From Columbine to Pensacola: Two Decades of Mass Shooting Violence in the United States, 1999 to 2019
For my final project in the Introduction to Data Visualization and Design Fundamentals course, I decided to analyze two decades of mass shooting violence in the United States from 1999 to 2019. I chose this topic because mass shootings are a hot topic in the United States today and it seems that mass shootings are becoming more common in the United States. I wanted to determine if there were any trends in the number of mass shootings and victims over time, the locations that have the most total victims, the predominant race of mass shooters, and the predominant gender of mass shooters.
The data used for this analysis came from Mother Jones. It is called the “Mother Jones – Mass Shooting Database, 1982 to 2019” and is a database that is kept up to date by Mother Jones for public use. The mass shootings in this database follow very strict criteria, which is set by Mother Jones, in order to be included in this database.
In order for the mass shooting to be included in this database: the perpetrator must have killed at least four people (after 2013 this was revised to three people), the killings were a result of a lone gunman (except in the case of the Columbine massacre and the Westside Middle School killings), the shootings were in a public place (except in the cases of a party on private property in Crandon, Wisconsin, and in Seattle, Washington, where there was a public crowd), and the shooting was not a result of gang activity, armed robbery, or mass killings in private homes. The victim tallies do not contain any perpetrators who were injured or killed during the attack.
I used the case, date, fatalities, injured, total victims, location type, shooter, race, and gender variables from the database in my analysis. The data did not require much cleaning since it was made as a ready-use database, but I did edit location types to be more location specific and I did have to create the shooter variable.
I created a total of seven visualizations in the course of my analysis.
The first visualization is a timeline of the mass shootings. I chose this visualization because I wanted to represent each data point over time. I decided to split the timeline in half to represent a decade each. This way the timelines could be used to compare the decades to each other.
The second and third visualizations are scatter plots with trend lines. I chose scatter plots for these visualizations because it was the easiest way to represent and see relationships between the variables of interest. The first scatter plot shows the amount of mass shootings over time. The trend line for this visualization is a 3rd order polynomial trend line, which means that mass shootings increase at an exponential rate over time. The second scatter plot shows the total amount of victims over time. The trend line for this visualization is also a 3rd order polynomial trend line, which means that total victims per year increase at an exponential rate over time.
The fourth visualization is a tree map. I chose this visualization because it is a “part of a whole” type visualization. The size represents the total number of victims for each case so that the size of each square in the visualization represents proportionally how many victims out of the total amount of victims that case represents. The larger the square, the more victims that mass shooting has as compared to the smaller squares.
The fifth visualization is a stacked bar chart. I chose this visualization because it was the best way to show the total amount of victims for each location type, while breaking down the total amount of victims by case. It helps us to see which location types have the most and least shootings as well as which location types have the most total victims without having to break this into two visualizations.
The sixth and seventh visualizations are waffle grids. I chose this visualization because it is a “part of a whole” type visualization. Each square in the visualization represents 1% so it is a great way to represent percentages of variables and compare variable amounts. The first waffle grid compares black mass shooters to white mass shooters and shows us that mass shooters are predominantly white, 55%, as compared to 15% black mass shooters. The second waffle grid compares male and female mass shooters and shows us that most mass shooters are predominantly male, 95%, as compared to 3% female mass shooters and 2% male and female mass shooter duo.
While the tree map is probably the visualization with the most visual impact in this analysis, it is also important to note that this is just a comparison of the amount of victims as a result of the mass shooting. It does not mean that the larger the total victim count the more devastating the shooting is. Each mass shooting is devastating in its own right and has had impacts that we could not visualize since it is subjective and we are only looking at the objective data. Due to the nature of the topic it is hard to measure the impact each mass shooting has had since it impacts families, friends, communities, and even the public as a whole in some or most instances.
Given the time, and the data, I would have loved to take a deeper dive into this topic. Specifically, I would have loved to have looked at whether the mass shooters had a history of mental illness and what weapons they used and if they were obtained legally or not.
I would have also liked to investigate a specific location type more closely to determine if there are trends that have been overlooked when looking at the overview of all mass shooting location types.
If I had a list of victims of each mass shooting, I would have loved to create a word cloud for each mass shooting to recognize the victims of these shootings. The victims of these senseless acts of violence deserve to be recognized more than their killers.