Monthly Archives: December 2019

White Paper

From Columbine to Pensacola: Two Decades of Mass Shooting Violence in the United States, 1999 to 2019

Motivation

For my final project in the Introduction to Data Visualization and Design Fundamentals course, I decided to analyze two decades of mass shooting violence in the United States from 1999 to 2019. I chose this topic because mass shootings are a hot topic in the United States today and it seems that mass shootings are becoming more common in the United States. I wanted to determine if there were any trends in the number of mass shootings and victims over time, the locations that have the most total victims, the predominant race of mass shooters, and the predominant gender of mass shooters.

Data

The data used for this analysis came from Mother Jones. It is called the “Mother Jones – Mass Shooting Database, 1982 to 2019” and is a database that is kept up to date by Mother Jones for public use. The mass shootings in this database follow very strict criteria, which is set by Mother Jones, in order to be included in this database.

In order for the mass shooting to be included in this database: the perpetrator must have killed at least four people (after 2013 this was revised to three people), the killings were a result of a lone gunman (except in the case of the Columbine massacre and the Westside Middle School killings), the shootings were in a public place (except in the cases of a party on private property in Crandon, Wisconsin, and in Seattle, Washington, where there was a public crowd), and the shooting was not a result of gang activity, armed robbery, or mass killings in private homes. The victim tallies do not contain any perpetrators who were injured or killed during the attack.

I used the case, date, fatalities, injured, total victims, location type, shooter, race, and gender variables from the database in my analysis. The data did not require much cleaning since it was made as a ready-use database, but I did edit location types to be more location specific and I did have to create the shooter variable.

The Visualizations

I created a total of seven visualizations in the course of my analysis.

The first visualization is a timeline of the mass shootings. I chose this visualization because I wanted to represent each data point over time. I decided to split the timeline in half to represent a decade each. This way the timelines could be used to compare the decades to each other.

The second and third visualizations are scatter plots with trend lines. I chose scatter plots for these visualizations because it was the easiest way to represent and see relationships between the variables of interest. The first scatter plot shows the amount of mass shootings over time. The trend line for this visualization is a 3rd order polynomial trend line, which means that mass shootings increase at an exponential rate over time. The second scatter plot shows the total amount of victims over time. The trend line for this visualization is also a 3rd order polynomial trend line, which means that total victims per year increase at an exponential rate over time.

The fourth visualization is a tree map. I chose this visualization because it is a “part of a whole” type visualization. The size represents the total number of victims for each case so that the size of each square in the visualization represents proportionally how many victims out of the total amount of victims that case represents. The larger the square, the more victims that mass shooting has as compared to the smaller squares.

The fifth visualization is a stacked bar chart. I chose this visualization because it was the best way to show the total amount of victims for each location type, while breaking down the total amount of victims by case. It helps us to see which location types have the most and least shootings as well as which location types have the most total victims without having to break this into two visualizations.

The sixth and seventh visualizations are waffle grids. I chose this visualization because it is a “part of a whole” type visualization. Each square in the visualization represents 1% so it is a great way to represent percentages of variables and compare variable amounts. The first waffle grid compares black mass shooters to white mass shooters and shows us that mass shooters are predominantly white, 55%, as compared to 15% black mass shooters. The second waffle grid compares male and female mass shooters and shows us that most mass shooters are predominantly male, 95%, as compared to 3% female mass shooters and 2% male and female mass shooter duo.

Closing Remarks

While the tree map is probably the visualization with the most visual impact in this analysis, it is also important to note that this is just a comparison of the amount of victims as a result of the mass shooting. It does not mean that the larger the total victim count the more devastating the shooting is. Each mass shooting is devastating in its own right and has had impacts that we could not visualize since it is subjective and we are only looking at the objective data. Due to the nature of the topic it is hard to measure the impact each mass shooting has had since it impacts families, friends, communities, and even the public as a whole in some or most instances.

Next Steps

Given the time, and the data, I would have loved to take a deeper dive into this topic. Specifically, I would have loved to have looked at whether the mass shooters had a history of mental illness and what weapons they used and if they were obtained legally or not.

I would have also liked to investigate a specific location type more closely to determine if there are trends that have been overlooked when looking at the overview of all mass shooting location types.

If I had a list of victims of each mass shooting, I would have loved to create a word cloud for each mass shooting to recognize the victims of these shootings. The victims of these senseless acts of violence deserve to be recognized more than their killers.

From Columbine to Pensacola: Two Decades of Mass Shooting Violence in the United States, 1999 to 2019

Mass shooting violence in the United States is an on-going problem that seems to get worse as time goes by. There has been many pushes for adequate gun control laws to try and limit access to semiautomatic weapons, which are the most common types of weapons that are used in mass shootings, but the problem still remains.

For this project, I will be using the “Mother Jones – Mass Shooting Database, 1982 to 2019” a database that was compiled by Mother Jones and is kept up to date to investigate mass shootings in the United States from 1999 to 2019. I was interested in determining if there are trends in the shooter’s gender and race as well as trends in the amount of shootings over time and the amount of victims over time.

The information garnered from these visualizations and this investigation overall could be used to used to show what types of places and locations are more prone to mass shooting violence so that those places could improve security and police presences. It could also be used to bring attention to mass shooting violence in the United States to create a push to [hopefully] stop this type of violence.

To start off my investigation, I created two timelines that each depict one decade of shooting violence to determine trends in mass shooting violence over time. The first timeline is from 1999 to 2009 and the second timeline is from 2010 to 2019. By comparing the two timelines, we can easily see that there was much more shooting violence in the second decade, and in recent years, than there was in the first. It is also important to note that in 2002 there was no mass shootings and was the only year where there were no mass shootings.

As a followup to the timeline, I wanted to analyze the number of mass shootings and the total number of victims over time. To do this I created two scatter plots and added trend lines to them. The first is a scatter plot of the number of shootings over time with a polynomial trend line with a power of three. It shows that as time increases so does the number of shootings. This trend is significant at the 1% level as the p-value of the trend line is less than 0.0001. The second is a scatter plot of the number of victims over time with a polynomial trend line with a power of three. It shows that as time increases so does the number of victims. This trend is significant at the 5% level as the p-value of the trend line is less than 0.05. It is important to note, there are no entity fixed effects in the regression analyses above. This could have affected the coefficients as well as affected the p-values and standard errors.

I then created a tree map to analyze the total amount of victims as a result of each instance of mass shooting violence. In doing so, I determined that the Las Vegas Strip massacre perpetrated by Stephen Paddock had the most victims by far. To round out the top three largest mass shooting cases, it was followed by the Orlando nightclub massacre perpetrated by Omar Mateen and the Aurora theater shooting perpetrated by James Holmes.

Next, I wanted to investigate the location types of mass shootings. To do this, I created  a stacked bar chart which shows the total amount of victims as a result of that type of mass shooting. It shows that the most victims were a result of concert mass shootings followed by other types of mass shootings, school mass shootings, workplace mass shootings, nightclub mass shootings, religious mass shootings, military mass shootings, multiple (spree) mass shootings, and festival shootings. The concert location type had the most victims even though it only accounted for two shootings because of the Las Vegas Strip massacre.

Finally, I wanted to analyze the gender and race of the mass shooters. To do this, I created two visualizations, both of which were waffle grids. The first waffle grid was used to compare black mass shooters to white mass shooters. It was determined that white mass shooters were the dominant race of mass shooters as compared to black mass shooters and mass shooters of other races. The second waffle grid was used to compare male mass shooters to female mass shooters. It was determined that male mass shooters were the dominant gender of mass shooters as compared to female mass shooters and male and female duo mass shooters.

Given the time, I would have liked to take a closer look at a single category and do an in-depth analysis of just that category of mass shootings to determine if there were any trends in gender, race, total victims, amount of shootings, or any other trends that may have gotten overlooked in the overview of mass shootings that I have done in this instance.