[OC] Total deaths and injuries from US school shootings (1990-2024)



Data: https://en.wikipedia.org/wiki/Lists_of_school_shootings_in_the_United_States

Tools: R

General description: each cell represents the total for that day of the week of individuals injured or killed in a school shooting (see definitions in the source above). I started with looking at day of week within week of month within month of year but it was way too difficult to read and I probably would have had to more heavily process the data (due to nulls across weekends and other considerations with how to count a week). This analysis involved aggregating data on school shootings by date, calculating the total incidents per day, and extracting date components like the week of the month and the year. A calendar heatmap was then created using a color gradient from light purple to dark yellow to visually represent the frequency of incidents across different decades, months, and weeks within the month, with outliers mitigated using log scaling. I ordered it by August through July to sort of go from start to end of school year, and put the day of the week starting on Monday to better illustrate the first day of school for the week.

If people are interested I will post the data/code.

And here is one where I adjusted it to be just week of year. Again, more legible than both day of week and week of year but I felt day of week was more interesting. https://imgur.com/QdGeGIs

Posted by reporst

11 comments
  1. Your source has a very loose definition of a school shooting. There’s many examples like this:

    > An individual who was not a student accidentally shot himself in the leg in the parking lot of Glades Central High School.

    > A worker fixing the roof of Canyon del Oro High School was fatally wounded after his unholstered weapon accidentally discharged.

    I don’t know if you are being purposefully misleading or if it’s your source, but you should make the definition clearer

  2. How does a shooting that happens on a Sunday in July classify as a school shooting?

    I know no such schools that would be in session.

  3. The presentation by day of week/month is an interesting way to slice the data. I am not seeing a clear pattern, but definitely an interesting question.

    Briefly on politics/rhetoric. 

    I’d echo yo-chill; the choice of what is reported as a school shooting is a very political one. Not that you made that choice, just in general.

    It will conflate awful incidents like Sandy Hook with incidents like “A school resource officer accidentally discharged their firearm in a school bathroom (April 2, 2024)” Both could be “school shootings”, in that a firearm was discharged at a school. That said, they’re substantially different incidents to which different solutions might be suggested. Conflating the two will inflate the perceived frequency of more awful/memorable school shootings, while omitting the less severe incidents will mask/hide the frequency of lower level firearm negligence.

    You are coloring by deaths/injuries, which will omit entries with nobody hurt, which ameliorates this some.

    On presentation/visualization:

    I think there’s a point to be made that the log scale is a choice to be considered; without it, outliers might drown out other patterns in the data; with it, days with fewer deaths/injuries will appear more similar to days with many deaths/injuries.

    Aggregating incidents by DoW/Month/Year will make it difficult to distinguish if there were 8 1-injury incidents or 1 8-injury incident in a given space.

    It looked like there’s a few spots where a particular DoW/month was missing; I’m guessing there were no data for those cells, but I think it would be better to see the same rows/columns in each decade, for a clearer comparison.

    One thing I’m thinking of that I don’t think would have a huge effect might be that in a given month, there will be different frequencies of each DoW. Probably doesn’t change much but worth considering.

    Just a thought:

    One thing that might be interesting to do would be something like a stacked bubble chart; each incident being a bubble on a single line for time, with size for number of casualties; then slice that time to month-long windows. Display those windows with x axis being month, y axis as decade.

    Could do the same for DoW slices. 

    Would be a bit more difficult to appreciate from a top-down/single page view, but would offer a bit more detailed slice into the data.

    Neat project though, good work :).

  4. Good to know that Saturdays did not exist in the 1990s 😉

    And why are the days of the week listed in reverse order? Sun Sat Fri Thurs Wed Tues Mon

  5. I’m not sure this belongs in /r/dataisbeautiful, maybe /r/dataisterrifying.

    One thing that seems to be clear is that it’s getting more frequent. The colours might not be quite so dark in the last plot, but I’m not sure that isn’t just luck. Certainly having more occurrences leads to more opportunities for those colours to darken…

Leave a Reply