Making Sense of Public Health Data for COVID-19

Pandemic news coverage in real-time is… challenging.

At the beginning of the nationwide stay-at-home directive (whether issued officially or not), I wanted to help in any way I could. I started training up on NLP (natural language processing) models with the goal of tackling a Kaggle competition based on “classification and of COVID-19 related scientific papers”*.

I never got to the point of submitting any models, but I found a lot of interesting datasets in my research. One of these datasets was the NYTimes dataset of US COVID-19 cases & deaths by state & county, updated every day.

By month three of lockdown, I felt like I was battling a constant onslaught of perspective skewing news. The initial grasp I had on the magnitude of this crisis was

beginning to slip as the push notifications became less about numbers & geographies & more about pictures of crowded pools in the Ozarks.

With cities and states making moves to reopen, I needed to understand the context — was this societal fatigue that was pushing us into these eventualities? Did the view from the CDC actually look any different now than it did three months ago? Just how dangerous were the choices that everyone was making?

So I decided to use public COVID-19 datasets to construct the perspective I wasn’t getting from the news.

I built a Python-fueled machine to transform, format, &

update this data everyday. Then, happily, I leveraged the same technology the NYTimes uses (d3.js, get into it) to visualize this data. The main point of this project was not to recreate the NYTimes’ very-available COVID maps, but to fashion a tool that can help me keep the perspective I felt I was losing. I can extend this project to help answer, directly from the data, epidemiological questions I may wonder about in the future, without needing it to filter through the news cycle.

I iterated on these datasets throughout the pandemic, slowly adding interesting suggestions from friends more detail to help answer questions I couldn’t let go. The vaccine allocation dataset is a manifestation of that restlessness — you can find more info on its project page.

*Check out more about the mechanics & stakes of this challenge from our friends at Freethink here.

Previous
Previous

Analyzing Pandemic Aid for Farmers

Next
Next

The Trump Tax Calculator