← Back to Blog

Build a COVID-19 Dashboard With Fewer Than 40 Lines of Code

pythonstreamlitcovid-19

Build a COVID-19 Dashboard With Fewer Than 40 Lines of Code

Intro

One of the difficult parts of making sense of the COVID pandemic (caused by the SARS-CoV-2 virus) is sorting through all the myriad data coming at us from every direction. In this tutorial, I will show you a fun way of manipulating some of these data to build your own dashboard to view United States COVID-19 statistics. Afterward, you will hopefully be able to enhance your dashboard(s) to make unique visualizations.

This project uses Streamlit, an exciting and relatively nascent open-source data visualization project. It makes building beautiful data visualizations as easy as writing a Python script. If you like what you see here, I suggest you read through their documentation and follow this project as it grows.

As stated above, there are many data sources for COVID-19. We will be using covidtracking.com as it has a simple REST API. However, there are many other sources available. I also recommend checking with your local municipality for public data sources. Remember to be respectful and follow API guidelines.

Let's get started!

The recommended way to install Streamlit is via pip:

pip install streamlit

Open a new Python script (I called mine covid_dashboard.py), and begin by importing some libraries:

import streamlit as st
import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime, timedelta

Streamlit does not have its own graphing functionality. Instead, we will use Matplotlib's pyplot to make our plots. We need pandas to manipulate the data, and we will also be using Python's built-in datetime library.

Next, we need to define a function to retrieve data from the API.

def fetch_data():
    df = pd.read_json('https://covidtracking.com/api/v1/us/daily.json')
    df['date'] = pd.to_datetime(df['date'], format="%Y%m%d")
    df.set_index('date', inplace=True)
    df.sort_index(ascending=True, inplace=True)
    return df

The API output is in JSON format which we can read directly into a pandas DataFrame. We now convert the 'date' column to datetime objects and reindex the DataFrame with these. Finally, we sort the DataFrame by these dates.

As the user interacts with our dashboard, the script is frequently re-run. However, we don't want to poll the API every time the user changes how he/she wants to view the same data. Streamlit simplifies this by having a @st.cache decorator that we can place ahead of any API call. This way, the API is only called once despite running the fetch_data() function multiple times. (This is important not to overrun the API endpoint with unnecessary calls for the exact same data over and over again.)

Our decorated function and function call should look like this:

@st.cache
def fetch_data():
    df = pd.read_json('https://covidtracking.com/api/v1/us/daily.json')
    df['date'] = pd.to_datetime(df['date'], format="%Y%m%d")
    df.set_index('date', inplace=True)
    df.sort_index(ascending=True, inplace=True)
    return df

df = fetch_data()

Again, the fetch_data() only triggers an API call once; from then on, the data are cached.

Next, we construct a dictionary of data descriptors and their corresponding columns in the DataFrame we just constructed from the API.

options = {"Cumulative Positive Results": 'positive',
    "Daily Positive Tests": 'positiveIncrease',
    "Cumulative Deaths": 'death',
    "Daily Deaths": 'deathIncrease',
    "Current Hospitalizations": 'hospitalizedCurrently',
    "Daily Hospitalizations": 'hospitalizedIncrease',
    "Cumulative Hospitalizations": 'hospitalizedCumulative',
    "Current ICU Patients": 'inIcuCurrently',
    "Cumulative ICU Patients": 'inIcuCumulative',
    "Current Ventilator Patients": 'onVentilatorCurrently',
    "Cumulative Ventilator Patients": 'onVentilatorCumulative',
    "Recovered Patients": 'recovered',
    "Daily Tests Performed": 'totalTestResultsIncrease',
    "Cumulative Tests Performed": 'totalTestResults'}

Now that we have collected our data, we can start the fun part: making our dashboard! Let's give it a title:

st.title('COVID-19 Dashboard: US Data')

And, it is good practice to reference your data source:

st.subheader('Source: https://covidtracking.com')

Streamlit features a collapsable sidebar where you can put options menus and other things you don't want cluttering your main window. Most Streamlit components can be put in the sidebar by calling them as methods on the st.sidebar object. Let's add some date pickers for the start and end dates of our visualization. (I'll set an arbitrary initial start date of 1-Mar-2020 and an initial end date matching the latest date in the DataFrame.)

start_date = st.sidebar.date_input("Start Date", value=datetime(2020,3,1))
end_date = st.sidebar.date_input("End Date", value=df.index.max())

We now need to have the user pick which visualizations they want to see. Only seeing a single category or data is boring, so we should use a multi-select.

charts = st.sidebar.multiselect("Select individual charts to display:",
                options=list(options.keys()),
                default=list(options.keys())[:1])

The first parameter is the caption, the options are the keys of our options dictionary defined above, and the default is the first of those options. This returns a list of the chosen options which we store in charts.

In our main window, we now loop through the charts list and plot each one on a single axis.

for chart in charts:
    df[options[chart]].loc[start_date : end_date + timedelta(days=1)].plot(label=chart, figsize=(8,6))
    plt.xlabel('Date')
    plt.legend(loc="upper left")

The first line in the loop selects the column in the DataFrame corresponding to the chosen option, filters out the dates chosen, and uses the pandas plot function to plot the data from that column. The other lines are for formatting the axis; you can customize these to change the look of your charts.

Once all the plots are created, the pyplot object still resides in memory, we have to tell Streamlit to show it by calling:

st.pyplot()

And that's it!

Entire script

import streamlit as st
import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime, timedelta

@st.cache
def fetch_data():
    df = pd.read_json('https://covidtracking.com/api/v1/us/daily.json')
    df['date'] = pd.to_datetime(df['date'], format="%Y%m%d")
    df.set_index('date', inplace=True)
    df.sort_index(ascending=True, inplace=True)
    return df

df = fetch_data()

options = {"Cumulative Positive Results": 'positive',
    "Daily Positive Tests": 'positiveIncrease',
    "Cumulative Deaths": 'death',
    "Daily Deaths": 'deathIncrease',
    "Current Hospitalizations": 'hospitalizedCurrently',
    "Daily Hospitalizations": 'hospitalizedIncrease',
    "Cumulative Hospitalizations": 'hospitalizedCumulative',
    "Current ICU Patients": 'inIcuCurrently',
    "Cumulative ICU Patients": 'inIcuCumulative',
    "Current Ventilator Patients": 'onVentilatorCurrently',
    "Cumulative Ventilator Patients": 'onVentilatorCumulative',
    "Recovered Patients": 'recovered',
    "Daily Tests Performed": 'totalTestResultsIncrease',
    "Cumulative Tests Performed": 'totalTestResults'}

## Build page
st.title('COVID-19 Dashboard: US Data')
st.subheader('Source: https://covidtracking.com')

start_date = st.sidebar.date_input("Start Date", value=datetime(2020,3,1))
end_date = st.sidebar.date_input("End Date", value=df.index.max())

charts = st.sidebar.multiselect("Select individual charts to display:",
                options=list(options.keys()),
                default=list(options.keys())[0:1])

for chart in charts:
    df[options[chart]].loc[start_date : end_date + timedelta(days=1)].plot(label=chart, figsize=(8,6))
    plt.xlabel('Date')
    plt.legend(loc="upper left")
st.pyplot()

Launching your dashboard

Once your script is saved, you can view your dashboard by running:

streamlit run covid_dashboard.py

A browser window should open to your new, fancy dashboard! Try changing the dates and chart types in the sidebar.

This is just a starting point. There is a lot you can do to make it more interactive and show data in different ways. Here are some other ideas:

  • Source data from different states and chart comparisons between them.
  • Try calculating functions on your data, such as moving averages (hint), to make the plots more interesting.
  • Try different chart types using matplotlib and seaborn.
  • Allow different user input to customize the views.

Stay safe and have fun!

← Back to all posts