Knowing How To Read Your Data

adamlarosa

Adam La Rosa

Posted on March 22, 2020

Knowing How To Read Your Data

With all the craziness of this worldwide pandemic I couldn't help but to spend a little time playing with a COVID-19 api (located at https://covid19api.com/). It's got a few different datasets, the root of which will return all available types, i.e....

0: {Name: "Get All Data", Description: "Returns all data in the system. Warning: this request returns 8MB+ and takes 5+ seconds", Path: "/all", Params: null}
1: {Name: "Get List Of Countries", Description: "Returns all countries and associated provinces. Th…y_slug variable is used for country specific data", Path: "/countries", Params: null}
2: {Name: "Get List Of Cases Per Country Per Province By Case Type", Description: "Returns all cases by case type for a country. Coun…ases must be one of: confirmed, recovered, deaths", Path: "/country/{country}/status/{status}", Params: Array(2)}
3: {Name: "Get List Of Cases Per Country By Case Type", Description: "Returns all cases by case type for a country. Coun…ases must be one of: confirmed, recovered, deaths", Path: "/total/country/{country}/status/{status}", Params: Array(2)}
4: {Name: "Get List Of Cases Per Country Per Province By Case Type From The First Recorded Case", Description: "Returns all cases by case type for a country from …ases must be one of: confirmed, recovered, deaths", Path: "/dayone/country/{country}/status/{status}", Params: Array(2)}
5: {Name: "Get List Of Cases Per Country By Case Type From The First Recorded Case", Description: "Returns all cases by case type for a country from …ases must be one of: confirmed, recovered, deaths", Path: "/total/dayone/country/{country}/status/{status}", Params: Array(2)}
6: {Name: "Add a webhook to be notified when new data becomes available", Description: "POST Request must be in JSON format with key URL a…sponse data is the same as returned from /summary", Path: "/webhook", Params: Array(2)}
7: {Name: "Summary of new and total cases per country", Description: "A summary of new and total cases per country", Path: "/summary", Params: null}

Right off the bat "Get List Of Cases Per Country Per Province By Case Type From The First Recorded Case" caught my eye. "How many people in the United States have died from this thing??!?!" was my first thought. Let's see what this one gives us...

0:
Country: "US"
Province: "King County, WA"
Lat: 47.6062
Lon: -122.332
Date: "2020-02-29T00:00:00Z"
Cases: 1
Status: "deaths"
__proto__: Object
1: {Country: "US", Province: "King County, WA", Lat: 47.6062, Lon: -122.332, Date: "2020-03-01T00:00:00Z", …}
2: {Country: "US", Province: "King County, WA", Lat: 47.6062, Lon: -122.332, Date: "2020-03-02T00:00:00Z", …}
3: {Country: "US", Province: "Snohomish County, WA", Lat: 48.033, Lon: -121.834, Date: "2020-03-02T00:00:00Z", …}
4: {Country: "US", Province: "King County, WA", Lat: 47.6062, Lon: -122.332, Date: "2020-03-03T00:00:00Z", …}
5: {Country: "US", Province: "Snohomish County, WA", Lat: 48.033, Lon: -121.834, Date: "2020-03-03T00:00:00Z", …}

...etc...

Ok, the object has a key of "Cases" which gives me the number of deaths. Perfect! I'll just add those up & have my total! Luckily for me the object is numbered, so I'll just take the number of keys and iterate though them grabbing my data along the way. Having put the data in an object named "results", I put this together...

let total = 0
const size = Object.keys(results).length

for (let i=0; i < size; i++){
    total = total + results[i]["Cases"]
}

This gave me a number of 1135 which was incredibly alarming as the news was reporting that the United States had only suffered 244 fatalities. So my first response, of course, was to panic.

WE'RE BEING LIED TO! THEY'RE SUPPRESSING THE REAL DATA!!!

Thankfully this only lasted a moment, and the cold realization that once again my logic was to blame splashed me with its icy truth. Time to take a deeper look at the data being presented to me.

0:
Country: "US"
Province: "King County, WA"
Lat: 47.6062
Lon: -122.332
Date: "2020-02-29T00:00:00Z"
Cases: 1
Status: "deaths"
__proto__: Object
1:
Country: "US"
Province: "King County, WA"
Lat: 47.6062
Lon: -122.332
Date: "2020-03-01T00:00:00Z"
Cases: 1
Status: "deaths"
__proto__: Object
2:
Country: "US"
Province: "King County, WA"
Lat: 47.6062
Lon: -122.332
Date: "2020-03-02T00:00:00Z"
Cases: 5
Status: "deaths"
__proto__: Object
3: {Country: "US", Province: "Snohomish County, WA", Lat: 48.033, Lon: -121.834, Date: "2020-03-02T00:00:00Z", …}
4:
Country: "US"
Province: "King County, WA"
Lat: 47.6062
Lon: -122.332
Date: "2020-03-03T00:00:00Z"
Cases: 6
Status: "deaths"
__proto__: Object
5: {Country: "US", Province: "Snohomish County, WA", Lat: 48.033, Lon: -121.834, Date: "2020-03-03T00:00:00Z", …}
6: {Country: "US", Province: "King County, WA", Lat: 47.6062, Lon: -122.332, Date: "2020-03-04T00:00:00Z", …}
7: {Country: "US", Province: "Placer County, CA", Lat: 39.0916, Lon: -120.804, Date: "2020-03-04T00:00:00Z", …}

...etc...

Here we can see that the first entry is from King County, WA reporting one "case" of "deaths" of February 29th. Then on March first again there is only one "case" of "deaths". This is where I made my mistake. The "cases" isn't individual deaths as I assumed, but a total of the recorded deaths for that date. I.E. there was one one 2/29, still only one on 3/1, then a total of five on 3/2.

Ok, I can work with this. All one would have to do is find the last entry for that particular state and grab the total, then add all of THOSE totals up. This presented a couple new challenges.

First, all the states are mixed up due to the fact they were all reporting cases simultaneously. Not too complex, just separate them all by state during an iteration.

Second, and more exciting, was the fact that sometime around March 10th they changed the way the locations were formatted. E.g...

26: {Country: "US", Province: "Placer County, CA", Lat: 39.0916, Lon: -120.804, Date: "2020-03-09T00:00:00Z", …}
27: {Country: "US", Province: "Santa Rosa County, FL", Lat: 30.769, Lon: -86.9824, Date: "2020-03-09T00:00:00Z", …}
28: {Country: "US", Province: "Snohomish County, WA", Lat: 48.033, Lon: -121.834, Date: "2020-03-09T00:00:00Z", …}
29: {Country: "US", Province: "Florida", Lat: 27.7663, Lon: -81.6868, Date: "2020-03-10T00:00:00Z", …}
30: {Country: "US", Province: "New Jersey", Lat: 40.2989, Lon: -74.521, Date: "2020-03-10T00:00:00Z", …}
31: {Country: "US", Province: "Washington", Lat: 47.4009, Lon: -121.491, Date: "2020-03-10T00:00:00Z", …}

Switching from "County, State" to just "State". So during the iteration I'd have to check to see what the formatting is, then sort the data appropriately. Which inevitably led me to this solution...

        const stateTable = {
            AL: "Alabama", AK: "Alaska", AZ: "Arizona", AR: "Arkansas",
            CA: "California", CO: "Colorado", CT: "Connecticut", DE: "Delaware",
            FL: "Florida", GA: "Georgia", HI: "Hawaii", ID: "Idaho",
            IL: "Illinois", IN: "Indiana", IA: "Iowa", KS: "Kansas",
            KY: "Kentucky", LA: "Louisiana", ME: "Maine", MD: "Maryland",
            MA: "Massachusetts", MI: "Michigan", MN: "Minnesota",
            MS: "Mississippi", MO: "Missouri", MT: "Montana", NE: "Nebraska", 
            NV: "Nevada", NH: "New Hampshire", NJ: "New Jersey", 
            NM: "New Mexico", NY: "New York", NC: "North Carolina", 
            ND: "North Dakota", OH: "Ohio", OK: "Oklahoma", OR: "Oregon", 
            PA: "Pennsylvania", RI: "Rhode Island", SC: "South Carolina", 
            SD: "South Dakota", TN: "Tennessee", TX: "Texas", UT: "Utah", 
            VT: "Vermont", VA: "Virginia", WA: "Washington", 
            WV: "West Virginia", WI: "Wisconsin", WY: "Wyoming" 
        }

        const resultsSize = Object.keys(results).length
        let states = {} 

        for (let i=0; i < resultsSize; i++){
            // Province is split between state & county.
            if (results[i].Province.split(", ")[1]) {
                let state = results[i].Province.split(", ")[1]

                if (Object.keys(states).includes(stateTable[state])) {
                    states[stateTable[state]].push({
                        location: results[i].Province.split(", ")[0],
                        date: results[i].Date,
                        deaths: results[i].Cases
                    })
                } else {
                    states[stateTable[state]] = []
                    states[stateTable[state]].push({
                        location: results[i].Province.split(", ")[0],
                        date: results[i].Date,
                        deaths: results[i].Cases
                    })
                }
            } else {
            // Only state name is specified.
                let state = results[i].Province

                if (Object.keys(states).includes(state)) {
                    states[state].push({
                        date: results[i].Date,
                        deaths: results[i].Cases
                    })
                } else {
                    states[state] = []
                    states[state].push({
                        date: results[i].Date,
                        deaths: results[i].Cases
                    })
                }
            }
}

I now have a new object called "states" with only the date and number of deaths. Let's try adding THIS one up.

for (const state in states) {

            const stateSize = states[state].length
            const theDead = states[state][stateSize - 1].deaths
            const theTime = states[state][stateSize -1].date

            totalDeaths = totalDeaths + theDead
            console.log(state, ":", theDead, theTime)
        }

...which have a number of 244, just a bit lower than official numbers, which I presume is due to the api only updating once each night.

MUCH better. Less panic.

What did we learn? Understand how your data is sent before jumping to conclusions. :)

💖 💪 🙅 🚩
adamlarosa
Adam La Rosa

Posted on March 22, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related