COVID-19: Perspective of a Data Scientist

We should understand what the data is telling us about COVID-19 and pay attention to what’s happening in our own communities. If we don’t pay attention and start preparing now, by the time this virus is in our neighborhoods, it’s already too late.

Even as a company constantly identifying critical events for some of the most recognized companies in the world, the first reports of coronavirus didn’t raise significant alarms; it was far away and the reported numbers weren’t particularly concerning. At the office, we brushed it off as a bad flu. Our CTO, whose sister is an MD, didn’t seem particularly worried either. Mortality rates in China (read: questionable data) were relatively low. Same in Japan. As early reporting from South Korea started coming in, alarm bells began ringing. We were starting to see implications for global supply chains.

The first COVID-19 case in the US was reported on January 21st, then several reports of local transmission. On February 28th, the first reported death in the US due to COVID-19. Most of that centered here, in Washington State. I saw reports from Italy;  Doctors conducting triage in hallways, sending sick patients away given the lack of resources. Suddenly it got to the “why” behind mortality rates: hospitals were overwhelmed and lacked the resources to adequately prepare. 

As COVID-19 progresses, patients require ventilators and ICU care. Recovery times are relatively long measured in weeks, not days. Demand for beds and equipment quickly outpaces supply. 

As a data guy who started a data company, it got me thinking about what we would do in the US, and if we could develop a real-time gauge of hospital occupancy in our communities. I did a bit of analysis, found the number of hospital beds by zipcode on the Department of Homeland’s Security’s website. I then looked at reporting from the New York Times and a dataset that captured the ratio of ICU beds to normal beds posted as a project by the Harvard Global Health Initiative, as well as typical occupancy. What I found made me worried: if previous patterns held true, no major US city would not have the capacity to deal with the oncoming demand.

What Happens This Week

New York City is about to be a bellwether for the US. How they are able to cope with COVID-19 will be telling. And if the estimates I ran are correct, New York City’s healthcare system will begin redlining this week. Basically, when NYC hits 9500 reported cases, they will hit ICU capacity. As of right now (Sunday, March 22nd), they are 3,000 cases away from that threshold, and increasing by thousands of cases per day. Fortunately, they have shown strong leadership and a recognition of the magnitude of the impending crisis.

Here are my assumptions:

  • The NYT piece reviewed ~300 hospitals, or about 5% of the overall number of hospitals in the US. 
  • With simple averages I found that 11.4% of overall beds are ICU beds.
  • In the same NYT piece, I found that 63% of ICU beds are typically occupied.

Looking at US Department of Homeland Security’s HFLID data, which covers all hospitals in the US, I applied the averages from the NYT piece.


What I found was that NYC has about 11,000 hospital beds. 11% of that 11,000 leaves about 1,250 ICU beds. Accounting for current occupancy reduces roughly 790 beds, leaving a total of 460 or so available ICU beds. 

To take a swag at the number of confirmed COVID-19 cases that leads to a redline, take the percent of cases that end up in the ICU. From what I’ve found, that’s 5% of total COVID-19 cases. For 9,400 positive cases in NYC, this implies that 470 end up in the ICU – pushing the system beyond its capacity of 460 available ICU beds.  As of Sunday morning, I see 6,211 confirmed cases in NYC.

The Stabilitas Difference

Stabilitas fuses broad coverage with unparalleled speed to bring you the most accurate, actionable picture of your world, allowing you to protect your people wherever they are.

Speed: The time between when an event happens and when it’s picked up by your team is called “time to detect” (TTD). Rapid TTD can make the difference between pre-warning personnel and reacting to an event in progress. In a comparison of TTD, Stabilitas detected an event in just four minutes while it took a team of human analysts monitoring news sources nearly an hour. Social media has been shown to detect events faster, but noise and biases impact reliability.

Coverage: Stabilitas connects to more than 15,000 sources from government, weather and geological services, local and international press, and social media. This includes sources that security and intelligence professionals rely on for comprehensive global coverage. Stabilitas makes sense of it all by categorizing 53 critical event types. Validity is assured by cross-referencing multiple sources of events and even allowing for input from your own personnel on the ground.

Actionability: Stabilitas not only tracks critical events in real time, it maintains dynamic maps of your facilities, logistics, and personnel movements. This means your team can warn of potential impacts, take proactive security measures, and alert or extract – assuring the safety of all personnel.

I hope I’m wrong about the ability to handle cases. But we should understand what the data is telling us and pay attention to what’s happening in our own communities. If we don’t pay attention and start preparing now, by the time this virus is in our neighborhoods, it’s already too late.

As I mentioned previously, my focus is on the data, not on the medical science. I readily welcome feedback and input from others. If anyone out there can help refine this analysis, I would welcome the chance to talk with you.

The Science of Security

Stay up to date with the latest news and insights from the Stabilitas security intelligence network.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.