Coronavirus outbreak — five questions to ask big data

Claim your CPD points

Let's explore the scale of the largest quarantine ever attempted in human history using big data. Zeming Yu uses the Baidu data platform to illustrate some interesting visualisations of movement due to the Coronavirus. 

Update on 18 Feb 2020

This article was originally written on 31 January. Since then the scale of the outbreak has led to intensified measures to control the spread. For example, the residential community I live in has recently adopted closed-off management. I was given the entry permit below. Anyone who's not registered is not allowed to enter.

Back of the entry permit to my residential community in Beijing

Back of the entry permit to my residential community in Beijing

The situation is similar for many cities across China. In some places, residents are given a quota to do their grocery shopping, e.g. once every two days. In Beijing and many other cities, people who arrive from outside the city need to observe a compulsory 14 day quarantine period. Schools have been closed, events canceled, and most companies are adopting a work from home policy.

Indeed, we are experiencing the largest quarantine ever attempted in human history! If you don't believe it, just look at the graph below which shows the year on year comparison for the total traffic flow for the top 30 traffic hubs (airports and train stations) across China. Red is this year and green is last year.

Traffic flow index for top 30 traffic hubs across China, year on year comparison

Traffic flow index for top 30 traffic hubs across China, year on year comparison

Let's hope that such strong measures can effectively combat the spread of this highly contagious disease.

As I write this article, the total number of coronavirus cases on the Princess Diamond cruise ship has jumped to 454, highlighting the need for quarantine.

Original article published on 1 Feb 2020

Baidu map is used 120 billion times a day around the world. Based on so much geospatial data, Baidu launched this amazing data visualization platform called Baidu Qianxi (which means migrate). Together with the search engine data, Baidu really knows a lot about what's happening.

The  coronavirus outbreak  and the resulting lockdown in Wuhan and surrounding cities have caused major disruption to people's lives. What answers could Baidu provide us based on big data?

1.What's the traffic like in Wuhan right now?

In Wuhan, the highway traffic jam distance, a measure of traffic activity, is down by 99.7% compared to the same period last year (red vs green line). The lockdown is real and ongoing.

Traffic jam distance (km) year on year comparison - Wuhan

Traffic jam distance (km) year on year comparison - Wuhan

For comparison, here's the same graph for Beijing - 'only' down by about 50%.

Traffic jam distance (km) year on year comparison - Beijing

Traffic jam distance (km) year on year comparison - Beijing

Traditionally, now is the time for people to come back to the big cities and start working. As we are still in the middle of this massive outbreak, there is a risk the traffic flow could further spread the virus.

The governments around China have either extended the public holiday or asked employers to arrange their staff to work from home. Could this be the year of "working from home" for China?

2. What's the traffic like in Wuhan before and after the lockdown?

On 23 January, Wuhan  suspended all public transportation  from 10 a.m. onwards, including all bus, metro and ferry lines. Additionally, all outbound trains and flights were halted.

We can see from the graphs below that the traffic flow drastically reduced after the announcement on 23 January, but the traffic control wasn't fully effective until 26 January.

Wuhan outbound traffic flow index (yellow = this year, white = last year)

Wuhan outbound traffic flow index (yellow = this year, white = last year)

Wuhan inbound traffic flow index (yellow = this year, white = last year)

Wuhan inbound traffic flow index (yellow = this year, white = last year)

Again, now should be the peak time for inbound travel into Wuhan city after the Chinese New Year, but this year is very different.

3. For those that left Wuhan just before the lockdown, where did they go?

The graph below shows that most of them went to other cities within the Hubei province. There's no surprise that almost the entire Hubei province was under lockdown only a few days after Wuhan.

Outbound traffic flow from Wuhan on 22 January

Outbound traffic flow from Wuhan on 22 January

For reference, here's a table from  Wikipedia  about the impact of the traffic ban:

Image

4. What can we learn from Baidu search keyword trend?

Blue: coronavirus, Green: Wuhan, Orange: mask

Blue: coronavirus, Green: Wuhan, Orange: mask

The number of searches for 'coronavirus' (green line) and 'Wuhan' (green line) took off around 19 January.

On 23 January, the number of searches for 'Wuhan' had another massive increase after the announcement of the traffic ban.

The increase in the number of searches for masks (orange line) increased slowly, driven by the fact that it was widely publicized that wearing the mask is one of the best measures people should take to avoid the spread of the virus.

5. What's the context of the keyword 'Wuhan' before and after the lockdown?

Baidu 'needs graph' shows the correlation of related keywords over time. As time changes the context of the keyword also changes. This provides us insights about what people were thinking at the time.

During December, 'Wuhan' tends to be correlated to other cities in China (e.g. Changsha, Chengdu) or favorite travel destinations within Wuhan (e.g. Hubuxiang, Wuhan's famous breakfast alley) indicating tourism-related interests.

Image

Two weeks later, the word 'SARS' appeared for the first time in the graph, probably because at that time people suspected about a SARS outbreak as they didn't know what to call the new virus.

Image

Fast forward to late January, some of the most correlated keywords are 'Wuhan lockdown', 'Wuhan pneumonia' and even ' Huanan Seafood Market'  which is where most of the people got exposure to coronavirus in the early days of the outbreak.

Image

In the age of big data, we can learn a lot about people's travel patterns and search patterns using data providers like Baidu. All of this wasn't available during the 2003 SARS outbreak. The extra data enables people to make more informed decisions this time and plays a pivotal role in the fight against the virus outbreak.

Tools used in the article

Baidu Qianxi  is a big data product provided by Baidu which allows you to monitor real-time migration across China during the Chinese New Year.

Baidu keyword search and needs graph can be found in  Baidu index .

Both tools are purely point-and-click. Just follow the link and select the relevant cities that you'd like to monitor. Unfortunately, there is no English version of these tools. If you need to do some research in this area and don't know Chinese, maybe it's a good idea to find someone who knows the language to assist you.

Original article on Medium 

Covid-19
About the authors
Zeming Yu
Zeming Yu is an actuary with 20 years of experience across Australia, New Zealand, and China, specialising in the intersection of actuarial science, data science, and AI. He is currently Senior Manager, Actuarial and Analytics at Zurich Financial Services Australia, supporting data-driven innovation and AI initiatives in insurance. Previously, Zeming was Director of Data Science at LexisNexis Risk Solutions, leading analytics projects for the Chinese motor insurance market. He has also held data science and actuarial roles at Munich Re, Cover More Travel Insurance, and IAG. Zeming’s unique blend of actuarial, data science, and AI expertise enables him to drive strategic, analytics-led decision-making and foster innovation in the insurance sector.