How big data is being used in healthcare today
Uri Laserson, Data Scientist, Cloudera.
21 November 2014
We are being bombarded from all directions with information about
"Big Data" and how it will change our lives, especially as it
pertains to our health. But what is Big Data, and how is it actually
being used in the healthcare system today?
More than anything, Big Data represents a change in perspective.
Organizations have long understood the value in storing and
analysing their data. What makes Big Data different is the
realization that it can be highly valuable to store and analyse
literally every bit of information that can be stored and analysed.
Furthermore, the rise of Big Data has been enabled through the
development of a democratizing ecosystem of tools for processing
such large quantities: Hadoop. In the past, the ability to analyse
petabytes of data was inaccessible to all but the richest, most
sophisticated organizations (like Google). But the rise of cloud
computing and the Hadoop ecosystem have enabled even small groups of
people to record, store, and process petabytes of data.
However, lots of popular discussion around Big Data tends to be
high level and abstract, so below we describe a few specific uses of
the Big Data mind-set that are actually being developed/deployed.
The first example concerns sepsis, which is an often fatal
condition caused by a strong inflammatory response to an infection,
typically during hospitalizations. Early detection of sepsis is
crucial for increasing the chances of survival, which requires
constant monitoring of the patients. While most hospitalized
patients are connected to continuous monitoring systems (including
vital signs like heart rate, oxygen saturation, and body
temperature), this data is generally not stored/analysed in bulk for
the long term.
Today, one large hospital chain is using the Hadoop ecosystem to
collect such continuous monitoring data to train predictive models
for a patient's risk of developing sepsis over the coming hour.
These models are then deployed using another Hadoop system to
continuously score the model to generate real-time predictions for
each hospitalized patient, enabling septic patients to be treated as
early as possible.
Another major example concerns health insurers' attempts at
controlling spiralling healthcare costs in the US, where a small
fraction of individuals is responsible for a disproportionately
large fraction of healthcare spending. Preventative health is one of
the pillars of reducing healthcare costs, so insurers are using a
diverse array of data to train predictive models that attempt to
classify which policy-holders will need the most attention.
Specifically, the insurers are using large troves of claims data,
supplemented with anything they can access, including consumer data,
social media data, and even giving out quantified self-monitoring
devices, like FitBits. By training such classification models, the
insurers can identify and proactively address the patients with the
most need, simultaneously providing better care and reducing costs
to the healthcare system at large.
The number of use cases actively being developed is large and
diverse. EMR companies are storing their medical records in a
centralized Hadoop cluster, allowing for population-wide analyses.
Pharmaceutical companies are instrumenting their vaccine
manufacturing facilities, to help improve efficiencies and yields.
Hospitals are tracking large amounts of prescription data, to help
fight drug abuse and fraud. Indeed, the attitude that all data can
be valuable coupled with the empowerment of the Hadoop ecosystem is
already having a significant impact on the quality and efficiency of
our healthcare system.
The barriers to tapping into Big
Content and how to overcome them
Protecting critical healthcare data in the era of 'big data'
cardiology, Big Data covers the ‘whole’ patient