Harnessing Healthcare’s Data Explosion With AI-Based Natural Language Processing

David Lareau is CEO of Medicomp Systems, a provider of physician-driven point-of care solutions that fix EHRs. 

Conducting further research to improve her healthcare services


In 2020, the amount of healthcare data created globally was an estimated 2,314 exabytes — which is an unfathomable amount when you consider a single exabyte is equivalent to one billion gigabytes. While it may be hard to wrap one’s head around such a figure, this much is clear: To make sense of such healthcare’s ever-growing volumes of data, we need advanced technologies, such as artificial intelligence (AI)-based tools, to enhance user productivity and minimize burdensome searches.

One of the most promising AI technologies to help manage huge volumes of data is natural language processing (NLP). NLP is a branch of linguistics, computer science and AI that enables computers to read, understand and structure large volumes of human prose (i.e., natural language). Though NLP has been around for decades, the explosion of healthcare data in recent years has made it an increasingly valuable tool for interpreting and filtering all types of medical text to make the data more useful to clinicians, researchers, payers and other stakeholders.

Weeding Through An Abundance Of Data

As I discussed in a previous post, more data is not always better when it comes to helping clinicians make well-informed decisions that improve patient care and outcomes. Consider a typical patient chart, which may include a mix of physician-dictated progress notes from multiple specialists, scanned lab and tests results and structured data from clinical documentation. When data is dumped into a patient’s chart from disparate sources and not stored in a structured format, clinicians must manually search through page after page of data to find the problem-specific information they need for clinical decision making.

When NLP tools are applied to the patient data, however, the computer can automatically read the text-based information in real time, interpret the meaning of the words based on the context and transform the details to a structured format that is easy to search and extract insights from, at the point of care. Advanced NLP engines can also decipher when multiple terms refer to the same or similar concept, such as “broken wrist” and “fractured carpal,” and then codify the like terms with standard or custom ontologies.


Similarly, clinical researchers can use NLP in place of manual reviews of lengthy physician narratives or lab results to accelerate research, drug discovery or therapy development. For example, NLP can glean insights from unstructured medical text to identify cohorts for clinical trials, understand disease progression or assess the efficacy of different therapies.

More Data Requires Better Search And Filtering Tools

The volume of healthcare data is expected to grow exponentially in the coming years — at a compound annual growth rate (CAGR) of 36% through 2025. The growth reflects the impact of accelerating volumes of data from patient-record outcomes, imaging technologies, remote patient monitoring devices, digital transcription and AI-enabled ambient listening tools that capture complete conversations in exam and operating rooms — which Microsoft is betting big on, as evidenced by its recent purchase of Nuance Communications for $19.7 billion.

Ambient listening technology, along with speech recognition tools that transform conversation into text, hold great promise for capturing detailed information about a patient’s health status. The challenge, however, is that providers, payers and researchers don’t have time to read lengthy transcriptions to find critical insights. As the volume of healthcare data continues to explode, tools that can search massive amounts of text to deliver stakeholders key insights will become essential for the effective delivery of quality patient care — and for minimizing clinician burnout. Technologies that persevere will filter all the data for clinical relevancy at the point of care.

The Data/Burnout Connection

Clinician burnout is a longstanding issue that continues to plague healthcare. Excessive bureaucratic demands and long hours are top contributors to clinician frustrations that lead to career dissatisfaction, depression and, tragically, even suicide.

By creating efficiencies that minimize tedious searches and deliver users the specific information they need when they need it, clinicians can spend less time on the computer and more time interacting with patients. As healthcare data volumes continue to growth, healthcare leaders must embrace NLP and other advanced technologies to minimize clinician frustrations and give users ready access to relevant information. The systems of the future need tools that let clinicians quickly get the clinically relevant patient- and diagnosis-specific information they need when they need it. With the right tools, clinicians can spend less time on activities that fuel burnout, and more time improving productivity and remaining focused on care delivery and optimal patient outcomes.

Previous post
Back to list
Next post