Advanced Artificial Intelligence Helping to Track the Progression of Geographic Atrophy


Verana Health

Until recently, the progression and population of patients with geographic atrophy (GA) – an advanced form of age-related macular degeneration that can lead to vision loss – was incredibly difficult to track using real-world evidence. This was mainly due to an absence of FDA-approved treatments and, as a result, the condition was not always coded in electronic health record (EHR) systems. 

Now, with two FDA-approved therapies on the market and advances in artificial intelligence (AI)-driven analytics that can build predictive models based on near real-time patient activity, it’s possible to track detailed healthcare journeys, which can lead to a better understanding of this previously difficult to identify population.

Extracting Key Insights with AI-powered Techniques

Advanced AI-powered natural language processing (NLP) models, and an exclusive data partnership with the American Academy of Ophthalmology IRIS® Registry (Intelligent Research in Sight), have made it possible for Verana Health to:

  • Ingest massive volumes of data from multiple sources (e.g., EHRs and images)
  • Identify patterns in the data
  • Generate inferences based on pattern recognition 

The IRIS Registry is an 11-year longitudinal database that includes outpatient EHR data on nearly 80 million de-identified patients from 15,000 contributing clinicians. This patient-specific data – found in structured and semi-structured fields in EHRs, unstructured EHR clinical notes, and imaging data – capture many aspects of the patient experience. By applying AI-driven rules-based NLP models to real-world data (RWD) for GA, we can codify signals for disease progression on a nationwide scale and find new ways to evaluate the real-world patient journey, as well as usage and effectiveness of new treatments.

In order to extract meaningful insights from the data, Verana Health applies AI-powered NLP and machine learning (ML) models to analyze patterns of language in unstructured clinical notes or lesion details in ophthalmic images that signal key milestones and clinical insights that occur during the patient experience. Most importantly, Verana Health’s team of clinical experts, which includes experienced ophthalmologists with deep expertise in data-driven research, is continually training, testing and establishing rules for how this unstructured data is cataloged and categorized to make it useful in the real-world. This approach also includes robust testing and validation before moving any model into production via comprehensive quality metrics.

Verana Health is unique among healthcare data and analytics providers in its ability to analyze this depth and breadth of RWD at scale. While some companies have set out to manually parse clinical notes for insights, and others have tried to automate the entire process, Verana Health is the only company of its kind to model patterns of language in this manner using EHR data captured in the IRIS Registry.

Utilizing Machine Learning to Understand GA Prevalence

Let’s closely examine GA to better understand how Verana Health’s approach to unstructured data curation works in the real-world. As mentioned earlier, when it comes to structured, standardized coding used to identify patients with GA, it was not always coded in EHRs. We know this, because we’ve tracked it. In fact, when we identified GA patients using standard ICD-10 codes alone, it yielded 330,000 GA patients. After implementing ML capabilities to tap into the unstructured clinical notes, we uncovered a significant undercount, identifying an additional 476,000 patients, significantly expanding our total cohort to over 810,000 patients.

It’s also important to note that the new drugs approved to treat GA were only recently approved. It often takes some time for product-specific J-codes to be assigned and adopted and, therefore, may take time to appear in medical claims data. These treatments can, however, often be identified earlier in EHR data by mining the clinical notes for non-specific J-codes and overlaying that with other context on the treatment. Because many of the key variables used to identify GA and chart its progression come from ophthalmic images and data derived from unstructured fields in clinical notes, capturing a comprehensive snapshot of the patient population has required the integration of several data types.

The key to finding missing patients and details around disease prevalence, as well as progression, is fine-tuning NLP models to flag keywords and patterns of language consistent with certain clinical cues. For example, we’ve developed ML models that leverage keywords in clinical notes, such as “visual acuity deteriorating,” or “subfoveal involvement.” These keywords can provide critical clues that signal disease progression, but would not show up in the traditional structured data fields of an EHR, and are not captured in medical claims. 

Other key variables involved in identifying and tracking GA progression are images. Ophthalmology is unique in its heavily standardized use of images to track key variables such as lesion size, location and growth rate, total number of lesions, and other criteria. Verana Health has tens of thousands of high-quality images for more than 2,000 patients with GA, allowing us to combine both highly structured and unstructured datasets to better understand patient-specific disease progression. These images can be utilized to train AI models that can identify GA disease progression at scale.

Bringing Precision and Scale to Unlocking A Comprehensive View of GA

RWE has the ability to bring both precision and scale to our understanding of disease prevalence and progression in the real-world clinical environment. However, without the ability to scrutinize all aspects of that RWE, analysis using only ICD-10 codes in EHRs or medical claims are likely missing large swaths of the patient population. The only way to truly know you’re capturing the complete patient population and journey is to work with a source (i.e., structured, unstructured or semi-unstructured, and imaging data) that most closely represents the clinical interpretation of the patient’s condition and experience. 

By training algorithms based on rules and nuanced interpretations that are developed and continually refined by practicing clinicians, we are able to deliver the most data on the largest universe of patients, and with the best insights into what’s really happening at each step of the healthcare journey. 

To learn about Verana Health’s Qdata Geographic Atrophy, and how it can unlock critical signals and spotlight important trends, click here.

Verana Health Logo

Let's Accelerate Research Together

To learn more about Verana Health, please fill out the information below and our team will follow up with you as soon as possible.