Real-world evidence generation from hybrid claims data: an interview with Seth Kuranz, Forian Inc

1 Sep 2023

Written by Seth Kuranz (Forian Inc)

Health data Interviews Real-world evidence

In this interview, Seth Kuranz, RWE Principal, Forian, Inc., explores the opportunities for real-world evidence (RWE) generation using hybrid claims data. Seth will be speaking on this topic at the forthcoming 5th Edition: Excellence in Real World Evidence and Data Utilization taking place in Philadelphia, PA, USA on January 23–24, 2024.

Please could you introduce yourself and your organization?

My name is Seth Kuranz and I’m an epidemiologist and the RWE Principal at Forian, Inc. I’ve been with Forian since early 2022 supporting our partners and their use of Forian’s hybrid claims data to generate RWE. Before joining Forian, I spent time in academic and industry settings working on a variety of research topics with a focus on real-world data (RWD) and RWE.

I joined Forian because they developed one of the largest, integrated, repositories of healthcare data in the US. Forian provides SaaS software solutions, RWD and RWE analytics, and de-identified healthcare information products. The Forian team includes experienced epidemiologists, data scientists, engineers, and life science executives who implement and oversee the ingestion and curation process for a set of complex and disparate RWD sources that make up Forian’s data ecosystem.

What is hybrid claims data and why is it being used as a source of RWE?

At Forian, we have a hybrid data ecosystem called CHRONOS. In this case, “hybrid” refers to healthcare data from both closed and open claims sources that have been linked and integrated together at the patient level. The de-identified data are combined in a HIPAA-compliant manner allowing researchers to observe and follow cohorts of patients across different sources.

Forian and our research partners are using hybrid claims data to address the limitations inherent in RWD. RWD is a valuable resource for life science organizations but is most often transactional in nature and collected through administrative processes supporting routine clinical care, which is to say that it’s not collected with research data needs in mind. A single RWD source may have gaps across time, be missing key data elements, and not be fit-for-purpose for many of the questions that life sciences organizations would like to ask. Combining multiple RWD sources, as in a hybrid claims data model, can begin to address these limitations and support the generation of varied RWE.

As an example, we can explore how closed claims data may support health economics and outcomes research measuring progression from chronic to end-stage renal disease. Closed claims data are sourced from healthcare payers, a closed system supporting the assumption that missing data is minimal while patients are enrolled in the payer’s health plan. But closed data sources have limitations. Often, they reflect patient populations covered by commercial health insurance plans or single payers, they have limited follow-up times restricted by patients or patients’ employers switching health insurance plans, and they may be missing out-of-network treatment or fully out-of-pocket payments. Although open claims data sources have their own limitations, they differ from closed data because they are sourced from clearing houses that manage the provider–payer revenue cycle. In the case of measuring end-stage renal disease, supplementing the closed claims with the open claims provides greater insight into outcomes such as dialysis and kidney transplants. These are longer-term outcomes that may be covered by a public insurance option, meaning they would be underestimated in a research study using closed claims alone. Hybrid claims provide a better option to answer questions related to the progression of chronic kidney disease.

What challenges do researchers face when using hybrid claims data?

Challenges faced by researchers when using hybrid claims data are often related to the complexity of RWD. Claims data, and other RWD sources such as electronic medical record data, differ from sources like randomized controlled trial data, in that RWD is not systematically collected with a research aim in mind. It can take some time and effort on the part of a researcher or data scientist to understand the scope of the data and any pitfalls they may come across while trying to answer a specific question.

When RWD is combined and integrated, such as in the case of hybrid claims data, the complexity only increases. Patients may not be present in both the closed and open sources. When a patient is present in both sources it may be during different time periods of observation. Patients will drop out of the data for different reasons; for example, changing health insurance coverage, facing unemployment, or retirement, or for entirely administrative reasons, like changes in provider and payer vendor relationships. These are only a few examples of the challenges researchers must grapple with as they think through the best analytic approach to make use of data across multiple sources. However, in an imperfect data world, hybrid claims data and other linked data sources are an improvement.

How can these be overcome?

There is no single solution. If hybrid claims data is being utilized to increase enrollment in a clinical trial by identifying physicians treating potentially eligible patients, combining more and more data is an asset and may be relatively straightforward to make use of regardless of whether a physician is present in the closed claims data, open claims data, or both. Often, a researcher cannot simply combine data from different sources and then treat the data as if it came from one data source. Researchers must think through imputation methods, normalization approaches, and bias analyses. Doing so starts to unlock the full potential of using hybrid RWD.

This work can be helped along by a hybrid claims data source that has been curated in such a way that it supports varied and complex research aims. For example, a good hybrid data source will utilize some form of a unified data model improving the ease with which researchers can make use of hybrid claims data, as long as a researcher doesn’t lose track of how the closed and open claims data are sourced differently.

What are some examples of RWE studies that have used hybrid claims data?

Hybrid claims data is not an entirely new concept. Previous studies that have used hybrid claims or other combinations of RWD sources have often relied on ad hoc linking for a single project or research aim. Although inefficient, there are examples where this has been done successfully. Moving forward, we are likely to see more studies that capitalize on linked ecosystems that have been created independently of a singular research objective.

What are the future trends in the use of hybrid claims data for RWE generation?

I think the use of linked data overall, whether it is hybrid claims data, or claims datasets combined with electronic medical records, registries, clinical trial data, or other novel data sources, will continue to become more and more common. And of course, we have been discussing US hybrid claims data, but I believe ex-US data sources will also become increasingly available in some form.

Health care in the US is complex and patient care over time will never be completely captured for a patient cohort. Much of what hybrid claims data attempts to do is extend a researcher’s observation of a patient’s journey through a fractured system. Hybrid claims data begins to expand the view of this journey, but we are likely to see future trends toward more and more linkages.

Disclaimer:

The opinions expressed in this feature are those of the interviewee/author alone and do not necessarily reflect the views of The Evidence Base®.