How to Successfully Employ External Comparator Arm Studies Using Real World Data

Despite randomised clinical trials (RCTs) being the gold standard for drug approval studies, the shift towards precision medicine has increased the use of single-arm trials (SATs). SATs lack results from control patients. Therefore, to help contextualise study findings, external comparator arms (ECAs) can be employed, which compile data from external sources, such as patient registries and other medical records. However, methodological considerations must be undertaken to ensure the best conduct and minimise potential biases in ECA study designs.

The differences between RCTs and ECAs

Through randomisation of both the treatment and control groups, RCTs allow researchers to control potential biases and the influence of unmeasured variables. In addition, RCTs don’t have to rely on two different data sources, which may have different operational definitions, assessment methods, and measurement timing. However, in certain cases, RCTs are unfeasible and this is when SATs can be utilised.

When looking at trends in precision medicine, which home in on specific biomarkers among patient populations in the same disease category, SATs allow researchers to evaluate new treatment options for smaller patient cohorts. To increase validity of SAT findings, researchers are using real world data (RWD) from patients with the same attributes to act as the external comparator arm. However, several factors should be considered before employing ECA studies to ensure the findings are statistically valid.

Importance of sample size in ECAs

As with any research study, sample size plays a critical role in ensuring the study results can be used as valid evidence for treatment efficacy and safety. When it comes to sample size for ECAs, there are several unique considerations, including whether data for one treatment arm is already available, whether additional conservativeness should be incorporated when estimating needed sample sizes, and how to incorporate the usage of causal inference methods.

Populations and treatment conditions

Documentation of detailed descriptions of the ECA population is critical, including the mechanisms and conditions that led to the patients being recorded in the data source. RWD sources usually only document comparator patients who actually received the treatment, not those who were intended to receive the treatment, which is translating rather to a safety analysis population, not an intention-to-treat (ITT) population, and it is recommended to compare like with like also in terms of analysis populations.

A second consideration is identifying which target population the treatment should be standardized for in terms of estimating marginal treatment effects. This means identifying the estimands that will describe the differences in treatment effects in different target populations, such as the average treatment effect (ATE) for the overall patient population and the average treatment effect on the treated (ATT) or untreated (ATU), which focus on treated or untreated populations. Also, the average treatment effect in the overlap population (ATO) is a possible estimand, focusing on the internal validity of the treatment comparison.

As with study populations, also, the treatment conditions for both the treatment and comparator groups must be clearly described. The treatment group description will be available through the clinical study protocol, and the conditions in the RWD comparator group need to be described similarly (e.g. eligible dosages, how the drug is administered, and frequencies of drug intake). Since there is a chance of higher exposure time in the treatment group versus the comparator group, due to the controlled setting of the SAT, researchers need to evaluate whether implementing a minimum number of treatment cycles or exposure time for both groups is helpful to address baseline exposure differences in the data sources, potentially as a sensitivity analysis.

Baseline and endpoint considerations

Prior to data collection, baseline information must be established in order to ensure that variable definitions of both groups are as identical as possible. The index date, meaning the date on which the trial officially begins, should be defined by the start of the treatment initiation, not by the enrolment date, to create consistent definitions across data sources.

Inclusion and exclusion criteria must also be consistently defined between the treatment and comparator groups as much as possible. For example, the definition of lines of treatment in site-based RWD sources is based on the physician's clinical judgment and may differ from physicians at other sites and the SAT algorithm. In order to create more consistency across all datasets, it is necessary to check, and often reclassify, lines of treatments and potentially other baseline data.

The measurement of endpoints in RWD and SATs may differ, also, by means of different time points. For example, oncology studies with the RWD endpoint progression (i.e. the progression of the disease) may not be assessed using established classification rules, and some RWD sources may not even allow for the application of such classification rules. While RW assessments of progression are typically less strict compared to SATs, bias may be introduced, potentially in favour of the SAT drug. One consideration for validating RWD progression assessments is a blinded central review, which involves employing reviewers that do not know which group the data came from.

Analysis considerations

Often, propensity score (PS) models are used to analyse non-randomised data, but there are many more alternative methods possible, for example doubly-robust, g-computation, non-PS weighting, or non-PS matching methods. Also, a combination of approaches are possible, e.g. performing PS weighting or g-computation after an initial matching step.

Missing values and unmeasured covariates are at the heart of the ECA design type and an effective handling of missing baseline covariate data is essential. Various sophisticated methods are possible and sensitivity analyses should be applied to check on the robustness of results.

Discussion

In the general case, RCTs are the optimal approach to clinical research and drug development. However, if an RCT is not feasible - for example, due to sample populations being too small when studying ultra-rare diseases - SATs with ECAs that use real world data can be a powerful tool to help identify better treatments for patients.

In order to get statistically valid results, taking factors such as sample size, population and treatment conditions, baseline measures, and endpoint comparisons into consideration when designing and analysing a study will be critical. However, a thorough review will be the key to the successful utilisation of ECAs, as each disease type and data source brings its own challenges in the research process.

About the author

Dr. Gerd Rippin, Director Biostatistics at IQVIA, received his bachelor’s degree in Statistics in 1995 from the University of Dortmund, Germany, and his PhD in 1999 from the University of Mainz, Germany. He spent most of his career within the CRO business, including working as a contractor and directly within the pharmaceutical industry. Dr. Rippin is a very experienced Biostatistician with over 20 years of expertise in applying statistical methods to clinical studies. His experience includes various indications and phases of medical research, and he has a special interest in External Comparator Arm (ECA) studies and in applying complex real-world statistical methodology in general.