Abstracts

2025 Annual Symposium on Risks and Opportunities of AI in Pharmaceutical Medicine

Day 1 Tutorial 1

Title:

“Large Scale Processing and Analysis of Wearable Device Data”
Lukas Adamowicz, Pfizer Inc.
Yiorgos Christakis, Pfizer, Inc.

Abstract:

 This tutorial will cover the use of Scikit Digital Health (SKDH) as a comprehensive framework for setting up scalable processing pipelines for wearable device data. Wearable devices generate vast amounts of data that require efficient processing and analytics to extract digital measures of health (e.g., metrics of physical activity, sleep, gait, etc.). SKDH provides a robust and flexible library of tools to handle this data at scale, enabling researchers and practitioners to streamline their workflows. Participants will learn how to leverage SKDH to load, preprocess, and analyze wearable data, with practical examples. By the end of this tutorial, attendees will be equipped with the knowledge to implement scalable data processing pipelines using SKDH, enhancing their ability to derive digital measures of health from wearable device data.

Day 1 Tutorial 2

Title:

“TMLE in Drug Development”
Susan Gruber, TL Revolution
Hana Lee, Food and Drug Administration
Mark van der Laan, UC Berkeley

Abstract:

Artificial intelligence and machine learning (AI/ML) is transforming the field of medical product development and evaluation. AI/ML tools are driving innovative solutions to accelerate processes, improve efficiency, and reduce costs, however naive use could result in biased findings that yield low quality evidence. The U.S. Food and Drug Administration (FDA) has developed regulatory frameworks and guidances to ensure the safe and effective integration of AI/ML. Targeted Learning provides a particularly useful framework for integrating AI/ML in the planning, design, and analysis of randomized controlled trials (RCTs) and real-world evidence (RWE) studies.

This course will first provide an overview of FDA’s perspectives on the use of AI/ML for drug development and evaluation, describing various uses of AI/ML across the full spectrum of drug development processes, landscape assessment of regulatory submissions, and available guidance and resources from FDA along with other FDA initiatives.

Targeted Learning (TL), a subfield of statistics that combines statistical theory, ML, and causal inference, provides a framework for addressing the many challenges to efficient unbiased estimation in a regulatory context. The course will present the Targeted Learning (TL) estimation roadmap for causal inference, with applications to both RCTs and RWE studies. The roadmap offers step-by-step guidance to specifying key design components including the target causal estimand, statistical estimand and identifying assumptions, required data, estimation, sensitivity analysis and interpretation. AI/ML solutions can be incorporated throughout, e.g., for outcome identification, power calculations, and setting tuning parameters for pre-specified analyses of primary and secondary outcomes.

Targeted maximum likelihood estimation (TMLE) is an efficient, double robust estimator that is suitable for analyses of point treatment, longitudinal, and survival data. TMLE is combined with super learning (SL), an ensemble of various ML algorithms, to provide superior causal inference when compared with traditional parametric methods, or the use of AI/ML alone. TMLE+SL provides valid statistical inference while preserving transparency, interpretability, and replicability to meet regulatory recommendations.

Case studies will be presented that illustrate the application of the TL roadmap and TMLE+SL in challenging areas of medical product development, such as evaluating point treatment effects and survival analysis in rare disease settings.

Day 2 Plenary Session 1

Title:

“Targeted Learning, HAL, and Causal Inference for Generating Real-World Evidence in Drug Development”
Mark van der Laan, UC Berkeley

Abstract:

Targeted Learning follows a general causal roadmap for 1) accurately translating the real world into a formal statistical estimation problem in terms of causal estimand, a corresponding statistical estimand, and statistical model; 2) a corresponding template for construction of a targeted maximum likelihood estimator (TMLE) of the statistical estimand; and finally 3) a sensitivity analysis addressing the possible causal gap. The TMLE represents an optimal plug-in machine learning based estimator of the estimand combined with formal statistical inference. The three pillars of TMLE are super-learning, Highly Adaptive Lasso (HAL), and the TMLE-update step, where the latter has various choices such as CV-TMLE/C-TMLE, and the recently developed adaptive TMLE (Lars van der Laan et al., 2023). Through super-learning it can incorporate high dimensional and diverse data sources such as images, NLP features, and state of art algorithms tailored for such data sources. To optimize finite sample performance, the precise specification of TMLE can be tailored towards the precise experiment and statistical estimation problem in question, while being theoretically grounded, optimal, and benchmarked. We provide a motivation, explanation, and overview of targeted learning; the key role of super-learning and HAL; discuss some of the key choices and considerations in specifying the TMLE-step; and discuss (a priori specified) SAP construction based on targeted learning, incorporating outcome-blind simulations to choose a best specification of the SAP. We also discuss various case studies including a Sentinel and FDA RWE demonstration project of targeted learning demonstrating SAP specification on real data.

Title:

“Using Statistical Foundations to Demonstrate Effectiveness of AI/ML Algorithms for Clinical Utility”

Charmaine Demanuele, Pfizer Inc.

Abstract:

AI presents significant opportunities for improving healthcare, especially in disease diagnosis (e.g., COVID-19) and prognosis (e.g., sepsis or acute kidney injury). However, many clinical predictive algorithms (CPAs) remain unused, exhibit bias, or fail to generalize across different populations and hospital platforms. This issue stems from two key factors. First, there is an overreliance on the area under the receiver operating characteristic curve (AUC) as the main measure of CPA quality, which does not consider the prevalence of the target characteristic, limiting the understanding of the CPA’s positive and negative predictive value. We advocate for a comprehensive approach to optimizing CPA development, akin to the methodologies employed in bioanalytical diagnostic tests and biologic biomarkers. Secondly, there is a lack of robust clinical trials to quantify the benefits and risks of implementing CPAs. To address this, we propose a fit-for-purpose, sequential clinical development approach, similar to the process used in evaluating new medicinal products and biomarkers, with potential oversight from regulatory agencies. By embracing these recommendations, the benefits and risks associated with CPAs will become clearer, fostering increased credibility, transparency, and utility. This will enable our healthcare system to fully harness the potential of digital medicine and AI. learning demonstrating SAP specification on real data.

Title:

“Transforming Pharmacovigilance with AI: A Roadmap”
Hussein Ezzeldin, FDA

Abstract:

Since its inception in the mid-19th century, Pharmacovigilance (PV), the science of detecting, assessing, and understanding adverse drug events, has undergone profound transformation. Over the decades, PV has evolved from paper-based systems, such as the UK Yellow Card scheme, to automated processing of electronic health records enabled by advancements in natural language processing, machine learning, and data interoperability. This presentation traces the trajectory of PV, highlighting key historical milestones from its origins to the application of large language models for safety signal detection, case summarization, and adverse events adjudication. Concluding with a forward-looking perspective, we explore the future of PV, where the integration of diverse datasets and emerging technologies converge to advance public health and foster the development of safer, more effective treatments.

Day 2 Plenary Session 2

Title:

“Safety First: Systematic Approaches to Hallucination Mitigation in Clinical AI”

Malaikannan Sankarasubbu, Saama Technologies

Abstract:

Large Language Models (LLMs) are increasingly being deployed in healthcare settings, promising to revolutionize clinical workflows and decision support systems. However, their tendency to hallucinate – generating plausible but factually incorrect information, poses unique challenges in medical contexts where accuracy is paramount.
This talk presents a comprehensive examination of hallucinations in medical LLMs, grounded in real-world deployment experiences from clinical settings. We analyze actual cases where hallucinations impacted clinical workflows and discuss the implemented safeguards.
The presentation covers cutting-edge research in hallucination detection and mitigation, including recent advances in knowledge-grounded generation and semantic consistency checking specifically tailored for medical contexts. We share our novel contributions to this field, including a proposed framework for real-time hallucination detection in clinical applications and automated verification systems integrated with medical knowledge bases.
Looking ahead, we explore emerging approaches for building more reliable medical LLMs and discuss the critical balance between innovation and safety in healthcare AI deployment. This talk bridges the gap between theoretical research and practical implementation, offering valuable insights for both researchers and healthcare technology practitioners.

Title:

“Stage-Aware Learning for Dynamic Treatments”

Annie Qu, UC Irvine

Abstract:

Recent advances in dynamic treatment regimes (DTRs) provide powerful optimal treatment searching algorithms, which are tailored to individuals’ specific needs and able to maximize their expected clinical benefits. However, existing algorithms could suffer from insufficient sample size under optimal treatments, especially for chronic diseases involving long stages of decision-making. To address these challenges, we propose a novel individualized learning method which estimates the DTR with a focus on prioritizing alignment between the observed treatment trajectory and the one obtained by the optimal regime across decision stages. By relaxing the restriction that the observed trajectory must be fully aligned with the optimal treatments, our approach substantially improves the sample efficiency and stability of inverse probability weighted based methods. In particular, the proposed learning scheme builds a more general framework which includes the popular outcome weighted learning framework as a special case of ours. Moreover, we introduce the notion of stage importance scores along with an attention mechanism to explicitly account for heterogeneity among decision stages. We establish the theoretical properties of the proposed approach, including the Fisher consistency and finite-sample performance bound. Empirically, we evaluate the proposed method in extensive simulated environments and a real case study for COVID-19 pandemic.

Title:

“Leveraging the power of AI/ML to enhance statistical inference in clinical trials: opportunities and learnings”
Scott McClain, SAS

Abstract:

Predicting both drug efficacy and safety, as well as model modes of actions using historical atomic level descriptors is a sort of “holy grail.” The what-if scenario is “what if I had the ability to predict the next new drug, its parameters of dose and expected adverse effects with fewer pre-clinical animal studies, much earlier in the pipeline?” There are vast amounts of hidden value in the years’ worth of animal and in vitro data, and we now have an ability to stack thousands of metrics for the drug chemistry. The tactical question is how to access the historical to train for a drug’s future performance. But it is not clear that black-box AI/ML modeling pointed at this data is a straightforward or better answer. In fact, its argued that with the advent of big data, data itself is a hindrance. We will discuss access to key approaches for a blended world of chemistry and biology: feature extraction, data reduction and clustering. Newer generative and predictive techniques, such as adversarial networks, will also be examined. Getting a view of stitching traditional and new techniques together to support trusted modeling is our objective in reviewing risks and key opportunities in a generative AI drug development world.

Title:

“Testing and decision making with e-values”
Aaditya Ramdas, Carnegie Mellon

Abstract:

This talk will describe an approach towards sequentially testing hypotheses and estimating functionals that is based on games. In short, to test a (possibly composite, nonparametric) hypothesis, we set up a game in which no betting strategy can make money under the null (the wealth is an “e-value” under the null). But if the null is false, then smart betting strategies will have exponentially increasing wealth. Thus, hypotheses are rewritten as constrained games, the statistician is a gambler, test statistics are derived from betting strategies, and the wealth obtained is directly a measure of evidence which is valid at any data-dependent stopping time (an e-value). The optimal betting strategies are typically Bayesian, but the guarantees are frequentist. This “game perspective” provides new statistically and computationally efficient solutions to many modern problems, with particular importance for an open-ended meta-analysis that can be updated on the fly. 

Title:

Leveraging the power of AI/ML to enhance statistical inference in clinical trials: opportunities and learnings

Abstract:

The increasingly competitive landscape is motivating accelerated timelines and more efficient development processes. Simultaneously, there are emerging opportunities: new data sources such as RWD and digital biomarkers as well as advancements in AI/ML capabilities. In particular, new causal inference methodologies leverage AI/ML capabilities while emphasizing statistical inference, which is crucial in reporting clinical trial outcomes. These methods could be used to adjust for prognostic score (leading to more efficient trials with smaller sample size), and/or to leverage external data sources to accelerate study timeline and conserve resource. However, there remain gaps in understanding exactly how particular algorithms might perform in practice, as well as in implementations for specific types of estimands. We will discuss some of these trends and Sanofi efforts to fill identified gaps. In particular, we present results from simulation studies that leverage generative AI (Generative Adversarial Networks) to produce fit-for-purpose synthetic data, which allows us to compare performance of AI/ML-related methods in specific indications.

Title:

Stage-Aware Learning for Dynamic Treatments

Abstract:

Recent advances in dynamic treatment regimes (DTRs) provide powerful optimal treatment searching algorithms, which are tailored to individuals’ specific needs and able to maximize their expected clinical benefits. However, existing algorithms could suffer from insufficient sample size under optimal treatments, especially for chronic diseases involving long stages of decision-making. To address these challenges, we propose a novel individualized learning method which estimates the DTR with a focus on prioritizing alignment between the observed treatment trajectory and the one obtained by the optimal regime across decision stages. By relaxing the restriction that the observed trajectory must be fully aligned with the optimal treatments, our approach substantially improves the sample efficiency and stability of inverse probability weighted based methods. In particular, the proposed learning scheme builds a more general framework which includes the popular outcome weighted learning framework as a special case of ours. Moreover, we introduce the notion of stage importance scores along with an attention mechanism to explicitly account for heterogeneity among decision stages. We establish the theoretical properties of the proposed approach, including the Fisher consistency and finite-sample performance bound. Empirically, we evaluate the proposed method in extensive simulated environments and a real case study for COVID-19 pandemic.

Title:

Sequential testing and meta-analysis with e-values

Abstract:

This talk will describe an approach towards sequentially testing hypotheses and estimating functionals that is based on games. In short, to test a (possibly composite, nonparametric) hypothesis, we set up a game in which no betting strategy can make money under the null (the wealth is an “e-value” under the null). But if the null is false, then smart betting strategies will have exponentially increasing wealth. Thus, hypotheses are rewritten as constrained games, the statistician is a gambler, test statistics are derived from betting strategies, and the wealth obtained is directly a measure of evidence which is valid at any data-dependent stopping time (an e-value). The optimal betting strategies are typically Bayesian, but the guarantees are frequentist. This “game perspective” provides new statistically and computationally efficient solutions to many modern problems, with particular importance for an open-ended meta-analysis that can be updated on the fly.