PhD candidate in Biostatistics, Jessica Butts, will present:
“Methods for Integrative Analysis and Prediction Accounting for Subgroup Heterogeneity”
PhD advisers: Sandra Safo and Lynn Eberly
Abstract: Multi-view data, where there are multiple data views (e.g., genomics, proteomics) measured on the same set of participants, have become increasingly available and require integrative analysis methods to fully utilize the available data and better understand complex diseases. At the same time, epidemiologic and genetic studies in many complex diseases suggest subgroup differences (e.g., by sex or race) in disease course and patient outcomes. While there are existing methods that can perform integrative analysis or account for subgroup heterogeneity, we are unaware of any methods that can do both simultaneously; these existing methods are thus unable to fully utilize the available data. We introduce HIP (Heterogeneity in Integration and Prediction), a novel one-step method for one or more continuous outcomes that (1) accounts for subgroup heterogeneity in multi-view data, (2) ranks variables based on importance, (3) can incorporate covariate adjustment, and (4) has efficient algorithms implemented in Python. We then extend HIP to accommodate multi-class, Poisson, and ZIP outcomes allowing researchers to study other clinically relevant outcomes using HIP. We illustrate HIP using data from the COPDGene Study to explore the genes and proteins associated with exacerbation frequency for males and females. Finally, we provide an R Shiny application that provides a simple interface to the Python code to make HIP accessible to a wider audience.