Javad Rahimikollu and Hanxi Xiao, students in the Joint Carnegie Mellon-University of Pittsburgh PhD Program in Computational Biology (CPCB)
CPCB students publish innovative machine learning model in Nature Methods 

Javad Rahimikollu and Hanxi Xiao, students in the Joint Carnegie Mellon-University of Pittsburgh PhD Program in Computational Biology (CPCB), led research on SLIDE, a tool for exploring complex biological interactions. 

They were the first-listed authors in the competitive Nature Methods journal for their article “SLIDE: Significant Latent Factor Interactive Discovery and Exploration Across Biological Domains.” 

Rahimikollu and Xiao are graduate student researchers in the lab of Jishnu Das, assistant professor at the Center for Systems Immunology and Departments of Immunology and Computational and Systems Biology.  

Das describes SLIDE as an “engine of discovery.” It is a machine learning approach designed to enhance our understanding of complex biological systems by uncovering hidden interactions within high-dimensional omic datasets. 

“With rapid advances in modern technology, you have a very large number of datasets being generated at large scale,” Das said. “We’ve gotten very creative at ways to put these datasets together and get to predictive or correlative biology. What we don’t have is inferences, meaning not just the ‘what’ of a dataset but the ‘how or why.’” 

SLIDE addresses this challenge in the field by moving beyond predictive biomarkers to reveal the processes underlying immunological states. It harnesses the power of statistical machine learning.  

Throughout this project, the Das team combined their skillsets to create this innovative model. Rahimikollu, whose background is in industrial engineering and statistics, led the statistical method development. Xiao, who holds a master’s degree in automation science, spearheaded implementation and context-specific applications. Das oversaw interdisciplinary research and collaboration with other scientists to define applications for the model. 

“One of the really cool things about SLIDE is that it doesn’t require that input data needs to come from a specific technology, which is quite unique,” Xiao said. “It doesn’t really matter where you got these data sets; it can vary from RNA readings to how many cups of coffee your colleagues drank today. It can handle any kind of input data set.” 

SLIDE makes no assumptions regarding data-generating mechanisms, allowing it to accommodate many popular technology datasets.  

Another key feature of this model is it offers a robust framework for identifying significant standalone and interacting latent factors. These latent factors can drive outcomes but are not directly observed. They can only be inferred indirectly using a mathematical model. 

“Interactions are very important, and many people miss the interaction of latent factors in their analysis,” Rahimikollu said. “Latent factors are one way to try and reconstruct the biological truth.” 

Latent factors provide deeper insights into the underlying mechanisms of diseases and biological processes. 

Since this paper was published, SLIDE has been adopted across disease and cellular contexts. Some of these uses include identifying latent factor interactions related to autoimmune disorders and lung disease.  

“I think a true test of a method is to ask, ‘Is it giving biological answers across contexts?’” Das said. “Getting into Nature Methods is awesome, but I would say seeing the utility of a method is the most satisfying part.”