Data-driven insights into cancer: from machine learning methods to biological discoveries

The recent deluge of genomics data has transformed biological research from a field limited by data acquisition into one limited by data interpretation. The potential of many rich but heterogeneous genotype and phenotype data sources remains relatively untapped, largely because we lack computational tools to appropriately combine and integrate data across studies. I develop computational methods and pipelines to rapidly and effectively integrate information from multiple data sources and leverage the strengths of human and model organism data. My research program has included collaborations with major consortia including The Cancer Genome Atlas (TCGA) and the West Coast Dream Team, as well as independent projects.

Here I will focus on two methods that I developed during the course of my research. PLATYPUS, a semi-supervised machine learning method, which I apply to drug sensitivity prediction in a large cancer cell line database and highlight key biological drivers of response. PLATYPUS builds ‘views’ from multiple data sources and biological priors that then jointly predict patient outcomes. I show that this learning strategy increases the performance of the learning methods over any single view and helps identify drivers of drug response, thereby generating models to inform clinical and biomedical research. The second method I will present is a comparative genomics statistical framework that highlights the strengths in modeling disease across multiple species. For this, I RNA-profiled canine mammary tumors (CMTs) to model molecular changes in human breast cancer, using my statistical framework to demonstrate that dog-derived cancer signatures are predictive of survival for human breast cancer patients.

Speaker Bio:
Kiley is a flatiron research fellow in Olga Troyanskaya’s lab at the Center for Computational Biology of the Simons Foundation Flatiron Institute and a visiting scholar at Princeton University. She earned her PhD in biomolecular engineering in Josh Stuart’s lab at the University of California, Santa Cruz. Kiley operates at the confluence of life science research and computer science. She has made significant contributions to cancer genomics through development of novel machine learning methods that integrate large and heterogeneous datasets to address key questions in human health and disease. Kiley was a core member of several flagship projects within The Cancer Genome Atlas, which unlocked critical insights into cancer and enabled the development of personalized cancer therapy by myriad biomedical researchers. Most recently, she has translated her genomics expertise towards creating a statistical framework for cross-species translational cancer analyses that inform our understanding of the mechanisms of human breast cancer.​


Lectures, Seminars



Slonim Conference Room, Goldberg Computer Science Building




David Langstroth