The minimum requirements for graduation are:

  • At least 30 credit hours of graduate-level training, including:
    • At least 4 research credits of directed study
    • A summer internship or additional directed study, worth 3 credits
    • 3 credits of Foundations of Computational Biology
    • 1 credit of Professional Development
  • One course (taken or waived) from each of the five groups

Program duration is from 12 to 20 months, depending on the background of the student. To complete the degree in one year, students must demonstrate proficiency in programming, calculus and linear algebra.

Detailed Course Listing

Course Grouping 1: Computational Structural Biology
COBB 2030 Computational Structural Biology
(4 credits, Fall)

Topics covered include: applying computational and statistical methods to the analysis of DNA and protein structures representing protein, DNA and RNA structure; homology modeling and protein structure prediction; theoretical description of basic interactions, along with computational methods to estimate them; statistical mechanical theory of molecules; molecular dynamics and other sampling methods; modeling protein flexibility, from side chains to loops to slow modes; reaction paths and basics of path sampling; protein-protein and protein-small molecule docking; supramolecular assembly; introduction to Quantitative Structure Activity Relationship (QSAR) in drug design.
Course Grouping 2: Computational Systems Biology
COBB 2041 Cell & Systems Modeling
(4 credits, Spring)

This course introduces students to the theory and practice of modeling biological systems from the molecular to the population level with an emphasis on intracellular processes. Topics covered include kinetic and equilibrium descriptions of biological processes, systematic approaches to model building and parameter estimation, analysis of biochemical circuits modeled as differential equations, modeling the effects of noise using stochastic methods. A range of biological models and applications are considered, including gene regulatory networks, cell signaling, molecular motors, and developmental biology.

Course Grouping 3: Computational Genomics
ISB 2020 Genomics for Systems Biology
(3 credits, Spring)
COBB 2070 Computational Genomics
(3 credits, Spring)

This course introduces students to genomic data and basic analytical principles pertaining  them. Students  will  learn  about  high-throughput  sequencing  methods and applications, genomic variation, transcriptomics and epigenomic data. At the end of the course, the students will be able to analyze efficiently these types of data sets using existing algorithms or algorithms they will develop.

Dramatic advances in experimental technology and computational analysis are fundamentally transforming the basic nature and goal of biological research. The emergence of new frontiers in biology, such as evolutionary genomics and systems biology is demanding new methodologies that can confront quantitative issues of substantial computational and mathematical sophistication. This course introduces classical approaches and the latest methodological advances in the context of the following biological problems: 1) Computational genomics, focusing on gene finding, motifs detection and sequence evolution. 2) Analysis of high throughput biological data, such as gene expression data, focusing on issues ranging from data acquisition to pattern recognition and classification. 3) Molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, and 4) Systems biology, concerning how to combine sequence, expression and other biological data sources to infer the structure and function of different systems in the cell. From the computational side this course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, pattern recognition, data integration, time series analysis, active learning, etc.
Course Grouping 4: Computer Science
COBB 2066 Scalable Machine Learning for Big Data Biology
(4 credits, Spring)

Machine learning (ML) has become an integral part of computational thinking in the era of big data biology. This course focuses on understanding the statistical structure of large-scale biological datasets using ML algorithms. We cover the basics of ML and study their scalable versions for implementation on a distributed computing framework. We pursue distributed ML algorithms for matrix factorization, convex optimization, dimensional reduction, clustering, classification, graph analytics and deep learning, among others. This course is project driven (3 to 4 small projects) with source material from genomic sciences, structural biology, drug discovery, systems modeling and biological imaging. Students are expected to design, implement and test their ML solutions in Apache Spark.
Course Grouping 5: Electives


Table 1: Course Schedule for 16-month CoBB MS Training (34 credits total)

Year 1 Course Area (*) Credits Term Credit
1st Fall semester Introduction to Bioinformatics Programming in Python COBB2025 4 11
Elective 3
Foundations in Computational Biology COBB2010 3
Professional Development COBB2055 1
1st Spring semester Genomics for Systems Biology CoBB2020 3 11
Cell & Systems Modeling COBB2041 4
Scalable Machine Learning for Big Data COBB2066 4
1st Summer semester Internship 3 3
2nd Fall semester Elective 3 11
Directed Study COBB2080 4
Computational Structural Biology COBB2030 4
Total   36  


Fall,Spring -N/A
Fall – Introduction to Medical Imaging and Image Analysis presents the physics of image formation as well as methods for tomographic image reconstruction for major medical imaging modalities, including X-ray Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). Also introduced are fundamentals of digital image processing, with particular emphasis on medical applications, including basic techniques to enhance image quality, image de-noising, methods for extracting, classifying, and tracking features of and objects in images, etc. Students will learn how to implement these techniques in MATLAB (The MathWorks Inc., Natick, MA) to solve practical image processing problems. MATLAB exercises will demonstrate to students how filtering operations applied in the image domain or the Fourier domain affect medical images. In addition to these fundamentals, more advanced algorithmic approaches for image segmentation and image as well as point-cloud registration techniques will also be reviewed.
Spring – This course is designed to teach the basic principles and applications of optical microscopy and imaging techniques commonly used in biomedical research. The enormous growth of optical microscopy has become an essential tool to investigate biological processes, diagnose diseases and quantitatively measure the biological system at unprecedented cellular and molecular level. It has become increasingly important for biomedical researchers to learn the proper use of optical microscopy, understand the advantages and limitations of each type of optical microscopy and how to apply them for specific biomedical applications. In this course, we will cover the physical principles involved in basic light, basic and advanced optical microscopy techniques. Strong emphasis will be given to biomedical applications for each type of optical microscopy. At the end of the course, a student will have a thorough understanding of basic principles of optical imaging and optical microscopy, learn how to apply optical microscopy to address biological questions and perform basic quantitative image analysis.
Fall – In this course some newly evolving multi-modal imaging techniques and analysis methods in biomedical applications will be introduced. The course will briefly cover the fundamental physics, core signal processing, image reconstruction of a variety of current standalone imaging modalities such as X-Ray, computer tomography (CT), magnetic resonance imaging (MRI), nuclear imaging (PET, SPECT), optical imaging (fluorescence, optical diffuse tomography, bioluminescence), and ultrasound. Subsequently, the concept, fundamental physics, and image analysis of some exemplary multi-modal imaging techniques and systems will be introduced. Their applications in Biomedicine in different scales from organ to cellular and molecular level, and from structural to functional imaging will be discussed. The course will also briefly address the issues related to image-based diagnosis, intervention and therapy.
Fall – This 2-credit course is a graduate level course to introduce basic concept and methods for statistical learning with emphasis on modern health science applications. The syllabus includes linear regression with regularization, supervised machine learning, unsupervised clustering, dimension reduction and other special topics (e.g. Bayesian network and hidden Markov model). Target audience will be second year Biostatistics master students or early PhD students with interests in statistical learning techniques for health science data. Through homework problem sets, computer labs and a final project, students train with hands-on materials to implement methods and interpret results in real applications.
Fall -This 2-credit course is a graduate level course to cover popular statistical and computational methods for high-throughput omics data analysis. With the rapid advances of many omics technologies, the course will focus on the fundamental concepts of various topics (e.g. data preprocessing, association analysis, causal mediation analysis, differential analysis, statistical learning, pathway analysis, etc.) and their specific applications to different omics data types (e.g. microarray, next-generation sequencing, single cell sequencing, mass spectrometry, microbiome, etc.). The major target audience is graduate students (master or PhD students) interested in omics data analysis and related research. Through homework problem sets, computer labs and a final project, students train with hands-on materials to understand the methods, implement the algorithms and interpret results in real omics applications.
Fall – This is a 2-credit course in advanced statistical learning, covering topics related to the statistical interpretation and theory behind machine learning models/methods. Emphases will be given to in-depth derivation of models/algorithms from topics covered in BIOST 2079 (Introductory Statistical Learning for Health Sciences) as well as additional topics on modern statistical learning methodologies, with special focus on methods for health science applications.  This course is designed for graduate students in the Department of Biostatistics and other interested graduate students who already have sufficient statistical and programming background. Students are expected to be familiar with R. Experience in C/C++, Python or Matlab may be helpful, but is not required. Programming skills/training shall be demonstrated by previous programming (or programming heavy) courses in R, Python, Matlab, C/C++, etc.
Fall – Personalized medicine is becoming a reality that is being driven by ongoing discoveries in cell biology, genomics, proteomics, and metabolomics. The translational speed of these discoveries, particularly in the diagnostic, prognostic, and theragnostic arenas, is rapid. We believe that in the future personalized medicine diagnostics will involve both physicians and basic scientists. A major obstacle to this approach is the lack of training components for basic scientists in this area. This course aims to close that gap and provide an appreciation for, and understanding of, key principles of clinical development and testing in order to help bridge this gap. The course will be designed to delve into concepts of personalized medicine using focused topic areas. The first week will introduce the principles and overriding concepts of clinical test development, which differ qualitatively from investigational research. Next there will be six 2-week sessions, with each section focusing on a separate testing modality. Topics will include inherited genetic diseases and predispositions, acquired genetic changes (cancer), metabolomic profiles of endocrine diseases, immune networks for transplant and rejection, proteomic profiling in blood disorders, and proteomic detection of shock and organ failure.
Fall – This course examines molecular mechanisms of drug interactions with an emphasis on drugs that modulate cell signaling, cellular responses to drugs, and drug discovery. The course will include student participation through presentations and discussion of relevant contemporary scientific literature. Topics include: cell cycle checkpoints and anti-cancer drugs, therapeutic control of ion channels, and blood glucose, anti-inflammatory agents and nuclear receptor signaling, and molecular mechanisms of drugs used for the treatment of cardiovascular diseases.
Spring – Course covers computational and mathematical neuroscience. It will do modeling and analysis of complex dynamics of single neurons and large-scale networks. This course is offered every other Spring starting in 2022.
Fall – This course offers an introduction to modeling methods in neuroscience. Topics range from modeling the firing patterns of single neurons to using computational methods to understand neural coding. Some systems level modeling is also done.
Spring – This course introduces a number of modeling methods for biological systems. We will examine a number of problems from cell biology, immunology, population biology, physiology and molecular genetics. The main tools will be techniques from ordinary and partial differential equations. Discrete and delay-differential equations will also be used however the background for these will not be assumed. We will take models from current and classic papers in the field.
Fall – This course is focused on particular topics of great biologic complexity in critical illness, where modeling has the potential to translate in improved patient care. Lectures are provided by basic (biological and mathematical sciences) and clinical faculty, in conjunction with members of industry and speakers from outside institutions. This information will be communicated within the framework of defined themes that describe the complexity of inflammation in acute and chronic illnesses.
Spring. This is an introductory probability and statistics course intended primarily for biomedical informatics students. The first part of the course covers probability, including basic probability, random variables, univariate and multivariate distributions, transformations, expectation, numerical integration, and approximations. The second part of the course covers statistics, including study design, classical parametric inference, hypothesis testing, Bayesian inference, non-parametric methods, classification, ANNOVA, and regression. We will use r for statistical computing and applications. Examples and applications will focus on biomedical informatics and related discipline.
Summer – Science is increasingly inter-disciplinary, and programming has become a valuable skill in many investigations. This course is designed to empower you with the ability to solve scientific problems through writing computer programs. Emphasis is placed on using the R language to solve biology problems.
Fall – This survey course covers the principles of population genetics as applicable to human populations, including (1) the laws of inheritance that govern the organization of the genomes in populations, (2) the evolutionary forces and phenomena that impact genetic diversity in human populations, and (3) the foundational concepts of genetic epidemiology and gene discovery.
This course uses examples from across biology to illustrate how simple mathematical models can increase our understanding of biological systems. We will focus on several foundational modeling approaches, including systems of difference equations, matrix models, probability, and statistical data analysis. Students will discover how these approaches are used, their strengths and limitations, and how they could be extended to more complex problems. Students should be prepared to use both spreadsheet programs and scripts, written in a language such as Python or R, to explore thesemodels.
This course examines the electrical properties of nerve cells and the mechanisms by which nerve cells communicate. The following topics will be covered: electrical principles used by nerve cells, the basis of the resting potential, the function of voltage-dependent ionic channels, the mechanisms by which action potentials are generated, neurotransmitter receptor function, and the physiology of fast synaptic communication.
Fall – Course is concerned primarily with the structure and functions of proteins and nucleic acids. These are large polymers where structure and function are determined by the sequence of monomeric units. Topics will include the physical and chemical properties of the monomer units (amino acids/nucleotides); the determination of the linear sequence of these units; the size, shape and general properties of the biopolymers in aqueous systems; and the relation between structure and function, particularly in transport (hemoglobin) and in catalysis (enzymes).
Spring – Evolution is a fundamental unifying principle of biology. This class takes a broad approach to illustrate how an evolutionary perspective augments medical research and practice. Topics covered range from the evolution of human populations, to antibiotic resistance, and include medical conditions as diverse as diabetes, cardiovascular disease, cancer or aging.