The minimum requirements for graduation are:
- At least 30 credit hours of graduate-level training, including:
- At least 4 research credits of directed study
- A summer internship or additional directed study, worth 3 credits
- 3 credits of Foundations of Computational Biology
- 1 credit of Professional Development
- One course (taken or waived) from each of the five groups
Program duration is from 12 to 20 months, depending on the background of the student. To complete the degree in one year, students must demonstrate proficiency in programming, calculus and linear algebra.
Detailed Course Listing
Course Grouping 1: Computational Structural Biology
(4 credits, Fall)
Topics covered include: applying computational and statistical methods to the analysis of DNA and protein structures representing protein, DNA and RNA structure; homology modeling and protein structure prediction; theoretical description of basic interactions, along with computational methods to estimate them; statistical mechanical theory of molecules; molecular dynamics and other sampling methods; modeling protein flexibility, from side chains to loops to slow modes; reaction paths and basics of path sampling; protein-protein and protein-small molecule docking; supramolecular assembly; introduction to Quantitative Structure Activity Relationship (QSAR) in drug design.
Course Grouping 2: Computational Systems Biology
(4 credits, Spring)
This course introduces students to the theory and practice of modeling biological systems from the molecular to the population level with an emphasis on intracellular processes. Topics covered include kinetic and equilibrium descriptions of biological processes, systematic approaches to model building and parameter estimation, analysis of biochemical circuits modeled as differential equations, modeling the effects of noise using stochastic methods. A range of biological models and applications are considered, including gene regulatory networks, cell signaling, molecular motors, and developmental biology.
Course Grouping 3: Computational Genomics
(3 credits, Spring)COBB 2070 Computational Genomics
(3 credits, Spring)
This course introduces students to genomic data and basic analytical principles pertaining them. Students will learn about high-throughput sequencing methods and applications, genomic variation, transcriptomics and epigenomic data. At the end of the course, the students will be able to analyze efficiently these types of data sets using existing algorithms or algorithms they will develop.
Dramatic advances in experimental technology and computational analysis are fundamentally transforming the basic nature and goal of biological research. The emergence of new frontiers in biology, such as evolutionary genomics and systems biology is demanding new methodologies that can confront quantitative issues of substantial computational and mathematical sophistication. This course introduces classical approaches and the latest methodological advances in the context of the following biological problems: 1) Computational genomics, focusing on gene finding, motifs detection and sequence evolution. 2) Analysis of high throughput biological data, such as gene expression data, focusing on issues ranging from data acquisition to pattern recognition and classification. 3) Molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, and 4) Systems biology, concerning how to combine sequence, expression and other biological data sources to infer the structure and function of different systems in the cell. From the computational side this course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, pattern recognition, data integration, time series analysis, active learning, etc.
Course Grouping 4: Computer Science
(4 credits, Spring)
Machine learning (ML) has become an integral part of computational thinking in the era of big data biology. This course focuses on understanding the statistical structure of large-scale biological datasets using ML algorithms. We cover the basics of ML and study their scalable versions for implementation on a distributed computing framework. We pursue distributed ML algorithms for matrix factorization, convex optimization, dimensional reduction, clustering, classification, graph analytics and deep learning, among others. This course is project driven (3 to 4 small projects) with source material from genomic sciences, structural biology, drug discovery, systems modeling and biological imaging. Students are expected to design, implement and test their ML solutions in Apache Spark.
Course Grouping 5: Electives
Table 1: Course Schedule for 16-month CoBB MS Training (34 credits total)
|Year 1||Course Area (*)||Credits||Term Credit|
|1st Fall semester||Introduction to Bioinformatics Programming in Python COBB2025||4||11|
|Foundations in Computational Biology COBB2010||3|
|Professional Development COBB2055||1|
|1st Spring semester||Genomics for Systems Biology CoBB2020||3||11|
|Cell & Systems Modeling COBB2041||4|
|Scalable Machine Learning for Big Data COBB2066||4|
|1st Summer semester||Internship||3||3|
|2nd Fall semester||Elective||3||11|
|Directed Study COBB2080||4|
|Computational Structural Biology COBB2030||4|