Statistical genomics (19/20)

Faculteit Science and Engineering
Jaar 2019/20
Vakcode WISG-09
Vaknaam Statistical genomics (19/20)
Niveau(s) master
Voertaal Engels
Periode semester I a

Uitgebreide vaknaam Statistical Genomics (tweejaarlijks 2019/2020)
Leerdoelen At the end of the course, the student is able to:

1. translate applied and real-world problems into graphical models
2. interpret a graphical model w.r.t. its conditional independency-properties
3. statistically model graphs
4. design MCMC sampling algorithms for graphs
5. infer graphical models from data
6. critically assess inference algorithms for graphical models
Omschrijving In the course we will mainly focus on directed graphical models (Bayesian networks), which can for example be used to infer gene regulatory networks and protein pathways in systems biology research. After a very brief introduction to these biological applications, we will consider Bayesian networks from a mathematical perspective. As Bayesian networks can effectively be seen as a marriage of graph theory and probability theory, we will first discuss various graph theoretic concepts, before we can start modelling graphs statistically. In this context we will have to briefly discuss (or repeat respectively) some fundamentals of Bayesian Statistics. Finally, we will learn how to learn/infer Bayesian networks from data. Throughout the course we will use the statistical computing environment R to implement some of the discussed algorithms, while other more sophisticated implementations will be made available.

More specifically, the topics, covered in this course, include:
1. An introduction to graphical models in Statistical Genomics
2. An introduction to the general graphical model terminologies
3. The Markov property, conditional independency relations, and the concept of d-separation
4. Equivalence classes of graphs (CPDAGs) and single-edge operations
5. The BDe scoring metric for discrete Bayesian networks
6. The BGe scoring metric for Gaussian Bayesian networks
7. Graph inference with Greedy Search algorithms (for finding ‘the best’ graph)
8. Graph inference with Markov Chain Monte Carlo (MCMC) simulations (for model averaging)
9. Dynamic Bayesian networks for time series data
10. Other graphical models, advanced Bayesian network models and improved MCMC sampling schemes
Uren per week
Onderwijsvorm Hoorcollege (LC), Opdracht (ASM), Practisch werk (PRC)
(4 hours lectures and 2 hours tutorials (partly in a computer lab) per week)
Toetsvorm Opdracht (AST), Verslag (R)
(Assessment takes place through three homework assignments and a final research project report (RPR) according to the formula: Final = 0.1 x max(HW1, RPR) + 0.1 x max(HW2, RPR) + 0.1 x max(HW3, RPR) + 0.7 x RPR where HW1, HW2 and HW3 are the homework grades for 1st , 2nd, and 3rd homework and RPR is the research project report grade.)
Vaksoort master
Coördinator prof. dr. M.A. Grzegorczyk
Docent(en) prof. dr. M.A. Grzegorczyk
Verplichte literatuur
Titel Auteur ISBN Prijs
The study material is provided in form of self-explanatory lecture
Entreevoorwaarden The course unit assumes prior knowledge acquired from the two course units ‘Probability Theory’ and ‘Statistics’ of the Mathematics BSc Programme.
Opgenomen in
Opleiding Jaar Periode Type
MSc Computing Science: Data Science and Systems Complexity  (Guided choice course units) - semester I a keuze
MSc Computing Science: Intelligent Systems and Visual Computing  (Guided choice course units) - semester I a keuze
MSc Courses for Exchange Students: Mathematics 19-20 - semester I a
MSc Mathematics: Science, Business and Policy  (Science, Business and Policy: Statistics and Big Data) - semester I a keuzegroep
MSc Mathematics: Statistics and Big Data  (MSc Mathematics: Statistics and Big Data) - semester I a keuzegroep
Tweejaarlijkse vakken  (Oneven jaren) - semester I a -