Statistical genomics (17/18)
Faculteit  Science and Engineering 
Jaar  2017/18 
Vakcode  WISG09 
Vaknaam  Statistical genomics (17/18) 
Niveau(s)  master 
Voertaal  Engels 
Periode  semester II a 
ECTS  5 
Rooster  rooster.rug.nl 
Uitgebreide vaknaam  Statistical Genomics (tweejaarlijks 2017/2018)  
Leerdoelen  Although graphical models, such as Bayesian networks, have recently become a very important and popular tool in the topical field of Statistical Genomics, these models are usually not discussed in the classical Statistics courses. The aim of this course is that the students learn how to combine graph theory and probability theory to infer graphical models from data. At the end of this course students are expected to be able  to translate applied and realworld problems into graphical models  to interpret a graphical model w.r.t. its conditional independencyproperties  to statistically model graphs  to design MCMC sampling algorithms  to infer graphical models from data  to critically assess inference algorithms for graphical models 

Omschrijving  In the course we will mainly focus on directed graphical models (Bayesian networks), which can for example be used to infer gene regulatory networks and protein pathways in systems biology research. After a very brief introduction to these biological applications, we will consider Bayesian networks from a mathematical perspective. As Bayesian networks can effectively be seen as a marriage of graph theory and probability theory, we will first discuss various graph theoretic concepts, before we can start modelling graphs statistically. In this context we will have to briefly discuss (or repeat respectively) some fundamentals of Bayesian Statistics. Finally, we will learn how to learn/infer Bayesian networks from data. Throughout the course we will use the statistical computing environment R to implement some of the discussed algorithms, while other more sophisticated implementations will be made available. More specifically, the topics, covered in this course, include: 1. An introduction to graphical models in Statistical Genomics 2. An introduction to the general graphical model terminologies 3. The Markov property, conditional independency relations, and the concept of dseparation 4. Equivalence classes of graphs (CPDAGs) and singleedge operations 5. The BDe scoring metric for discrete Bayesian networks 6. The BGe scoring metric for Gaussian Bayesian networks 7. Graph inference with Greedy Search algorithms (for finding ‘the best’ graph) 8. Graph inference with Markov Chain Monte Carlo (MCMC) simulations (for model averaging) 9. Dynamic Bayesian networks for time series data 10. Other graphical models, advanced Bayesian network models and improved MCMC sampling schemes 

Uren per week  
Onderwijsvorm 
Hoorcollege (LC), Practisch werk (PRC), Werkcollege (T)
(4 hours lectures and 2 hours tutorials (partly in a computer lab) per week) 

Toetsvorm 
Opdracht (AST), Schriftelijk tentamen (WE)
(Final = 0.1 x max(HW1, ET) + 0.1 x max(HW2, ET) + 0.1 x max(HW3, ET) + 0.7 x ET, where HW1, HW2 and HW3 are the homework grades for 1st , 2nd, and 3rd homework and ET is written exam grade.) 

Vaksoort  master  
Coördinator  dr. M.A. Grzegorczyk  
Docent(en)  dr. M.A. Grzegorczyk  
Verplichte literatuur 


Entreevoorwaarden  You need to be familiar with the principles of mathematical statistics (likelihood theory) and several standard statistical techniques (e.g. linear models). For some of the homework exercises basic programming skills (software: R) are required. Being familiar with Bayesian Statistics might be advantageous but is not required, since all required Bayesian concepts will be repeated throughout the course.  
Opmerkingen  Note that there is NO mandatory literature, as the main parts of this lecture have been taken, collected and adapted from various textbooks and research papers. The lecture itself will be based on a set of slides, which will be made available via Nestor. As these slides are selfexplanatory, there is no need for additional literature. However, useful additional and complementary pieces of information can be found in standard textbooks on Bayesian networks, such as the book mentioned above.  
Opgenomen in 
