Introduction to Data Science

Faculteit Science and Engineering
Jaar 2021/22
Vakcode WMCS002-05
Vaknaam Introduction to Data Science
Niveau(s) master
Voertaal Engels
Periode semester I a
ECTS 5
Rooster rooster.rug.nl

Uitgebreide vaknaam Introduction to Data Science
Leerdoelen At the end of the course, the student is able to:
1) Describe several fundamental methods that play a central role in Data Science (in natural language and pseudo code)
2) Implement methods in a programming language
3) Reason about data science and individual methods concerning appropriateness, correctness and efficiency
4) Analyse data by means of experiments
5) Report orally and in writing about activities that involve the knowledge and skills listed above
Omschrijving You learn some fundamental principles about data preprocessing and discovery, analysis and evaluation. We will look at big data analytics, hands-on practicals, theory and methods. Some practicals contain real-world data science problems (for example Medicine) for which the full data-science-life-cycle is expected to be performed by each student team.
Since it is an introduction course we will have a broad overview of general principles including specific examples and references to the other Master courses which provide more in-depth analysis of the specific topics.

We will cover the following aspects with respect to data mining:
- multidimensional and multivariate data, sampling, preprocessing, quality and missingness
- classification
- optimization
- text analysis
These topics give a broad overview of challenges with respect to data science and methodologies.

Practical assignment work is done in groups in which participants will be automatically assigned within the first 2 weeks of the course optimized for interdisciplinarity aiming to simulate a realistic data science project team composition.

Groups will have to present their work in front of the class for selected assignments.

Attendance to some lectures might be mandatory (details will be announced in the first lecture).

Please note:
Due to the popularity of the course the enrolment is now restricted to 120 places given on a first-come-first-serve basis (see Remarks section).
We run a very strict procedure in case of 'ghost enrolments' of people never showing up, which results in a n NA grade if not disenrolled. This is necessary due to the automatic optimized group assignment in the first 2 weeks to limit the negative impact of personal choices on the groups. We strongly appeal to your consciousness when you take one of the restricted places if you are not
sure if you are committed to finish the course requiring 140hours of your time.
Uren per week
Onderwijsvorm Bijeenkomst (S), Hoorcollege (LC), Practisch werk (PRC)
(Some group presentations assume the presence of the entire group. Some of the lectures are mandatory.)
Toetsvorm Opdracht (AST), Schriftelijk tentamen (WE)
(Assignments (both of a theoretical nature, and some programmed in Matlab, R, or Python) and an exam. The assignments (the A grade) are worth together 60% of the grade; the exam (the E grade) is worth 40% of the grade. Both A and E grades must be above 5.0 (a total grade of 5.5 will be rounded up to 6).)
Vaksoort master
Coördinator prof. dr. K. Bunte
Docent(en) prof. dr. K. Bunte
Verplichte literatuur
Titel Auteur ISBN Prijs
Genetic Algorithms in Search, Optimization and Machine Learning David E. Goldberg 0201157675
Introduction to Data Mining (recommended) Pang-Ning Tan, Michael Steinbach, Vipin Kumar
Entreevoorwaarden The course unit recommends prior knowledge acquired from the courses Algorithms & Data Structures in C, Statistics and Advanced Algorithms & Data Structures from the BSc degree programme in Computing Science or equivalent knowledge from other bachelor programmes.
Programming knowledge in at least one of the following languages: C/C++, Matlab, R or Python is indispensable.
Computers in the lab are in Linux, so familiarity will help. For group work, we require the use of git and GitHub. If you are not familiar with these, it would be good to read up on it upfront.
Opmerkingen This course has limited enrollment:
- CS students can always enter the course, regardless of whether the course is mandatory for them or not.
- The number of enrolments for other non-CS students is limited. These students need to meet the course prerequisite requirements as mentioned on Ocasys. Priority is given to students for which the course is an official elective (see list below).
- An exception can be made for exchange students, if they have a CS background: please contact the FSE International Office. See here for more info about the enrollment procedure.
Opgenomen in
Opleiding Jaar Periode Type
MSc Artificial Intelligence  (C - Elective Course Units) - semester I a keuze
MSc Astronomy: Quantum Universe 1 semester I a verplicht: DS
MSc Computing Science: Data Science and Systems Complexity  (Compulsory course units) 1 semester I a verplicht
MSc Computing Science: Intelligent Systems and Visual Computing  (Guided choice course units) - semester I a keuze
MSc Computing Science: Science Business and Policy  (Compulsory course units) 1 semester I a verplicht
MSc Computing Science: Software Engineering and Distributed Systems  (Guided choice course units) - semester I a keuze
MSc Courses for Exchange Students: AI - Computing Science - Mathematics - semester I a
MSc Human Machine Communication - per 21-22 Computational Cognitive Science  (C - Elective Course Units) - semester I a keuze
MSc Mathematics: Science, Business and Policy  (Science, Business and Policy: Statistics and Big Data) - semester I a keuzegroep
MSc Mathematics: Statistics and Big Data  (MSc Mathematics: Statistics and Big Data) - semester I a keuzegroep