Leerdoelen  At the end of this module the students will be able to:  identify and summarize existing big data processing techniques  select a best possible technique for a given data processing problem  construct an algorithm for a given problem on top of existing data processing frameworks and techniques  evaluate the constructed solution and compare it with possible alternatives, both qualitatively and quantitatively 

Omschrijving  Over recent years the importance of the efficient data processing has drastically increased. From one side, this is caused by the additional value extracted from the data which results in huge economical and societal benefits both for companies and for the society as a whole. From the other side, demand for efficient processing is also influenced by easiness of collecting and storing large amounts of data that needs to be continuously processed and analysed. With ever increasing data, efficient data processing is not possible without fully exploiting the available computational resources. In this course we specifically look into problems that require large amount of resources, the problems, that cannot be easily solved on a single computer using traditional methods. We discuss both CPUintensive tasks and dataintensive, and look into possible solutions for both classes of problems. The course does not concentrate on one particular method or technique, but rather aims at presenting an overview of possible solutions (to name a few: GPU computing, map/reduce, bulk synchronous parallel, etc.), their pro's and con's.  
(Theory in 1 lecture / week. For practical assignments students are divided in groups (23 students per group). Each group is assigned with a project that needs to be completed by the end of the course. Weekly meetings are organized to discuss progress on ongoing projects.) 

(Written exam 50%. Project 50% (report and presentation).) 

