Massive datasets are now common and require scalable analysis tools. Machine learning provides such tools and is widely used for modelling problems across many fields including artificial intelligence, bioinformatics, finance, marketing, education, transportation, and health. In this context, we study how standard machine learning models for supervised (classification, regression) and unsupervised learning (for example, clustering and topic modelling) can be scaled to massive datasets using modern
computation techniques (for example, computer clusters). In addition, we will discuss recent models for recommender systems as well as for decision making (including multi-arm bandits and reinforcement learning).
Through a course project students will have the opportunity to gain practical experience with the analysis of datasets from their field(s) of interest. A certain level of familiarity with computer programming will be expected.