ADVANCED STATISTICAL LEARNING
2018/2019, Semester 2
Saw Swee Hock School of Public Health (Saw Swee Hock School of Public Health)
Modular Credits: 4
This module will be taught on Tuesday and Friday mornings from 10am – 12noon. Classes will commence on 15 January 2019.
For more information on topics and venues, please check the module schedule uploaded in the IVLE Files. Any changes to the schedule will be reflected in the module schedule.
This module will introduce advanced topics for analyzing large or complex datasets, with a particular emphasis on various biomedical data. We will cover fundamental techniques in machine learning with emphasis on both computing and data analysis. The topics will include regression and classification, resampling-based techniques to evaluate performance, variable selection, tree-based methods for regression and classification, support vector machines, unsupervised data clustering methods and factor analysis, neural networks, neural network-based deep learnings, etc.
The first part of the course is a mixture of sage-on-stage and flipped classroom learning, with substantial reading material being assigned each week, a lecture giving an overview of that material on Tuesday, and a computer practical on Friday in which students, either individually or in groups, tackle problem sets provided by the instructor, with support from the instructor and other students. The second part of the course introduces the basics of Bayesian statistics and Monte Carlo Simulation. It will also introduce various models for the simulations. Reading materials and programming examples (R scripts) will be distributed after each lecture. Datasets for course project will be provided after the first lecture. The third part of the course will be an introduction to deep neural networks. Students will learn the fundamental concepts behind deep neural networks and the real-world applications of deep neural networks.
Students interested in this module should have background in Statistics.
Problem-based learning involving the analysis of large biomedical and epidemiological datasets
Two project assessments which involve data analysis and report writing
Upon completion of this course, you will be able to:
Understand the basic machine learning models
Understand the conceptual difference between Bayesian and Frequentist statistics;
Program simulation based routines for performing Bayesian inference;
Program a Markov chain Monte Carlo algorithms in the R language and either JAGS or STAN;
Understand the use of importance sampling and hierarchical modelling;
Understand the concepts behind deep neural networks;
Be able to implement neural networks for various applications;
Apply a variety of machine learning algorithms to perform data mining, inference, and prediction.
Total for CA
When a student is unable to attend the required sessions, an excuse may be granted for limited time periods upon the production of evidence of illness, misadventure or leave of absence having been granted.
Students must inform the Education Office if any of the above has taken place.
Failure to meet attendance requirements will affect module grading.