Indiana University Bloomington

Computer Science B555
Machine Learning

Contact: Martha White
Offered: Spring, 2016
Class Time: 09:30am-10:45am
Class Days: Tu, Th
Capacity: 50
Pre-Requisites: None.
Algebra Required: Basics: vector spaces, matrices, independence, solving linear systems.
Calculus Required: Basics: discrete and continuous functions, differentiation, integration.
Instructor: Christopher Raphael
Days Per Week Offered: Two.
Website: http://homes.soic.indiana.edu/martha/teaching.html
Syllabus: Download Syllabus
Keywords: Machine Learning, statistical inference, classification, regression, distribution learning.
Description: The course objective is to study the theory and practice of constructing algorithms that learn (functions) and choose optimal decisions from data and experience. Machine learning is a field with goals overlapping with other disciplines, in particular, statistics, algorithms, engineering, or optimization theory. It also has wide applications to a number of scientific areas such as finance, life sciences, social sciences, or medicine. The class will cover theoretical foundations of machine learning but also provide examples from classification, regression, and statistical distribution learning. This is a core Computer Science course.

The course covers about 75% of the following topics, depending on the year:
  • mathematical foundations of machine learning (random variables and probabilities, probability distributions, high-dimensional spaces)
  • overview of machine learning (supervised, semi-supervised, unsupervised learning, inductive and transductive frameworks)
  • classification algorithms: linear and non-linear algorithms (logistic regression, naive Bayes, decision trees, neural networks, support vector machines)
  • regression algorithms (least squares linear regression, neural networks, relevance vector machines, regression trees)
  • density estimation (expectation-maximization algorithm, kernel-based density estimation)
  • kernel methods (dual representations, RBF networks)
  • graphical models (Bayesian networks, Markov random fields, inference)
  • ensemble methods (bagging, boosting, random forests)
  • practical aspects in machine learning (data preprocessing, overfitting, accuracy estimation, parameter and model selection)
  • special topics (introduction to PAC learning, sample selection bias, learning from graph data, learning from sequential data)
  • Books: Recommended Textbook:
    Pattern Recognition and Machine Learning - by C. M. Bishop, Springer 2006.

    Additional materials:
    Machine Learning - by Tom M. Mitchell, McGraw-Hill, 1997.
    The Elements of Statistical Learning - by T. Hastie, R. Tibshirani, and J. Friedman, 2009.
    Applied/Theoretical: Balanced.
    Formal Computing Lab: No
    Software: MatLab, Python
    How Software is Used: To provide illustrative demos.
    Problem Sets: Four homework assignments and five thought questions.
    Data Analysis: Basic implementation and analysis of methods taught in class.
    Presentation: Traditional white-board; power point when needed; demos.
    Exams: Final exam (final week).
    Comments: Instructor's code and demos are in MATLAB. Homework assignments will contain programming, but students are not required to use MATLAB.