Indiana University Bloomington

Informatics I529
Machine Learning In Bioinformatics

Contact: Haixu Tang
Offered: Spring, Every Year
Capacity: 35
Pre-Requisites: INFO-I519 or equivalent.
Algebra Required: Basics.
Calculus Required: Basics.
Contact Person for Authorization: Haixu Tang
Instructor: Haixu Tang
Days Per Week Offered: Two lectures and one lab every week
Syllabus: No Syllabus Avaliable
Keywords: Bioinformatics, hidden Markov model, Bayesian network, Expectation-Maximization algorithm, MCMC.
Description: Machine learning techniques have been successful in analyzing biological data because of their capabilities in handling noisy data noise and in generalization. In this class, we will learn basics about probabilistic models and machine learning techniques. We will focus on probabilistic models (Markov models, hidden Markov models, and Bayesian networks) for biological sequence analysis and systems biology. Other machine learning techniques, such as Naive Bayes, neural networks and support vector machines will only be covered briefly.
Books: Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1999.
Applied/Theoretical: Balanced.
Formal Computing Lab: Yes
Software: C/C++, Python
How Software is Used: For data analysis and visualization.
Problem Sets: 5 home assignments, including 3-4 programming assignments.
Data Analysis: Several small projects and one group project requring analyzing large biological datasets.
Presentation: Required for the final group project.
Exams: One midterm and one final.
Comments: The students are required to implement algorithms in C/C++ or Python.