EEB 384K AI IN MOL BIO AND BIOCHEM
The BioML Seminar Series, launched by the BioML Society, has now been formalized into a graduate course at UT Austin. Taught by Dr. Claus Wilke, the course provides structured training in machine learning methods for biological data.
Topics covered include:
Machine Learning Basics
Model Training Tutorials
Various Network Architectures
How to apply Machine Learning in Biology and Biochemistry
For Fall 2025, you can sign up with Unique #56914 (EEB 384K), meeting on Monday/Wednesday (9am to 10:30am) in WEL 2.306.
Fall 2023 Seminar Course:
Machine Learning for Biochemical Applications
The BioML Society hosted an 8-week seminar course designed to be your essential primer for the latest in Bioinformatics and Machine Learning (ML) tools! Explore the forefront of scientific innovation as we delve into the world of state-of-the-art ML-based biology tools, uncovering their potential to revolutionize your research.
Each class was be taught by current graduate student members of the BioML Society and consisted of a 30 minute lecture followed by a 30 minute in-depth Q&A and discussion. The course contained homework assignments in which you will learn to utilize and/or build on top of these novel tools. You do not need to know how to code, though it is recommended to get the most out of the course.
The seminar is now over. Thank you for your interest!
Original Student-led Syllabus
-
Introduction to course
High-level ML overview
Architectures
Data representations for biology & chemistry
Limitations & scope of models
Lecture 1 slides
Lecture 1 video
-
Python, IDE, basic coding tools
Example problem using scikit-learn
AI coding assistants (Copilot, ChatGPT)
Lecture 2 slides
Lecture 2 videoResources:
-
Classification
Regression
Evaluation Metrics
Over/Underfitting
Cross Validation
-
What are embeddings?
Embedding-based search
Classification & regression on embeddings
Transfer learning
Pre-training & fine-tuning example
-
Can we predict aroma?
Transformers
BERT
Attention visualization
Lecture 4 slides
Lecture 4 videoUseful links:
The Illustrated Transformer
Deconstructing BERT -
Recent history of language & foundation models
Revisiting transformers
Applications of language models to genomics
-
Protein folding motivation & challenges
AlphaFold
RoseTTAFold
ESM & ESMFold
ColabFold
-
Why protein design? (and how?)
Point mutation models (ESM, MutCompute, Stability Oracle)
Span redesign models (ESM, ProteinMPNN)
De novo design (RFdiffusion, ProteinMPNN)
Recent breakthroughs in structure prediction and design (RFAA, AlphaFoldAA)
Limitations