Course announcement - Machine Learning Systems Design at Stanford!
Update:
- The course website is up, which contains the latest syllabus, lecture notes, and slides.
- The course has been adapted into the book Designing Machine Learning Systems (O’Reilly 2022)
Ever since teaching TensorFlow for Deep Learning Research, I’ve known that I love teaching and want to do it again.
In early 2019, I started talking with Stanford’s CS department about the possibility of coming back to teach. After almost two years in development, the course has finally taken shape. I’m excited to let you know that I’ll be teaching CS 329S: Machine Learning Systems Design at Stanford in January 2021.
The course wouldn’t have been possible with the help of many people including Christopher Ré, Jerry Cain, Mehran Sahami, Michele Catasta, Mykel J. Kochenderfer.
Here’s a short description of the course. You can find the (tentative) syllabus below.
This project-based course covers the iterative process for designing, developing, and deploying machine learning systems. It focuses on systems that require massive datasets and compute resources, such as large neural networks. Students will learn about the different layers of the data pipeline, approaches to model selection, training, scaling, as well as how to deploy, monitor, and maintain ML systems. In the process, students will learn about important issues including privacy, fairness, and security.
Pre-requisites: At least one of the following; CS229, CS230, CS231N, CS224N, or equivalent. Students should have a good understanding of machine learning algorithms and should be familiar with at least one framework such as TensorFlow, PyTorch, JAX.
For Stanford students interested in taking the course, you can fill in the application here. The course will be evaluated based on one final project (at least 50%), three short assignments, and class participation.
For those outside Stanford, I’ll try to make as much of the course materials available as possible. I’ll post updates about the course on Twitter or you can check back here from time to time.
Since these are all new materials, I’m hoping to get early feedback. If you’re interested in becoming a reviewer for the course materials, please shoot me an email. Thank you!
Tentative syllabus
Week 1: Overview of machine learning systems design
- When to use ML
- ML in research vs. ML in production
- ML systems vs. traditional software
- ML production myths
- ML applications
- Case studies
Week 2: Iterative process
- Principles of a good ML system
- Iterative process
- Scoping the project
Week 3: Data management
- Challenges of real- world data
- How to collect, store, and handle massive data
- Different layers of the data pipeline
- Data processor & monitor
- Data controller
- Data storage
- Data ingestion: database- engines
Week 4: Creating training datasets
- Feature engineering
- Data labeling
- Data leakage
- Data partitioning, slicing, and sampling
Week 5: Building and training machine learning models
- Baselines
- Model selection
- Training, debugging, and experiment tracking
- Distributed training
- Evaluation and benchmarking
- AutoML
Week 6: Deployment
- Inference constraints
- Model compression and optimization
- Training vs. serving skew
- Concept drift
- Server- side ML vs. client- side ML
- Releasing strategies
- Deployment evaluation
Week 7: Project milestone and discussion
- Ethical concerns
Week 8: Monitoring and maintenance
- What to monitor
- Metrics, logging, tags, alerts
- Updates and rollbacks
- Iterative improvement
Week 9: Hardware & infrastructure
- Architectural choices
- Hardware design
- Edge devices
- Clouds vs. private data centers
- Future of high- performance computing
Week 10: Integrating ML into business
- Model performance vs. business goals vs. user experience
- Team structure
- Why ML projects fail
- Best practices
- State of ML production
This blog post was edited by the wonderful Andrey Kurenkov.
I want to devote a lot of my time to learning. I’m hoping to find a group of people with similar interests and learn together. Here are some of the topics that I want to learn:
- How to bring machine learning to browsers
- Online predictions and online learning for machine learning
- MLOps in general
If you want to learn any of the above topics, join our Discord chat. We’ll be sharing learning resources and strategies. We might even host learning sessions and discussions if there’s interest. Serious learners only!