Resources
A lot of my writing is on this blog, but the rest scatters over the Internet. Here is some of my work by topics.
Machine learning in production
- [Book] Desining Machine Learning Systems (O’Reilly, June 2022)
- [Booklet] Machine learning systems design
- [Video] Machine learning production myths (Stanford’s MLSys Seminars)
- [CS 329S lecture] Introduction to machine learning in production
- [CS 329S lecture] Data system fundamentals for data scientists
- [CS 329S lecture] Creating training data: sampling, labeling, handling class imbalance, data augmentation
- [CS 329S lecture] Feature engineering
I track hundreds of machine learning tools and MLOps tools and update the list as I discover new tools and/or existing tools pivot. You can find the analyses (accompanied by a list of tools) of previous versions here:
Tutorials
- [Code] Python-is-cool: Cool Python features for machine learning that I used to be too afraid to use.
- [Code] just-pandas-things: Pandas quirks that used to traumatize me.
- [Code] Stanford’s TensorFlow tutorials.
Machine learning interviews
- [Free & open-sourced book] Machine Learning Interviews Book
- [Twitter thread] The ML interviews process
- [Post] Analysis of compensation, level, and experience details of 19k tech workers
- [Post] What Glassdoor interview reviews reveal about tech hiring cultures
- [Video] Chip Huyen on Machine Learning Interviews (Full Stack Deep Learning)
- [Code] Coding exercises and solutions for coding interviews
Courses
I’ve created and taught two courses on machine learning at Stanford. The slides and notes can be found on the courses’ websites.
- CS 20: TensorFlow for Deep Learning Research (2017, 2018)
- CS 329S: Machine Learning Systems Design (ongoing)