After 3.5 years struggling, I’ve finally graduated with a bachelor’s degree and a master’s degree in Computer Science (CS), Artificial Intelligence track. I completed a total of 228 units, including 36 CS courses. I guess you can say that I know CS courses at Stanford pretty well.
Through teaching and studying, I’ve had quite a few people asking me for advice on what classes to take. I realized that there’s a big gap between courses descriptions and reality. This post is an attempt to bridge the gap.
In this post, you will find reviews + related materials on many AI courses at Stanford. I rank each course on difficulty from 1 being the easiest to 5 being the hardest. The difficulty level is based on how much time I spent on the course and how well I did in it.
I also take the liberty to include some advice on how to navigate the CS major and a suggested 4-year plan for people interested in majoring in CS, AI track. Everything is this post is from my personal viewpoint and probably biased. It is NOT endorsed by my university or my employer. This list will be updated as I have more time.
Is it difficult to study CS at Stanford?
Stanford offers a strong liberal arts education, which means that you have a lot of freedom in designing your coursework. There are about 4000 courses offered each quarter for you to choose from. There are many different ways to fulfill a certain requirement and therefore, you can make your study as hard as you want.
I know quite a few people who deliberately take courses below their level to get easy A’s. I also know people who make bad life choices and end up getting wrecked every quarter. When I say the latter, I mean me.
How much time did you spend studying?
If you just want to finish your bachelor’s in four years and get a decent software engineering job, you can probably do just fine studying for about 40 hrs/wk. If you want to learn as much as you can, squeeze both your bachelor’s and master’s degree into 4 years, or get into a respectable PhD program, there is no upper bound on how much time you should be studying.
In my experience, if you’re a college student living on campus, you can easily work up to 80 hrs/wk while still maintaining a decent social life. You don’t have to commute. You don’t have to do grocery shopping, cooking, cleaning up. All your friends live nearby and you could hang out by doing homework together.
What’s the grading like?
In most CS classes I’ve taken, about 30 - 50% of the students get an A. It means that if you get a B+, you’re pretty much average. Few get a C. There were two courses I did very, very poorly in, like failing to complete the most important assignment, but somehow ended up with a B-. I wondered what I had to do to fail.
Of course, there are exceptions. Mykel Kochenderfer said in the first lecture that if you pay attention in class, he’d be happy to give you an A. 86% of students in his class get some form of A.
A lot of people criticize grade inflation at Stanford, but I think it’s nice in a way that it gets rid of the pressure to study for grade. It encouraged me to take difficult classes because I know that if I worked hard, I would make it through. I know it’s upsetting when you do really well in a class but still get an A just like everyone else, but who the hell cares about grades?
What is it like to take graduate courses as an undergrad?
At Stanford, CS courses with code number above 199 are graduate courses. It means that almost all AI courses are graduate-level. I don’t have the official statistics, but my impression is that about 70% of those enrolled in AI courses are graduate students.
A friend told me he was disheartened and discouraged when taking courses with grad students because they seemed to be doing much better than he was. Also, grad students make the curve bad for undergrads, like no matter how hard my friend tried, his grades were always hovering around or below the median.
For me, I don’t care if other people are doing better than me. I’m happy if I can learn something I didn’t know before.
AI course reviews
STATS 202: Data Mining and Analysis
Textbook: An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (Springer, 1st ed., 2013)
See course materials
This is probably one of the most underrated AI courses at Stanford. I’ve seen so-called ML people snickering when the course is mentioned because it’s too basic for them. It’s true that it’s basic, but that’s exactly why it’s useful. The course gives a stress-free, mathematical yet intuitive explanation of important concepts such as bias, variance, hypothesis test, p-value, PCA, regularization. You will also be given an introduction to R. I don’t like R either but it’s undeniably useful.
After taking the course, things that I’d studied in other “real” AI classes such as CS229, CS231N suddenly made a lot more sense. I’d highly recommend taking this course before taking other AI courses.
CS109: Introduction to Probability for Computer Scientists
Textbook: A First Course in Probability by Sheldon Ross (Pearson, 9th ed., 2014)
See archived course materials (2016)
You can’t do AI without knowing probability. Everything is a distribution. CS109 gives a comprehensive introduction into variables, variance, covariance, correlation, different distributions, bayesian methods, central theorems. It also dabbled into machine learning with logistic regression, naive bayes, MAP.
I took CS109 in my first winter quarter. I knew nothing about AI back then so I didn’t appreciate the course enough. Looking back, I wish I had studied the materials inside out.
It was unfortunate that I took the course with Ron Dror, who didn’t care much about the class and ended up giving uninspiring lectures. But we found Mehran Sahami’s old lecture videos and they were amazing. I’d highly recommend taking all the classes you can with Mehran Sahami. I’ve never taken a class with Chris Piech but I’ve section led for him once and everyone loved him.
The prerequisite for CS109 is CS103, but you don’t need to take CS103 before taking CS109. Difficulty: 2-4, depending on how intuitive probability is for you.
CS231N: Convolutional Neural Networks for Visual Recognition
Whether you’re into computer vision or not, CS231N will definitely help you become a much better AI researcher/practitioner. CS231N is hands down the best deep learning course I’ve come across. It balances theories with practices. The lecture notes are well written with visualizations and examples that explain well difficult concepts such as backpropagation, gradient descents, losses, regularizations, dropouts, batchnorm, etc. The assignments are fun and relevant. The instructors are all young and brilliant.
Even though the course description includes CS229 as a prerequisite, I think a student would benefit a lot more if they take CS231N BEFORE taking CS229. You’ll also have a much better time in the class if you are familiar with Python and NumPy as there’s a fair amount of coding involved.
CS224N: Natural Language Processing with Deep Learning
This is a must-take course for anyone interested in NLP. It’s been taught by two people that I have a lot of admiration for: Professor Richard Socher and Professor Chris Manning. The course is well organized and well taught. I was so impressed when I took the course and the professor mentioned a paper that was published last week. In Vietnam, we’re only taught things that were published at least a decade ago.
The assignments, while useful, can sometimes be frustrating as training NLP models can take a long time. The final project can be open-ended and fun.
I took the course before I took CS124: From Languages to Information and it was a bad decision. I missed out on a lot of details just because I didn’t know NLP basics. The course doesn’t list CS231N as a prerequisite, but I think you’d do much better in the course if you’ve taken CS231N.
CS229: Machine Learning
Talking about CS229, I’m going to state an unpopular opinion that I didn’t like CS229 that much. The course is ambitious. It aims to cover a lot of things and you’d probably do well if you could work through all the materials, but you’d probably need to drop out of all other classes to even hope to do so in 10 weeks. The lecture notes are dense. The assignments are difficult, but still nowhere as difficult as the real exam. The median the year I took it was 46/100.
The course says that its prerequisites include only linear algebra, basic probability and statistics, but my friend Debnil told me that it’s a lie. You need to be really good with linear algebra since the class is very mathy. If you take the course as your first introduction into machine learning, you’re in for a rough time.
CS221: Artificial Intelligence: Principles and Techniques
This is the class that got me interested in AI. I took it the same quarter I took CS229 and while I was crying my way out of CS229, I had a lot of fun making Pacman running away from the ghosts. The course covers fundamental AI concepts such as regression, clustering, gradient descent, nearest neighbors, path finding as well as basic RL concepts such as Monte Carlo, SARSA, Q-learning, policy/value iteration, etc.
The assignments are interesting, practical, and manageable. The minimax algorithm I learned in the course helped land me an internship offer my sophomore year as I used it to solve a card game.
CS228: Probabilistic Graphical Models: Principles and Techniques
Textbook: Probabilistic Graphical Models: Principles and Techniques by Daphne Koller and Nir Friedman (MIT Press)
See course materials
When I told my friends I was about to take the course, they all looked at me in pity. Its reputation as one of the most difficult courses at Stanford keeps the class size relatively low, 150+ students compared to 700-900 students in most other AI classes.
I struggled in the class but didn’t suffer. I found the materials interesting and important. It forced me to check many assumptions I never knew I was making while building ML models. Unlike most AI courses that introduce small concepts one by one or add one layer on top of another, CS228 tackles AI top down as it asks you to think about the relationships between different variables, how you represent those relationships, what independence you’re assuming, what exactly you’re trying to learn when you say machine learning.
The course changes the way I approach AI. I can’t recommend the course highly enough.
CS234: Reinforcement Learning
It’s surprising that Stanford didn’t have a real RL class until Professor Emma Brunskill joined Stanford in 2017. I took the class the first year it was offered and was a bit disappointed. However, I revisited it this year and noticed that the course was much better organized.
Given that there isn’t an alternative, if you’re interested in learning more about RL, you should definitely take the course. The assignments, in my experience, weren’t difficult but can be frustrating. The final project was hard for me because I didn’t know RL enough to come up with a novel yet still manageable idea, and I think it was the same for my friends. If you’re reluctant about taking the course, I’d recommend taking CS221 and CS238 to get a taste of RL.
CS238: Decision Making under Uncertainty (AA 228)
Textbook: Decision Making Under Uncertainty: Theory and Application by Mykel J. Kochenderfer et al. (MIT Lincoln Laboratory Series)
See course materials
This is an enjoyable class taught by Professor Mykel Kochenderfer, the kindest professor I’ve met at Stanford. That says a lot since 99% of the professors here are super nice.
The class covers a lot of important concepts that doesn’t require you to go deep in any of them. It covers a bit of many other AI classes such as representation, learning (CS228), model-free/model-based RL, Markov decision process (CS221, CS234), as well as new materials such as decision theory and POMDP.
Mykel is the only CS professor I know who does freestyle teaching. He doesn’t use slides. He has a notebook that he writes on as he teaches. He has a great sense of humor which makes learning fun. His favorite book is A Million Random Digits with 100,000 Normal Deviates.
CS246: Mining Massive Data Sets
Textbook: Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeff Ullman (Cambridge University Press)
See course materials
I’ve taken two classes with Professor Jure Leskovec and each time I was in awe of what an amazing teacher he is. As the name suggests, the course covers algorithms and techniques that we can use when handling large datasets, which will definitely come in handy when we do AI.
I thought the materials were fun, but a friend told me he was bored for a couple of weeks because the materials for those week weren’t challenging enough. The course has some math, some algorithms, and some coding.
Some generic advice
- Learn to use LaTeX early.
- Complete your 4-year plan as early as possible, preferably in your first year.
- Get the required courses out of the way early while your friends are still taking them. They become more annoying when you are the only upperclassman in the class.
- Find a partner if the course allows. I wouldn’t have been able to complete my coursework without partners.
- Take advantage of office hours.
- Prerequisites are for reference only.
- Don’t compete with other people since there will always be someone smarter than you at Stanford. Focus on how much you learn.
- Don’t overload yourself with more than 2 difficult courses per quarter. A good schedule is to take 2-3 easy/medium courses with 2 difficult courses a quarter.
- Don’t take classes for easy A’s. Sure, once in a while it’s nice to relax, but it’s lame take all the classes because you want immaculate GPA.
- Time management is everything.
A suggested schedule
Below is the schedule I wish I had followed now that I have the benefit of hindsight. I suggest that any student interested in AI takes as many math classes as you can, especially in linear algebra, probability, and optimization.
Winter: CS106B/X, CS109
Spring: CS103, CS107
Fall: CS221, CS131, STATS 202
Winter: CS124, CS161
Spring: CS231N, CS110
Winter: CS228, CS224N
Winter: CS246, CS234
Spring: Camp Stanford
Some people asked how I did it in 3.5 years, so I’m attaching my 4-year plan. James just told me it looks repulsive. True.