2.1.1.1 Technical skills

  1. Software engineering. As ML models often require extensive engineering to train and deploy, it’s important to have a good understanding of engineering principles. Aspects of computer science that are more relevant to ML include algorithms, data structures, time/space complexity, and scalability. You should be comfortable with the usual suspects: Python, Jupyter Notebook or Google Colab, NumPy, scikit-learn19, and a deep learning framework. Knowing at least one performance-oriented language such as C++ or Go can come in handy. BestPracticer has an interesting list of engineering skills needed for skills at different levels.
  1. Data cleaning, analytics, and visualization. Data handling is important yet often overlooked in ML education. It’s a huge bonus when a candidate knows how to collect, explore, clean data as well as knowing how to create training datasets. You should be comfortable with dataframe manipulation (pandas, dask) and data visualization (seaborn, altair, matplotlib, etc.). SQL is popular for relational databases and R for data analysis. Familiarity with distributed toolkits like Spark and Hadoop is also very useful.
  2. Machine learning knowledge. You should understand ML beyond citing buzzwords. Ideally, you should be able to explain every architectural choice you make. You might not need this understanding if all you do is clone an existing open-source implementation and it runs flawlessly on your data. But models seldom run flawlessly, so you’d need this understanding to evaluate potential solutions and debug your models.
  3. Domain-specific knowledge. You should have knowledge relevant to the products of the company you’re interviewing for. If it’s in the autonomous vehicle space, you’re probably expected to know computer vision techniques as well as computer vision tasks such as object detection, image segmentation, and motion analysis. If the company builds speech recognition systems, you should know about mel-filterbank features, CTC loss, and common benchmark datasets for the task of speech recognition.

19: As of 2019, scikit-learn is as popular as TensorFlow.

results matching ""

    No results matching ""