1.1.3.4 Other technical roles in ML production

There are many other technical roles in the ML ecosystem. Many of them don’t require ML knowledge at all, e.g. you can build tools and infrastructures for ML without having to know what a neural network is (though knowing might help with your job). Examples include framework engineers at NVIDIA who work on CUDA optimization, people who work on TensorFlow at Google, or AWS platform engineers at Amazon.

ML infrastructure engineer, ML platform engineer

Because ML is resource-intensive, it relies on infrastructures that scale. Companies with mature ML pipelines often have infrastructure teams to help them build out the infrastructure for ML. Valuable skills for ML infrastructure/platform engineers include familiarity with parallelism, distributed computing, and low-level optimization.

These skills are hard to learn and take time to master, so companies prefer hiring engineers who are already skillful in this and train them in ML. If you are a rare breed that knows both systems and ML, you’ll be in demand.

ML accelerator/hardware engineer

Hardware is a major bottleneck for ML. Many ML algorithms are constrained by processors not being able to do computation fast enough, not having enough memory to store data/models and load them into memory, not being cheap enough to run experiments at scale, not having enough power to run applications on device.

There has been an explosion of companies that focus on building hardware both for training and serving ML models, both for cloud computing and edge computing. These hardware companies need people with ML expertise to guide their processor development process: to decide what ML models to focus on, then implement, optimize, and benchmark these models on their hardware. More and more hardware companies are also looking into using ML algorithms to improve their chip design process1112.

ML solutions architect

This role is often seen at companies that provide services and/or products to other companies that use ML. Because each company has its own use cases and unique requirements, this role involves working with existing or potential customers to figure out whether your service and/or product can help with their use case and if yes, how.

Developer advocate, developer programs engineer

You might have seen developer relationship (devrel) engineer roles such as developer advocate and developer programs engineer for ML. These roles bridge communication between people who build ML products and developers who use these products. The exact responsibilities vary from company to company, role to role, but you can expect them to be some combination of being the first users of ML products, writing tutorials, giving talks, collecting and addressing feedback from the community. Products like TensorFlow and AWS owe part of their popularity to the tireless work of their excellent devrel engineers.

Previously, these roles were only seen at major companies. However, as many machine learning startups now follow the open-core business model -- open-sourcing the core or a feature-limited version of their products while offering commercial versions as proprietary software -- these startups need to build and maintain good relationships with the developer community, devrel roles are crucial to their success. These roles are usually very hard to fill, as it’s rare to find great engineers who also have great communication skills. If you’re an engineer interested in more interaction with the community, you might want to consider this option.

11: Chip Design with Deep Reinforcement Learning (Google AI blog, 2020)

12: Accelerating Chip Design with Machine Learning (NVIDIA research blog, 2020)


This book was created by Chip Huyen with the help of wonderful friends. For feedback, errata, and suggestions, the author can be reached here. Copyright ©2021 Chip Huyen.

results matching ""

    No results matching ""