This is the second post in a series about the ways that machine learning (ML) has not changed. (The first post is here.) Not that it is the same in every way, but that key principles from classical ML are still important today. This one is a bit of a koan: the thing that has not changed is that you should expect change, understanding it in terms of ML’s purpose in your system.
Know why ML is involved
When developing a product feature based on machine learning, it’s important to remember that it is a product feature first, then a software system that provides that feature, and finally an ML-based system. The primary goal—its reason for existence—is that it is doing something useful for your end users, in the socio-technical context of everything else your users are using, understanding, and doing. Some software system is necessary to deliver that feature, but it might not be necessary to use any ML model at all, especially in the larval stages of exploratory user research and system prototyping, or in baseline systems used for software testing, system evaluation, and data development. If models are necessary, they are just parts in that software system body, and it might not be the same class of model as you would use at a different stage for another purpose.
A system has many parts, and each part has its function
A function that you believe will need a big model or a custom model might be better satisfied by deterministic rules, a random function, or a much simpler model, while you are focusing development on the overall software system, including metrics that can guide you towards solving your users’ real needs. On the other hand, you might want to use a very large custom model or a complicated pipeline in a slow offline system that you use as a silver labeler in data development, as a teacher model for a smaller faster production model, or as a foil in system evaluation and error analysis. So think carefully about why you need ML in the system, and for each ML role in the system, consider what trade-offs are preferred among quality, latency, reliability, usability, etc.
Don’t let the tail wag the frog
A system with ML components requires additional considerations, beyond a deterministic system, but not instead of those. The non-deterministic components need their own evaluations, and the system as a whole needs end-to-end evaluation based on statistical design, beyond the end-to-end testing of logic and data formats. And resource planning and load testing will need to be tailored to the constraints of ML systems. But that is all on top of a foundation of good software development. Early exploratory user research and technical feasibility studies feed into careful designs, deterministic components are developed iteratively with classical testing, and feedback from software profiling, usability tests, and trials guide revisions to the system. But these adjustments to the architecture and development processes are for the sake of the product, for the users’ sake, not for the sake of machine learning as a goal, nor especially a particular class of model as a goal.
The only constant is change
As development progresses, the system will mature while your own understanding of the problem space matures, and your data development efforts expand and enrich your datasets. Besides the planned upgrades to models that have better quality/latency/cost profiles, you may well find that initial prototypes were focused on the wrong problem, or that you need to replace pipelines with end-to-end models for quality reasons, or replace end-to-end models with pipelines for the sake of reliabilty, speed or cost. You’ll need to find ways to adjust assumptions while allowing contrastive evaluations and error analysis.