The most common mistake when building software that involves data is reaching for machine learning when a simpler solution exists. ML is powerful for specific problems, but it is also slow to build, requires significant data, fails in unpredictable ways, and is hard to debug. Before starting any ML project, the question to ask is: can I write a clear heuristic for this? If yes, write the heuristic first.
This post gives you a concrete decision framework for when to use ML and when not to, with real examples of problems that look like they need ML but do not.
The Core Question: Can You Write a Rule?
Machine learning learns rules from data. If you can write the rules yourself, you do not need ML.
This sounds obvious, but it is violated constantly in practice. Here are problems people frequently reach for ML to solve, along with the simpler solutions that actually work.
"Is this email from our domain?" This looks like a classification problem. It is a string check.
def is_company_email(email: str) -> bool:
return email.endswith("@yourcompany.com")
One line. Zero training data. Zero model maintenance. Zero latency. Works perfectly.
"Show me recent orders." This looks like a personalization or recommendation problem. It is a database query.
SELECT * FROM orders
WHERE user_id = $1
ORDER BY created_at DESC
LIMIT 20;
No ML, no embedding, no recommendation engine. Just a query sorted by date.
"Is this a valid phone number?" This looks like a validation problem that might need ML to handle international formats. It is a regex combined with a library.
import phonenumbers
def is_valid_phone(number: str, region: str = "US") -> bool:
try:
parsed = phonenumbers.parse(number, region)
return phonenumbers.is_valid_number(parsed)
except Exception:
return False
The phonenumbers library handles every country's format. No training data, no model, handles edge cases better than any ML approach.
"Classify this support ticket into: billing, technical issue, or feature request." This looks like an NLP classification problem. If the categories are broad and you have a few dozen keywords per category, a keyword-matching approach often works well enough.
BILLING_KEYWORDS = {"invoice", "charge", "payment", "refund", "billing", "subscription"}
TECHNICAL_KEYWORDS = {"error", "bug", "crash", "broken", "not working", "fails"}
FEATURE_KEYWORDS = {"feature", "request", "add", "would like", "suggestion", "idea"}
def classify_ticket(text: str) -> str:
text_lower = text.lower()
scores = {
"billing": sum(1 for kw in BILLING_KEYWORDS if kw in text_lower),
"technical": sum(1 for kw in TECHNICAL_KEYWORDS if kw in text_lower),
"feature": sum(1 for kw in FEATURE_KEYWORDS if kw in text_lower),
}
return max(scores, key=scores.get)
This is not perfect. ML classification would be more accurate on edge cases. But if 85% accuracy is sufficient for your use case, and you can build this in an hour versus an ML solution in a week, write this first. Measure the accuracy. Only upgrade to ML if the accuracy is actually insufficient.
The Decision Tree for Using ML
Work through this in order before starting any ML project:
1. Can you write an exact rule? String matching, regex, date comparison, threshold comparison. If yes, write the rule.
2. Can you write a heuristic that is good enough? Keyword scoring, weighted rules, decision trees you write by hand. If this meets your accuracy requirement, use it.
3. Is your problem actually about pattern recognition in unstructured data? Images, audio, natural language beyond simple classification, handwriting. If yes, ML is likely the right tool. If no, reconsider.
4. Do you have enough labeled data? As a rough guide: simple classification with few classes needs at minimum a few hundred labeled examples, ideally thousands. Complex tasks need tens of thousands. If you do not have data and it would take months to collect, ML may not be feasible.
5. Have you tried a pre-trained model or an LLM prompt first? For many NLP classification tasks, a simple GPT-4o prompt (classify this text into one of these categories) outperforms a custom-trained model and requires zero training data. Try the LLM approach before training a model.