There's a point in most businesses where data science starts to feel like the answer to everything. Better forecasting. Smarter personalisation. More efficient marketing spend. On paper, it makes complete sense. In reality, early data science models often fall short — not because the idea is wrong, but because the foundations aren't in place to support them.
The gap between what data science promises and what it delivers in the early stages is almost always a business problem rather than a technical one. The models are fine. The data, the commercial focus, and the ability to act on outputs are where things break down.
Here are the four challenges I see most consistently — and what to do about each one.
This is the most common and most damaging early-stage problem. Many models are built on tracking data that isn't fully reliable — missing events, inconsistent definitions, gaps in coverage, or data from before tracking was properly implemented. The model doesn't question the data it receives. It just scales whatever you feed into it.
If the data is flawed, the output will be too. A churn model built on incomplete purchase history will generate inaccurate churn predictions. A personalisation model trained on session data that misses a significant share of mobile traffic will generate recommendations skewed toward desktop behaviour. The output looks credible — it has numbers, it has confidence intervals, it has a chart — but it isn't accurate.
Before building any model, audit the data it will be trained on. Check event coverage rates, validate key definitions across systems, and identify gaps. A data quality assessment before model development is not a delay — it's the difference between a model that works and one that confidently produces wrong answers. If you can't validate the input data, don't trust the output.
It's easy to build something technically impressive — a churn model, a customer segmentation framework, a demand forecasting algorithm — and much harder to tie it back to a decision that actually drives revenue or improves performance. If there's no clear action linked to the output, the model doesn't add commercial value, regardless of how sophisticated it is.
This happens when data science work is commissioned without a clear commercial brief. The data science team builds what's technically interesting or what's within their capability to deliver — rather than starting from the question: what decision does this need to inform, and what would change in the business if we had this answer?
Start every data science initiative with a commercial brief, not a technical specification. The brief should answer: what decision are we trying to make better? What does the business do differently if the model is right? Who owns that decision and will they use the output? If those questions can't be answered clearly before development starts, the initiative isn't ready to begin.
Even when models are technically sound and commercially focused, they often fail to generate impact because the outputs sit outside the day-to-day tools and workflows that operational teams actually use. If marketing, trading, or product teams can't easily act on the output, it tends to get ignored — not because teams don't want to use it, but because the friction of incorporating it into existing workflows is too high.
A customer segmentation model that lives in a data science team's Python environment and requires a data request to access is not an operational tool. It's a research project. For data science to have sustained commercial impact, it needs to be embedded into how the business actually operates — in the CRM, in the merchandising tool, in the campaign planning workflow.
Define the activation pathway before the model is built. Where will the output surface — in the email platform, in the product recommendation engine, in the trading dashboard? Who will act on it and how? The integration work required to make a model operational is often as significant as the model development itself. Plan for it from the start rather than treating it as a phase two afterthought.
Early data science models are typically expected to deliver significant, visible results quickly — often because the initiative has been sold internally with a compelling business case that sets high expectations. In reality, the gains from early data science work are almost always incremental and take time to compound. Without realistic expectations, teams lose confidence before the real value starts to emerge.
This creates a particularly destructive pattern: the model is deployed, early results are underwhelming relative to what was promised, leadership loses interest, the team stops iterating, and the model gradually becomes stale. Six months later someone concludes that data science didn't work — when in fact the problem was the expectation, not the approach.
Set expectations honestly at the outset: early models typically deliver 5–15% improvement in the relevant metric, not step-change transformation. Frame the value as compounding — each iteration improves accuracy and commercial relevance. Build a review cadence into the programme so that incremental progress is visible and celebrated rather than compared unfavourably to an inflated initial promise.
Are You Ready to Embed Data Science?
Before investing in data science capability — whether through hiring, a vendor, or an internal programme — it's worth being honest about whether the conditions for success are in place.
Tracking is implemented comprehensively and validated — you know what events you're capturing and what you're missing
There is a clear commercial question the model needs to answer — not just a general desire for "better data"
There is a named person who will own the output and make decisions based on it — not a committee
The activation pathway is defined — you know where in existing tools and workflows the output will surface
Expectations are realistic — leadership understands that early gains are incremental and compound over time
There is budget and organisational appetite for iteration — not a one-off project with no review cycle
- Data quality — models don't question the data they're fed. Audit input data quality before development starts, not after the model produces wrong answers
- Solving the wrong problem — start with a commercial brief, not a technical specification. If there's no clear decision linked to the output, the model has no commercial value
- The insight-to-action gap — define the activation pathway before building the model. If operational teams can't access the output in their existing tools, it won't be used
- Expectation vs reality — early gains are incremental and compound over time. Set honest expectations and build a review cadence that makes progress visible
