Why Manufacturing Demand Planning Fails

Q: Can you build a useful demand forecast with imperfect historical data?

Yes, with explicit trade-offs. Imperfect data does not make forecasting impossible - it changes the model choices available and the accuracy levels achievable. The most important discipline is being explicit about what is missing and why. A model trained on 18 months of clean history, with the limitations of that window documented, is more trustworthy than a model trained on 36 months of mixed-quality data where the limitations are unknown. Start with the most reliable signal you have, validate it honestly, and extend the model as data quality improves over time. Forecast value added (FVA) - does the model beat a naive baseline? - is the right measure of whether the current data is sufficient to produce useful predictions.

Q: What is the difference between a top-down and bottom-up demand forecast?

A top-down forecast starts at the aggregate level (total revenue for a facility or the portfolio) and allocates that total to lower-level dimensions using historical mix percentages. A bottom-up forecast builds the total from independent forecasts at the most granular level required, then sums them. Top-down is more stable and less sensitive to outliers in individual segments. Bottom-up is more sensitive to structural changes in product mix or customer demand. The reconciled approach runs both, identifies where they diverge, and treats the divergence as a signal worth investigating. For most PE-owned manufacturers with multi-entity portfolios, the reconciled approach is most powerful because cross-entity mix shifts - a customer consolidating purchases at one facility, or a new product category ramping at one plant - show up clearly in the gap between top-down and bottom-up numbers.

The instinct is to replace the model. The problem is the data.

When a manufacturing demand plan underperforms - forecasts that miss by 30%, plans that production can't execute, revenue projections that finance won't stand behind - the first response is almost always to examine the model. Maybe the algorithm is too simple. Maybe a more sophisticated approach, a neural network, a better seasonal decomposition, a larger feature set, would close the gap.

In practice, for most PE-owned manufacturers, the model is not the constraint. The data feeding it is. Three structural problems appear repeatedly in manufacturing environments that have grown by acquisition, run multiple ERP systems, or have not yet built a conformed data layer:

Fragmented transaction history - the same customer, product, or channel appears as multiple disconnected records across systems, so the model trains on a partial signal
Unreliable feature histories - the leading indicators that make forecasts more accurate (open backlog, booking rates, quote activity) cannot be reliably reconstructed for the training window
Wrong grain definition - the level of detail at which the forecast is built does not match the decisions it needs to drive, and the mismatch either produces unusable output or unacceptably high error rates

Each of these is a data architecture decision, not a model selection decision. Resolving them before choosing a forecasting approach is the actual work - and it is the work that most demand planning initiatives skip.

A note on methodology

Demand forecasting is a deep field with many valid methodologies. The approaches here reflect what we have seen produce the best practical results in PE-owned manufacturing environments - not a claim that these are the only right answers. There are practitioners with far broader expertise in the discipline. One we have worked with directly and recommend without reservation is Nicolas Vandeput, whose work on demand forecasting and inventory optimization is worth reading by anyone building serious forecasting capability.

The three structural problems

Fragmented transaction history

Data Fragmentation

When a PE-owned manufacturer acquires a second business, they inherit its ERP, its item master, and its transaction history - all in a format that was never designed to be compared with the platform company's data. The same customer might appear as three separate records across three ERP instances. The same product category is described under three different naming conventions. Revenue from the same end market sits in three different cost center structures, with three different period-close calendars.

A demand forecasting model trained on this data is not learning the true demand pattern for a customer or a channel. It is learning a fragmented partial history that represents some subset of the actual demand signal. The model finds patterns in the data it was given. If that data is missing 30% of the volume for the segment being forecast, the model will confidently predict 70% of reality - and there will be no flag in the output indicating that anything is wrong.

From the field

A manufacturer operating two plants - one on Epicor, one on Sage 100 - serves the same end customer through both facilities. Epicor records the account as "Boeing Commercial Airplanes." Sage records it as "BOEING-WA." Without a conformed customer master, every model treats these as two unrelated demand streams. Historical revenue for the account is split between two partial histories. Neither facility's forecast is wrong on its own terms. Both are wrong relative to the total relationship. The combined forecast is systematically low - and no error metric will reveal it, because the model has no visibility into the other half of the account.

The prerequisite for accurate demand forecasting in a multi-entity manufacturing environment is the same as the prerequisite for accurate gross margin analysis and working capital reporting: a conformed master data layer that resolves customer, item, and channel records to a single reference point before any model runs. Solving fragmentation is not a data hygiene exercise. It is a forecasting accuracy exercise.

Reconstructing leading indicators from ERP history

Leading Indicators

Historical revenue alone is a lagging signal. The most predictive features for a demand forecast are leading indicators: open backlog (what customers have ordered but not yet received), booking rate (the pace at which new orders are entering the system), and quote activity (pipeline that has not yet converted to orders). Models that incorporate these signals consistently outperform those built on revenue history alone - particularly at 3-to-12-month horizons where pure time-series momentum loses explanatory power.

The common assumption is that these features are impossible to reconstruct historically because ERPs record current state, not historical state. A purchase order that was open on January 1, 2023 and was subsequently received appears only as received - there is no native snapshot of what was open on that date. Many forecasting implementations abandon leading indicators entirely as a result and fall back to revenue history alone. This is the wrong conclusion.

Deep familiarity with how the major ERP platforms record transaction timestamps creates a path to reconstruct approximate as-of backlog states - even without native point-in-time snapshots. An imperfect leading indicator, honestly applied, improves a demand forecast more reliably than ignoring it entirely.

Order line creation dates, acknowledged ship dates, scheduled delivery dates, and shipment confirmation timestamps - when combined with ERP-specific reconstruction logic - allow us to answer "what was open in backlog as of date X?" with reasonable accuracy for Epicor, Dynamics 365, Infor SyteLine, Sage, and the other platforms in a typical PE-owned manufacturing portfolio. The reconstruction is an approximation, not a perfect audit trail. We are transparent about where approximation is involved and what the implied confidence interval is. But the result is a backlog series that extends the leading indicator window back through the available transaction history and gives the model a materially richer feature set than revenue history alone.

In practice

Using timestamp reconstruction logic against the sales order and shipment tables, we build monthly as-of backlog snapshots going back to the start of the available data window. That reconstructed backlog series - combined with the booking rate derived from order creation dates - becomes a feature in the forecasting model alongside historical revenue. The same CRM-independent approach applies to pipeline: ERP order creation velocity supplements quote activity where CRM history is thin or inconsistent. The combined signal is consistently more predictive than revenue in isolation, even when the reconstruction is imperfect.

Forecasting is never perfect. The goal is not a perfect feature set - it is a better feature set than what most implementations default to. Knowing how each ERP platform stores its transaction history, and knowing which fields to combine to reconstruct a useful approximation of past state, is a meaningful capability advantage at the start of a forecast build.

Grain definition determines error rates

Forecast Grain

Grain (also referred to as granularity, entity, or hierarchy level in data science and forecasting literature) is the combination of dimensions that identifies a single forecast row - the level of detail at which the forecast is calculated and reported. This decision has a larger impact on forecast accuracy and operational utility than any model selection choice, and it is made before the first line of code is written.

The mathematical reality of grain is not a modeling problem. It is a statistical property of demand signals. A facility generating $24M per year has a relatively smooth monthly revenue curve with enough signal to support a reliable forecast. A specific product category within a specific end market at that facility, generating $180K per year, is inherently more volatile relative to its mean. Small volume cells have high coefficients of variation. A single large order, a seasonal spike, or one lost account can move the number by 40% in a month. No model produces a low-error forecast for a high-volatility cell, because the signal is genuinely noisy - not because the model is wrong.

L1 - Aggregate

Revenue by Facility

8–12%

Typical MAPE range for a well-built L1 model on a $20M+ annual revenue facility

Stable demand signal, lower noise

Reliable for revenue planning and working capital

Fast to build and validate

Too aggregated for procurement and production scheduling

L2 - Operational

Facility × End Market × Product Category

25–40%

Typical MAPE range for lower-grain cells, reflecting higher demand volatility

Actionable for procurement, production, and inventory planning

Reveals mix shifts invisible at L1

Higher error is inherent to the signal - not a model failure

Requires conformed item and customer master to build reliably

The right grain is determined by what decision the forecast is meant to support. Finance needs L1 for revenue planning and working capital projections. Procurement and production scheduling need L2 or lower to size purchase orders and production runs by product category and facility. Neither is wrong. They serve different purposes and carry different accuracy expectations.

The most powerful approach in a PE-owned multi-entity portfolio is a reconciled top-down and bottom-up view. A top-down model forecasts total revenue at L1, then allocates proportionally to L2 using historical mix. A bottom-up model forecasts each L2 cell independently, then sums to L1. The gap between the two is not a problem to resolve - it is a signal to investigate. When the sum of bottom-up L2 forecasts diverges materially from the top-down L1 forecast, it typically indicates a structural change in product mix, a new customer ramping at one facility, or a demand shift in a specific segment that the aggregate model cannot detect on its own.

The questions that precede any model choice

The three problems above are not solvable by choosing a better algorithm. They are solvable - or at minimum, honestly sized - by answering a set of data and architecture questions before any modeling work begins. These questions determine what is achievable, what trade-offs are required, and what data investment needs to happen in parallel with the initial model build.

Is the transaction history conformed across entities? Customers, items, and channels must resolve to the same identifiers across all source systems before a cross-entity forecast is meaningful. If master data work has not happened, it precedes the forecast.

How far back does reliable, consistent history extend? The acquisition date is often the practical starting point. Pre-acquisition data from a legacy ERP, under a different item master and cost structure, frequently introduces more noise than signal. Be honest about the clean history window.

Which leading indicators can be reconstructed from ERP transaction history? Backlog, booking rate, and quote activity are the highest-value features. Even without native point-in-time snapshots, timestamp reconstruction logic across the major ERP platforms can recover approximate as-of backlog states - extending the leading indicator window back through the available data rather than abandoning these features entirely.

At what grain does the business decision actually happen? Define the forecast grain by the operational workflow it supports - not by what the data makes easy. Then set accuracy expectations accordingly: L2 MAPE of 30% is not a failure if the L2 signal is genuinely volatile.

Do you need a top-down, bottom-up, or reconciled view? If the answer is reconciled - which it usually is in a multi-entity portfolio - plan the architecture to produce both and measure the divergence. The gap is often the most actionable output the forecast produces.

A demand forecast built on honest answers to these five questions - even if the answers reveal significant data limitations - is more valuable than a sophisticated model built on unexamined assumptions about data quality. The model is the easy part. The foundation is the work.

Related reading: Master Data as a PE Strategic Advantage covers the conformed data layer that makes cross-entity forecasting possible. 5 Data Problems That Corrupt Your Inventory Analytics addresses the same data quality issues from an inventory planning perspective.

In this article

The model isn't the problem 1 - Fragmented history 2 - Unreliable features 3 - Grain and error rates 5 questions before you build

Common questions

Questions about manufacturing demand planning and the data foundation it requires.

How far back does a manufacturer need transaction history to build a reliable demand forecast?

The practical minimum is 24 months of consistent, conformed transaction history. Fewer than 24 months makes it difficult to capture full seasonal cycles, which are present in most manufacturing demand patterns. 36 months is better. The key word is consistent: 36 months of fragmented, unmastered history from three different ERPs may produce a worse forecast than 24 months of clean, conformed data. The acquisition date is often the practical starting point for what counts as reliable - pre-acquisition data from a different ERP instance, under a different item master, frequently introduces more noise than signal.

What is forecast grain and how do you choose the right level?

Forecast grain is the combination of dimensions that defines a single forecast row - the level of detail at which the forecast is calculated. Revenue by Facility is a coarse grain (L1). Revenue by Facility, End Market, and Product Category is a finer grain (L2). The right grain is determined by the decision the forecast is meant to support. Finance needs L1 for working capital and revenue planning. Procurement and production scheduling need L2 or lower to size purchase orders and production runs. A common approach is to build both and reconcile them - the gap between a top-down L1 forecast and the sum of L2 bottom-up forecasts is itself a signal that something in the demand structure has shifted.

Can you build a useful demand forecast with imperfect historical data?

Yes - and imperfect data is not a reason to avoid leading indicators. For backlog and booking rate specifically, deep ERP knowledge creates a path to reconstruct approximate as-of states from transaction timestamps even without native point-in-time snapshots. Order line creation dates, acknowledged ship dates, and shipment confirmation records can be combined to answer "what was open in backlog on date X?" with reasonable accuracy across Epicor, Dynamics 365, Infor, and Sage. The reconstruction is an approximation, not a perfect audit trail, and we are transparent about that. But an imperfect backlog series, honestly applied, improves model accuracy more reliably than falling back to revenue history alone. Forecasting is never perfect - the goal is a better feature set than what most implementations default to, not a perfect one.

What is the difference between a top-down and bottom-up demand forecast?

A top-down forecast starts at the aggregate level and allocates to lower-level dimensions using historical mix percentages. A bottom-up forecast builds the total from independent forecasts at each granular cell, then sums. Top-down is more stable and less sensitive to outliers in individual segments. Bottom-up is more sensitive to structural changes in product mix or customer demand. The reconciled approach runs both, identifies where they diverge, and treats the divergence as a signal worth investigating. For PE-owned manufacturers with multi-entity portfolios, the reconciled approach is most powerful - cross-entity mix shifts and new product ramps show up clearly in the gap between top-down and bottom-up numbers before they appear in actuals.

Why Manufacturing Demand Planning Fails
And the Data Problem at the Root of It

The instinct is to replace the model. The problem is the data.

The three structural problems

The questions that precede any model choice

Common questions

The forecast is only as good as the data it runs on.

Why Manufacturing Demand Planning FailsAnd the Data Problem at the Root of It

The instinct is to replace the model. The problem is the data.

The three structural problems

The questions that precede any model choice

Common questions

The forecast is only as good as the data it runs on.

Why Manufacturing Demand Planning Fails
And the Data Problem at the Root of It