You do not need a perfect MES before starting AI projects in aerospace. You do need a usable foundation: data that is identifiable, time-aligned, traceable to the product and process, and stable enough to support a narrow business question.
The right starting point is not “collect everything.” It is choosing one use case and confirming that the MES and adjacent systems capture the minimum facts needed to explain outcomes. In aerospace, that usually means linking what was built, how it was built, when it was built, on which equipment, by whom or by which role, with which materials, under which revision, and what quality result followed.
Minimum MES data foundation
-
Order and routing context: work order or traveler ID, operation/step IDs, planned versus actual routing, operation status, start and end timestamps, queue and touch times.
-
Part and configuration identity: part number, serial number or lot, configuration or effectivity context where relevant, revision identifiers, and as-built linkage.
-
Material traceability: lot or batch numbers, substitutions if allowed, component genealogy, consumption records, and links to receiving or inventory transactions.
-
Resource context: machine, workcenter, tool, fixture, test stand, and where possible the software or recipe version used during execution.
-
Process parameters and event history: operator entries, measured values, alarms, exceptions, completions, rework loops, holds, and reason codes. The value depends heavily on consistency and timestamp quality.
-
Quality outcomes: pass/fail results, defect codes, NCR references, rework and scrap records, test outcomes, inspection points, and disposition status where available.
-
Document and revision context: work instruction revision, specification revision, approved change date, and evidence that execution occurred under the intended version.
-
User and role attribution: who performed, approved, inspected, or overrode a step, or at least role-level attribution if individual attribution is restricted.
If those links are missing, many AI projects fail because the model cannot distinguish process variation from bad joins, stale revisions, or incomplete genealogy.
What matters more than volume
For most early aerospace AI use cases, data quality matters more than data quantity. The main checks are straightforward:
-
Stable identifiers: part, serial, lot, work order, operation, resource, and defect codes must join reliably across MES, ERP, QMS, PLM, and sometimes historian or test systems.
-
Timestamp integrity: events need consistent time zones, clock synchronization, and enough granularity to reconstruct sequence.
-
Revision awareness: the data has to reflect engineering and process changes, or the model will mix unlike conditions.
-
Completeness by workflow: missing data concentrated in specific shifts, cells, suppliers, or manual steps can bias outputs.
-
Reason code discipline: free-text comments can help, but uncontrolled coding usually reduces repeatability.
-
Change control: if process definitions, interfaces, or master data changed without clear version history, historical training data may not be comparable.
In practice, many plants have a lot of MES data but only a subset is usable for analytics without substantial cleanup.
Use-case-specific minimums
The minimum data depends on the AI project.
-
Predicting quality escapes or rework: you typically need genealogy, operation history, measured process values, defect and NCR history, revision context, and enough closed-loop quality outcomes to learn from.
-
Cycle time or bottleneck prediction: you need routing steps, queue times, labor or machine touch times, dispatch events, hold reasons, and rework loops.
-
Anomaly detection on equipment or test processes: you often need higher-frequency historian, PLC, or test data in addition to MES context. MES alone is often too coarse.
-
Copilot-style operator assistance: you need approved work instructions, revision-controlled procedures, defect history, and strong document governance. Uncontrolled document pools are a major risk.
So the answer is not one fixed dataset. It is the smallest traceable data set that can explain the decision or prediction you want to support.
Brownfield reality
In aerospace, MES rarely holds everything needed by itself. Useful AI projects usually depend on coexistence with ERP, PLM, QMS, test systems, historians, and document control. That means your real prerequisite is often integration quality, not just MES data availability.
If work orders live in ERP, revisions in PLM, NCRs in QMS, and machine data in a historian, then AI readiness depends on whether those systems can be linked consistently and under change control. Full replacement strategies often fail here because qualification burden, validation cost, downtime risk, integration complexity, and long asset lifecycles make wholesale cutovers hard to justify. A narrower layer that connects existing systems is usually more realistic, but it still requires disciplined mapping and governance.
What is usually missing first
-
Genealogy is incomplete below a certain assembly level.
-
Manual rework is recorded in comments instead of structured transactions.
-
Instruction revisions are not tied cleanly to execution events.
-
Equipment IDs changed over time without cross-reference mapping.
-
Defect codes are too inconsistent for reliable learning.
-
Closed-loop outcomes are unavailable because NCR and disposition data sit outside MES.
-
Important data exists only on paper, in PDFs, or in local spreadsheets.
If those issues are present, start by fixing the highest-value data link rather than launching a broad AI program.
Practical starting threshold
Before starting, you should be able to answer yes to most of these questions:
-
Can we trace each analyzed unit or batch through the relevant operations?
-
Can we link execution records to the correct revision and effective date?
-
Can we identify the outcome we are trying to predict or explain?
-
Can we explain major missing data patterns?
-
Can we reconstruct changes in process, equipment, or coding over time?
-
Can we validate outputs against a business process with human review?
If the answer is no to most of those, the limiting factor is likely data readiness and governance, not AI capability.
Bottom line
Start with a narrow, high-value use case and verify five things: product identity, process sequence, revision context, quality outcome, and reliable timestamps. Add material, resource, and parameter data as needed. In aerospace, traceability and controlled change matter as much as model accuracy. Without that foundation, AI can still produce outputs, but they may not be trustworthy enough for regulated operations.