RSC Cluster: Performance Visibility (OEE, NPT, Shift Variance)

The Performance Visibility cluster translates execution data into outcomes leadership actually cares about. It focuses on OEE, nonproductive time, downtime, and shift-to-shift variance, with an emphasis on high-mix, low-volume aerospace environments. The content clearly distinguishes meaningful metrics from vanity KPIs and explains how to calculate, interpret, and act on performance data. The goal is to help teams move from anecdotal explanations to evidence-based improvement tied directly to execution reality.

  • real-time visibility

    Real-time visibility is the continuous access to current operational data as it is generated, presented in a form that can be monitored or analyzed without delay. In manufacturing and production environments, it means that machine status, work-in-progress, material movements, quality checks, and downtime events are captured, updated, and displayed as they occur, rather than in batches or after a shift.

    Operationally, real-time visibility typically involves:

    • Automatic data collection from equipment, systems, and manual inputs
    • Instant updating of dashboards, reports, and alerts when a status changes
    • A single, consolidated view of current conditions across lines, cells, or sites
    • Standard rules for how events (such as deviations, delays, or failures) are detected and surfaced

    In the context of a Manufacturing Execution System (MES), real-time visibility is achieved when the MES continuously aggregates and displays live production data so that supervisors, operators, and support teams see the same up-to-date information at the same time.

  • production line

    A production line is a grouped set of equipment, workstations, and supporting resources arranged to perform a defined sequence of manufacturing operations for a specific product or family of products. It represents a physical and operational segment of a plant where materials flow through ordered steps to be transformed into finished goods or intermediates.

    In discrete and hybrid manufacturing, a production line typically includes machines, manual workstations, conveyors or material transfer systems, in-line inspection points, and local control systems. It is usually dedicated to a particular product type, variant, or process route, and can operate in batch, semi-continuous, or continuous modes.

    Role in manufacturing systems

    Within manufacturing operations and information systems, a production line is often used as a key organizational object for:

    • Planning and scheduling capacity and sequence of work orders
    • Collecting and aggregating production, quality, and downtime data
    • Assigning operators, maintenance, and support resources
    • Managing in-process inventory and material flow
    • Configuring MES, SCADA, and historian tags and reports

    In models aligned with ISA-95, a production line commonly appears as a physical and operational level below an area and above units, equipment modules, and control modules. It is distinct from enterprise or site structures used in ERP, although MES and ERP systems frequently map work centers or work centers groups to production lines.

    What it includes and excludes

    A production line typically includes all equipment and workstations directly required to execute a defined sequence of process steps, such as filling, assembling, testing, or packaging. It may also include local buffer storage, inline quality inspection points, and the automation systems that control the line.

    It generally does not include:

    • Upstream bulk utilities (for example, plant-wide compressed air systems)
    • Site-wide warehousing and logistics functions
    • Enterprise-level planning or business processes

    Common confusion

    • Production line vs. process cell: In ISA-95 terminology, a production line is more typical in discrete or packaging environments, while a process cell often refers to an integrated processing area in continuous or batch process industries. Both occupy a similar level in the physical hierarchy but represent different process styles.
    • Production line vs. work center: In many ERP systems, a work center is a logical planning entity that may map to a single machine, a group of machines, or an entire production line. The production line is the physical and operational flow of equipment, while the work center is often a scheduling and costing construct.
    • Production line vs. unit: A unit is usually a more granular piece of equipment or functional segment within a production line (for example, a filler, a labeler, or a reactor), not the entire end-to-end flow.

    Use in regulated and integrated environments

    In regulated manufacturing, production lines are frequently defined as mastered objects in MES and related systems to support batch records or electronic device history records, line clearance procedures, equipment qualification tracking, and traceability of materials and results at the line level.

    When integrating MES, SCADA, and ERP, a clear, consistent definition of each production line helps align equipment hierarchies, routing definitions, and performance metrics such as OEE, throughput, and downtime at a level meaningful for both operations and business planning.

  • Equipment state model

    An equipment state model is a defined set of machine or equipment conditions and the allowed transitions between them, used to consistently describe how equipment is being used over time. It is typically implemented in MES, SCADA, or OEE systems to classify machine time into standardized states such as running, idle, setup, planned stop, unplanned stop, or maintenance.

    Key characteristics

    An equipment state model commonly includes:

    • Standard state definitions such as productive, standby, changeover, faulted, blocked, starved, or offline.
    • Transition rules that define which state changes are valid (for example, from running to faulted, or from planned stop to running).
    • Event sources such as PLC signals, operator input, or MES events that trigger a state change.
    • Time-based classification so each time segment for a piece of equipment is assigned to a single, clear state.
    • Mapping to performance metrics, particularly OEE components like availability, performance, and quality.

    In regulated or high-reliability environments, the equipment state model is often documented and governed so that reports, investigations, and audits can interpret equipment utilization and downtime consistently across lines, plants, or sites.

    Operational use in manufacturing

    In day-to-day operations, the equipment state model appears in:

    • Dashboards and HMIs that show current machine status (for example, running, changeover, maintenance).
    • Downtime capture workflows where operators confirm or refine the reason associated with an unplanned stop state.
    • Automatic data collection from controllers or sensors that update the state based on signal patterns or conditions.
    • Performance and reliability analysis, where time in each state is aggregated for OEE, NPT, or capacity studies.
    • Maintenance planning, where time in maintenance-related states is tracked and analyzed.

    The equipment state model is often aligned with standards such as ISA-95 or ISO 22400, which describe typical state categories and KPI definitions. Implementations may customize the model to specific processes, but retain a common core to support plant-wide or enterprise-wide comparisons.

    Scope and boundaries

    An equipment state model:

    • Includes the logical states of physical production assets such as machines, lines, cells, and test equipment.
    • Includes time-based events and status changes relevant to production, quality, and maintenance reporting.
    • Excludes detailed process parameters (for example, temperature, speed, pressure) that describe how the equipment is running, not what state it is in.
    • Excludes work order or product status models, which describe the state of a job or batch rather than the equipment itself.

    Common confusion

    • Equipment state model vs. OEE calculation: The state model provides the categorized time data (for example, run vs. downtime). OEE formulas then use that data to compute availability and related metrics, but are not the state model itself.
    • Equipment state model vs. workflow or routing: A routing or workflow describes the sequence of operations a product goes through. The equipment state model describes the condition of the machine, independent of which product or operation is running.
  • Work unit

    A work unit commonly refers to a clearly defined, trackable element in production or operations management. It is used to organize, execute, and measure manufacturing activities. The specific meaning depends on context, but it always describes something discrete that can be planned, assigned, and reported on.

    Common meanings in manufacturing and industrial systems

    In regulated and industrial environments, the term “work unit” is most often used in two ways:

    • As a unit of work: A specific, bounded task or set of tasks, such as an operation on a routing, a job step in a digital traveler, or a maintenance activity that can be started, completed, and recorded. It is often tied to a work order, operation number, or activity ID in MES or ERP.
    • As a unit of production resource: A logical or physical production entity that performs work, such as a machine, work center, cell, or production line segment. In this sense, a work unit is the resource on which work is scheduled and whose utilization, availability, or performance is tracked.

    Both usages share the idea of being a manageable granularity for planning, scheduling, execution, and performance measurement.

    Operational characteristics

    In OT/IT, MES, and ERP contexts, a work unit typically:

    • Has a unique identifier (for example, operation ID, activity code, or work center ID).
    • Can be associated with materials, tooling, labor, and digital work instructions.
    • Has status values such as planned, released, in process, on hold, or complete.
    • Is time-bound, enabling collection of cycle time, queue time, and downtime data.
    • Can carry traceability and genealogy data where required by quality or regulatory standards.

    In performance and OEE-style metrics, work units are often the level at which run time, non-productive time (NPT), scrap, and rework are recorded, enabling analysis by specific task or resource.

    Use in regulated and quality-focused environments

    In regulated manufacturing (for example, aerospace, medical device, or defense), work units are important for:

    • Traceability: Linking each work unit to specific parts, lots, travelers, and inspection records.
    • Evidence trails: Showing which operator, machine, and version of work instructions applied when the work unit was executed.
    • Nonconformance management: Associating defects or deviations with the exact work unit (operation or resource) where they occurred.
    • Capacity and planning: Aggregating individual work units to understand resource loading and throughput.

    Includes and excludes

    A work unit typically includes:

    • Individual operations or activities on a routing or traveler.
    • Discrete machine runs or job segments.
    • Logical production entities such as work centers or cells, when the term is used for resources.

    A work unit typically excludes:

    • Entire end-to-end value streams or production systems.
    • Purely financial constructs like general ledger cost centers without operational meaning.
    • Informal tasks that are not defined or tracked in a system of record.

    Common confusion

    • Work unit vs work order: A work order is usually a higher-level instruction to produce a certain quantity of product or perform a job. Work units are the smaller execution elements or resources within or across those work orders.
    • Work unit vs operation/step: An operation or step is often a specific type of work unit on a routing. Some systems use these terms interchangeably; others treat “work unit” as a generic abstraction.
    • Work unit vs work center: A work center is typically a resource grouping (machines, people, or cells). A work unit may refer to that resource, or to a unit of work scheduled to run on it, depending on the system configuration.

    Relation to digital systems

    In MES and integrated OT/IT architectures:

    • Work units may be represented as records in an operations or activities table, linked to routings, BOMs, and work orders.
    • Digital work instructions can be attached at the work-unit level to ensure the correct procedure and revision are used.
    • Event data from machines or operators (start, stop, pause, completion, inspection) are frequently logged against specific work units to support audit trails, analytics, and continuous improvement.
  • How can aerospace manufacturers measure throughput in low-volume, high-mix environments?

    Throughput in aerospace high-mix, low-volume (HMLV) environments cannot be reduced to a simple parts-per-hour number. You typically need a layered approach that looks at throughput by work order, by routing step, and at the constraint resource, rather than only at finished units.

    1. Choose the right unit of measure for HMLV

    In HMLV aerospace, parts are complex, routings are long, and mix shifts daily. A single “widgets/hour” number is usually meaningless. Common, more practical throughput measures include:

    In practice, this connects to operational visibility when teams need to turn the answer into repeatable execution habits.

    • Work orders completed per period (per week or month) by product family or program.
    • Routing steps completed per period on key resources (e.g., 5-axis machining, CMM, NDI, bonding, paint).
    • Throughput hours: sum of standard or planned hours completed on released work orders in a time window.
    • Constraint-step throughput: completed operations at the known bottleneck machine, cell, or department.

    Each of these requires reasonably accurate routings and labor standards. If those are weak or outdated, the first step is often to stabilize them before trusting derived throughput metrics.

    2. Measure throughput at the work-order level

    Because individual part numbers move slowly, the work order is usually the most reliable lens:

    • Count work orders released vs. completed over a defined period, segmented by program, family, or process type (e.g., structural machining vs. sheet metal vs. assemblies).
    • Track work-order cycle time (release to completion) and lead-time adherence. Rising cycle time at constant release volume usually signals a throughput or WIP problem.
    • Use hours-based throughput: completed standard hours or earned hours per week is often more stable than units in HMLV.

    In a brownfield environment, this data often lives partly in ERP (work orders, standards) and partly in MES or manual travelers (actual progress). Without at least basic interoperability, you will only see a partial picture.

    3. Focus on constraint-step throughput

    For most aerospace shops, true throughput is limited by a few resources such as specialized machines, inspection, NDI, or a specific skilled labor pool. Measuring throughput at these constraint steps is usually more actionable than measuring finished assemblies:

    • Identify the constraint with loading studies or simple observation (persistent queues, high overtime, chronically late operations).
    • Measure completed operations at the constraint per day or per shift, ideally normalized by planned hours.
    • Track queue time before the constraint as an early indicator of collapsing throughput.
    • Segment by mix (family, complexity, customer) so you can see when the mix has effectively reduced the constraint’s output.

    This approach fits both legacy and modern MES: even if you only have paper travelers, you can sample how many operations exit a key machine or cell per day. Digital systems make it easier but do not replace the need to reason about the real constraint.

    4. Use routing-step completion as an intermediate metric

    For complex, long-cycle assemblies, you will not see many finished units in any given week. Routing-step throughput gives a more continuous signal:

    • Count completed operations per resource or area (e.g., ops 20/30/40 in major machining cells) and trend them.
    • Track first-pass completion vs. rework operations to separate true throughput from churn.
    • Measure operation-level cycle time: start-to-complete at each critical step.

    Operation-level data often comes from MES, digital travelers, or time collection systems. If your plant still relies heavily on manual sign-off, the first step may be to digitize routing progress (even with light-weight scanners or tablets) before attempting fine-grained throughput measurement.

    5. Combine throughput with WIP and lead time

    Throughput alone is easy to misread in HMLV without context on WIP and lead time:

    • WIP vs. throughput: if WIP keeps growing while throughput is flat, you are loading the system faster than it can execute.
    • Lead time vs. throughput: if lead times are rising while reported throughput is stable, your throughput metric may be missing rework, queueing, or partial completions.
    • Program-level view: measure throughput and WIP by program or customer to detect where mix is silently consuming capacity.

    These views usually require at least basic alignment between ERP (order/WIP quantities), MES (operation status), and scheduling tools. In many aerospace plants, spreadsheets bridge the gaps; this is workable if the interfaces and data definitions are tightly controlled and periodically reconciled.

    6. Attribute non-productive time and variability

    HMLV throughput is often constrained by unplanned variability rather than by nominal cycle times. To understand true throughput, you need to distinguish productive from non-productive time:

    • Log major causes of delay at constraint resources: waiting on NC programs, tooling, FAI approval, MRB decisions, material, or engineering changes.
    • Quantify rework and scrap at each step, not just at final inspection. High rework consumes capacity and inflates apparent throughput if you only count operations completed.
    • Measure schedule adherence at the operation level: how often do operations start and finish within their planned window?

    Without reasonably consistent reason codes and operator reporting, any throughput number will hide as much as it reveals. Digital work instructions and digital travelers can help standardize cause coding, but only if governance and training are in place.

    7. Deal explicitly with FAI, one-offs, and engineering churn

    In aerospace, throughput is frequently distorted by FAIs, prototype lots, and engineering change-driven disruption:

    • Separate FAI and NPI lots from steady-state production when calculating throughput trends. FAIs often take longer and require more stops for inspection and approvals.
    • Tag one-offs and repairs so they are not mixed into baseline throughput metrics for recurring part numbers or assemblies.
    • Measure engineering-change impact explicitly (e.g., hours lost or days of delay due to ECO holds), rather than attributing all variability to the shop floor.

    In brownfield stacks, this usually requires clear coding in ERP and consistent use of routing or order attributes that MES can read. Without that discipline, data from FAIs and one-offs will contaminate “normal” throughput measures.

    8. Practical data strategies in brownfield environments

    In many aerospace plants, you will not be able to deploy a clean-sheet MES or scheduling system quickly due to validation, qualification, and downtime constraints. Throughput measurement must work with existing systems:

    • Start with what you have: use ERP work-order completions and simple time stamps to build an initial throughput view, even if some steps are manual.
    • Add light-weight data capture at constraints: barcode scans or basic digital travelers focused on the bottleneck resources often deliver more value than a plant-wide big-bang rollout.
    • Align master data before deep analytics: inconsistent routings, units, and part families across ERP and MES will undermine any throughput KPI, regardless of tooling.
    • Use incremental validation: for regulated environments, treat new throughput calculations as software features subject to change control and documented verification, especially if they feed planning or customer commitments.

    Full system replacement purely to improve throughput visibility is rarely justifiable in aerospace. The validation burden, integration complexity, and risk of disrupting qualified processes often outweigh potential gains. Layered, interoperable solutions and targeted digitization around constraints are usually safer and faster paths.

    9. Governance and interpretation

    Finally, throughput metrics in HMLV aerospace must be governed and interpreted carefully:

    • Define each metric precisely (what is counted, which orders, which hours) and lock that definition under document control.
    • Review metrics with operations, quality, and engineering together so that changes in throughput are not misattributed to a single function.
    • Periodically reconcile metrics to reality (shop-floor walks, sample job histories) to catch data quality or integration issues before they drive bad decisions.

    Used this way, throughput metrics in low-volume, high-mix aerospace environments become a tool for targeted improvement around real constraints, rather than a superficial scoreboard.

  • Which metrics best indicate production system health beyond delivery counts?

    Delivery counts and on-time delivery are lagging outcomes, not true indicators of production system health. In regulated, high-mix environments, you need a balanced set of leading and lagging metrics across flow, quality, assets, workforce, and system integrity. The right mix depends on your data maturity, integrations, and validation constraints.

    1. Flow and stability metrics

    These show whether work moves predictably through the system, independent of short-term expediting.

    In practice, this connects to operational visibility when teams need to turn the answer into repeatable execution habits.

    • Throughput by constraint / bottleneck resource: Units or standard hours completed at the true constraint, not just shipped. Requires stable routing and time standards.
    • Work-in-process (WIP) by stage: WIP levels at key operations or value-stream segments. Rising WIP at a particular step often signals hidden defects, staffing gaps, or scheduling issues.
    • Queue time vs process time: Ratio of waiting time to actual touch time. A high ratio indicates systemic flow problems, even if deliveries are currently being met via expediting.
    • Schedule adherence: Percentage of orders completed in the planned sequence and time bucket, not just shipped on time. This is a good early-warning metric for firefighting behavior.

    2. Quality and rework metrics

    Healthy operations show stable, low variation in quality performance, with visible and acted-on feedback loops.

    • First pass yield (FPY) at key operations: Percentage of units passing a step without rework or deviation. In aerospace and similar environments, include concessions and use-as-is dispositions, not just hard rejects.
    • Final yield: Good units shipped vs total units started for a part number or family. Sensitive to scrap, rework, and test failures.
    • Cost of poor quality (COPQ): Labor, material, and overhead consumed by scrap, rework, MRB activity, and customer returns. Calculation methods vary and should be documented to remain auditable.
    • NCR rate and severity: Nonconformance count per unit or per labor hour, stratified by criticality (e.g., safety/airworthiness related vs minor). Requires consistent coding in your QMS or MES.
    • Rework cycle time: Time from NCR creation to closure. Long durations indicate systemic bottlenecks in MRB, inspector availability, or engineering decision-making.

    3. Asset and equipment performance

    The goal is predictable capability and availability, not just high utilization.

    • Overall equipment effectiveness (OEE) for critical assets: Availability, performance, and quality multipliers. In high-mix contexts, OEE is useful mainly when normalized carefully and limited to selected constraint resources.
    • Planned vs unplanned downtime: Percentage of machine downtime that occurs as planned maintenance, setups, or changeovers vs unexpected events. A rising unplanned share is an early signal of reliability and maintenance issues.
    • Mean time between failures (MTBF) / Mean time to repair (MTTR): For key machines, especially those with long qualification cycles or tooling lead times.
    • Setup and changeover time: Particularly important in high-mix, low-volume operations. Trends here directly affect your ability to maintain flow without excess WIP.

    4. Labor, standard work, and workforce health

    Delivery can be maintained short-term by burning people out. System health metrics must expose this.

    • Labor productivity: Value-added hours vs total hours, or units / standard hours vs actual hours. In regulated settings, ensure the standard data and actuals are controlled and traceable.
    • Overtime level and distribution: Percentage of hours worked as overtime, by area. Sustained high overtime often masks capacity, planning, or training issues.
    • Training and certification coverage: Percentage of operations run by properly certified / qualified operators per QMS requirements. Depends on robust training records and controlled work instruction systems.
    • Adherence to standard work: Measured via layered process audits, digital work instruction usage, or similar. Non-adherence is a leading indicator of future quality and safety problems.

    5. Planning and material health

    Production health is fragile when material and planning signals are unstable, even if deliveries look fine right now.

    • Material availability at schedule release: Percentage of work orders that can start on time with all required materials, tooling, and documents available. Requires integration between ERP, MES, and stores.
    • Shortage count and recurrence: Number of active shortages, frequency of repeat shortages on the same parts, and impact on constrained resources.
    • Reschedule churn: Frequency and magnitude of work-order rescheduling and priority changes. High churn indicates weak demand signals or unstable planning parameters.

    6. System integrity and compliance signals

    In regulated environments, system health includes the trustworthiness and stability of the digital backbone.

    • Data integrity incidents: Number of issues such as misaligned revisions, missing signatures, incorrect routings, or broken genealogy links detected in production or audits.
    • Document and revision adherence: Percentage of work performed to the correct, approved revision of drawings, specifications, and work instructions. This generally requires MES or digital traveler controls.
    • Audit and LPA findings: Trends in internal audit and layered process audit findings tied to production processes and documentation control.
    • Rework related to configuration errors: Portion of defects caused by wrong parts, revisions, or routings, which often arise from weak system integration rather than operator skill.

    7. Choosing metrics realistically in brownfield environments

    The list above is intentionally broad. In most brownfield plants with mixed MES, ERP, PLM, and QMS systems, you cannot measure all of these reliably on day one.

    • Start from critical constraints: Focus on metrics around the few resources, operations, or product families that drive most lead time, risk, or margin.
    • Assess data readiness: Before setting a metric as a KPI, verify that definitions are clear, time stamps align across systems, and manual workarounds are sustainable and auditable.
    • Avoid full replacement as a prerequisite: Waiting for a new monolithic system to replace legacy MES/ERP/QMS to “get perfect data” typically delays improvement and introduces qualification and downtime risk.
    • Validate calculations in regulated contexts: Where metrics may be used in decisions that affect product quality or compliance (e.g., risk-based sampling, staffing decisions), ensure calculations and reports are controlled, versioned, and validated.

    8. Putting it together as a health dashboard

    A practical production health view usually includes a small, stable set of metrics across categories, not a long list:

    • Flow: WIP by stage, queue vs process time, schedule adherence.
    • Quality: FPY at key steps, NCR rate/severity, COPQ trend.
    • Assets: OEE or availability for a few critical assets, unplanned downtime.
    • Workforce: Overtime level, training coverage, layered process audit adherence.
    • Planning/material: Material availability at release, shortage count, reschedule churn.
    • System integrity: Revision adherence, configuration-related defects.

    The exact thresholds and targets will vary by plant, product, and regulatory context. What matters is that the metrics are defined clearly, traceable to their data sources, realistic given existing systems, and stable enough to drive disciplined problem solving rather than short-term firefighting.

  • How many KPIs should be global versus local?

    There is no universal number, but the usual answer is: keep the global set small and the local set purposeful.

    For most regulated manufacturing environments, a practical pattern is 5 to 12 global KPIs that are defined consistently across sites, plus a larger set of local KPIs owned by plants, lines, cells, or functions. The exact split depends on process similarity, data quality, governance discipline, and whether sites are actually comparable.

    In practice, this connects to operational visibility when teams need to turn the answer into repeatable execution habits.

    What should be global

    Global KPIs should be limited to measures that meet all of these tests:

    • Leadership needs them for cross-site decisions, not just reporting.
    • The definition can be controlled consistently across plants.
    • The underlying data is available with acceptable quality and timing.
    • The metric can survive normal differences in routing, product mix, batch size, maintenance strategy, and quality workflows.

    Typical candidates include a small set around delivery, quality, schedule adherence, inventory, capacity, or cost of poor quality, but only if the calculation logic is actually harmonized. If one site books rework inside standard routing and another books it as a separate event, the same KPI can mean different things.

    What should stay local

    Local KPIs should capture what operators, supervisors, engineering, and quality teams can actually act on day to day. These often include bottleneck-specific losses, queue time between process steps, first-pass behavior by product family, inspection backlog, tooling availability, training coverage, or specific sources of scrap and rework.

    These measures are often more useful operationally than enterprise dashboards because they reflect local constraints. A site building stable repeat assemblies does not need the same local metrics as a high-mix repair operation or a tightly constrained outside-processing flow.

    Why not standardize everything

    Because full standardization usually breaks on operating reality.

    In brownfield environments, plants often run mixed MES, ERP, QMS, historian, spreadsheet, and manual log processes. They may also differ in work definitions, shift calendars, routing granularity, labor booking, and nonconformance handling. Forcing one enterprise KPI model across all sites without fixing those differences usually creates three problems:

    • Metrics look comparable when they are not.
    • Sites spend time arguing definitions instead of improving performance.
    • Teams create shadow reporting outside controlled systems.

    That is why a layered model is usually safer than an all-global model.

    A practical operating model

    A common structure is:

    • Tier 1 global: a small enterprise scorecard used for portfolio decisions and executive review.
    • Tier 2 functional/global-local hybrid: common categories with limited local parameterization, such as quality loss, schedule attainment, or material availability.
    • Tier 3 local: plant, line, cell, or program KPIs tied to actual constraints and daily management.

    This approach preserves comparability where it matters while allowing sites to manage the process they actually run.

    What ratio is reasonable

    If you need a rule of thumb, many organizations are better off with roughly 20 to 30 percent global and 70 to 80 percent local by count. But do not treat that as a target. Some networks need fewer global KPIs because products and processes vary too much. Others can support more global KPIs if they have strong master data, common routing logic, disciplined change control, and validated system integration.

    The real question is not the count. It is whether each KPI has a clear owner, stable definition, trusted source, and a decision that depends on it.

    Common failure modes

    • Too many global KPIs, which turns review meetings into dashboard maintenance.
    • Global KPIs defined centrally but calculated differently in each plant.
    • Local KPIs with no link to business outcomes, which creates optimization in the wrong direction.
    • Metrics introduced before data readiness, causing manual workarounds and low trust.
    • Replacing existing reporting too aggressively, which is risky in validated or heavily controlled environments.

    That last point matters. Full replacement strategies often fail when legacy reporting is tied into qualified processes, audit evidence, or long-established operational routines. Coexistence is usually more realistic: stabilize a small canonical KPI layer first, map source systems carefully, validate calculations where required, and retire old reports gradually under change control.

    Bottom line

    Use as few global KPIs as you can govern well, and as many local KPIs as teams need to run the process responsibly. If a KPI cannot be defined consistently across sites, it should probably not be global.