Skip to content

A Practical Guide To Data Modelling for Asset Reliability Analytics

by Kudzai Manditereza
15 min read

Today, more than ever, manufacturing is about producing more with fewer resources. Many manufacturers are turning to analytics to boost productivity with existing equipment while minimizing poor-quality output. However, they continue to struggle with modeling their machine data in ways that allow them to easily and effectively extract the insights needed for critical decision-making.

In this article, we provide practical guidance on how to model machine data for key Asset Reliability KPIs and metrics, including Overall Equipment Effectiveness (OEE), Mean Time to Repair (MTTR), and Mean Time Between Failure (MTBF). We’ll start by examining why traditional approaches to machine data modeling often fall short, then introduce a more effective method and discuss its advantages. Finally, we’ll explain how to map these data model instances to an MQTT-based Unified Namespace, enabling real-time data streams through a contextualization pipeline.

Key Metrics and KPIs for Asset Reliability Analytics

Let’s begin by exploring some of the key KPIs that manufacturers often rely on for asset reliability analytics.

Overall Equipment Effectiveness (OEE)

One of the most impactful and relatively easy metrics to focus on is Overall Equipment Effectiveness (OEE), which measures how effectively a manufacturing operation is utilized. OEE is typically broken down into three components:

  • Availability – Tracks the amount of machine downtime.

  • Performance – Indicates whether the equipment is running at optimal throughput speeds by tracking machine cycles.

  • Quality – Accounts for defects, rework, and other production losses.

By assessing these factors, manufacturers can pinpoint where inefficiencies occur, such as machine failures, setup delays, slow cycle times, or excessive defects—and then take targeted steps to address them.

Mean Time to Repair (MTTR)

The next critical KPI is Mean Time to Repair (MTTR), which measures how quickly machines are fixed following a breakdown. MTTR is key to understanding maintenance efficiency and reducing overall downtime.

Mean Time Between Failures (MTBF)

Also essential is Mean Time Between Failures (MTBF), which indicates how long equipment operates before a failure occurs. Monitoring MTBF helps gauge the reliability of machines and guides preventive maintenance strategies.

Challenges with Conventional Machine Data Modelling Approach

A key question in asset reliability analytics is how to model and collect machine data so that metrics and KPIs can be extracted seamlessly. A common strategy for many manufacturers is to create a data model for each type of machine in the organization, then stream data through instances of these models, into analytics tools for customized calculations of the metrics and KPIs.

A common strategy manufacturers use to create a data model for each type of machine in the organization.However, this approach can pose several challenges, including the following:

Excessive Complexity

Firstly, physical industrial systems are inherently complex, and so, attempting to capture every detail of every machine into a model quickly becomes difficult to manage. 

Scalability Issues

Secondly, it’s not uncommon for companies to have tens of thousands of assets. So, defining every asset type, and subtype along with their relationships, significantly slows down equipment onboarding. Although teams may be able to model a few machines in a pilot project, expanding beyond that often proves overwhelming, and many initiatives stall at the pilot stage.

Limited Reusability

Also, when data models are built solely to represent specific machines or compute certain KPIs, they cannot be easily repurposed for other types of analytics or higher-level KPIs (for example, digital twin initiatives). This lack of flexibility undermines the return on investment from data analytics efforts.

Missing Contextual Information

Lastly, critical details for meaningful analytics, such as shift schedules, operator assignments, product details or downtime reasons, for example, usually fall outside the scope of a machine-specific data model. 

Effective Machine Data Modelling for Asset Reliability Analytics

Let’s explore a more effective strategy for modeling machine data in order to generate meaningful performance analytics. The key is to identify units of information that are common across all machines, rather than modeling every machine individually. As we ascertained earlier, the building blocks for generating OEE, MTTR, and MTBF metrics for machine or production line are Downtime events, Machine Cycles and Defects.

Instead of creating separate data models for each piece of equipment, you need to focus on building standardized foundational data models around these three core elements. By doing so, you establish a framework that supports consistent KPI generation across all machines and production lines.

The importance of building standardized foundational data models.

Building Data Models for Machine Performance Analytics

Constructing these models involves unifying data from multiple sources through contextualization. For example, a data model for Downtime might combine:

  • Machine or PLC data indicating when downtime starts and ends

  • Manual or automated reason codes for the downtime

  • Operator and shift information

  • Physical location of the machine

This context-rich data structure provides everything needed for calculating Downtime-related KPIs and can be adapted to other metrics as well.

Let's look at examples of these three core data models in detail, and examine how these foundational models come together to generate the key asset reliability metrics you need.

Downtime Data Model

Below is an example JSON structure of a Downtime data model.

{
    "date_created": "2025-01-30 20:55:24", 
    "machine_id":"MP-001",
    "machine_name":"MP1 - Mechanical Press 1",
    "downtime_start":"2025-01-15 13:56:04",
    "downtime_end":"2025-01-15 15:29:06",
    "downtime_type": "Unplanned",
    "reason_code":"MNT-03",
    "description": "emergency_repair",
    "shift_id": "Shift 1",
    "operator_id": "Employee 001",
    "site":"Munich",
    "area":"Pressing",
    "line":"PressLine1"

}

The Downtime data model captures essential information about any downtime event, regardless of the machine type. It includes the machine name and a unique ID, enabling you to group or filter data for specific machines in your analytics platform. Using the start and end times, you can calculate the total duration of each downtime event, which is crucial for Availability calculations and OEE metrics.

Additionally, downtime events are categorized as either planned or unplanned, providing valuable insights for root-cause analysis. Consistent reason codes allow you to track common issues and prioritize improvement efforts effectively. The model also includes location information, offering visibility and enabling comparisons across different sites or the entire enterprise.

By using a standardized Downtime data model, you ensure each downtime event is documented with all the relevant context, making it straightforward to aggregate, analyze, and ultimately reduce these events in the future. Essentially, this model forms the foundation for calculating OEE, MTTR, MTBF and other critical KPIs by providing a clear, consistent snapshot of every downtime incident across the enterprise.

Cycle Data Model

Below is an example JSON structure for a Cycle data model.

{
  "machine_id": "MACHINE_001",
  "cycleId": "CYCLE_20250130_000123",
  "start_time": "2025-01-30T10:00:00Z",
  "end_time": "2025-01-30T10:02:10Z",
  "cycle_time_seconds": 130,
  "ideal_cycle_time_seconds": 120,
  "operator_id": "OPERATOR_123",
  "shift_id": "Shift 1",

  "product_info": {
    "product_id": "PRODUCT_A",
    "part_id": "PART_8976"
  },

  "production_data": {
    "quantity_produced": 1,
    "quantity_scrapped": 0
  },

  "quality_data": {

    "inspection_type": "visual",
    "defects_found": 0
    
  },

  "additional_attributes": {
    "temperature_celsius": 200,
    "vibration_level": 0.3,
    "energy_consumption_KWh": 0.25
  },

  "site":"Munich",
  "area":"Pressing",
  "line":"PressLine1"
}

This Cycle data model includes machine identification along with the start and end times for each cycle. It includes the derived total elapsed time and the ideal cycle time, which are essential for determining the Performance metric for calculating Overall Equipment Effectiveness (OEE). 

Additionally, the model provides context about the parts being produced and includes quality data to ensure production standards are met. It also offers the option to track environmental and process variables such as temperature (°C), vibration levels, and energy consumption (kWh). Monitoring these variables may support advanced analytics use cases beyond OEE, including predictive maintenance and energy efficiency improvements.

Defect Data Model

Below is an example JSON structure for a Defect data model.

{
    "date_created": "2025-01-30 20:55:24", 
    "part_id":"PART7890",
    "machine_id":"BATCH5678",
    "defect_test_time":"2025-01-30T20:57:24.1840199+01:00",
    "category":"Dimensional",
    "defect_type":"Misalignment",
    "Quantity":3,
    "Description":"Misaligned by 0.5mm beyond the acceptable tolerance.",
    "detection_method":"Automated Inspection",
    "severity":"Major",
    "status":"In Review",
    "ResolvedBy":"",
    "ResolutionDetails":{
        "ResolutionAction":"",
        "ResolutionTime":"",
        "Comments":""},
    "site":"Munich",
    "area":"Pressing",
    "line":"PressLine1"
    }

The Defect Data Model captures detailed information about each defect produced by a machine. It includes contextual data about the part being manufactured, the machine's identity, the type of defect, and its current status. This comprehensive information allows you to track and monitor the production of low-quality goods by each machine, which is essential for calculating Overall Equipment Effectiveness (OEE).

When combined, the Downtime, Cycle and Defect data models provide a complete view of production performance. You can see how efficiently machines are running, where quality issues arise, and how downtime events factor in, all of which help generate OEE, MTTR, and other critical analytics.

UNS for Streaming Data for Asset Reliability Analytics

Let’s explore how the Unified Namespace can be used to integrate OT data into these asset reliability data models for use in a datastore or analytics platform.

The diagram below illustrates how instances of Downtime, Cycle, and Defect data models are integrated into a UNS architecture that leverages MQTT. In this setup, data from various machine sources is streamed in real-time into model instances through a contextualization pipeline and then persisted in a database. 

Unified Namespace (UNS) for Streaming Data for Asset Reliability AnalyticsThe contextualization pipeline connects to your operational technology (OT) equipment and machines on one end. On the other end, it streams Downtime, Cycle, and Defect event data into the MQTT broker, which represents the Unified Namespace using MQTT topic hierarchies. These events are then persisted into a datastore within a Machine Performance Analytics platform, enabling comprehensive analysis and performance monitoring.

Summary

By standardizing your data models, you create structures that are independent of machine types and software systems, allowing all machines and production lines to share the same foundational elements. With a contextualization pipeline, you can stream data from any type of machine into these standardized models that capture core activities, then store this data (for example, as rows in a database) where it can be aggregated and joined to generate higher-level asset reliability metrics.

In other words, focusing on base data models for machine activities, rather than creating a single, monolithic representation of every machine, enables you to derive numerous asset reliability insights from the same data foundation.

Watch the Video

Chapters

    Kudzai Manditereza

    Kudzai is a tech influencer and electronic engineer based in Germany. As a Sr. Industry Solutions Advocate at HiveMQ, he helps developers and architects adopt MQTT and HiveMQ for their IIoT projects. Kudzai runs a popular YouTube channel focused on IIoT and Smart Manufacturing technologies and he has been recognized as one of the Top 100 global influencers talking about Industry 4.0 online.

    • Kudzai Manditereza on LinkedIn
    • Contact Kudzai Manditereza via e-mail
    HiveMQ logo
    Review HiveMQ on G2