Solving Real-Time Smart Meter Data Inconsistencies via HiveMQ Data Hub

by Shashank Sharma

Jan 27, 2025 16 min read

The energy sector is experiencing a digital revolution by integrating smart grids and smart metering systems. These systems use IoT devices, such as smart meters, to collect real-time data on energy consumption from homes and businesses. The collected data is then transmitted to backend systems for processing, billing, and analysis.

However, a major challenge in this transformation is inconsistent data formats. Different smart meters from various manufacturers use different data formats and communication protocols. This inconsistency causes several issues:

Data misinterpretation: Backend systems struggle to parse and process data correctly when data formats differ.
Delays in reacting to failure: When a smart meter has any issue, you usually request data after the failure has occurred, process that data manually, and figure out root-cause thereby costing yourself critical time and business value for your customers..
Slow data integration: When different devices use various formats or protocols, it takes extra time to combine and standardize the data. This delay makes it difficult to monitor energy usage in real-time and respond quickly to issues. Data from multiple devices becomes cumbersome and inefficient, slowing real-time energy monitoring and decision-making.

This is where HiveMQ’s Data Hub helps. It helps energy providers ensure that all smart meter data follows a consistent, standardized format.

A Brief Introduction to HiveMQ Data Hub

HiveMQ offers our customers a powerful MQTT platform for moving their data across devices and IT systems. HiveMQ Data Hub, a part of the HiveMQ Platform, provides mechanisms to ensure data integrity and quality as it moves through MQTT-based systems. It allows businesses to define and enforce data policies, transform data formats, and validate incoming messages against schemas. This makes it easier to ensure that the data flowing through IoT systems is accurate, standardized, and ready for use in decision-making.

Here’s how Data Hub addresses the issue of data inconsistency:

Schema Validation: Data Hub enforces a simple, generic JSON schema that all MQTT messages coming into HiveMQ Broker must conform to. This ensures that every message is structured in the same way, making it easy for backend systems to process the data accurately.
Error Handling: If any device sends data that doesn’t meet the required format, Data Hub, based on a given policy, can reject or flag the message. This prevents errors from entering the system and causing downstream problems like data analysis issues.
Seamless Integration: By enforcing a standard format, Data Hub enables easy integration with other systems. Energy utility providers can now quickly aggregate data from various devices (e.g., different models of smart meters), allowing them to monitor energy consumption in real-time and make better-informed decisions.
Scalability: As smart grid systems grow and new devices are added, Data Hub ensures that all incoming data remains consistent, supporting future expansion and new technologies without disrupting operations.

With Data Hub in place, smart meter data is transmitted securely via MQTT over TLS and consistently to backend systems, making it easier for energy companies to handle large volumes of data while reducing the potential for errors.

In this blog post, we cover one of the many use cases covered by Data Hub, i.e. applying a strict data schema across all messages sent via smart meters through the HiveMQ MQTT Platform.

Data Hub Real-World Use Case Example

Let's take a look at a real-world example. A smart meter payload consists of:

Unique identifiers or timestamps
Values as Voltage, current, and energy

The payload in our example is in JSON format.

In such cases, usually data types—as well as precision to decimal points—also play a role. Such precision is sometimes required for accurate calculations such as billing. An example of such a payload would look as follows:

{ 
      "payload": [ "[2, 213.7, 0.38, 0.007420]" 
       ] 
}

In the above example, we see a string containing an array and a unique identifier, voltage, current, and energy in that order. the unique identifier is an integer. The voltage and current are float with two decimal places, and the energy is float with seven decimal places.

With MQTT, the need for complex additional security elements, such as security keys, is greatly simplified. Since MQTT is TLS-enabled, security mechanisms are efficiently managed at the TLS layer, ensuring secure communication. HiveMQ further enhances this by offering robust support for various security authentication and authorization mechanisms, making it easier to safeguard data while maintaining seamless and reliable communication.

To ensure data consistency, we can utilize HiveMQ’s Data Hub to process all incoming MQTT messages and do the in-flight processing. This can be done via a schema file and a data policy file in a two-step process.

Step 1: Create and upload a Schema

Schema: A schema simply refers to a blueprint or layout of data where its structure follows certain guidelines. We can upload a certain schema to Data Hub, and by that, we ask it to check if the incoming messages follow a particular schema.

A JSON schema (smart-meter-energy-data-schema.json) for such a simple use case would look as follows:

{
  "type": "object",
  "properties": {
    "payload": {
      "type": "array",
      "items": [
        {
          "type": "string",
          "pattern": "^\\[\\d+,\\s*\\d+\\.\\d{2},\\s*\\d+\\.\\d{2},\\s*\\d+\\.\\d{7}\\]$"
        }
      ]
    }
  },
  "required": ["payload"],
  "additionalProperties": false
}

A schema like the above is enforced via a Policy. The pattern property ensures that we adhere to the data types in which the values are written.

To upload this schema to the broker, run the following command:

mqtt hivemq schema create --id smart-meter-energy-data-schema --type json --file smart-meter-energy-data-schema.json

Step 2: Create and upload a data policy

Data Policy: It is a set of instructions that tell Data Hub how to process incoming messages by taking schema files as a blueprint. It is also composed in JSON format. In our case, all the smart meter data is stored in a specific topic structure: smart-meter/data/#. We can add this information to our data-policy file as well.

In our case, the data policy file (smart-meter-energy-data-validation-policy.json) would look as follows:

{
  "id": "smart-meter-energy-data-validation-policy",
  "matching": {
    "topicFilter": "smart-meter/data/#"
  },
  "validation": {
    "validators": [
      {
        "type": "schema",
        "arguments": {
          "strategy": "ALL_OF",
          "schemas": [
            {
              "schemaId": "smart-meter-energy-data-schema",
              "version": "latest"
            }
          ]
        }
      }
    ]
  },
  "onSuccess": {
    "pipeline": [
      {
        "id": "flagSchemaChecked",
        "functionId": "Mqtt.UserProperties.add",
        "arguments": {
          "name": "policy",
          "value": "success"
        }
      }
    ]
  },
  "onFailure": {
    "pipeline": [
      {
        "id": "logFailure",
        "functionId": "System.log",
        "arguments": {
          "level": "WARN",
          "message": "The client ${clientId} has sent an invalid energy data payload. The message will be dropped."
        }
      },
      {
        "id": "flagSchemaChecked",
        "functionId": "Mqtt.UserProperties.add",
        "arguments": {
          "name": "policy",
          "value": "failed"
        }
      },
      {
        "id": "dropThePublish",
        "functionId": "Mqtt.drop",
        "arguments": {}
      }
    ]
  }
}

This policy acts as a set of rules to validate the data sent by smart meters. It ensures that the data (such as energy usage, voltage, and current) follows a specific format. If the data is correct and follows the rules, it’s marked as "success." If there’s an error, the system logs a warning, marks the message as "failed," and stops the message from being processed. In other words, if the smart meter's data doesn't match the required format, this section ensures that the invalid message is ignored and not processed further in the system. That means invalid messages do not reach the backend systems.

Alternatively, you can choose to not drop messages that don’t pass your quality checks and store them elsewhere. You can adjust the Policy file accordingly. e.g., redirect invalid messages for further inspection.

To upload smart-meter-energy-data-validation-policy.json to the broker, run the following command:

mqtt hivemq data-policy create --file smart-meter-energy-data-validation-policy.json

This will start processing all incoming messages “in-flight” using the uploaded schema and policy. I used the following example from HiveMQ’s GitHub repo of Data Hub use cases as a baseline for my example, and my sample payload is based on this article. Please note, for running this example, you would need to enable REST-APIs in your configuration and you’ll also need to activate the Data Hub full-license trial.

Once you successfully upload both your schema and data-policy, you can view them in the Control Center.

HiveMQ Data Hub

The Benefits of Real-Time Smart Meter Data Processing

Many legacy systems store such data but do not process them in real time. As a result, when things go wrong, these legacy businesses react to failure instead of acting ahead of time. Instead, having such data in real-time gives you a live view which also impacts your business understanding. You can predict failures by examining and analyzing the real-time data. For example, individual temperature values in a valid range of a meter don't create any alarms, but as you track any changes or fluctuations over a time period, you are likely to identify patterns such as consistent temperature rise in a short time window. This may signal an issue but can be caught only if the data is being analyzed in real-time.

Real-time data flow also saves time for any future downstream batch processing of data as you now have the data in a consistent format. Ultimately, this leads to business efficiency by reducing wrong data, and data inconsistencies as well as enabling you to have critical information, such as total supply and demand, in real time.

A simple representation of such a data flow can look as follows:

Simple representation of data flow from smart meters to enterprise systems via MQTT

Conclusion

Leveraging HiveMQ’s Data Hub for real-time smart meter data processing brings significant advantages to the energy sector. By ensuring data consistency with schema and policy validation, energy providers can avoid data errors and improve operational efficiency. Real-time processing offers businesses a live view of their data, enabling predictive analysis and early detection of failures, which minimizes downtime and optimizes decision-making. Additionally, this approach reduces the need for costly downstream batch processing, as data is already in a consistent format. The move towards real-time data processing enhances business efficiency, providing critical insights into supply and demand in real-time, ultimately improving customer satisfaction.

To learn more about Data Hub use cases, visit our GitHub repo and download HiveMQ to get started using Data Hub today.

Shashank Sharma

Shashank Sharma is a product marketing manager at HiveMQ. He is passionate about technology, supporting customers, and enabling developer-centric workflows. He focuses on the HiveMQ Cloud offerings and has previous experience in application software tooling, autonomous driving, and numerical computing.

Solving Real-Time Smart Meter Data Inconsistencies via HiveMQ Data Hub

A Brief Introduction to HiveMQ Data Hub

Data Hub Real-World Use Case Example

The Benefits of Real-Time Smart Meter Data Processing

Conclusion

Shashank Sharma

Distributed Data Intelligence in Manufacturing: The Path, Benefits, and Pitfalls

Build Stateful IoT Data Pipelines with HiveMQ Data Hub

HiveMQ at Hannover Messe 2025: Unlock the Value of Your Industrial Data