The HiveMQ Data Hub provides mechanisms to define how MQTT data is handled in the HiveMQ broker. This ensures that data quality is assessed at an early stage in the data supply chain, eliminating the need for subscribers to perform resource-intense validation before data reaches downstream devices or upstream services.
The HiveMQ Data Hub provides the following capabilities to help you enforce data integrity and quality.
Define a policy that dictates agreed upon behaviors for how devices should work with the MQTT broker by logging, stopping, or transforming the behaviors. Flow-control validates in-flight message flow patterns, and the scripting engine allows you to implement behaviors tailored to your specific needs via JavaScript functions.
Convert or manipulate raw or structured data into the desired format while it moves through the MQTT broker and before it reaches consumers. Transformations move more operations to the edge so data can be standardized. For example, convert Fahrenheit to Celsius, or something more sophisticated such as data manipulation using JavaScript.
Utilize the simple user interface to manage schemas, data, and behavioral policies. The dashboard provides an overview of overall quality metrics, making it easy to locate bad actors and bad data sources. Visualize the MQTT data further in tools like Grafana.
Create the blueprint for how data is formatted. Both JSON and Protobuf formats are currently supported.
Define the set of rules and guidelines to enforce how data and messages should be expected, like requiring a temperature between 100-1200.
The HiveMQ REST API provides an interface for applications to interact programmatically with the HiveMQ Enterprise MQTT broker.
Create the schema of how data should be formatted. Both JSON and Protobuf formats are currently supported. These can vary from simple to complex. The example on the right is a schema that validates GPS coordinates.
An appropriate policy tells HiveMQ how to handle incoming MQTT messages to enforce the rules and structure that the schema outlines. The example on the right drops the message and logs the result, but these can be arbitrarily complex and could re-queue the message. On the re-queue, the system could also apply transformation functions to fix bad data.
A demo use case introduces a quality metric and visualizes it in a Grafana dashboard. An addition to the quality metric is the list of bad clients queried from a PostgreSQL database. A screenshot of the dashboard includes the data quality on the left hand side and list of the top 10 bad clients on the right hand side.
Maximize the business value of data being transported by defining data policies, transformations, and validation requirements to ensure the data is accurate and meets the standards your organization requires.
Provide faster business insights on validated data - stop acting on rogue data and generating more noise than insights.
Ensure data quality standards are centrally defined and enforced across all devices and messages.
Stop bad-acting devices from misusing MQTT connections, sending bad data, and monopolizing resources.
Reduce redundant processing and storage costs by only acting on good data. Reduce the impact of acting on bad data.
Quarantine and further investigate bad data to prevent it from contaminating your systems and ultimately ensuring your data is accurate and reliable.
Manage everything in a single system, allowing data to be processed faster and negating the need to manage another standalone system to ensure data quality.
Browse the informative HiveMQ Policy Cookbooks repository on GitHub for additional use stories and code samples.
Learn how to measure the quality of your data pipeline including creating MQTT data schemas and policies.
Understand how Data Hub prevents bad actors from potential outages that bring down infrastructures.