Skip to content

Managing IoT Device State Within MQTT

by Magnus McCune
14 min read

As the scale of IoT solutions grows, understanding the current state of client devices & services is not just a feature of these solutions but a necessity for creating efficiency and responsiveness. A common yet flawed approach to this recurring architectural challenge is to create external systems, some form of dedicated ‘device state service’, for storing and querying that information. While functional, this method introduces its own complexities and dependencies, making it less ideal. Fortunately, the flexible and open nature of the MQTT protocol and fully-compliant MQTT brokers like HiveMQ provides a reliable and scalable way to implement device state discovery — no external services required.

Identifying the Antipattern in IoT Device Connectivity

Let’s first consider the commonly applied solution of a Device State Service, an external function that might retain the current state of devices in a persistent database and make available an API for querying that state. While initially this seems like a reasonable and functional solution to a common challenge when working with distributed IoT solutions, it should be considered an antipattern.

Antipatterns are frequently used solutions to common architectural challenges that appear to be effective but can lead to negative consequences.

The overhead of maintaining separate systems, higher risk of data inconsistency, and increased latency — not to mention introducing a new dependency — can impede the effectiveness of IoT solutions. 

This solution might work well during development and testing but fail to scale with production workloads.

Proposing an MQTT-centric Solution

Where possible, we look to identify MQTT-native techniques to solve these recurring architectural challenges and document proven patterns that operate at scale. Implementing device state by leveraging the existing capabilities of MQTT assures us that the solution will scale along with our broker. 

It’s worth noting that specifications built on top of MQTT, such as Eclipse’s Sparkplug specification for Industrial IoT, have considerations built into them for addressing device state (see Sparkplug Session State Management). As these specifications are a collection of design & architectural choices defined on top of MQTT, it is possible to pick elements that we need for other projects without having to adopt the complete specification. HiveMQ has an excellent whitepaper on building a specification on top of MQTT that highlights this approach.

A simple, yet highly effective and scalable approach to implementing device state in MQTT is to use a device-specific topic, retained messages, and the Last Will and Testament feature of MQTT. When a device connects to the broker, it includes a Will, to be published by the broker to a device-specific topic, with a payload indicating an "offline" status, in the event of an ungraceful disconnect. 

After connecting, the device publishes a retained message to that same device-specific topic, with a payload that indicates an "online" status. The message indicating the "online" status is retained by the broker and pushed to any subscribers of that topic, including those that may subscribe in the future (Diagram 1). If the device were to disconnect ungracefully, the broker would then publish the Will message to the device-specific topic, superseding the previously retained message, and pushing it to any subscribers (Diagram 2).

Managing IoT Device State Within MQTT

A Scalable Solution to Seamless IoT Device State Management

Let’s take a more detailed look at how this solution works by investigating each component.

Device-Specific Topics & Wildcard Subscriptions

For a solution like this to work, and as a general best practice, it is recommended to design topic structures that include the device’s unique identifier at some level of the topic hierarchy. This enables many use cases and design patterns that are critical to building a scalable IoT solution. Whether state management, as this blog post covers, or granular authorizations, device commands, or Over-The-Air firmware updates etc., having a way to uniquely identify a device by the topic hierarchy is valuable. Examples might include using the Vehicle Identification Number in the topic structure of a connected car solution (cc/v1/uniqueVIN/state) or the Edge Node Descriptor of Sparkplug in IIoT settings, which is a combination of the Group ID and the EoN Node ID (spBv1.0/groupID/NBIRTH/eonNodeID).

To increase the effectiveness of this technique, wildcard subscriptions can be used if a given service needs to be aware of the current state of all devices in the hierarchy. In our earlier connected car example, a subscriber to the topic cc/v1/+/state would have the state changes of ALL vehicles pushed to it. 

MQTT Retained Messages

In order to ensure that the device’s state is available to new subscribers, even if they weren’t subscribed at the time of the change in state, we use retained messages in conjunction with our device-specific topics. The MQTT protocol only allows for one message to be retained per topic, which ensures that only the most recently updated state is held. Any clients that subscribe to that device’s topic, or a wildcard topic that contains it — even after the initial message has been published — will receive the retained message. The device itself is responsible for publishing its “online” status after it makes a successful connection while the “offline” status is handled by the broker, through the Will mechanism.

MQTT Will Message

To update the status of a device that has been disconnected, perhaps due to network failure, broker action, or other ungraceful disconnection, we rely on the Will mechanism of MQTT, sometimes also known as Last Will and Testament (LWT). As part of the initial CONNECT message, a device can include the Will flag, the topic for the Will message, its QoS level, whether it should be retained, and the payload. In the event of an ungraceful disconnection, the broker must publish the Will messages to the defined topic, which is then pushed to all subscribers. The published and retained Will message and payload indicating the “offline” status of the device then supersedes any previous state. If the device reconnects, it publishes its “online” status, which in turn supersedes the Will message. 

Further Enhancements & Advanced Use Cases

Thus far, examples have used a simplistic “online” or “offline” status as the payload for the state messages. However, these could be enhanced to implement additional functionality. A mechanism to include a timestamp with the Will payload could provide some useful context. Similarly, adding a reason code for the ungraceful disconnection could help in troubleshooting. Adding a mechanism for the device to publish a graceful disconnect message prior to sending its DISCONNECT notification would distinguish between planned and unplanned disconnections. Each of these would require some client-side code or a broker-side extension to implement, as we have done with the HiveMQ Sparkplug Aware extension.  

An enhanced "offline" status message might look like this:

{
  “status”: “offline”,
  “reason”: “client DISCONNECT”,
  “timestamp” : 1704388031
}

Similarly, the “online” status payload could include additional information about the device, such as a firmware version, model number or even a payload schema. The SparkplugB specification is a fantastic example of these features being implemented in a robust and scalable manner. You can read more about how they are implemented with Sparkplug here:

HiveMQ Sparkplug Essentials - Session State Management  

HiveMQ Sparkplug Essentials - Payload Structures  

HiveMQ Sparkplug Essentials - Operational Behavior  

Another advanced use case might be a circumstance where multiple HiveMQ clusters are in use and the state of devices must be replicated across broker clusters. With the described MQTT native solution, this replication is possible without additional complication, thanks to offerings like HiveMQ Enterprise Bridge Extension. Publishing an identifier for the connected cluster as part of the “online” status payload could further enhance this use case.

Conclusion

Adopting an MQTT-centric approach for managing device states offers a more integrated, near-real-time, and scalable solution compared to external systems. This strategy not only simplifies the architecture but also enhances the reliability and responsiveness of IoT systems. We encourage IoT developers and architects to explore this approach in their solutions, and we welcome any feedback or questions on this topic.

Magnus McCune

Magnus is a Principal Architect at HiveMQ. He is a passionate technologist with a proven background solving complex business and technical challenges through the design, implementation and operationalization of cloud and edge technologies. His expertise extends to network, cloud, & infrastructure architecture, cloud-native solutions design and large-scale automation projects.

  • Magnus McCune on LinkedIn

Why HiveMQ is More Than an MQTT Broker

Explore how HiveMQ goes beyond being an MQTT broker, addressing the complex needs of modern IoT systems with enterprise-grade capabilities.

Blog

Machine Learning at the Edge with Scikit-Learn, Keras, BentoML, and HiveMQ

Explore how to run machine learning models at the edge with Scikit-Learn, Keras, BentoML, & HiveMQ using a manufacturing example on predictive quality.

Blog

HiveMQ vs. Mosquitto: An MQTT Broker Comparison

Compare HiveMQ’s scalability and clustering with Mosquitto’s lightweight design to choose the right MQTT broker for your application needs.

Blog

Automating MQTT Broker Management on Kubernetes with IaC and GitOps: Part 2

A technical step-by-step guide on how to implement security principles while automating HiveMQ MQTT Broker management on Kubernetes with IaC and GitOps.

Blog

Real-time Operational Visibility in Manufacturing with HiveMQ and Snowflake

Learn how HiveMQ MQTT platform and Snowflake together can enable manufacturers to harness the full potential of their IIoT data for operational insights.

Blog

Automating MQTT Broker Management on Kubernetes with IaC and GitOps: Part 1

A technical step-by-step guide to automating HiveMQ MQTT Broker management on Kubernetes using IaC, GitOps leveraging Terraform and ArgoDB. This is part 1.

Blog

How to Configure an MQTT Broker in Under 5 Minutes

Learn to configure an MQTT broker in under 5 minutes with HiveMQ cloud, open-source, and enterprise options for your IoT messaging needs.

Blog

Enhancing Axis Network Camera Capabilities with MQTT and HiveMQ

Discover how Axis network cameras & HiveMQ MQTT Broker enable real-time data processing & edge intelligence for revolutionary network camera capabilities.

Blog

Deploy HiveMQ MQTT Broker with Amazon Elastic Container Service (ECS) Anywhere

A guide to deploying HiveMQ MQTT broker using Amazon ECS Anywhere so you can harness the full potential of IoT applications across diverse environments.

Blog

Connector Framework vs. Plug-in Architecture in MQTT-Based IoT Architectures

Discover the pros & cons of connector & plug-in framework & learn why the plug-ins framework is preferred for building HiveMQ Enterprise Extensions.

Blog

An IoT Tutorial Using HiveMQ MQTT Cluster, ESP32, Lua, and Xedge32

A beginner’s guide to creating efficient and secure IoT applications using HiveMQ MQTT Cluster, ESP32, Lua, and Xedge32.

Blog

How HiveMQ and MQTT Support USCAR

Discover how HiveMQ's MQTT platform addresses data communication challenges to meet USCAR-53 standards.

Blog

HiveMQ Achieves SOC 2 Type II Compliance: A Milestone in Security and Trust

HiveMQ achieves SOC 2 Type II compliance, demonstrating top-notch security, processing integrity, confidentiality, & privacy for its trusted MQTT platform.

Blog

How HiveMQ Optimizes High-volume Data Ingest into AWS

A solution architect’s guide showing how HiveMQ MQTT platform can simplify the IoT solution architecture for telemetry data transfer to the AWS cloud.

Blog

Cracking MQTT Performance with Automation: Benchmarking Implemented

Learn how HiveMQ engineers implemented automated system benchmarks to improve performance testing of the MQTT broker.

Blog

Cracking MQTT Performance with Automation: Challenges and Approaches

Explore how HiveMQ engineers addressed the challenges related to MQTT performance and how they leveraged automated system benchmarking.

Blog

Implementing Authentication in HiveMQ Without Active Directory Schema Changes

A step-by-step guide to implement access control management and authentication inside of HiveMQ Broker without active directory schema changes.

Blog

HiveMQ Health API: MQTT Platform Monitoring Made Easy

Explore how SREs can monitor, identify, & resolve issues efficiently with HiveMQ Health API, which offers detailed health insights for the HiveMQ Platform.

Blog

Navigating the HiveMQ Migration: Your FAQ Guide

Get answers to some of the frequently asked questions about the migration process at HiveMQ.

Blog

Stopping the Scam: Anomaly Detection and Fraud Prevention with MQTT

Learn how MQTT & HiveMQ platform help provide deeper insights into IoT/IIoT data, detect anomalies as they occur, & safeguard against fraudulent activities.

Blog

Set Up to Scale: Trax Retail Enhances Customer Experience with Real-Time Inventory Management

Learn why Trax Retail implemented MQTT and HiveMQ to make data-driven decisions, optimize stocking processes, and enhance the overall customer experience.

Blog

Embracing Innovation With the New Long-Term Support Version of the HiveMQ Platform

Explore how long-term support version of HiveMQ 4.28 features improved user experience, better integration, enhanced reliability, improved security, & more.

Blog

HiveMQ Increases MQTT Per-Core Throughput with the New Client Queue

Learn how HiveMQ 4.28 release features increase in efficiency, making HiveMQ one of the most scalable and reliable MQTT platform.

Blog

HiveMQ Receives 2024 IoT Evolution Industrial IoT Product of the Year Award

HiveMQ recognized with 2024 IoT Evolution Industrial IoT Product of the Year award for its ability to connect millions of devices reliably and securely.

Blog

Understanding HiveMQ’s ISO/IEC 27001 Certification for Information Security Management

Explore why & how HiveMQ adopted ISO/IEC 27001 information security management standard to protect data, intellectual property, & consumer information.

Blog

Monitoring HiveMQ: A Comprehensive Guide

A comprehensive guide to monitoring HiveMQ MQTT Broker for several KPIs after it is deployed in a new production environment.

Blog

Real-World Wastewater Industry Use Cases Powered by HiveMQ

Learn how HiveMQ MQTT platform powered Unified Namespace (UNS) implementation is enabling IIoT use cases in the wastewater industry.

Blog

HiveMQ Configuration with AI: A Practical Approach

Explore how you can use OpenAI’s ChatGPT to customize a GPT to generate Dockerfiles and XML to deploy a secure HiveMQ MQTT Broker.

Blog

HiveMQ: High Availability Through Replication and Failover

Gain insights into how HiveMQ MQTT broker uses data partition techniques to provide continuous availability even when the master node is not available.

Blog

Smart Cities and Public Safety Made Possible with MQTT and HiveMQ

Explore how MQTT protocol and HiveMQ MQTT platform together can help in creating smart cities and enable public safety.

Blog
HiveMQ logo
Review HiveMQ on G2