As the scale of IoT solutions grows, understanding the current state of client devices & services is not just a feature of these solutions but a necessity for creating efficiency and responsiveness. A common yet flawed approach to this recurring architectural challenge is to create external systems, some form of dedicated ‘device state service’, for storing and querying that information. While functional, this method introduces its own complexities and dependencies, making it less ideal. Fortunately, the flexible and open nature of the MQTT protocol and fully-compliant MQTT brokers like HiveMQ provides a reliable and scalable way to implement device state discovery — no external services required.

Identifying the Antipattern in IoT Device Connectivity

Let’s first consider the commonly applied solution of a Device State Service, an external function that might retain the current state of devices in a persistent database and make available an API for querying that state. While initially this seems like a reasonable and functional solution to a common challenge when working with distributed IoT solutions, it should be considered an antipattern.

Antipatterns are frequently used solutions to common architectural challenges that appear to be effective but can lead to negative consequences.

The overhead of maintaining separate systems, higher risk of data inconsistency, and increased latency — not to mention introducing a new dependency — can impede the effectiveness of IoT solutions.

This solution might work well during development and testing but fail to scale with production workloads.

Proposing an MQTT-centric Solution

Where possible, we look to identify MQTT-native techniques to solve these recurring architectural challenges and document proven patterns that operate at scale. Implementing device state by leveraging the existing capabilities of MQTT assures us that the solution will scale along with our broker.

It’s worth noting that specifications built on top of MQTT, such as Eclipse’s Sparkplug specification for Industrial IoT, have considerations built into them for addressing device state (see Sparkplug Session State Management). As these specifications are a collection of design & architectural choices defined on top of MQTT, it is possible to pick elements that we need for other projects without having to adopt the complete specification. HiveMQ has an excellent whitepaper on building a specification on top of MQTT that highlights this approach.

A simple, yet highly effective and scalable approach to implementing device state in MQTT is to use a device-specific topic, retained messages, and the Last Will and Testament feature of MQTT. When a device connects to the broker, it includes a Will, to be published by the broker to a device-specific topic, with a payload indicating an "offline" status, in the event of an ungraceful disconnect.

After connecting, the device publishes a retained message to that same device-specific topic, with a payload that indicates an "online" status. The message indicating the "online" status is retained by the broker and pushed to any subscribers of that topic, including those that may subscribe in the future (Diagram 1). If the device were to disconnect ungracefully, the broker would then publish the Will message to the device-specific topic, superseding the previously retained message, and pushing it to any subscribers (Diagram 2).

Managing IoT Device State Within MQTT

A Scalable Solution to Seamless IoT Device State Management

Let’s take a more detailed look at how this solution works by investigating each component.

Device-Specific Topics & Wildcard Subscriptions

For a solution like this to work, and as a general best practice, it is recommended to design topic structures that include the device’s unique identifier at some level of the topic hierarchy. This enables many use cases and design patterns that are critical to building a scalable IoT solution. Whether state management, as this blog post covers, or granular authorizations, device commands, or Over-The-Air firmware updates etc., having a way to uniquely identify a device by the topic hierarchy is valuable. Examples might include using the Vehicle Identification Number in the topic structure of a connected car solution (cc/v1/uniqueVIN/state) or the Edge Node Descriptor of Sparkplug in IIoT settings, which is a combination of the Group ID and the EoN Node ID (spBv1.0/groupID/NBIRTH/eonNodeID).

To increase the effectiveness of this technique, wildcard subscriptions can be used if a given service needs to be aware of the current state of all devices in the hierarchy. In our earlier connected car example, a subscriber to the topic cc/v1/+/state would have the state changes of ALL vehicles pushed to it.

MQTT Retained Messages

In order to ensure that the device’s state is available to new subscribers, even if they weren’t subscribed at the time of the change in state, we use retained messages in conjunction with our device-specific topics. The MQTT protocol only allows for one message to be retained per topic, which ensures that only the most recently updated state is held. Any clients that subscribe to that device’s topic, or a wildcard topic that contains it — even after the initial message has been published — will receive the retained message. The device itself is responsible for publishing its “online” status after it makes a successful connection while the “offline” status is handled by the broker, through the Will mechanism.

MQTT Will Message

To update the status of a device that has been disconnected, perhaps due to network failure, broker action, or other ungraceful disconnection, we rely on the Will mechanism of MQTT, sometimes also known as Last Will and Testament (LWT). As part of the initial CONNECT message, a device can include the Will flag, the topic for the Will message, its QoS level, whether it should be retained, and the payload. In the event of an ungraceful disconnection, the broker must publish the Will messages to the defined topic, which is then pushed to all subscribers. The published and retained Will message and payload indicating the “offline” status of the device then supersedes any previous state. If the device reconnects, it publishes its “online” status, which in turn supersedes the Will message.

Further Enhancements & Advanced Use Cases

Thus far, examples have used a simplistic “online” or “offline” status as the payload for the state messages. However, these could be enhanced to implement additional functionality. A mechanism to include a timestamp with the Will payload could provide some useful context. Similarly, adding a reason code for the ungraceful disconnection could help in troubleshooting. Adding a mechanism for the device to publish a graceful disconnect message prior to sending its DISCONNECT notification would distinguish between planned and unplanned disconnections. Each of these would require some client-side code or a broker-side extension to implement, as we have done with the HiveMQ Sparkplug Aware extension.

An enhanced "offline" status message might look like this:

{
  “status”: “offline”,
  “reason”: “client DISCONNECT”,
  “timestamp” : 1704388031
}

Similarly, the “online” status payload could include additional information about the device, such as a firmware version, model number or even a payload schema. The SparkplugB specification is a fantastic example of these features being implemented in a robust and scalable manner. You can read more about how they are implemented with Sparkplug here:

HiveMQ Sparkplug Essentials - Session State Management

HiveMQ Sparkplug Essentials - Payload Structures

HiveMQ Sparkplug Essentials - Operational Behavior

Another advanced use case might be a circumstance where multiple HiveMQ clusters are in use and the state of devices must be replicated across broker clusters. With the described MQTT native solution, this replication is possible without additional complication, thanks to offerings like HiveMQ Enterprise Bridge Extension. Publishing an identifier for the connected cluster as part of the “online” status payload could further enhance this use case.

Conclusion

Adopting an MQTT-centric approach for managing device states offers a more integrated, near-real-time, and scalable solution compared to external systems. This strategy not only simplifies the architecture but also enhances the reliability and responsiveness of IoT systems. We encourage IoT developers and architects to explore this approach in their solutions, and we welcome any feedback or questions on this topic.

Magnus McCune

Magnus is a Principal Architect at HiveMQ. He is a passionate technologist with a proven background solving complex business and technical challenges through the design, implementation and operationalization of cloud and edge technologies. His expertise extends to network, cloud, & infrastructure architecture, cloud-native solutions design and large-scale automation projects.

Managing IoT Device State Within MQTT

Identifying the Antipattern in IoT Device Connectivity

Proposing an MQTT-centric Solution

A Scalable Solution to Seamless IoT Device State Management

Device-Specific Topics & Wildcard Subscriptions

MQTT Retained Messages

MQTT Will Message

Further Enhancements & Advanced Use Cases

Conclusion

Magnus McCune

Dynamic Geo-Fencing: Building Location-Aware IoT Applications with MQTT and HiveMQ

Managing HiveMQ Releases and Support: What You Need to Know

Why Embracing Cloud-Native IoT is a Business Imperative for Enterprise Architecture

Site Broker or Not? How to Decide for Modern IIoT and Edge Deployments

A Step-by-Step Guide to Connecting Ignition to MQTT and HiveMQ

Deploying HiveMQ on AWS ECS vs. AWS EKS: Pros and Cons

HiveMQ at Hannover Messe 2025: Unlock the Value of Your Industrial Data

HiveMQ vs. AWS IoT Core: A Comparative Analysis for IoT Messaging

Robust and Responsive Overload Protection for HiveMQ Broker

HiveMQ Managed vs. Self-Managed: Which Option Is Right for You?

Why HiveMQ is More Than an MQTT Broker

Machine Learning at the Edge with Scikit-Learn, Keras, BentoML, and HiveMQ

HiveMQ vs. Mosquitto: An MQTT Broker Comparison

Automating MQTT Broker Management on Kubernetes with IaC and GitOps: Part 2

Real-time Operational Visibility in Manufacturing with HiveMQ and Snowflake

Automating MQTT Broker Management on Kubernetes with IaC and GitOps: Part 1

How to Configure an MQTT Broker in Under 5 Minutes

Enhancing Axis Network Camera Capabilities with MQTT and HiveMQ

Deploy HiveMQ MQTT Broker with Amazon Elastic Container Service (ECS) Anywhere

Connector Framework vs. Plug-in Architecture in MQTT-Based IoT Architectures

An IoT Tutorial Using HiveMQ MQTT Cluster, ESP32, Lua, and Xedge32

HiveMQ Achieves SOC 2 Type II Compliance: A Milestone in Security and Trust

Real-time Analytics of MQTT Messages Using Elasticsearch, Kibana & HiveMQ

Customer Onboarding at HiveMQ: A Path to Transforming Businesses

How HiveMQ Optimizes High-volume Data Ingest into AWS

Cracking MQTT Performance with Automation: Benchmarking Implemented

Cracking MQTT Performance with Automation: Challenges and Approaches

Implementing Authentication in HiveMQ Without Active Directory Schema Changes

HiveMQ Health API: MQTT Platform Monitoring Made Easy

Navigating the HiveMQ Migration: Your FAQ Guide