Skip to content

MQTT Topic Tree & Topic Matching: Challenges and Best Practices Explained

by Lukas Brandl
10 min read

As both the size and complexity of IoT projects continues to grow, we talk to many IT architects who are working to solve the technical challenges to design a data foundation built for scale. In a large MQTT deployment, there may be thousands or even millions of clients subscribing to different topics.

This article explains why finding the matching subscriptions among millions of subscribers is a challenge and how an MQTT broker can overcome this challenge.

What is Topic Matching in MQTT?

MQTT is a publish/subscribe protocol where devices act as MQTT Clients and exchange messages over an MQTT Broker. MQTT Clients send their data in the PUBLISH control packets to the specific topic. The topic is separate from the packet’s payload, which allows the MQTT broker to avoid analyzing the packet’s payload. The MQTT broker delivers the published message to every client subscribed with a matching topic filter. 

For those unaware, the main distinction between the topic and topic filter is that the topic is used for publishing and cannot contain wildcard characters whereas the topic filter can. The wildcard characters are used to aggregate multiple streams of data into one and are thus used on the subscriber’s side. It is possible to create a topic filter without wildcard characters, then it would only match at most one topic. That case is often referred to as an exact subscription. For more information refer to our article, MQTT Topics and Wildcards. 

In a nutshell, topic filter can be thought of as a selector for topics that the PUBLISH packets are sent to. The MQTT broker must be able to find the matching subscriptions for each published message.

Topic filter as a selector for topics that the PUBLISH packets are sent to in an MQTT Broker.Subscriptions can contain wildcard characters to match a broad range of topics. Subscriptions with wildcards are often used when there is uncertainty about the topics that publishing clients will use. For example, when the publishing clients include their ID as one topic level, it may be impossible to reliably receive messages for all such topics without the usage of wildcard characters. While this is useful for the clients, finding all the matching subscriptions presents a technical challenge. In some real-life scenarios, MQTT brokers check millions of subscriptions for every published message.

MQTT Wildcard Topic Matching Challenge Explained

Since there are many use cases for wildcards, let’s examine the technical challenge of capitalizing on wildcard subscriptions. First, looking at every subscription for every published message is not scalable. The number of steps needed to find the matching subscriptions linearly increases with the number of subscriptions. Alternatively, the broker could map the subscriptions to their topic filters and check the map for all filters matching the topic of a published message. This method is also impractical because the number of potentially matching topic filters is rather large for topics with many levels. For instance, if a message is published to the topic “town/house/kitchen”, all the subscriptions with the following topic filters would match:

  • #

  • town/#

  • town/house/#

  • +/house/kitchen

  • town/+/kitchen

  • town/house/+

  • +/+/kitchen

  • town/+/+

  • +/+/+

  • town/house/kitchen

The MQTT broker must also check the map for all these topic filters. In production workloads, the MQTT broker has to find the matching subscriptions for published messages thousands of times per second, so it needs a specialized data structure for a fast lookup.

What is Topic Tree in MQTT?

The Topic Tree is a data structure used to solve the challenges posed by the above wildcard topic matching problem. The topic of the published message is used to collect the matching subscriptions present in the topic tree. We start at the root of the topic tree and proceed through its levels using the topic segments to select the next node. If the current node has wildcard subscriptions (with # or +),  they are added to the matching subscriptions. Once there are no more segments to match in the topic and there are non-wildcard subscriptions in the current node, there are exact subscriptions for this topic. An exact subscription filters topics of published messages for exact matches.

MQTT Topic Tree StructureMQTT Topic Tree Structure

The broker can continue delivering the published message to the subscribed clients once it has found all matching subscriptions. This way of storing the subscriptions also reduces memory usage because topic levels shared across multiple subscriptions are only stored once.

Best Practices for MQTT Topic Tree & Topic Matching

For your application to filter messages to your specifications regardless of how the MQTT Broker defines topic matching, there are a few topic design considerations that you may leverage. 

It is good practice to avoid topic levels that do not add additional information, like using the same topic level across all  subscriptions. The most common example of such abuse is using the company name as the first level for every subscription. While some topic levels typically have less variety than others, you should omit topic levels that are the same for every topic. Similarly, leading with forward slashes must be matched, so should be avoided if you don’t want them present on all topics.

Bad practice:

  • home/livingarea/kitchen

  • home/livingarea/bathroom

  • home/garage

Bad practice:

  • /livingarea/kitchen

  • /livingarea/bathroom

  • /garage

Good practice:

  • livingarea/kitchen

  • livingarea/bathroom

  • Garage

Conclusion

In conclusion, MQTT’s topic matching is crucial to its publish/subscribe protocol, enabling MQTT clients to exchange messages with the MQTT broker with minimal effort. Topic filters help select the topics to which PUBLISH packets are sent, and subscriptions with wildcard characters enable broad topic matching. However, finding all matching subscriptions presents a technical challenge, it can be solved using a specialized data structure called the Topic Tree. It is essential to use best practices when designing topics to make them agnostic of the implementation that the particular MQTT broker may have for topic matching.

Lukas Brandl

Lukas Brandl is Senior Software Engineer at HiveMQ. He is with HiveMQ for more than a decade and plays a key role in product development and product engineering.

  • Contact Lukas Brandl via e-mail

Related content:

HiveMQ logo
Review HiveMQ on G2