Optimizing Data Cost Efficiency in MQTT-Based IoT and Connected Systems
In today’s data-driven world, optimizing data cost efficiency is essential for businesses leveraging IoT and connected systems. As MQTT-based architectures expand, streamlining data transmission while maintaining system integrity becomes a critical advantage. CTOs, Solution Architects, and IoT professionals must now balance maximizing data potential with controlling operational costs.
This blog highlights innovative strategies to enhance data cost efficiency in MQTT deployments. By adopting these approaches, you can significantly cut expenses and unlock new opportunities for scaling and optimizing performance.
Understanding the Challenge
For IoT architects, industrial automation engineers, solutions architects, and data engineers, the challenge of data cost efficiency in MQTT-based systems is multifaceted and critical to address. While MQTT’s lightweight protocol is designed for efficient communication, the exchange of packets—whether they are PUBLISH, SUBSCRIBE, or control packets—naturally contributes to data usage and costs. As systems scale, these costs can escalate, impacting both operational expenses and system performance. By implementing targeted strategies, you can fully leverage MQTT’s benefits while optimizing data cost efficiency.
Strategies for Data Cost Efficiency Optimization
Reduce the Number of Messages
In MQTT, message flow relies on packet exchange. Therefore, reducing the number of packets exchanged is a key strategy for optimizing data cost efficiency. Achieving this requires a clear understanding of the specific requirements and limitations of your use case. In most scenarios, constraints often stem from limitations on end devices. If you have control over the end devices, their libraries, and software, you have more flexibility to optimize message flow compared to situations where this control is limited.
Avoid Sending Overhead Data
When scaling up, it’s crucial to eliminate unnecessary data. Otherwise, as you scale, you multiply the amount of irrelevant data, leading to inefficiencies and, in some cases, direct costs—such as higher cell contract expenses. Where to start? There are several ways to reduce overhead. One effective approach is upgrading from MQTT 3 to MQTT 5. Next, assess what information needs to be included in the MQTT header. Consider whether certain data can be sent using User Properties instead of the Payload, or vice-versa, to streamline message structure.
Shrink the Size of the Payload
One of the most common approaches to optimize data efficiency is by reducing the amount of information in the payload or using another protocol like Protobuf to minimize overall message size.
As advocates of MQTT, we encourage exploring ways to enhance its efficiency, especially in handling data-intensive applications. While Protobuf offers compact encoding, MQTT is inherently a lightweight protocol, designed for efficient data transmission even in constrained environments.
To leverage both, you can use HiveMQ Data Hub, which enables efficient handling of Protobuf-encoded MQTT messages by providing schema definition and validation, data transformation, and policy enforcement. It allows for the deserialization and manipulation of Protobuf data, facilitating interoperability and ensuring data integrity within IoT applications.
By combining the compact data encoding of Protobuf with the lightweight and reliable nature of MQTT through HiveMQ Data Hub, you can achieve optimal message size, improve data efficiency, and streamline data management within complex IoT systems.
Know More About MQTT
At HiveMQ, we strongly advocate for MQTT because of its proven advantages over other protocols. Its lightweight nature, efficiency in low-bandwidth environments, and reliability in delivering critical data make it a powerful choice for IoT applications. By deepening your understanding of MQTT, you can unlock its full potential and leverage it more effectively in your projects.
Staying updated on MQTT best practices, specifications, and emerging use cases empowers you to make informed decisions that optimize your system’s performance. A solid foundation in MQTT not only equips you to troubleshoot and refine existing implementations but also enables you to innovate with confidence as your IoT ecosystem grows.
Pro Tip: Leverage our HiveMQ University and MQTT Essentials to take your expertise to the next level.
Use HiveMQ to its Full Potential
HiveMQ is far more than just an MQTT broker. If you’re only using HiveMQ for basic message brokering, you’re missing out on its full range of capabilities.
For example, I recently spoke with a customer who needed to monitor which clients were connected. They initially solved this by implementing a “ping-pong” system, where backend systems would regularly exchange messages to verify client connections. While technically feasible, this solution introduced a significant number of unnecessary transactions.
By consulting with us, they discovered a simpler approach using the HiveMQ REST API, which provides real-time information about connected clients without excessive messaging overhead. With this change, they eliminated over 21 million unnecessary messages per day for their 15,000 devices, saving approximately 20.5 GB of data daily.
Build Business Rules
Every use case and business has a unique set of rules and requirements. But, the importance of building efficient business rules ties together several key concepts but deserves special emphasis. It’s essential to optimize these rules by streamlining functions, procedures, software, and hardware, while also minimizing single points of failure.
Deeper Understanding of Your Use Case
This is a topic we frequently emphasize across various examples, and this context is no different. While there are common traits among use cases, each one also demands unique approaches tailored to its specific requirements.
For instance, in a smart city application, latency and real-time data processing are often critical, requiring efficient message routing and low-latency protocols. On the other hand, in industrial IoT scenarios, reliability and fault tolerance are paramount, which might involve robust failover mechanisms and enhanced security protocols. Similarly, healthcare IoT systems often prioritize data integrity and regulatory compliance, necessitating precise data handling and encryption measures.
Understanding the unique needs of each use case helps ensure that the solution is optimized to deliver the desired outcomes efficiently.
Applying Data Cost Efficiency Strategies in Practice
Now, let’s dive deeper and explore how each of these strategies can be effectively applied.
Convert Protobuf to JSON
Protobuf messages are often 2 to 10 times smaller than JSON because Protobuf uses compact binary encoding, making it ideal for MQTT-based systems where data efficiency is crucial.
A common strategy is to use Protobuf for communication with client devices and JSON for backend interactions. For example, Protobuf is well-suited for devices with cellular connections due to its smaller size, while JSON improves readability and processing on the IoT platform.
HiveMQ Data Hub supports this approach by enabling seamless validation and conversion between Protobuf and JSON, ensuring data integrity and compatibility within MQTT environments. Read our documentation for in-depth information about HiveMQ Data hub.
Discard Messages That are Invalid or Not Adding Value
Only high-quality, valuable data should be transmitted throughout your system. With HiveMQ Data Hub, you can ensure this by validating incoming packets against specified schemas and predefined values. Any messages that are invalid or fail to meet the criteria can be filtered out and discarded before they propagate downstream.
This approach is especially beneficial in fan-out scenarios, where a single packet is often republished to multiple topics. By discarding irrelevant messages early, you reduce unnecessary data transmission, lower costs, and maintain the integrity of your data flow.
Use MQTT 5’s Shared Subscriptions Feature
Shared Subscriptions, a feature available only in MQTT 5, are commonly used when a group of clients struggles to handle all the messages being published. By distributing the message load among multiple clients, Shared Subscriptions help alleviate the processing burden in high-traffic situations.
Another important use case is ensuring that each message is received by only one client, rather than all clients in a group. This selective distribution minimizes unnecessary data duplication and reduces bandwidth usage.
For example, in a simple scenario, with two devices subscribed to a shared subscription, you’re already cutting the data load by 50%—and this is the worst-case scenario. As the number of subscribed clients grows, the cost savings increase proportionally.
Use MQTT 5’s Topic Alias Feature
The Topic Alias feature is exclusive to MQTT 5. In our blog post, MQTT Topic Alias, we can read the following example:
data/europe/germany/south/bavaria/munich/schwabing/box-32543y/junction/consumption/current
This topic name describes the current power consumption of a specific junction box, with the payload being a single integer value. In cases like these, the Topic Alias feature simplifies communication by replacing long and complex topic strings with single integers.
For instance, instead of repeatedly publishing packets to the topic data/europe/germany/south/bavaria/munich/schwabing/box-32543y/junction/consumption/current
, you can assign it a topic alias, such as 321.
In this example, the original topic string is 90 bytes, while the alias “321” is just 3 bytes. This means that initially, the total size of a packet was 99 bytes (90 bytes for the topic + 9 bytes for the payload). By using a topic alias, the packet size drops to 12 bytes—resulting in an 87-byte reduction per packet, which is more than an 87% decrease.
Read our blog, Underutilized MQTT 5 Features that Enhance Modern IoT Data Flows, to understand MQTT 5 features that enhance IoT data flows.
Decrease Message Rate
In typical telemetry use cases, individual messages are often not as critical as the overall collection of data. This presents an opportunity to optimize by adjusting the message rate based on the use case requirements.
To determine the optimal rate, it’s essential to understand the specific needs of your application. For example, if the goal is to track temperature changes, sending a packet every minute might be unnecessary. Instead, it would be more efficient for the client to send a new packet only when a significant change in temperature occurs.
Use MQTT Session Expiration and Keep Alive Feature
In certain use cases, it’s essential to minimize data transmission to the lowest possible level. For sensors with built-in batteries that are difficult to access, reducing the amount of data sent is crucial for maximizing battery life. While this also helps save on data costs, the primary goal in these scenarios is to extend battery longevity.
To achieve this, you should extend the Session Expiry interval. This ensures that even if the connection is temporarily lost, the session remains active, reducing the need to frequently send CONNECT and DISCONNECT packets. Additionally, increasing the Keep Alive interval complements this by further minimizing packet transmission, conserving both battery power and bandwidth.
Use MQTT Quality of Service(QoS) Levels Wisely
Each Quality of Service (QoS) level generates a different amount of traffic:
QoS 0 uses a single packet.
QoS 1 requires at least 2 packets.
QoS 2 requires at least 4 packets.
It’s important to understand that this refers to the additional traffic generated by each QoS level, not their default behavior. The key difference is that QoS 0 by using one packet, it’s possible to clearly quantify the amount of traffic.
In contrast, higher QoS levels (1 and 2) require acknowledgments. This can significantly impact the system, especially with QoS 2, as its persistence and synchronization requirements increase network traffic beyond just the packets exchanged with clients. In these cases, the clusterReplicaCount also plays a role in the amount of internal data flow.
If you’re operating a cluster with nodes distributed across different Availability Zones, this added traffic can have a considerable impact on performance and cost. It’s crucial to strike a balance—adjusting QoS levels or replication settings can help, but be cautious, as lowering these may increase the risk of data loss.
Conclusion
Improving data cost efficiency in MQTT-based systems requires a multifaceted approach. By implementing these strategies, CTOs and Solution Architects can significantly reduce data transmission costs while maintaining system performance and reliability. Remember that each use case may require a unique combination of these techniques, so it's essential to thoroughly understand your specific requirements and limitations.
As you optimize your MQTT implementation, continue to stay informed about the latest developments in the protocol and related technologies. Regularly reassess your system's performance and adjust your strategies as needed to ensure ongoing cost efficiency and operational excellence. Contact us if you need assistance in optimizing your MQTT systems.
Francisco Menéres
Francisco Menéres is Senior Customer Success Manager – EMEA at HiveMQ. Francisco excels at helping customers achieve their business goals by bringing a unique perspective and a proactive approach to problem-solving. His ability to see the big picture allows him to develop effective strategies and drive success, whether he’s working with people one-on-one or leading a team.