Monitoring an MQTT Broker for Key Performance Indicators (KPIs)

by HiveMQ Team

Oct 30, 2024 24 min read

The MQTT protocol has become the leading standard for enabling efficient and reliable communication between devices. At the heart of this ecosystem lies the MQTT broker – a crucial component that facilitates message exchange. But how do we ensure this vital piece of infrastructure is performing optimally? The answer lies in effective MQTT Broker monitoring. There are various tools for monitoring KPIs, such as Prometheus, Grafana, AWS CloudWatch, Splunk, HiveMQ Control Center, HiveMQ Health API, and more.

In this guide, we will walk you through the importance of monitoring MQTT Brokers and different ways to achieve this.

The Importance of Monitoring MQTT Brokers

Imagine a smart city where traffic lights, public transport, and emergency services all rely on real-time data exchange. Or picture a manufacturing plant where production lines, inventory systems, and quality control are interconnected through IoT. In these scenarios and countless others, the smooth operation of an MQTT broker is not just desirable - it's mission-critical. Monitoring MQTT brokers is essential for several reasons:

Ensuring System Reliability: Proactively identifying and addressing issues before they impact the IoT ecosystem.
Optimizing Performance: Fine-tuning the MQTT broker for optimal performance and scalability.
Security Management: Detecting unusual patterns or potential security breaches.
Capacity Planning: Planning for future growth and resource allocation.
Troubleshooting: Quickly diagnosing and resolving issues when they arise.

Also, any different roles within an organization benefit from MQTT broker monitoring in unique ways, each leveraging the insights to fulfill their specific responsibilities. For example:

IoT developers can debug issues and optimize message flows by streamlining communication, removing duplicates, and detecting rogue clients through effective monitoring.
Site Reliability Engineers (SREs) can ensure the MQTT infrastructure meets service level objectives (SLOs).
Solution architects can design scalable and resilient IoT solutions.
Operations Teams (OT) can maintain day-to-day operations and respond to alerts.
Business stakeholders can understand the performance and reliability of their IoT infrastructure.

Now, let’s look at some of the Key Performance Indicators (KPIs) for monitoring an MQTT broker in production.

Key Performance Indicators (KPIs) for MQTT Broker Monitoring

Resource Consumption: Monitor CPU usage, memory usage, and heap memory.
Network Activity and Traffic: Track incoming connections, disconnections, concurrent connections, and message flow metrics such as incoming and outgoing PUBLISH packets, subscribe/unsubscribe rates, and QoS levels distribution.
Latency: Measure time-to-deliver for messages from a publisher to a subscriber.
System Health: Monitor overload protection levels and garbage collection metrics to maintain system stability.
Error Tracking: Log errors, failed tasks, and issues.
Cluster Performance: Track the number of nodes and data replication status in multi-node setups.
Business-Critical Data Integrity: Monitor data-loss incidents, such as dropped messages or failed message deliveries, to ensure integrity in business-critical applications.

Read our blog Monitoring HiveMQ: A Comprehensive Guide to get an overview of important KPIs when monitoring a production-level HiveMQ MQTT Broker.

Now, let's explore a few methods for monitoring an MQTT broker.

Method 1: Prometheus for MQTT Broker Metrics Collection

Prometheus is an excellent choice for monitoring MQTT brokers, especially when used with HiveMQ. One of the best ways to monitor your MQTT Broker using Prometheus is the free Prometheus Monitoring extension. This extension allows HiveMQ to expose metrics to a Prometheus application. Here's how to set it up:

Step 1: Install the HiveMQ Prometheus Extension

1. Download the HiveMQ Prometheus Monitoring Extension .

2. Unpack the zip file.

3. Move the hivemq-prometheus-extension folder to the extensions folder of your HiveMQ installation.

4. Adjust the prometheusConfiguration.properties file inside the extension folder to suit your needs.

Step 2: Configuring Prometheus

To install Prometheus, follow the Prometheus Guide. A working prometheus.yml file that is based on the HiveMQ Prometheus Extension configuration looks like below:

global:
  scrape_interval: 15s
scrape_configs:
  - job_name: 'hivemq'
    scrape_interval: 5s
    metrics_path: '/metrics'
    static_configs:
      #using port 9399 because we configured it the HiveMQ Prometheus Extension
      - targets: ['<node1-ip>:9399', '<node2-ip>:9399']

Note: The above code example is tailored for a two-node cluster. If you want more nodes, you need to add the additional nodes to the targets.

Step 3: Using Prometheus for Displaying Metrics

Access the Prometheus web interface at http://<prometheus-ip>:9090.
Use the Expression browser to query HiveMQ metrics directly.

This configuration provides a starting point with useful metrics for most MQTT deployments, including:

Total number of connected clients
Total messages published
Total messages delivered to subscribers
Message size statistics
Total number of subscriptions
Subscription and unsubscription rates
CPU usage
Memory usage
Network throughput
Latency measurements
Authorization attempts
Queue sizes
Persistence store performance

Read our blog HiveMQ - Monitoring with Prometheus and Grafana and our HiveMQ documentation for an in-depth, step-by-step guide.

Method 2: Monitor and Visualize MQTT Data with InfluxDB, Telegraph, and Grafana

InfluxDB and Grafana provide a powerful combination for storing time-series data and creating customizable dashboards for monitoring an MQTT Broker. Here's a detailed guide on setting up this monitoring solution for your HiveMQ MQTT broker:

Step 1: Install and Configure InfluxDB

1. Install InfluxDB on your system and start InfluxDB and create a database using the below command lines.

$ influx
Connected to http://localhost:8086 version 1.3.0
InfluxDB shell version: v1.2.3
> CREATE DATABASE hivemq

Attention: InfluxDB does not provide authentication by default. When you run InfluxDB on an external server, a lack of authentication can expose your metrics to a third party. Make sure you address this potential security issue adequately.

2. Set up a retention policy to manage data storage. For example, create a two-week retention policy.

$ influx
Connected to http://localhost:8086 version 1.3.0
InfluxDB shell version: v1.2.3
> CREATE RETENTION POLICY "two_weeks_only" ON "hivemq" DURATION 2w REPLICATION 1

Step 2: Configure HiveMQ InfluxDB Extension

Download the HiveMQ InfluxDB Monitoring Extension, which allows HiveMQ to connect to an instance of InfluxDB for time series monitoring.
In the folder hivemq-influxdb-extension, modify the influxdb.properties file to fit your needs. Check that the mandatory properties (host, port) are set
Copy the folder hivemq-influxdb-extension to your <HIVEMQ_HOME>/extensions folder

Step 3: Install and Configure Telegraf

Install Telegraf on each HiveMQ cluster node.
You need to configure the telegraf.conf file to tell Telegraf which metrics to gather and write to InfluxDB. Here's a basic configuration:

toml
[global_tags]
[agent]
  interval = "5s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "5s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = false
  hostname = ""
[[outputs.influxdb]]
  urls = ["http://127.0.0.1:8086"]
  database = "hivemq"
  retention_policy = ""
  write_consistency = "any"
  timeout = "5s"
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
[[inputs.system]]
[[inputs.disk]]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.processes]]

Step 4: Install and Configure Grafana

1. Install and start Grafana (default address: localhost:300).

2. Configure InfluxDB as a data source in Grafana:

Add a new data source.
Select InfluxDB as the type.
Set the URL to your InfluxDB instance (e.g., http://localhost:8086)
Set the database name to hivemq.

3. Download the HiveMQ dashboard template JSON file, which is available here.

4. In Grafana, go to "Create" > "Import" and upload the JSON file.

5. Select the InfluxDB data source you configured earlier.

The imported dashboard provides a starting point with useful metrics for most MQTT deployments, including:

System metrics, such as CPU, memory usage, and disk usage.
MQTT-specific metrics, such as number of connections, messages published/received, etc.
Cluster health (for multi-node setups)

Read our blog Monitoring MQTT Messages with InfluxDB and Grafana for an in-depth, step-by-step guide.

Method 3: Flexible Monitoring with Node-RED

Node-RED enables users to monitor MQTT brokers like HiveMQ effectively. By creating a visual monitoring system, you can gain real-time insights into broker performance, client connections, and message flow, enhancing your ability to manage and optimize your IoT deployments. Here’s a step-by-step guide:

Ensure Node-RED is installed and running on your system.
Open the Node-RED editor in your web browser, typically at http://localhost:1880.
Drag an "mqtt in" node from the palette onto your workspace.
Double-click the node to open its configuration panel.
Click the pencil icon next to the "Server" field to create a new broker configuration.
In the "Connection" tab:
a. Enter your HiveMQ broker address (e.g., yourcluster.hivemq.cloud)
b. Set the port (usually 8883 for secure connections)
c. Enable "Use TLS" for secure connections
d. Set the Protocol to "MQTT V3.1.1" or "MQTT V5" depending on your broker's support.
In the Security tab, enter your HiveMQ credentials (username and password)
Click Update to save the broker configuration.
Create separate flows for different metrics:
a. Add additional mqtt in nodes for specific topics like $SYS/broker/clients/connected for client count
b. Use gauge or chart nodes to visualize these metrics
Set up alerts by adding a switch node to check for specific conditions (e.g., client count below a threshold) and connect it to a notification node (like "email" or "Slack") for alerts.
Add dashboard nodes to create a visual interface:
a. Use gauge nodes for metrics like connected clients
b. Add chart nodes for time-series data like message rates

To see an exhausting list of metrics for HiveMQ, visit our documentation .

Method 4: Monitoring HiveMQ Broker with Splunk

Using HiveMQ Splunk Extension, users can post MQTT messages as well as their HiveMQ metric data to a Splunk HTTP Event Collector (HEC). This allows MQTT topics to be mapped directly to Splunk indexes and forward messages to multiple Splunk instances in a high-performing, scalable, and reliable manner.

Here is a step-by-step guide to monitoring an MQTT Broker using the HiveMQ Splunk Extension:

Download the Extension: Obtain the HiveMQ Splunk Extension by downloading the ZIP file from SVA website. This file contains the latest version of the extension along with documentation.
Unzip the File: Extract the contents of de.sva.hivemq.extensions.splunk_<version>.zip to the directory <HIVEMQ_HOME>/extensions/de.sva.hivemq.extensions.splunk.
Add the License File: Place the license file for the extension in the directory <HIVEMQ_HOME>/extensions/de.sva.hivemq.extensions.splunk.
Configure Settings: Locate the configuration.json file within the de.sva.hivemq.extensions.splunk folder. This file comes preconfigured with standard settings for forwarding all MQTT topics and some recommended metrics. Modify the configuration as needed to suit your monitoring requirements. Detailed explanations of each field are provided in the documentation accompanying the extension.
Start HiveMQ: Launch HiveMQ to activate the Splunk Extension. Ensure that you are using either HiveMQ Professional or HiveMQ Enterprise, as this extension is compatible only with these versions.
Monitor MQTT Data: Once configured, your MQTT messages and HiveMQ metric data will be posted to a Splunk HTTP Event Collector (HEC). This setup allows for the efficient mapping of MQTT topics directly to Splunk indexes and the efficient forwarding of messages to multiple Splunk instances.

Kindly note that the evaluation version of the extension is limited to 5 hours of operation. You will need to restart HiveMQ to reset this limit if necessary.

Here are the KPIs you can monitor with this extension:

Message Flow Metrics, such as Publish Delay, Message Throughput, and Message Success Rate.
Client Metrics, such as Subscriber Count and Connected Clients
Resource Usage, such as CPU Utilization, Memory Usage, and Network Traffic
System Health, such as HiveMQ Instance Health and Service Level Objectives (SLOs).

Method 5: Monitoring Using AWS Cloudwatch

The HiveMQ Extension for AWS CloudWatch allows HiveMQ to report its metrics to AWS CloudWatch. It is possible to send every HiveMQ metric to CloudWatch. Every metric that should be reported must be configured explicitly to reduce costs since CloudWatch can be expensive if you publish too much (unnecessary) data.

Here's a step-by-step guide to monitoring using the AWS CloudWatch Extension:

Download the Extension: Obtain the HiveMQ AWS CloudWatch Extension from the HiveMQ website. Refer to the GitHub documentation for more details.
Install the Extension: Unzip the downloaded file into the extensions folder of your HiveMQ installation.
Configure AWS Credentials: Set up AWS credentials either through environment variables, AWS credentials file, or IAM roles if running on EC2.
Edit Configuration File: Locate the cloudwatch-extension-config.xml file in the extension folder. Specify the AWS region and the metrics you want to monitor. Kindly note that the evaluation version of the extension is limited to 5 hours of operation. You will need to restart HiveMQ to reset this limit if necessary.
Access CloudWatch: Log into your AWS Console and navigate to the CloudWatch service. Look for the metrics under the "HiveMQ" namespace.

With the AWS CloudWatch Extension, you can monitor various KPIs, such as:

Client Metrics: Connected clients count, Connection rate, and Disconnection rate.
Message Metrics: Incoming and outgoing message rate, Publish and subscribe rates, and Message size statistics.
Subscription Metrics: Total subscriptions and Subscription change rate.
Cluster Metrics (for clustered setups): Cluster node count and cluster availability.
System Metrics: CPU usage, memory utilization, and disk usage.
MQTT-specific Metrics: MQTT version distribution, QoS level distribution, and retained message count.
Performance Metrics: Message processing time and subscription matching time.
Security Metrics: Authentication successes and failures and authorization checks.

The complete list of metrics is available on HiveMQ Documentation.

If you'd like to create alerts or alarms while monitoring, you can use Amazon CloudWatch alarms. There are two types of alarms:

Metric alarms, which track a single metric and trigger actions like SNS notifications or EC2 actions based on thresholds
Composite alarms, which aggregate the states of multiple alarms to reduce noise and trigger alerts only when all conditions are met. Visit this AWS documentation to learn more.

Method 6: HiveMQ-Specific Monitoring Options

If you're using HiveMQ as your MQTT broker, you have access to additional specialized monitoring tools:

HiveMQ Health API

The HiveMQ Health API offers comprehensive insights into the health of the HiveMQ Platform. It not only provides Readiness and Liveness REST API resources for every node in a HiveMQ cluster but also includes detailed status and health information for each installed HiveMQ extension. Site Reliability Engineers (SREs) can leverage this data to proactively oversee platform health, swiftly identify and address issues, and maintain maximum uptime.

Example Health API configuration:

<hivemq>
  <!-- ... -->
  <health-api>
    <enabled>true</enabled>
    <listeners>
      <http>
        <port>8889</port>
        <name>health-api-listener</name>
        <bind-address>127.0.0.1</bind-address>
      </http>
    </listeners>
  </health-api>
</hivemq>

HiveMQ Control Center

The HiveMQ Control Center provides the management tools and analytics required by administrators to monitor and maintain a deployed HiveMQ system. The Control Center provides a dashboard for monitoring the health of the system, a client overview, and advanced analytics to identify irregular behavior. Here's an overview of its key features:

Dashboard

The Control Center provides a real-time dashboard for monitoring the overall health of the HiveMQ MQTT broker deployment. It displays:

Number of MQTT client sessions
Inbound and outbound publish rates
Subscriptions
Retained messages
Queued messages

Additionally, administrators can query individual HiveMQ nodes for node-specific performance statistics.

Client Management

The Control Center offers a detailed client overview with the following capabilities:

Filtering clients based on Client ID, Connection Status, Client name, and IP address
Drilling down into specific client information
Viewing detailed client session information, including IP, KeepAlive time, TLS information, and Last Will and Testament

Analytics

HiveMQ Control Center provides advanced analytics features, including:

Analysis of dropped messages, showing reasons for message drops, affected clients, and impacted shared subscription groups
Trace Recording functionality, allowing administrators to select and log specific messages from particular clients or topics for diagnostics and debugging purposes

System Health Monitoring

The Control Center enables administrators to:

Monitor the overall health of the HiveMQ system
Identify irregular behavior
Maintain the deployed HiveMQ system effectively

By offering these comprehensive monitoring and management tools, the HiveMQ Control Center empowers administrators to ensure the smooth operation and optimal performance of their MQTT broker deployments.

Wrap-Up

Monitoring an MQTT broker is not just a best practice but a necessity for businesses leveraging IoT and IIoT ecosystems. Whether it’s maintaining system reliability, optimizing performance, managing security, or planning for scalability, robust monitoring capabilities are essential. Organizations can effectively track and analyze critical KPIs by utilizing various tools like Prometheus, InfluxDB, Grafana, Node-RED, and specialized HiveMQ solutions, empowering different stakeholders to take proactive and informed actions. The right monitoring strategy ensures a resilient IoT infrastructure, driving successful deployments and smoother operations.

With the ever-growing demands on IoT networks, a well-monitored MQTT broker acts as the backbone for seamless communication and reliable data exchange. By understanding and implementing the methods outlined in this guide, you’re setting up your MQTT broker for success, paving the way for a more efficient and secure IoT future.

HiveMQ Team

The HiveMQ team loves writing about MQTT, Sparkplug, Unified Namespace (UNS), Industrial IoT protocols, IoT Data Streaming, how to deploy our platform, and more. We focus on industries ranging from energy, to transportation and logistics, to automotive manufacturing. Our experts are here to help, contact us with any questions.