Skip to content

How HiveMQ Optimizes High-volume Data Ingest into AWS

by Łukasz Malinowski, Gaurav Suman
13 min read

Streamlined telemetry transfer and real-time data ingest from connected devices are vital for informed decision-making in real-world IoT scenarios. For example, modern cars generate an unprecedented volume of data. Those records include information about the vehicle's performance, environmental conditions, and driver behavior, which is crucial for advanced use cases like predictive maintenance.

Yet delivering those benefits to end users requires first solving a complex challenge: putting in place a robust solution for telemetry transfer that ensures timely and cost-efficient data transmission from connected devices (for example, vehicles) to backend platforms.

In this article, we demonstrate how HiveMQ simplifies solution architecture. We compare two approaches: using native AWS services to transfer data to Amazon Kinesis Data Streams and achieving the same goal with HiveMQ.

How HiveMQ Optimizes High-volume Data Ingest into AWS

Note: Amazon Kinesis Data Streams is a popular destination for streaming high-volume data into the AWS cloud. Kinesis makes it easy to collect, process, and analyze data streams in real time.

IoT Solutions Overview

AWS Native

Let’s start by analyzing the AWS native design. Don’t be discouraged if you are unfamiliar with AWS's offering; I will guide you through the entire solution.

AWS native solution design to ingest IoT dataAWS native solution design

On the left side of the above diagram, we have a car equipped with various sensors. Those sensors gather data about the vehicle and its surroundings. We analyze this information using cloud technologies to understand and predict the condition of monitored devices.

To make this example more relatable, let's consider the following structure of generated information:

{
    "timestamp": "2024-05-17T11:03:56.497482Z",
    "vehicle_id": "car0001",
    "engine_status": "running",
    "speed": 38.3,
    "fuel_level": 63.0,
    "engine_temperature": 92.0,
    "oil_pressure": 37.5,
    "tire_pressure": {
        "front_left": 32.1,
        "front_right": 32.2,
        "rear_left": 33.5,
        "rear_right": 32.9
    },
    "location": {
        "latitude": 60.657454,
        "longitude": -64.338256
    },
    "battery_voltage": 11.1,
    "acceleration": {
        "x": 0.65,
        "y": -0.98,
        "z": 0.12
    },
    "fuel_consumption_rate": 0.06,
    "ambient_temperature": 24.0,
    "humidity": 77.8,
    "engine_load": 26.5,
    "air_flow_rate": 19.8,
    "oxygen_sensor_readings": {
        "sensor1": 0.89,
        "sensor2": 0.79,
        "sensor3": 0.72,
        "sensor4": 0.82
    }
}

AWS IoT Core is the main Internet of Things service Amazon Web Services offers. 

To connect a device to AWS IoT Core, we must:

  • Create Thing

  • Define the IoT Policy

  • Generate the X.509 Certificate

Representation of a device in the AWS cloudRepresentation of a device in the AWS cloud

A Thing logically represents the device (a car in our case) in AWS. We leverage Things to store static metadata describing the corresponding devices.

A Thing in the AWS Web ConsoleA Thing in the AWS Web Console

An IoT Policy defines the allowed actions for a Thing, like publishing and receiving MQTT messages from specified MQTT Topics. To learn more about MQTT, check the HiveMQ MQTT Essentials guide.

You need to manage IoT Policies carefully, as a misconfiguration can create a severe security risk, enabling hackers to exploit our system and steal confidential information.

An IoT Policy in the AWS Web ConsoleAn IoT Policy in the AWS Web Console

The X.509 Certificate is proof of identity for our device. We must generate, distribute, and manage the Private Key and the corresponding X.509 Certificate with utmost care. Any mishandling could lead to severe security breaches, with attackers potentially utilizing the intercepted Private Key to impersonate one of our devices and disrupt operations.

An X.509 Certificate in the AWS Web ConsoleAn X.509 Certificate in the AWS Web Console

That is good progress, but we still have a long way to go. As you can see, establishing the initial connectivity requires extensive knowledge of AWS and IT security.

AWS IoT Core service does not store received MQTT messages. We must use AWS IoT Rules to deliver MQTT messages to the Amazon Kinesis Data Streams for storage and analytic purposes.

AWS IoT Core Rule overviewAWS IoT Core Rule overview

AWS IoT Rules analyze MQTT messages and deliver them to various AWS services (Amazon Kinesis Data Streams included). The sample configuration of a Rule looks as follows:

AWS IoT Rule configurationAWS IoT Rule configuration

At this stage, we encounter another challenge. We must explicitly define the IAM Role with relevant IAM Policy to permit the AWS IoT Rule to write messages into the Amazon Kinesis Data Streams. IAM stands for Identity and Access Management, which is used to manage identities and access to AWS services and resources. A misconfigured IAM Policy might disturb the information flow or, even worse, create a security vulnerability in our system.

AWS IAM Role for the IoT RuleAWS IAM Role for the IoT Rule

AWS IAM Role and IAM Policy in the AWS Web ConsoleAWS IAM Role and IAM Policy in the AWS Web Console

As the final stage of our AWS journey, we perform the Amazon Kinesis Data Streams configuration.

AWS Kinesis Data Streams connectionAWS Kinesis Data Streams connection

AWS Kinesis Data Stream in the AWS Web ConsoleAWS Kinesis Data Stream in the AWS Web Console

That was a long way to go, but finally, we are ready to test the complete solution. Let’s start generating data and verify that it reaches AWS Kinesis Data Streams.

Sample telemetry data generated by a simulated deviceSample telemetry data generated by a simulated device

MQTT messages stored in the Amazon Kinesis Data StreamMQTT messages stored in the Amazon Kinesis Data Stream

Success! We avoided numerous pitfalls, and our end-to-end solution is working.

That concludes the AWS-native setup. The cloud configuration demanded a deep understanding of several AWS services and the analysis of various attack vectors on our system. This complexity underscores the need for a simpler, more streamlined solution.

AWS native solution design to ingest IoT dataEnd-to-end AWS-native solution

HiveMQ Enterprise Extension for Amazon Kinesis

Can we leverage HiveMQ to achieve the same results using simpler architecture?

Yes, we can! The diagram below shows the complete deployment. Without going into details, we can tell it is not as complicated as the AWS native solution.

End-to-end solution leveraging HiveMQEnd-to-end solution leveraging HiveMQ

Similarly to the AWS-native solution, we are using the same AWS Kinesis Data Streams as our target.

On the AWS side, we need to create the IAM User with permission to write data into the Amazon Kinesis Data Stream. We are using the same set of permissions as for the AWS IoT Rule.

IAM User in the AWS Web ConsoleIAM User in the AWS Web Console

That is the complete setup on the AWS cloud! No AWS IoT Thing, X.509 Certificates, IoT Policy, or IoT Rule are required! With HiveMQ, we can achieve the same results with a significantly simpler architecture, minimizing the risk of misconfiguration.

How is that possible? Let me show you how we can leverage the HiveMQ Enterprise Extension for Amazon Kinesis.

The below HiveQM configuration file defines all aspects of our solution:

<hivemq-amazon-kinesis-extension xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                                 xsi:noNamespaceSchemaLocation="config.xsd">
    <aws-credential-profiles>
        <aws-credential-profile>
            <id>aws-credential-profile-01</id>
            <profile-file>/opt/hivemq/extensions/hivemq-amazon-kinesis-extension/aws-credentials</profile-file>
        </aws-credential-profile>
    </aws-credential-profiles>

    <mqtt-to-kinesis-routes>
        <mqtt-to-kinesis-route>
            <id>telemetry-mqtt-to-kinesis-route</id>
            <enabled>true</enabled>
            <aws-credential-profile-id>aws-credential-profile-01</aws-credential-profile-id>
            <region>eu-west-1</region>
            <mqtt-topic-filters>
                <mqtt-topic-filter>car0001/#</mqtt-topic-filter>
            </mqtt-topic-filters>
            <processor>
                <mapping>
                    <kinesis-streams>
                        <kinesis-stream>
                            <name>TelemetryDataStream</name>
                            <partition-key>
                                <mqtt-topic/>
                            </partition-key>
                            <explicit-hash-key>
                                <random/>
                            </explicit-hash-key>
                        </kinesis-stream>
                    </kinesis-streams>
                </mapping>
            </processor>
        </mqtt-to-kinesis-route>
    </mqtt-to-kinesis-routes>
</hivemq-amazon-kinesis-extension>

The aws-credential-profile specifies the AWS credentials of the IAM User we created in the previous step.

Then, we enable the MQTT to Kinesis communication:

<mqtt-to-kinesis-routes>
        <mqtt-to-kinesis-route>
            <id>telemetry-mqtt-to-kinesis-route</id>
            <enabled>true</enabled>

Define the MQTT Topic:

<mqtt-topic-filters>
                <mqtt-topic-filter>car0001/#</mqtt-topic-filter>
            </mqtt-topic-filters>

And the target Kinesis Data Stream:

<kinesis-streams>
        <kinesis-stream>
             <name>TelemetryDataStream</name>
              <partition-key>
                   <mqtt-topic/>
              </partition-key>
              <explicit-hash-key>
                    <random/>
              </explicit-hash-key>
        </kinesis-stream>
</kinesis-streams>

That is all! To enable the HiveMQ Enterprise Extension for Amazon Kinesis, simply remove the corresponding DISABLED file:

rm /opt/hivemq/extensions/hivemq-amazon-kinesis-extension/DISABLED

The HiveMQ Enterprise Extension for Amazon Kinesis supports hot reloading of the extension configuration. Changes we make are updated while the extension is running, with no need to restart. When the extension recognizes a new valid configuration, the previous configuration file is automatically archived in the config-archive of the extension home folder.

We can start sending MQTT messages to the local HiveMQ, and it will securely forward them to the Amazon Kinesis Data Streams:

Sample telemetry data generated by a simulated deviceSample telemetry data generated by a simulated device

We can confirm that the above telemetry reached AWS:

MQTT messages stored in the Amazon Kinesis Data StreamMQTT messages stored in the Amazon Kinesis Data Stream

Summary

In this article, we explored how HiveMQ can simplify the IoT solution architecture for telemetry transfer to the AWS cloud. I compared two approaches: using native AWS services to store data in Amazon Kinesis and achieving the same goal with HiveMQ. 

HiveMQ offers a robust MQTT broker that streamlines telemetry transfer, making it easier to collect, process, and analyze real-time data from various connected devices.Summary

The native AWS design involves setting up AWS IoT Core, creating a Thing, defining an IoT Policy, and generating an X.509 Certificate. On the other hand, HiveMQ offers a robust MQTT broker that streamlines telemetry transfer, making it easier to collect, process, and analyze real-time data from various connected devices. This comparison shows that HiveMQ is a more efficient and user-friendly solution.

Check the documentation to learn more about the HiveMQ Enterprise Extension for Amazon Kinesis.

Łukasz Malinowski

Łukasz Malinowski is an Internet of Things (IoT) advisor and trainer. He previously worked at Amazon Web Services, where he helped the world's largest corporations design, implement and secure global IoT solutions. Currently, he conducts independent consulting and training activities, empowering companies to achieve business goals by leveraging modern technologies.

  • Contact Łukasz Malinowski via e-mail

Gaurav Suman

Gaurav Suman, Director of Product Marketing at HiveMQ, has over a decade of experience in roles like Solutions Architect and Business Development Manager. His journey includes launching market-first products and achieving a 2X revenue increase in the past year. Eager to connect with industry peers, Gaurav pushes the boundaries of what Product Marketing can achieve for businesses.

  • Contact Gaurav Suman via e-mail
HiveMQ logo
Review HiveMQ on G2