Practical Application of LLM with MQTT, Google Gemini, and Unified Namespace for IIoT

by Anthony Olazabal

May 28, 2024 13 min read

Now that we have a basic understanding of AI through LLMs and UNS, let's look at the technical side and how to implement them.

While several techniques are commonly used to interact with and/or optimize LLMs, such as Fine-Tuning, Sampling Methods, Top-k Sampling, Top-p Sampling and many others, we will focus on the most basic one: Prompt Engineering.

Prompt engineering (also called prompt design) for Large Language Models (LLMs) involves crafting well-defined input cues or instructions, known as prompts, to guide the model towards generating desired outputs. Prompt engineering is crucial for eliciting specific responses from LLMs and shaping their behavior to suit particular tasks or applications.

Here's a breakdown of the prompt engineering steps:

Defining the Prompt: The prompt serves as the initial input provided to the LLM to initiate the generation process. It can be a single sentence, a paragraph, or even a series of questions or directives. The content of the prompt sets the context and provides guidance for the model on what type of response is expected.
Providing Context: Effective prompts provide sufficient context for the LLM to understand the task or query at hand. They may include relevant information about the topic, background details, or specific cues that direct the model's attention towards certain aspects of the input.
Tailoring for Specific Tasks: Depending on the application, prompts can be tailored to elicit responses that are relevant to a particular task or domain. For example, prompts for language translation tasks may include source language sentences to be translated, while prompts for text summarization tasks may include longer passages of text to be condensed.
Guiding Generation: Prompts guide the generation process by influencing the model's decision-making at each step. They help steer the model towards producing outputs that align with the intended meaning or purpose conveyed in the prompt. Carefully crafted prompts can encourage the model to exhibit desired behaviors, such as generating coherent, informative, or creative responses.
Iterative Refinement: Prompt engineering often involves an iterative process of refinement, where users experiment with different prompts and observe the model's responses. By analyzing the generated outputs and adjusting the prompts accordingly, users can iteratively improve the quality and relevance of the model's responses over time.
Adapting to Model Capabilities: Prompts should be designed with an understanding of the capabilities and limitations of the LLM being used. For instance, prompts for smaller models may need to be more concise and focused, while prompts for larger models can afford to provide more detailed instructions or context.

Prompt engineering plays a crucial role in harnessing the capabilities of LLMs effectively. By crafting thoughtful and contextually rich prompts, we can guide LLMs to produce outputs that meet their specific requirements and achieve desired outcomes across a wide range of tasks and applications.

Our Architecture to Build a UNS Chat

Architecture to Build a UNS Chat Using Google Gemini and HiveMQ MQTT Broker In short, we listen to changes in our UNS in real-time from our frontend. When Gemini is asked a question, we prepare a prompt in which we transmit the context of the request and the snapshot of our UNS. This approach gives Gemini a better understanding of what we are asking it and enables it to give a more relevant response.

Let’s Start with Google Gemini Access Setup

To be able to query Gemini via its API, we assume that you already have an active account on Google Cloud. If it’s not the case, you can get some free credit by using the trial.

We then need to perform a few configuration steps in order to access the Google API with allows us to query Gemini Pro. Since Google has already documented everything, I will just share the links so you can follow the guides:

At the end of this process, you should have a JSON file that you will rename to service-principal-gemini-credential.json with the API key and Vertex enabled on your Google project.

Build the API

As I'm a fan of the simplicity of implementing an API with NodeJS, we're going to create a simple API that will have an endpoint on which we'll send requests to Gemini. The API will take care of the interaction with Gemini Pro, including collecting the different pieces of the response and assembling them before sending it back to our client interface.

// importing the dependencies
require("dotenv").config();
const express = require('express');
const fs = require('fs');
const https = require('https');
const bodyParser = require('body-parser');
const cors = require('cors');
const helmet = require('helmet');
const morgan = require('morgan');
const gemini = require('./helpers/gemini');

// defining the Express app
const app = express();

// Adding Helmet to enhance your Rest API's security
app.use(helmet());

app.use(helmet.hidePoweredBy());

app.use(
    helmet.contentSecurityPolicy({
        useDefaults: true,
        directives: {
            "script-src": ["'self'", "*"],
            "style-src": null,
        },
    })
);

// Sets "Cross-Origin-Opener-Policy: same-origin"
app.use(helmet({ crossOriginOpenerPolicy: true }));

// Sets "Cross-Origin-Opener-Policy: same-origin-allow-popups"
app.use(helmet({ crossOriginOpenerPolicy: { policy: "same-origin" } }));

// Sets "Cross-Origin-Resource-Policy: same-origin"
app.use(helmet({ crossOriginResourcePolicy: true }));

// Sets "Cross-Origin-Resource-Policy: same-site"
app.use(helmet({ crossOriginResourcePolicy: { policy: "same-origin" } }));
// Using bodyParser to parse JSON bodies into JS objects
app.use(bodyParser.json());

// Enabling CORS for all requests
app.use(cors({
    origin: '*'
}));

// Adding morgan to log HTTP requests
app.use(morgan('combined'));

// Defining the root endpoint
app.get('/api', (req, res) => {
    res.send("Welcome to AI API");
});

app.post("/api/ai/gemini-pro", async (req, res) => {
    console.log("Executing Gemini Pro query requested")
    try {
        console.log("Checking user inputs")
        // Get user input
        const apiKey = req.headers['api-key'];
        const { promptinit, query } = req.body;

        // Validate user authentication
        if (apiKey) {
            if (apiKey != process.env.API_KEY) {
                console.log("Wrong api key ! ")
                res.status(400).send("Bad unauthentication ! ");
            }
            else {
                //Check api key validity
                if ((promptinit && query)) {
                    console.log("Prompt Init : " + promptinit)
                    console.log("Query : " + query)
                    let geminiResponse = await gemini.streamGenerateContent(promptinit, query);
                    res.status(200).send(geminiResponse);
                }
                else {
                    res.status(400).send("Bad query inputs ! ");
                }
            }
        }
        else {
            res.status(400).send("Bad authentication ! ");
        }
    } catch (err) {
        console.log(err)
        res.status(500).send(err);
    }
});

// Start HTTP server
app.listen(3001, () => {
    console.log('AI API listening on port HTTP 3001');
});

As you can see, we also integrate a few components like helmet, morgan and cors to add basic security on our API.

I’m also creating a helper that handles the interaction with Google API.

const { VertexAI, HarmCategory, HarmBlockThreshold } = require('@google-cloud/vertexai');
require('dotenv').config();

// Constants for project and location should be defined at the top level.
const PROJECT_ID = process.env.PROJECT_ID;
const LOCATION = process.env.LOCATION;

// Initialize Vertex AI with the necessary project and location information once.
const vertexAiOptions = { project: PROJECT_ID, location: LOCATION };
const vertex_ai = new VertexAI(vertexAiOptions);

// Define model names as constants to avoid magic strings and improve readability.
const GEMINI_PRO_MODEL_NAME = process.env.GEMINI_PRO_MODEL_NAME

// Safety settings can be moved outside of the model instantiation, 
// if they are static and reused across multiple instances.
const safetySettings = [{
    category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
    threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
}];

// Instantiate models once outside of functions to avoid repeated initializations.
const generativeModelOptions = {
    model: GEMINI_PRO_MODEL_NAME,
    safety_settings: safetySettings,
    generation_config: { max_output_tokens: 512 },
};
const generativeModel = vertex_ai.preview.getGenerativeModel(generativeModelOptions);

// The streamGenerateContent function does not need to be an async declaration since it returns a Promise implicitly.
async function streamGenerateContent(promptInitText, question) {
    const request = {
        contents: [
            {
                role: 'user',
                parts: [{ text: promptInitText + question }]
            }
        ],
    };

    console.log(request)

    // Using implicit return for the async arrow function.
    let geminiResponse;
    try {
        const streamingResp = await generativeModel.generateContentStream(request);
        for await (const item of streamingResp.stream) {
            console.log('stream chunk: ', item.candidates[0].content.parts[0]);
        }
        const aggregatedResponse = await streamingResp.response;
        console.log("Original question: " + question)
        console.log('Gemini response: ', aggregatedResponse.candidates[0].content.parts[0].text);
        geminiResponse = aggregatedResponse.candidates[0].content.parts[0].text;
    } catch (error) {
        console.error('An error occurred during content generation:', error);
        geminiResponse = "An error occurred during content generation:" + error;
    }

    return geminiResponse;
}

module.exports = { streamGenerateContent };

Before you start the API, you will also need to create a .env file where you will define the local variables needed:

API_KEY=<PUT GOOGLE API KEY HERE>
PROJECT_ID=<PUT GOOGLE PROJECT ID HERE>
LOCATION=<PUT GOOGLE LOCATION HERE>
GEMINI_PRO_MODEL_NAME=gemini-pro <Change model name if needed>
GOOGLE_APPLICATION_CREDENTIALS=service-principal-gemini-credential.json

Create a Simple Chat Frontend

As always, an API is nothing without a consumer. So we're going to prepare a little interface in React so that we can talk to our AI.

Prepare a little interface in React so that we can talk to our AI

I'm not going to explain all the code, but broadly speaking we're creating a chat component that will carry the history of interactions with the API. When the page is loaded, we open a WebSocket connection to our HiveMQ cluster in order to subscribe to the information in our UNS and send it to the API. We also prepare the prompt to give some context to Google Gemini Pro.

The context in the designed prompt looks like the following:

"You are a analytic expert that manage an industry unified namespace with a lot of informations on a factory production lines, including a mixer line, a filling line, a storage line and a packing line. Here are important 
informations on each unit that you will find in the dataset, temperature are in celsius, humidity in %, pressure is defined in Psi, weight are in kilograms, the speed of agitator is in rpm. Here is a the dataset that you need to analyse :"

When doing the query to our API, we pack together the context, the UNS data in JSON format, and the question of the user:

const userData = {
            promptinit: promptinit + JSON.stringify(topics),
            query: inputMessage
        };

The magic happens, and we get the answer from our AI.

Before you start the frontend, you will need to define local environment variables to be able to connect to your broker:

REACT_APP_BROKER_WS_ADDRESS=<PUT YOUR BROKER ADDRESS>
    REACT_APP_WS_PORT=1883
    REACT_APP_WS_USERNAME=<PUT YOUR USERNAME>
    REACT_APP_WS_PASSWORD=<PUT YOUR PASSWORD>
    REACT_APP_TOPIC_ROOT=#
    REACT_APP_API_URL=http://localhost:3001
    REACT_APP_API_KEY=<PUT YOUR API KEY HERE>

Let’s Test It

We now have all the pieces in place to start asking questions about our UNS data to Gemini.

As an example, we know that we have multiple silos having telemetry information on temperature and humidity. We can easily ask for the highest temperature or humidity, the average of all silos, or if we have any active alarms.

Sample example to start asking questions about our UNS data to Gemini

Wrap Up

The integration of Google Gemini with UNS represents a significant stride towards unlocking the full potential of distributed systems and IoT deployments. By combining Google's powerful Gemini language model with the efficient and standardized communication protocol of MQTT along with Unified Namespace, organizations can harness the power of natural language understanding to create intelligent and context-aware applications.

As we look towards the future, the convergence of advanced language models like Gemini and robust communication protocols like MQTT and Unified Namespace holds immense promise for transforming how we interact with technology and leverage data in distributed systems. By embracing this integration and harnessing its capabilities, organizations can unlock new levels of efficiency, productivity, and intelligence in their IoT deployments and beyond.

The journey towards fully realizing the potential of Google Gemini and UNS integration is just beginning. We invite you to explore further, experiment, and innovate as we continue to push the boundaries of what's possible at the exciting intersection of natural language processing and distributed computing.

Source code: As always, I share the source code, which you’ll find on GitHub.

Anthony Olazabal

Anthony is part of the Solutions Engineering team at HiveMQ. He is a technology enthusiast with many years of experience working in infrastructures and development around Azure cloud architectures. His expertise extends to development, cloud technologies, and a keen interest in IaaS, PaaS, and SaaS services with a keen interest in writing about MQTT and IoT.

Practical Application of LLM with MQTT, Google Gemini, and Unified Namespace for IIoT

Our Architecture to Build a UNS Chat

Let’s Start with Google Gemini Access Setup

Build the API

Create a Simple Chat Frontend

Let’s Test It

Wrap Up

Anthony Olazabal

How HiveMQ Pulse Is Changing Smart Manufacturing

Inside CONNACK! Episode 5: How UNS Fuels Industrial AI

Designing an Information Model for Your Unified Namespace