Practical Application of LLM with MQTT, Google Gemini, and Unified Namespace for IIoT
Now that we have a basic understanding of AI through LLMs and UNS, let's look at the technical side and how to implement them.
While several techniques are commonly used to interact with and/or optimize LLMs, such as Fine-Tuning, Sampling Methods, Top-k Sampling, Top-p Sampling and many others, we will focus on the most basic one: Prompt Engineering.
Prompt engineering (also called prompt design) for Large Language Models (LLMs) involves crafting well-defined input cues or instructions, known as prompts, to guide the model towards generating desired outputs. Prompt engineering is crucial for eliciting specific responses from LLMs and shaping their behavior to suit particular tasks or applications.
Here's a breakdown of the prompt engineering steps:
Defining the Prompt: The prompt serves as the initial input provided to the LLM to initiate the generation process. It can be a single sentence, a paragraph, or even a series of questions or directives. The content of the prompt sets the context and provides guidance for the model on what type of response is expected.
Providing Context: Effective prompts provide sufficient context for the LLM to understand the task or query at hand. They may include relevant information about the topic, background details, or specific cues that direct the model's attention towards certain aspects of the input.
Tailoring for Specific Tasks: Depending on the application, prompts can be tailored to elicit responses that are relevant to a particular task or domain. For example, prompts for language translation tasks may include source language sentences to be translated, while prompts for text summarization tasks may include longer passages of text to be condensed.
Guiding Generation: Prompts guide the generation process by influencing the model's decision-making at each step. They help steer the model towards producing outputs that align with the intended meaning or purpose conveyed in the prompt. Carefully crafted prompts can encourage the model to exhibit desired behaviors, such as generating coherent, informative, or creative responses.
Iterative Refinement: Prompt engineering often involves an iterative process of refinement, where users experiment with different prompts and observe the model's responses. By analyzing the generated outputs and adjusting the prompts accordingly, users can iteratively improve the quality and relevance of the model's responses over time.
Adapting to Model Capabilities: Prompts should be designed with an understanding of the capabilities and limitations of the LLM being used. For instance, prompts for smaller models may need to be more concise and focused, while prompts for larger models can afford to provide more detailed instructions or context.
Prompt engineering plays a crucial role in harnessing the capabilities of LLMs effectively. By crafting thoughtful and contextually rich prompts, we can guide LLMs to produce outputs that meet their specific requirements and achieve desired outcomes across a wide range of tasks and applications.
Our Architecture to Build a UNS Chat
In short, we listen to changes in our UNS in real-time from our frontend. When Gemini is asked a question, we prepare a prompt in which we transmit the context of the request and the snapshot of our UNS. This approach gives Gemini a better understanding of what we are asking it and enables it to give a more relevant response.
Let’s Start with Google Gemini Access Setup
To be able to query Gemini via its API, we assume that you already have an active account on Google Cloud. If it’s not the case, you can get some free credit by using the trial.
We then need to perform a few configuration steps in order to access the Google API with allows us to query Gemini Pro. Since Google has already documented everything, I will just share the links so you can follow the guides:
At the end of this process, you should have a JSON file that you will rename to service-principal-gemini-credential.json with the API key and Vertex enabled on your Google project.
Build the API
As I'm a fan of the simplicity of implementing an API with NodeJS, we're going to create a simple API that will have an endpoint on which we'll send requests to Gemini. The API will take care of the interaction with Gemini Pro, including collecting the different pieces of the response and assembling them before sending it back to our client interface.
As you can see, we also integrate a few components like helmet
, morgan
and cors
to add basic security on our API.
I’m also creating a helper that handles the interaction with Google API.
Before you start the API, you will also need to create a .env file where you will define the local variables needed:
Create a Simple Chat Frontend
As always, an API is nothing without a consumer. So we're going to prepare a little interface in React so that we can talk to our AI.
I'm not going to explain all the code, but broadly speaking we're creating a chat component that will carry the history of interactions with the API. When the page is loaded, we open a WebSocket connection to our HiveMQ cluster in order to subscribe to the information in our UNS and send it to the API. We also prepare the prompt to give some context to Google Gemini Pro.
The context in the designed prompt looks like the following:
When doing the query to our API, we pack together the context, the UNS data in JSON format, and the question of the user:
The magic happens, and we get the answer from our AI.
Before you start the frontend, you will need to define local environment variables to be able to connect to your broker:
Let’s Test It
We now have all the pieces in place to start asking questions about our UNS data to Gemini.
As an example, we know that we have multiple silos having telemetry information on temperature and humidity. We can easily ask for the highest temperature or humidity, the average of all silos, or if we have any active alarms.
Wrap Up
The integration of Google Gemini with UNS represents a significant stride towards unlocking the full potential of distributed systems and IoT deployments. By combining Google's powerful Gemini language model with the efficient and standardized communication protocol of MQTT along with Unified Namespace, organizations can harness the power of natural language understanding to create intelligent and context-aware applications.
As we look towards the future, the convergence of advanced language models like Gemini and robust communication protocols like MQTT and Unified Namespace holds immense promise for transforming how we interact with technology and leverage data in distributed systems. By embracing this integration and harnessing its capabilities, organizations can unlock new levels of efficiency, productivity, and intelligence in their IoT deployments and beyond.
The journey towards fully realizing the potential of Google Gemini and UNS integration is just beginning. We invite you to explore further, experiment, and innovate as we continue to push the boundaries of what's possible at the exciting intersection of natural language processing and distributed computing.
Source code: As always, I share the source code, which you’ll find on GitHub.
Anthony Olazabal
Anthony is part of the Solutions Engineering team at HiveMQ. He is a technology enthusiast with many years of experience working in infrastructures and development around Azure cloud architectures. His expertise extends to development, cloud technologies, and a keen interest in IaaS, PaaS, and SaaS services with a keen interest in writing about MQTT and IoT.