Lessons Learned: Deploying Connected Car Services in Production
Introduction
The automotive industry is investing a lot of time and energy in connected car platforms and services. Automotive manufacturers and key Tier 1 suppliers are all using IoT technologies to create new services. These innovative IoT services offer potential new sources of revenue, improved customer experiences, and telemetry data that can benefit the overall process of automobile design.
Many organizations in the automotive industry are adopting MQTT as the standard for moving data from the car to the cloud. The publish/subscribe pattern of MQTT has proven to be the superior way to deal with connectivity issues on unreliable networks. I believe that MQTT is the only approach that provides the scalability and reliability that the deployment of connected car services in production requires.
Over the last 5 to 6 years, I have worked with over 100 companies that use MQTT to create connected products, including connected cars. In particular, I’ve worked for more than two and a half years with a major automotive manufacturer on the development of their connected car platform using MQTT and HiveMQ. Based on this experience, there are a number of key lessons that I suggest everyone who builds a connected car service should consider.
Testing
Let me start by pointing a number of observations I made and lessons I learned when it comes to testing your connected car deployment.
Scalability Testing the End-to-end Scenarios
A connected car platform involves many individual components that must be integrated together to build the final application. Naturally, the entire system is only as strong as the links between the individual components. This is especially true when it comes to scalability testing.
For instance, it’s easy to test whether an MQTT broker can scale to handle millions of connections and messages. However, if the API of your device-authentication system can’t handle the connection rates that the MQTT broker delivers, the system will fail. Similarly, if a back-end data collection service is set up as an MQTT subscriber, this service is likely to become a bottleneck that needs a solution. In this case, MQTT shared subscriptions can be used to allow multiple instances of the back-end service to run on a cluster.
The key lesson here is that if you really want to test scalability, you always have to test the end-to-end scenarios.
Testing with Real Devices on a Real Network
To facilitate testing, the automotive industry makes heavy use of simulators. Simulation is fine for testing basic functionality but has limitations when it comes to complete system testing. Many simulators assume a stable network connection. However, stable network connections are a fallacy when you deal with the real devices and networks. Real cars that connect through real networks experience network latency and dropped connections. In addition, cars contain complex sets of electronics that do not always behave as intended once they are brought together.
In my experience, testing with real cars over real networks as early in the development cycle as possible is critically important for identifying system issues in a timely manner. The complexity of a connected car makes it impractical to rely solely on simulation.
Debugging
Due to the vast number of connections and messages, and the distributed nature of the systems, debugging a connected car service can be a significant challenge. Many interconnected systems appear as black boxes which makes it easy to draw hasty conclusions. For instance, it may look like the MQTT broker lost a message, when in fact the log files show that a mobile app sent the message incorrectly or a load balancer forwarded the message to the wrong broker.
Suggestions for making a connected car service less opaque for debugging purposes:
Select architectural components that publish frequent metrics so there is a degree of transparency within a particular service.
Create meaningful dashboards that show application-level data that can be used to trace through an entire system.
Ensure you have centralized logging that can track device interactions and messaging to and from the device.
Note: In early-stage development a verbose mode that assists initial debugging is often used for logging and data collection. However, data-privacy laws in some countries require the removal of unique identifiers such as the vehicle identification number (VIN) of a car, before production. Turning off verbose mode includes removing personally identifiable information such as the VIN.
Security
The overall security of a system is only as secure as each layer in the stack. For an MQTT-based systems, the recommendation is to only allow traffic to the ports that MQTT requires (i.e., port 1883 and 8883). For the MQTT broker, I suggest that you consider the following:
Use TLS for encryption and authentication to ensure that the data is private.
Implement authentication and authorization processes for specific topic namespaces to ensure that devices and data are only accessible based on predefined security policies.
Apply MQTT-connection throttling to protect against malicious or involuntary Denial of Service scenarios that large bursts of connection attempts can cause.
Add application-layer restrictions for MQTT such as setting the maximum message size that is allowed in accordance with your use case. This setting makes clients self-aware if they try to send a message that is too large for their use case.
Network Timeout Considerations
Keep Alive is a very useful MQTT feature that allows developers to set the interval that is allowed between PINGs to maintain an open connection between a client and the broker. However, MQTT messages are typically sent over a network that has its own timeout features such as routers, firewalls, and load balancers. For example, the default timeout for a connection with the AWS load balancer is 3 minutes. Even if the Keep Alive setting on the MQTT broker is 20 minutes, the connection between a device and an MQTT broker running on AWS will only be 3 minutes. The AWS load balancer closes the connection.
Specific to connected cars are the timeouts that are implicit in NAT firewalls. Due to NAT firewall rules, many of the IoT sim cards that cellular networks provide have a default of 3 minutes until the close any open connections.
The key lesson for network timeouts is that you need to create a detailed overview of network architecture early in the development process. It is imperative that you take the time to understand the potential timeout points in your network and synchronize the heartbeats accordingly.
Organizational Hurdles
A connected-car platform project introduces a number of organizational issues that need to be addressed. The key challenge is that the complexity of these types of systems requires the involvement of many different teams and vendors. A connected car project can include teams and vendors who work on hardware development, manufacturing, networks, device software, cloud software, security, etc. Typically, the technology used is so new that barely any prior experience is available within the institution.
Here are some suggestions on how to address your organizational hurdles:
At the operational level, clearly define the points of contact for each team and vendor. Make sure that the people who are doing the work have contact with the other teams.
Ensure there is a single ownership for the entire system.
Coordinate system requirements and capacities early in the process so that teams can plan in advance.
Integrate experts into the teams to provide efficient transfer of knowledge.
Invest in architectural discussions and planning.
Choose software that is supported and is proven
Conclusion
Connected car platforms and services are an important area of innovation for the automotive industry. It is clear that IoT is ready for production deployments of connected car services. Today, companies are rolling out reliable and scalable production systems that are built using MQTT.
These lessons learned are based on my experience working with many companies in the automotive industry and beyond. The key for these successful engagements is that these companies take the time to acquire the knowledge and expertise they need to build their connected car service and invest early in a holistic approach to architecture, testing, and deployment. By taking a holistic approach, these companies are able to deploy high-quality new services that meet and exceed the expectations of their customers.
HiveMQ works with some of the major automotive manufacturers to enable their connected car services. For more information about our Automotive Solutions, check out the white papers and customer stories on our website.