Meet Waterstream

Waterstream is a full-featured MQTT broker running on any Kafka-compatible platform via native Kafka consumers and producers.

Why Waterstream

Waterstream is an MQTT broker that uses Kafka as its own and only storage and distribution engine, combining the most popular IoT protocol with the standard de-facto streaming API.

MQTT

MQTT is the most popular IoT protocol for very good reasons: it’s lightweight, it’s supported by all programming languages, and it’s built for poor connectivity scenarios such as mobile networks. But it’s not designed for stream processing and does not support the reprocessing of events.

Apache Kafka

Apache Kafka is a streaming platform that enables large scale, high availability, long term storage and seamless integration with other technologies. But it’s not designed for the IoT because it requires a stable network, does not support tens of thousands of connections, and does not support IoT specific features such as Keep Alive or Last Will.

The best of both ecosystems

Waterstream gives you the the best of both ecosystems in one making MQTT and Kafka streaming work together perfectly with high scalability, millions of connections, real-time stream processing, easy integration with database, key-value stores, search indexes and file systems.

How it works

Incoming MQTT messages from clients as well as client-state are saved directly into Kafka without intermediate storage.

favicon_wt-01

Waterstream can read records from Kafka topics and eventually send them to clients using MQTT or Websockets.

favicon_wt-01

Every Waterstream node is stateless because everything is stored into Kafka. This allows low latency and excellent scalability

favicon_wt-01

Reference architecture

A typical scenario requires to deploy multiple Waterstream instances, also called nodes, to provide fault tolerance and scalability. Waterstream nodes do not store any information, everything is persisted in Kafka, therefore they can be added or removed dynamically according to the load. A load balancer is required between MQTT clients and Waterstream nodes to distribute network traffic.

Waterstream persists incoming MQTT messages to the configured Kafka topics. Once in Kafka, data that can be consumed by any Kafka client, such as a Kafka consumer, Kafka Connect, and Kafka Streams applications. Kafka producers can send messages back to the MQTT clients by writing on designated Kafka topics.

Waterstream provides integrated observability through Prometheus and Grafana, Customized metric solutions can be added through a plugin system.

Manage millions of clients

Waterstream scales out linearly. For most operations, its nodes don’t depend on each other and more nodes can be added to support an increasing number of clients.

Several scalability tests have been executed to test and tune Waterstream performance.As shown in the below graph, Waterstream was capable of managing more than one million connected devices, using only 12 nodes of modest computing power (2 CPU, 7.5 GB RAM).

Deploy everywhere with any Kafka compatible platform

Waterstream is distributed as a Docker image (x86/ARM64) with minimal requirements of RAM and CPU. Waterstream can be deployed at the edge, on-premises, and in the cloud as a standalone process or inside a Kubernetes cluster. To know more about this, check our documentation.

Waterstream requires Apache Kafka version 1.1.0 or greater to work. Several distributions of Kafka support this, like Confluent Cloud or IBM Event Stream. Waterstream is a Confluent Verified Integration meeting the standard quality and functional requirements to work with Confluent Cloud.

Waterstream also works with alternative implementations of the Kafka protocol like Redpanda. To know more, check out our Redpanda integration demo.

Ready to get started?

Request a demo or talk to our technical sales team to answer your questions.