Conf42: Python 2022


Building a real-time analytics dashboard with Streamlit, Apache Pinot, and Apache Kafka

Mark Needham
Developer Relations Engineer @ StarTree

Mark Needham's LinkedIn account Mark Needham's twitter account

When you hear “decision-maker”, it’s natural to think, “C-suite”, or “executive”. But these days, we’re all decision-makers. Restaurant owners, bloggers, big-box shoppers, diners - we all have important decisions to make and need instant actionable insights. In order to provide these insights to end-users like us, businesses need access to fast, fresh analytics.

In this session, we will learn how to build our own real-time analytics application on top of a streaming data source using Apache Kafka, Apache Pinot, and Streamlit. Kafka is the de facto standard for real-time event streaming, Pinot is an OLAP database designed for ultra-low latency analytics, and Streamlit is a Python-based tool that makes it super easy to build data-based apps.

After introducing each of these tools, we’ll stream data into Kafka using its Python client, ingest that data into a Pinot real-time table, and write some basic queries using Pinot’s Python SDK. Once we’ve done that, we’ll glue everything together with an auto-refreshing dashboard in Streamlit so that we can see changes to the data as they happen. There will be lots of graphs and other visualisations!

This session is aimed at application developers and data engineers who want to quickly make sense of streaming data.

Awesome tech events for

Priority access to all content

Community Discord

Exclusive promotions and giveaways