Conf42: Machine Learning 2021

...

Hail Hydrate! From Stream to Lake

Tim Spann
Developer Advocate @ StreamNative

Tim Spann's LinkedIn account Tim Spann's twitter account



A cloud data lake that is empty is not useful to anyone.

How can you quickly, scalably and reliably fill your cloud data lake with diverse sources of data you already have and new ones you never imagined you needed. Utilizing open source tools from Apache, the FLaNK stack enables any data engineer, programmer or analyst to build reusable modules with low or no code.

In this talk we will utilize Apache NiFi, Apache Pulsar, Apache Flink and MiNiFi agents to load CDC, Logs, REST, XML, Images, PDFs, Documents, Text, semistructured data, unstructured data, structured data and a hundred data sources you could never dream of streaming before.

I will teach you how to fish in the deep end of the lake and return a data engineering hero. Let’s hope everyone is ready to go from 0 to Petabyte hero.

Awesome tech events for

Priority access to all content

Community Discord

Exclusive promotions and giveaways