Courage Noko has over 12 years experience in Data engineering. He started at the New York Times where he was a Hadoop administrator/developer maintaining Hadoop clusters as well as writing MapReduce pipelines in Java and Pig. I also designed and implemented the first real time recommendation engine for nytimes.com.
He has led a team at Spotify to build a scalable real-time streaming platform that allows engineers to write real time recommendation engines, receive over 100s of millions of live events per seconds from streaming devices and run real time performance diagnostics. Over time, he has worked on several streaming frameworks: Apache Beam/Dataflow, Spark, Flink. And Storm.
Lately, he has taken interest in real-time analytics, leading teams to build and deploy platforms that support OLAP databases such as Druid, ClickHouse and Pinot.
Knowledge of basic data processing (filtering, map, GroupBy, Sum, count).
Data Scientists, Data Engineers and ML Engineers
Batch vs Streaming data - comparing the main differences between bounded and unbounded data processing.
Real world examples - a look at some real world examples of real-time stream processing.
Overview of streaming architecture - detailed look at the various components that form the core parts of a streaming infrastructure
Examples of streaming engines.
Event processing - a flow of events in the streaming pipeline.
Stream window operations -a detailed look at how different execution engines handle aggregations.
Stream Joins - joining multiple sources to streaming events.
Unit tests for streaming pipelines.
Real world streaming project covering concepts in modules 1-3.
Monitoring streaming jobs.
Best practices.
Review of Real world streaming project from module 3.