Stream Processing
Learn how to efficiently analyze massive data sets, create data pipelines, and gain insights in real-time. We'll cover tools like Apache Kafka, Apache Flink, and Apache Spark Streaming, as well as streaming data architectures.
Learn how to efficiently analyze massive data sets, create data pipelines, and gain insights in real-time. We'll cover tools like Apache Kafka, Apache Flink, and Apache Spark Streaming, as well as streaming data architectures.
Courage has over 12 years experience in Data engineering. He started at the New York Times where he was a Hadoop administrator/developer maintaining Hadoop clusters as well as writing MapReduce pipelines in Java and Pig. I also designed and implemented the first real time recommendation engine for nytimes.com.
He has led a team at Spotify to build a scalable real-time streaming platform that allows engineers to write real time recommendation engines, receive over 100s of millions of live events per seconds from streaming devices and run real time performance diagnostics. Over time, he has worked on several streaming frameworks: Apache Beam/Dataflow, Spark, Flink. And Storm.
Lately, he has taken interest in real-time analytics, leading teams to build and deploy platforms that support OLAP databases such as Druid, ClickHouse and Pinot.
This section will focus on Stream Processing with Apache Spark. You will learn how to use Spark Streaming to process real-time data streams and explore Spark's Structured Streaming API for continuous applications. You will also discover how to manage stateful computations, handle late arriving data, and integrate Spark with other streaming technologies. Additionally, you will learn about the deployment and monitoring of Spark Streaming applications.
In this section, you will explore Event-driven Architecture and Design Patterns for Stream Processing. You will learn how to design and build scalable and fault-tolerant event-driven systems for real-time data processing. You will also discover how to implement common event-driven design patterns, such as event sourcing, CQRS, and pub/sub models. Additionally, you will explore how to integrate event-driven systems with streaming technologies and different data storage solutions.
In this section, you will learn how to process data at scale for Stream Processing. You will explore different techniques for scaling stream processing applications, such as parallelism, distributed systems, and containerization. You will also discover how to optimize stream processing pipelines for high throughput and low latency. Additionally, you will learn how to manage large-scale data and deal with common issues like data skew, data loss, and data consistency.
In this section, you will learn about managing and monitoring stream processing applications. You will explore how to deploy, manage and maintain stream processing pipelines at scale. You will also discover how to monitor the performance, health, and security of stream processing systems using different tools and techniques. Additionally, you will learn how to troubleshoot common issues and optimize stream processing applications for reliability, scalability, and maintainability.
Bite-sized daily lessons that you can easily fit into your schedule. Each day, we release new lessons no longer than 15 minutes. Our lessons are carefully curated to ensure that they're both engaging and informative, allowing you to learn something new every day, and at your own pace.
Collaborate with other engineers from around the world, providing you with a unique opportunity to learn from others and build your professional network.
Our live learning sessions are designed to be interactive and engaging, giving you the opportunity to ask questions and interact with subject-matter experts.
Learn by solving real-world problems. Our courses are designed to get rid of the fluff and provide you with the most relevant information to help you apply your learning.
Fill in your details and we’ll reach out to you within 24h.