Course Outline

Introduction

Overview of Spark Streaming Features and Architecture

  • Supported data sources
  • Core APIs

Preparing the Environment

  • Dependencies
  • Spark and streaming context
  • Connecting to Kafka

Processing Messages

  • Parsing inbound messages as JSON
  • ETL processes
  • Starting the streaming context

Performing a Windowed Stream Processing

  • Slide interval
  • Checkpoint delivery configuration
  • Launching the environment

Prototyping the Processing Code

  • Connecting to a Kafka topic
  • Retrieving JSON from data source using Paw
  • Variations and additional processing

Streaming the Code

  • Job control variables
  • Defining values to match
  • Functions and conditions

Acquiring Stream Output

  • Counters
  • Kafka output (matched and non-matched)

Troubleshooting

Summary and Conclusion

Requirements

  • Experience with Python and Apache Kafka
  • Familiarity with stream-processing platforms

Audience

  • Data engineers
  • Data scientists
  • Programmers
 7 Hours

Number of participants



Price per participant

Related Categories