Back to Catalog
Data & Analytics

Amazon Kinesis

"Collect, process, and analyze real-time, streaming data."

What is Amazon Kinesis?

Amazon Kinesis is a platform for streaming data on AWS. It allows you to collect, process, and analyze real-time, streaming data such as video, audio, application logs, website clickstreams, and IoT telemetry data.

Key Components

1. Kinesis Data Streams

  • Ingest massive amounts of data in real-time.
  • Shards: Throughput is defined by "shards". You manually manage shards (unless using On-Demand mode).
  • Retention: Data is stored for 24 hours by default (up to 365 days extended).

2. Kinesis Data Firehose (Now Amazon Data Firehose)

  • Load streaming data into data stores.
  • Easiest way to capture, transform, and load data into Amazon S3, Redshift, OpenSearch, and Splunk.
  • Near Real-time: Delivery is not instantaneous (buffer time of 60 seconds usually).

3. Kinesis Data Analytics (Now Managed Service for Apache Flink)

  • Analyze streaming data with SQL or Apache Flink.

Exam Tips

[!IMPORTANT] Streams vs Firehose:

  • Kinesis Data Streams: "I need to write custom code to process data in real-time." (Think: Custom Consumers).
  • Kinesis Firehose: "I need to SAVE data to S3/Redshift/OpenSearch." (Think: Delivery service).

[!NOTE] Real-time: Kinesis is the go-to answer for "Real-time streaming data ingestion".

Common Use Cases

  • Log and Event Data Collection: Collecting logs from servers and applications in real-time.
  • Real-time Analytics: Calculation metrics like leaderboard scores or stock prices instantly.
OpenSearch
AWS Glue
SWIPE ZONE
< DRAG ME >