Back to Catalog
Data & Analytics

Amazon Athena

"Serverless interactive query service to analyze data in S3 using SQL."

What is Amazon Athena?

Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon S3 using standard SQL. It is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Key Features

  • Serverless: No EC2 instances to manage. No warehouse to provision.
  • Standard SQL: Uses Presto under the hood, supporting standard ANSI SQL.
  • Direct S3 Querying: You don't need to load data into a database. You query files (CSV, JSON, Parquet) sitting effectively in S3 buckets.
  • Integration with Glue: Uses the AWS Glue Data Catalog to store table metadata.

Exam Tips

[!IMPORTANT] Analyze S3 Data: If the question says "Analyze data in S3 using SQL" or "Query log files in S3", the answer is Athena.

[!NOTE] Cost Model: You pay per TB of data scanned.

  • Tip: Use columnar formats (like Parquet/ORC) and partition your data to scan less data and save money!

Common Use Cases

  • Log Analysis: Querying ALB logs, CloudTrail logs, or VPC Flow Logs stored in S3.
  • Ad-hoc Analysis: Quickly running a SQL query on a CSV file someone uploaded to S3.
App. Composer
Redshift
SWIPE ZONE
< DRAG ME >