Amazon Redshift

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools. It uses Columnar Storage and Massively Parallel Processing (MPP).

Key Concepts

Columnar Storage: Data is stored by columns rather than rows. Ideal for analytical queries (OLAP) which often aggregate data over a few columns (e.g., "Sum of Sales").
MPP (Massively Parallel Processing): Distributes data and query execution across multiple nodes.
Redshift Serverless: Automatically provisions and scales capacity in seconds.
Redshift Spectrum: Query data directly in S3 (data lake) without loading it into Redshift tables.
Zero-ETL: Integrate directly with Aurora, RDS, and DynamoDB without building complex pipelines.

Node Types

RA3 Nodes: Separate compute and storage. Scale each independently. Best for most workloads.
Dense Compute (DC): Best for high performance with less data (SSD based).
Dense Storage (DS): (Legacy) Best for large storage needs at lower cost.

Exam Tips

"Data Warehouse" / "OLAP" / "SQL Analytics": The answer is Redshift.
"Columnar Storage": Redshift feature for performance.
Single-AZ: Redshift is effectively Single-AZ (by default). If the AZ goes down, the cluster is unavailable (though data is backed up to S3). Note: Multi-AZ is now available for RA3 clusters but classic Redshift is often tested as Single-AZ.
Redshift Spectrum: Keyword "Query data in S3 without loading it".
BI Tools: Integrates with QuickSight, Tableau, PowerBI.

Common Use Cases

Enterprise Data Warehousing.
Big Data Analytics.
Log Analysis.
Migration from on-premise solutions like Teradata, Oracle DW.