Amazon Athena

Amazon Athena is a serverless, interactive query service that runs standard SQL directly against data in Amazon S3 — no cluster to provision, no data to load. It's powered by Trino (for SQL) and Apache Spark (for notebook workloads) and integrates with the AWS Glue Data Catalog for schema and table metadata.


Key Features:


Cost & Performance Best Practices:


Example Query:


SELECT
    date_trunc('day', event_time)                  AS day,
    region,
    count_if(status = 'error')                     AS errors,
    count(*)                                       AS total
FROM   logs.app_events
WHERE  year = '2026' AND month = '04'
GROUP BY 1, 2
ORDER BY day, region;
  


Athena vs. Redshift Spectrum vs. EMR:

Athena is the everyday entry point to the S3-based data lake — it lets analysts and engineers query raw and curated data with SQL without standing up any infrastructure.