Introduction

Introduce

Welcome to the AWS Analytics workshop. This workshop aims to acquaint you with the different data analytics services available in the AWS Analytics portfolio.

The workshop comprises several modules that cover different aspects of constructing a data analytics platform on AWS. You will gain knowledge on data ingestion, storage, transformation, and consumption using various analytics services like AWS Glue, Amazon Athena, Amazon Kinesis, Amazon EMR, Amazon QuickSight, AWS Lambda, and Amazon Redshift.

A more comprehensive architecture diagram is provided below:

Data Analytics on AWS

Here are the learning outcomes from this workshop:

  1. Designing a serverless data lake architecture.
  2. Building a data processing pipeline and Data Lake using Amazon S3 for data storage.
  3. Utilizing Amazon Kinesis for real-time data streaming.
  4. Using Amazon Kinesis Data Analytics for real-time data analysis.
  5. Employing AWS Glue for automated indexing of datasets.
  6. Data transformation.
  7. Running interactive ETL scripts in a Jupyter notebook on AWS Glue Studio using AWS Glue (interactive sessions).
  8. Utilizing Glue Studio to execute and monitor ETL jobs in AWS Glue.
  9. Using Glue DataBrew for data preparation.
  10. Using EMR to run a Spark transformation job.
  11. Loading data into Amazon Redshift from Glue.
  12. Introduction to best practices for designing Amazon Redshift.
  13. Querying data with Amazon Athena and visualizing data with Amazon QuickSight.