navigation

Analytics on AWS workshop

Welcome to the Analytics workshop on AWS. The purpose of this workshop is to acquaint you with the different analytics services offered in the AWS Analytics portfolio

The workshop will cover various modules that discuss different aspects of constructing an analytics platform on AWS. Participants will acquire knowledge on data ingestion, storage, transformation, and utilization using various analytics services including AWS Glue, Amazon Athena, Amazon Kinesis, Amazon EMR, Amazon QuickSight, as well as AWS Lambda and Amazon Redshift. The architecture diagram below provides a more detailed representation of the design:

Data Analytics on AWS

Learning outcomes from this workshop:

  1. Designing a serverless architecture for data lakes (serverless data lake).
  2. Building data processing pipelines and Data Lakes using Amazon S3 for data storage.
  3. Utilizing Amazon Kinesis for real-time streaming data.
  4. Employing Amazon Kinesis Data Analytics for real-time data analysis.
  5. Utilizing AWS Glue for automatic data storage in catalogs.
  6. Performing Data Transformation.
  7. Executing interactive ETL scripts in Jupyter notebooks on AWS Glue Studio through an AWS Glue interactive session.
  8. Using Glue Studio to run and monitor ETL jobs on AWS Glue.
  9. Utilizing Glue DataBrew to prepare data.
  10. Running Spark transform jobs on EMR (Elastic MapReduce).
  11. Uploading data from Glue to Amazon Redshift.
  12. Gaining an introduction to Amazon Redshift design best practices.
  13. Querying data with Amazon Athena and visualizing it with Amazon QuickSight.
  14. Cleaning up resources.