AWS EMR Studio

Xin Cheng
2 min readApr 28, 2021

--

Overview

EMR Studio provides fully managed Jupyter notebooks and integrated with EMR. Previously there is EMR notebooks. However, here is the difference:

Integration with AWS SSO (Single Sign-On) is interesting. User does not need to AWS Management Console for EMR Studio, and EMR studio appears like an enterprise application and user can access from a user portal.

Provision

Enable AWS SSO

You must enable AWS SSO. Otherwise you cannot create EMR Studio.

https://console.aws.amazon.com/singlesignon

You need to configure AWS SSO (e.g. users, groups).

Create EMR Studio

AWS provides sample script to easily provision EMR studio.

Make sure you have latest AWS CLI (version 1 or 2)

git clone https://github.com/aws-samples/emr-studio-samples.git
cd emr-studio-samples/
chmod +x create_demo_studio_with_dependencies
.sh
./create_demo_studio_with_dependencies.sh

It uses CloudFormation to create necessary resources, e.g. S3 bucket, EMR Studio. You can use the following command to list created studios

aws emr list-studios --region us-east-1

The output is like

{
"Studios": [
{
"StudioId": "es-*",
"Name": "test-emr-studio",
"VpcId": "vpc-*",
"Url": "https://es-*.emrstudio-prod.us-east-1.amazonaw
s.com",
"CreationTime": 1619408283.745
}
]
}

With EMR Studio created, In EMR Studio, add the user created in AWS SSO. In AWS SSO, you will see one application added appear in the user portal.

Create workspace

Voila! Now you can attach to EMR cluster, or EMR on EKS

--

--

Xin Cheng
Xin Cheng

Written by Xin Cheng

Multi/Hybrid-cloud, Kubernetes, cloud-native, big data, machine learning, IoT developer/architect, 3x Azure-certified, 3x AWS-certified, 2x GCP-certified

No responses yet