{0xc00044b600 0xc0004cf0c0} Setup an Amazon EKS cluster :: Distributed training with Amazon SageMaker / Amazon EKS Workshop

Setup an Amazon EKS cluster

Navigate to distributed-training-workshop > notebooks > part-3-sagemaker

The cpu_eks_cluster.sh and gpu_eks_cluster.sh files include the necessary options to lauch a CPU or GPU cluster. Take a look at the options by running the following script to launch an EKS clusters

cd ~/SageMaker/distributed-training-workshop/notebooks/part-3-kubernetes/
cat cpu_eks_cluster.sh

You should see the following output

eksctl create cluster \
    --name aws-tf-cluster-cpu \
    --version 1.14 \
    --region us-west-2 \
    --nodegroup-name cpu-nodes \
    --node-type c5.xlarge \
    --nodes 2 \
    --node-volume-size 50 \
    --node-zones us-west-2a \
    --timeout=40m \
    --zones=us-west-2a,us-west-2b,us-west-2c \

To launch a cluster with GPU use the script gpu_eks_cluster.sh instead. If you wish to launch a cluster with more than 2 nodes, update the nodes argument to number of nodes you want in the cluster.

Now launch an EKS cluster:

sh cpu_eks_cluster.sh

You should an output that something similar to this.

eks output

Creating a cluster may take about 15 mins. You could head over to AWS cloud formation console to monitor the progress.