Frameworks: This workshop currently uses TensorFlow 1.14, Keras and Horovod 0.18.
Dataset: The CIFAR-10 consists of 60,000 32x32 images belonging to 10 different classes (6,000 images per class).
CIFAR-10 dataset includes:
Here are the classes in the dataset, as well as 10 random images from each:
Note: Although the dataset is small and this is a simpler problem, all the steps we’ll take can easily be applied to large datasets that don’t fit in memory. Amazon SageMaker has native pipe-mode support to stream dataset directly from S3 to the training instances. With Amazon EKS, we’ll setup an Amazon FSx for lustre file system that’s accessible to every worker.