Learning Ray - Flexible Distributed Python for Machine Learning¶

-- Max Pumperla, Edward Oakes, Richard Liaw

Online version of "Learning Ray" (O'Reilly). All code and diagrams used in the book are fully open-sourced, and you can find self-contained notebooks accompanying the book here for free. You won't get the exact same reading experience as with the printed book, but you should get a good idea if the book is for you. If you want to support this project and buy the book, you can e.g. get it directly from O'Reilly, or from Amazon. The book will be published in May 2023, but online formats should be available before that.

Learning Ray

Overview¶

The book is organized to guide you chapter by chapter from core concepts of Ray to more sophisticated topics along the way. The first three chapters of the book teach the basics of Ray as a distributed Python framework with practical examples. Chapters four to ten introduce Ray's high-level libraries and show how to build applications with them. The last two chapters give you an overview of Ray's ecosystem and show you where to go next. Here's what you can expect from each chapter.

Chapter 1, An Overview of Ray Introduces you at a high level to all of Ray's components, how it can be used in machine learning and other tasks, what the Ray ecosystem currently looks like and how Ray as a whole fits into the landscape of distributed Python.
Chapter 2, Getting Started with Ray Walks you through the foundations of the Ray project, namely its low-level API. It also discussed how Ray Tasks and Actors naturally extend from Python functions and classes. You also learn about all of Ray's system components and how they work together.
Chapter 3, Building Your First Distributed Application with Ray Core Gives you an introduction to distributed systems and what makes them hard. We'll then build a first application together and discuss how to peak behind the scenes and get insights from the Ray toolbox.
Chapter 4, Reinforcement Learning with Ray RLlib Gives you a quick introduction to reinforcement learning and shows how Ray implements important concepts in RLlib. After building some examples together, we'll also dive into more advanced topics like preprocessors, custom models, or working with offline data.
Chapter 5, Hyperparameter Optimization with Ray Tune Covers why efficiently tuning hyperparameters is hard, how Ray Tune works conceptually, and how you can use it in practice for your machine learning projects.
Chapter 6, Data Processing with Ray Introduces you to the Dataset abstraction of Ray and how it fits into the landscape of other data structures. You will also learn how to bring pandas data frames, Dask data structures and Apache Spark workloads to Ray.
Chapter 7, Distributed Training with Ray Train Provides you with the basics of distributed model training and shows you how to use RaySGD with popular frameworks such as TensorFlow or PyTorch, and how to combine it with Ray Tune for hyperparameter optimization.
Chapter 9, Serving Models with Ray Serve Introduces you to model serving with Ray, why it works well within the framework, and how to do single-node and cluster deployment with it.
Chapter 9, Working with Ray Clusters This chapter is all about how you configure, launch and scale Ray clusters for your applications. You'll learn about Ray's cluster launcher CLI and autoscaler, as well as how to set up clusters in the cloud and how to deploy on Kubernetes and other cluster managers.
Chapter 10, Getting Started with the Ray AI Runtime Introduces you to Ray AIR, a unified toolkit for your ML workloads that offers many third party integrations for model training or accessing custom data sources.
Chapter 11, Ray's Ecosystem and Beyond Gives you an overview of the many interesting extensions and integrations that Ray has attracted over the years.