As tools trend toward scaling to larger and larger data, their requirements are growing as well, which presents a non-trivial human cost. To bring the focus of data science tools back to the data scientist, we built Modin, a platform that scales data scientist capabilities without requiring them to learn about distributed computing concepts. In this talk, we discuss the design of Modin and how Ray enables our modular design to scale. We also discuss our experience with enabling the data scientist to connect to multiple Ray clusters from the same notebook.
Devin Petersohn is a 5th year Computer Science PhD student at the UC Berkeley RISELab and Machine Learning Engineer at Intel. The early focus of his PhD work was in scaling genomic workloads to enable large scale DNA analysis. In recent years, Devin has focused on making scalable... Read More →
Thursday October 1, 2020 10:10am - 10:40am PDT
Virtual 3