Scale EDA & ML Workloads To Clusters & Back With Dask I PyData Chicago January 2022 Meetup

Speaker: Gus Cavanaugh Speaker’s Bio: “Big Data“ & “The Cloud“ promised me infinite scale. But that’s not what I found when I stumbled onto a Hadoop cluster after college. What seemed so simple when the architects at my big consulting employer got out the whiteboard became much less so when I had my hands on the keyboard. I found solace in Python, specifically the Anaconda distribution, which I could run on the most archaic Windows workstation or cluster of Linux servers. Eventually, I switched from consulting to software where I thought I was helping companies deploy data science platforms but I really spent my time as an unpaid AWS/Azure consultant fighting with Kubernetes. I recently reunited with former Anaconda colleagues at Coiled, where we provide software and support for commercial and community users of Dask. Abstract: While “Big Data“ may be an overhyped buzzword, it’s not uncommon for Python users to end up with more data than can fit on t

26 views