Generating Synthetic Data at Scale w/Help of Modern Execution Technologies (PyData Tel-Aviv Dec21)

Speaker: Or Sher, Infrastructure Team Lead, Datagen ------------------------- Datagen started creating synthetic images using on-premise consumer GPU machines which did not provide the flexibility and scalability required for larger scale operations. We needed a scalable system that enables large-scale generation of 3D environments, a CPU intensive process, and rendering the images from within the 3D environments, a GPU intensive process. This presentation will share our journey of building our internal K8s based, cloud agnostic system to enable us to provision and utilize thousands of GPU and CPU resources exactly and only when we need them.. We will cover aspects of reliability, performance, efficiency, cost optimization, and also: What is synthetic data The challenges of generating simulated data at scale serving many customers. Architecture and coding challenges Move fast and keep code clean
Back to Top