Рет қаралды 36
A talk by Iaroslav Geraskin at TikTok.
In the rapidly evolving landscape of machine learning (ML) serving, optimising performance is paramount. This presentation delves into the innovative strategy of leveraging distributed caches to accelerate ML serving. By strategically caching frequently accessed model predictions and intermediate computations, organisations can significantly reduce latency and improve throughput in ML inference pipelines. Through practical insights, attendees will gain a comprehensive understanding of the benefits and challenges of incorporating distributed caches into ML serving architectures. From cache design considerations to implementation best practices, this session equips participants with the knowledge and tools necessary to harness the full potential of distributed caching for accelerated ML serving.
Technical Level: Technical practitioner
This session was part of the Data Science Festival MayDay event 2024. Find out more at datasciencefestival.com/event...
The Data Science Festival is the place for data-driven people to come together, share cutting-edge ideas, and solve real-world problems. We run monthly events, meet-ups, and the biggest free-to-attend data festivals in the UK. Join the community at datasciencefestival.com/