High Performance Data Engineering in Rust - Efficient Deduplication Example

  Рет қаралды 2,437

Pragmatic AI Labs

Pragmatic AI Labs

Жыл бұрын

I demonstrate using Rust's speed and efficiency for a real-world data engineering use case - building a fast deduplication tool. Includes walkthrough of the thread pools, progress bars, command line interface, and checksumming logic with code examples.
github.com/noahgift/rdedupe
✨I build courses: insight.paiml.com/bzf
📚LLMOps Specialization: insight.paiml.com/a8e
📚Introduction to Generative AI: insight.paiml.com/ee2
📚Operationalizing LLMs on Azure: insight.paiml.com/e2u
📚Databricks to Local LLMs: insight.paiml.com/i6k
📚Advanced Data Engineering: insight.paiml.com/uvi
📚Rust Programming Specialization: insight.paiml.com/qwh
📚Rust for DevOps: insight.paiml.com/x14
📚Rust LLMOps: insight.paiml.com/g3b
📚Rust Fundamentals: insight.paiml.com/qyt
📚Data Engineering with Rust: insight.paiml.com/zm1
📚Python and Rust with Linux Command Line Tools: insight.paiml.com/jot
📚Applied Python Data Engineering Specialization: insight.paiml.com/5r9
📚Data Visualization with Python: insight.paiml.com/y9p
📚Virtualization, Docker, and Kubernetes for Data Engineering: insight.paiml.com/xtp
📚Spark, Hadoop, and Snowflake for Data Engineering: insight.paiml.com/f6j
📚MLOps | Machine Learning Operations Specialization: insight.paiml.com/l5u
📚Python Essentials for MLOps: insight.paiml.com/uvm
📚DevOps, DataOps, MLOps: insight.paiml.com/ggi
📚MLOps Tools: MLflow and Hugging Face: insight.paiml.com/y2v
📚MLOps Platforms: Amazon SageMaker and Azure ML: insight.paiml.com/ymb
📚Python, Bash and SQL Essentials for Data Engineering Specialization: insight.paiml.com/2or
📚Linux and Bash for Data Engineering: insight.paiml.com/d31
📚Scripting with Python and SQL for Data Engineering: insight.paiml.com/n3b
📚Python and Pandas for Data Engineering: insight.paiml.com/nz7
📚Web Applications and Command-Line Tools for Data Engineering: insight.paiml.com/o86
📚Building Cloud Computing Solutions at Scale Specialization: insight.paiml.com/hrt
📚Cloud Computing Foundations: insight.paiml.com/zrb
📚Cloud Data Engineering: insight.paiml.com/75t
📚Cloud Machine Learning Engineering and MLOps: insight.paiml.com/jjh
📚Cloud Virtualization, Containers and APIs: insight.paiml.com/ce5
📝 Guided Projects:
📝Object-Oriented Programming in Python:insight.paiml.com/n4h
📝MySQL-for-Data-Engineering: insight.paiml.com/e1k
📝Python Generators: insight.paiml.com/i9l
📝Build a Static Website with Rust and Zola: insight.paiml.com/a2h
📝Building Rust AWS Lambda Microservices with Cargo Lambda: insight.paiml.com/8ed
📝Rust Secret Cipher CLI: insight.paiml.com/zzr

Пікірлер: 8
@shamaldesilva9533
@shamaldesilva9533 Жыл бұрын
Awesome video 🥳 , btw am a data engineer and i primarily work with python am trying to find reasons to learn rust could you do a video on how we can integrate rust with python in the context of data engineering ✌️
@pragmaticai
@pragmaticai Жыл бұрын
Yes I can
@sany2k8
@sany2k8 Жыл бұрын
Hello Sir, this looks awesome 😎, I am a backend/data engineer working mostly with python, APIs and databases. I want to learn rust and use on my data engineering task. Can you suggest the whole work through rust tutorial/project?
@pragmaticai
@pragmaticai Жыл бұрын
yes, this course here: www.coursera.org/learn/devops-dataops-mlops-duke
@jonathanduran2921
@jonathanduran2921 Жыл бұрын
Love Rust, but unfortunately, I have not run into a single company hiring for a DE with Rust skills (I interview a handful of times a month and apply to 50ish companies a month). Really hoping this changes.
@pragmaticai
@pragmaticai Жыл бұрын
This is often the case with innovation, you can be head of the curve, but when eventually things turn out the way you think, you could do quite well.
@michah3956
@michah3956 Жыл бұрын
Currently, Rust is the best programming language.
@pragmaticai
@pragmaticai Жыл бұрын
I would agree with you, the breadth and features beat anything right now.
#PyTorch #stablediffusion #rust #gpu demo using #github #codespaces
7:05
Пробую самое сладкое вещество во Вселенной
00:41
Вечный ДВИГАТЕЛЬ!⚙️ #shorts
00:27
Гараж 54
Рет қаралды 13 МЛН
Rust for Python data engineers - Karim Jedda
27:30
EuroPython Conference
Рет қаралды 5 М.
Easiest way to build LLM apps - Langflow 1.0 demo and deep dive!
1:00:51
Rust's Witchcraft
9:18
No Boilerplate
Рет қаралды 173 М.
Is Rust the New King of Data Science?
15:38
Code to the Moon
Рет қаралды 133 М.
Sound Data Engineering in Rust-From Bits to DataFrames
34:36
Databricks
Рет қаралды 11 М.
SQL Best Practices - Designing An ETL - Part 1
24:42
Seattle Data Guy
Рет қаралды 71 М.
Simple maintenance. #leddisplay #ledscreen #ledwall #ledmodule #ledinstallation
0:19
LED Screen Factory-EagerLED
Рет қаралды 10 МЛН
iPhone 16 с инновационным аккумулятором
0:45
ÉЖИ АКСЁНОВ
Рет қаралды 831 М.
Lid hologram 3d
0:32
LEDG
Рет қаралды 10 МЛН