Demo: Optimizing Gemma inference on NVIDIA GPUs with TensorRT-LLM

  Рет қаралды 2,135

Google for Developers

Google for Developers

Күн бұрын

Even the smallest of Large Language Models are compute intensive significantly affecting the cost of your Generative AI application. Your ability to increase the throughput and reduce latency can make or break many business cases. NVIDIA TensorRT-LLM is an open-source tool that allows you to considerably speed up execution of your models and in this talk we will demonstrate its application to Gemma.
Watch more videos of Gemma Developer Day 2024 → goo.gle/440EAIV
Subscribe to Google for Developers → goo.gle/developers
#Gemma #GemmaDeveloperDay

Пікірлер: 2
@LuigiGoogle
@LuigiGoogle 2 ай бұрын
hello google, I am inspired by your company, could you create open source projects to work with people who are also interested in this, namely this topic of neural networks.
@ShayansCodeCommunity
@ShayansCodeCommunity 2 ай бұрын
Hi, I am Shayan, beginner Python developer, I want to learn something from code jams or online coding challenges, can Google team will do that work for me please 🥺, I really want to learn something from Google code jam.
Fast LLM Serving with vLLM and PagedAttention
32:07
Anyscale
Рет қаралды 17 М.
Китайка и Пчелка 4 серия😂😆
00:19
KITAYKA
Рет қаралды 3,7 МЛН
Final muy inesperado 🥹
00:48
Juan De Dios Pantoja
Рет қаралды 12 МЛН
FOOLED THE GUARD🤢
00:54
INO
Рет қаралды 11 МЛН
CUDA: New Features and Beyond | NVIDIA GTC 2024
50:08
NVIDIA Developer
Рет қаралды 8 М.
Deep Dive: Optimizing LLM inference
36:12
Julien Simon
Рет қаралды 17 М.
Introduction to large language models
15:46
Google Cloud Tech
Рет қаралды 669 М.
Nvidia CUDA in 100 Seconds
3:13
Fireship
Рет қаралды 1,1 МЛН
Bringing Generative AI to Life with NVIDIA Jetson
42:26
NVIDIA Developer
Рет қаралды 17 М.
NVIDIA's TensorRT-LLM: Building Powerful RAG Apps! (Opensource)
10:51
Learn RAG From Scratch - Python AI Tutorial from a LangChain Engineer
2:33:11
How charged your battery?
0:14
V.A. show / Магика
Рет қаралды 5 МЛН
Apple watch hidden camera
0:34
_vector_
Рет қаралды 62 МЛН
iPhone 15 Pro vs Samsung s24🤣 #shorts
0:10
Tech Tonics
Рет қаралды 13 МЛН
wireless switch without wires part 6
0:49
DailyTech
Рет қаралды 3,9 МЛН
i like you subscriber ♥️♥️ #trending #iphone #apple #iphonefold
0:14
Дени против умной колонки😁
0:40
Deni & Mani
Рет қаралды 12 МЛН