All
Search
Images
Videos
Shorts
Maps
News
More
Shopping
Flights
Travel
Notebook
Report an inappropriate content
Please select one of the options below.
Not Relevant
Offensive
Adult
Child Sexual Abuse
Staged Speculative Decoding
Comparison
Megatron-Lm
Staged Speculative Decoding
Method
Staged Speculative Decoding
Examples
Accelerating LLM
Inference Review
Accelerating LLM
Inference Paper
Accelerating LLM
Inference Code
Transformer Models
Accelerating LLM
Inference Video Summary
Latent Language Model Applications
Accelerating LLM
Inference Tutorial
Text Generation
Natural Language Processing
Bert
Latent Language Model Videos
Language Modeling
Deep Learning
Machine Translation
GPT-3
Length
All
Short (less than 5 minutes)
Medium (5-20 minutes)
Long (more than 20 minutes)
Date
All
Past 24 hours
Past week
Past month
Past year
Resolution
All
Lower than 360p
360p or higher
480p or higher
720p or higher
1080p or higher
Source
All
Dailymotion
Vimeo
Metacafe
Hulu
VEVO
Myspace
MTV
CBS
Fox
CNN
MSN
Price
All
Free
Paid
Clear filters
SafeSearch:
Moderate
Strict
Moderate (default)
Off
Filter
Staged Speculative Decoding
Comparison
Megatron-Lm
Staged Speculative Decoding
Method
Staged Speculative Decoding
Examples
Accelerating LLM
Inference Review
Accelerating LLM
Inference Paper
Accelerating LLM
Inference Code
Transformer Models
Accelerating LLM
Inference Video Summary
Latent Language Model Applications
Accelerating LLM
Inference Tutorial
Text Generation
Natural Language Processing
Bert
Latent Language Model Videos
Language Modeling
Deep Learning
Machine Translation
GPT-3
ibm.com
Faster LLMs: Accelerate Inference with Speculative Decoding
Isaac Ke explains speculative decoding, a technique that accelerates LLM inference speeds by 2-4x without compromising output quality. Learn how "draft and verify" pairs smaller and larger models to optimize token generation, GPU usage, and resource efficiency.
11 months ago
Accelerating LLM Inference with Staged Speculative Decoding LLM Inference
How to Quadruple LLM Decoding Performance with Speculative Decoding (SpD) and Microscaling (MX) Formats on Qualcomm® Cloud AI 100
qualcomm.com
Aug 1, 2024
1:58
Idea Atlas — Day 7: Japan's sovereign AI stack — AEC knowledge graph in Obsidian
YouTube
Idea Atlas
7 views
1 month ago
3:49
T-pro 2.0: Efficient Russian Reasoning LLM
YouTube
AI Research Roundup
5 months ago
Top videos
0:18
Introducing LM Studio 0.3.10 with 🔮 Speculative Decoding!It's an LLM inferencing technique that can speed up token generation by up to 1.5x-3x in some cases 🏎️💨- Supported for both GGUF and… | LM Studio | 10 comments
linkedin.com
10 views
Feb 19, 2025
0:23
DFlash: Faster LLM Inference with Speculative Decoding
YouTube
OnlyCS
7 views
6 days ago
11:34
Generate 10 Tokens At Once - Faster LLM INFERENCE - AdaSPEC - Speculative Decoding Improvement
YouTube
Vuk Rosić
505 views
6 months ago
Accelerating LLM Inference with Staged Speculative Decoding Speculative Decoding
6:13
Speculative Decoding: Make AI 2-3x Faster for Free | Tech Decoded
YouTube
Toc am
3 views
1 month ago
0:39
Unlock True LLM Performance on Your Consumer Hardware
YouTube
Github Signals
7 views
4 weeks ago
3:08
What is Speculative Decoding ?
YouTube
DeepManim
38 views
1 week ago
0:18
Introducing LM Studio 0.3.10 with 🔮 Speculative Decoding!It's an LLM i
…
10 views
Feb 19, 2025
linkedin.com
0:23
DFlash: Faster LLM Inference with Speculative Decoding
7 views
6 days ago
YouTube
OnlyCS
11:34
Generate 10 Tokens At Once - Faster LLM INFERENCE - AdaSPE
…
505 views
6 months ago
YouTube
Vuk Rosić
1:16:02
Speculative Decoding and Efficient LLM Inference with Chris Lott - 717
1.8K views
Feb 3, 2025
YouTube
The TWIML AI Podcast with Sam Charrington
Speculative Decoding — Think Fast⚡, Then Think Right✅
Apr 13, 2025
substack.com
How to Quadruple LLM Decoding Performance with Speculative Dec
…
Aug 1, 2024
qualcomm.com
1:23
Speculative Speculative Decoding for Faster LLM Inference
2.1K views
2 months ago
YouTube
Rajistics - data science, AI, and machine learning
37:34
Speculative Decoding Explained
7.8K views
Dec 21, 2023
YouTube
Trelis Research
1:05
What is Speculative decoding - Speculative decoding Explained #
…
309 views
2 months ago
YouTube
Med Bou | AI Tutorials
2:27:59
COLING 2025 Tutorial: Speculative Decoding for Efficient LLM Inference
398 views
Jan 23, 2025
bilibili
云安Ann
40:31
CS 886 | Lecture 13 Efficient LLM Inference | PABEE, CALM and Spe
…
1.2K views
Mar 3, 2024
YouTube
Rushabh Solanki
6:18
What is Speculative Sampling? | Boosting LLM inference speed
4K views
Nov 20, 2024
YouTube
AssemblyAI
2:42
AI Explained: Speculative decoding with vLLM
1.1K views
2 months ago
YouTube
Red Hat
5:04
Speculative Decoding: 2-3x Faster LLMs for Free
1 views
1 month ago
YouTube
The AI Century
7:40
Speculative Decoding: 3× Faster LLM Inference with Zero Quality L
…
709 views
4 months ago
YouTube
Tales Of Tensors
40:19
Speculation is all you need: Intro to Speculative Decoding for High Per
…
753 views
2 months ago
YouTube
Modal
0:25
Speculative execution for LLMs is an excellent inference-time optimi
…
1.2M views
Aug 31, 2023
x.com
Andrej Karpathy
9:39
Faster LLMs: Accelerate Inference with Speculative Decoding
22.1K views
11 months ago
YouTube
IBM Technology
14:37
Understanding Speculative Decoding: Boosting LLM Efficienc
…
470 views
Apr 6, 2025
YouTube
MLWorks
7:06
The Secret to Faster LLMs: How Speculative Decoding Works
7 views
5 months ago
YouTube
Zaharah
12:45
Speculative Decoding & Inference Speed — 2-3x Faster LLMs With Z
…
1 week ago
YouTube
Jeff Heidelberger
12:46
Speculative Decoding: When Two LLMs are Faster than One
32.9K views
Oct 12, 2023
YouTube
Efficient NLP
24:17
Fast Inference from Transformers via Speculative Decoding
1.3K views
Sep 12, 2023
YouTube
Arxiv Papers
3:08
What is Speculative Decoding ?
38 views
1 week ago
YouTube
DeepManim
17:56
Behind the Stack, Ep 11 - Speculative Decoding
70 views
6 months ago
YouTube
Doubleword
SwiftSpec: Disaggregated Speculative Decoding and Fused
…
1 month ago
acm.org
0:41
Speculative Decoding in AI & LLMs
1.9K views
2 months ago
YouTube
Hareesh Rajendran
3:42
AdaSPEC: Selective KD for Faster LLM Spec Decoding
6 views
5 months ago
YouTube
AI Research Roundup
6:53
How Speculative Decoding Makes LLMs 2.5x Faster (The Secret to F
…
159 views
8 months ago
YouTube
FranksWorld of AI
See more videos
More like this
Feedback