#311 Stefano Ermon: Why Diffusion Language Models Will Define the Next Generation of LLMs

Impossibile aggiungere al carrello

Puoi avere soltanto 50 titoli nel carrello per il checkout.

Riprova più tardi

Rimozione dalla Lista desideri non riuscita.

Riprova più tardi

Non è stato possibile aggiungere il titolo alla Libreria

Per favore riprova

Non è stato possibile seguire il Podcast

Per favore riprova

Esecuzione del comando Non seguire più non riuscita

#311 Stefano Ermon: Why Diffusion Language Models Will Define the Next Generation of LLMs

Ascolta gratuitamente

Vedi i dettagli del titolo

3 mesi a soli 0,99 €/mese

Dopo 3 mesi, 9,99 €/mese. Si applicano termini e condizioni.

A proposito di questo titolo

This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.

Visit https://agntcy.org/ and add your support.

Most large language models today generate text one token at a time. That design choice creates a hard limit on speed, cost, and scalability.

In this episode of Eye on AI, Stefano Ermon breaks down diffusion language models and why a parallel, inference-first approach could define the next generation of LLMs. We explore how diffusion models differ from autoregressive systems, why inference efficiency matters more than training scale, and what this shift means for real-time AI applications like code generation, agents, and voice systems.

This conversation goes deep into AI architecture, model controllability, latency, cost trade-offs, and the future of generative intelligence as AI moves from demos to production-scale systems.

Stay Updated:
Craig Smith on X: https://x.com/craigss
Eye on A.I. on X: https://x.com/EyeOn_AI

(00:00) Autoregressive vs Diffusion LLMs
(02:12) Why Build Diffusion LLMs
(05:51) Context Window Limits
(08:39) How Diffusion Works
(11:58) Global vs Token Prediction
(17:19) Model Control and Safety
(19:48) Training and RLHF
(22:35) Evaluating Diffusion Models
(24:18) Diffusion LLM Competition
(30:09) Why Start With Code
(32:04) Enterprise Fine-Tuning
(33:16) Speed vs Accuracy Tradeoffs
(35:34) Diffusion vs Autoregressive Future
(38:18) Coding Workflows in Practice
(43:07) Voice and Real-Time Agents
(44:59) Reasoning Diffusion Models
(46:39) Multimodal AI Direction
(50:10) Handling Hallucinations

Ancora nessuna recensione