AI at Scale with Nico Martin from Hugging Face | Transformers.js, Tokenizers, On-Device Inference

Impossibile aggiungere al carrello

Puoi avere soltanto 50 titoli nel carrello per il checkout.

Riprova più tardi

Rimozione dalla Lista desideri non riuscita.

Riprova più tardi

Non è stato possibile aggiungere il titolo alla Libreria

Per favore riprova

Non è stato possibile seguire il Podcast

Per favore riprova

Esecuzione del comando Non seguire più non riuscita

AI at Scale with Nico Martin from Hugging Face | Transformers.js, Tokenizers, On-Device Inference

Ascolta gratuitamente

Vedi i dettagli del titolo

A proposito di questo titolo

Can you really run state-of-the-art machine learning models directly in the browser, with no server, no API calls, and full privacy by default?

In this episode, Nico Martin, Open Source Machine Learning Engineer at Hugging Face and Google Developer Expert in AI and Web Technologies, walks through how Transformers.js makes on-device AI a reality. Nico's journey is anything but conventional. He started as a ski and windsurf instructor, taught himself web development on the side, spent years as a freelancer (including five at a bank building e-banking front ends), and recently landed what he calls his dream job at Hugging Face.

We unpack what Hugging Face actually is (the GitHub for machine learning), how Transformers.js brings the Python Transformers API to the browser, and the real engineering challenges of running models on whatever hardware your users happen to have. Nico explains quantization, ONNX as the standard for portable model architectures, the role of tokenizers, how text becomes tensors, and why WebGPU matters for running larger models client-side.

We also dig into the bigger picture: privacy-preserving AI, the difference between open weights and truly open source models, agents and MCP, and what front-end developers should actually learn to stay relevant in an AI-first world.

Key Topics:

- What Hugging Face is and the role of the Hub, Transformers, and Diffusers

- Transformers.js: bringing Python Transformers API to JavaScript and the browser

- The biggest challenge of browser ML: running on unknown client hardware

- Quantization explained (Q4, 4-bit vs 16/32-bit) and how it compresses models

- ONNX and ONNX Runtime Web: the standard for portable model architectures

- Open weights vs open source models and why the distinction matters

- Tokenizers, token IDs, and why each model needs its own tokenizer

- From text to tensors: pre-processing, inference, and post-processing

- Text embeddings explained through a simple animal feature analogy

- WebGPU and what it unlocks for in-browser inference

- Agents, tool calling, MCP, and how context windows get consumed

- Advice for developers who want to break into AI and ML engineering

🔗 FOLLOW NICO

💼 LinkedIn: https://www.linkedin.com/in/nicodotdev/

🐦 X/Twitter: https://twitter.com/nic_o_martin

🦋 Bluesky: https://bsky.app/profile/nico.dev

🐙 GitHub: https://github.com/nico-martin

🌐 Website: https://nico.dev

🎙️ FOLLOW & SUBSCRIBE

📸 Instagram: https://www.instagram.com/senorsatscale/

📸 Instagram: https://www.instagram.com/neciudev

🎙 Podcast URL: https://neciudan.dev/senors-at-scale

📬 Newsletter: https://neciudan.dev/subscribe

💼 LinkedIn: https://www.linkedin.com/in/neciudan

💼 LinkedIn: https://www.linkedin.com/company/senors-scale/

📚 ADDITIONAL RESOURCES

- Transformers.js: https://huggingface.co/docs/transformers.js

- Hugging Face: https://huggingface.co

- ONNX: https://onnx.ai

- ONNX Runtime: https://onnxruntime.ai

- WebGPU: https://www.w3.org/TR/webgpu/

- Utopia for Realists by Rutger Bregman

#MachineLearning #AI #HuggingFace #TransformersJS #WebML #OnDeviceAI #WebGPU #ONNX #JavaScript #Frontend #WebDev #SenorsAtScale #OpenSource

💬 Would you trust on-device AI over cloud-based models for sensitive data? Share your thoughts in the comments!

Ancora nessuna recensione