Module 2: The Encoder (BERT) vs. The Decoder (GPT) copertina

Module 2: The Encoder (BERT) vs. The Decoder (GPT)

Module 2: The Encoder (BERT) vs. The Decoder (GPT)

Ascolta gratuitamente

Vedi i dettagli del titolo

A proposito di questo titolo

Shay breaks down the encoder vs decoder split in transformers: encoders (BERT) read the full text with bidirectional attention to understand meaning, while decoders (GPT) generate text one token at a time using causal attention.

She ties the architecture to training (masked-word prediction vs next-token prediction), explains why decoder-only models dominate today (they can both interpret prompts and generate efficiently with KV caching), and previews the next episode on the MLP layer, where most learned knowledge lives.

Ancora nessuna recensione