shen/LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2024-11-25 16:22:50 +08:00

History

rasbt 1183fd7837 Some checks failed Code tests (Linux) / test (push) Has been cancelled Details Code tests (macOS) / test (push) Has been cancelled Details Test PyTorch 2.0 and 2.5 / test (2.0.1) (push) Has been cancelled Details Test PyTorch 2.0 and 2.5 / test (2.5.0) (push) Has been cancelled Details Code tests (Windows) / test (push) Has been cancelled Details Check hyperlinks / test (push) Has been cancelled Details Spell Check / spellcheck (push) Has been cancelled Details PEP8 Style checks / flake8 (push) Has been cancelled Details add dropout scaling note		2024-11-06 05:52:47 -06:00
..
01_main-chapter-code	add dropout scaling note	2024-11-06 05:52:47 -06:00
02_bonus_efficient-multihead-attention	fixed typos (#414 )	2024-10-24 18:23:53 -05:00
03_understanding-buffers	Update README.md	2024-07-30 06:55:41 -05:00
README.md	Update bonus section formatting (#400 )	2024-10-12 10:26:08 -05:00

README.md

Chapter 3: Coding Attention Mechanisms

Main Chapter Code

01_main-chapter-code contains the main chapter code.

Bonus Materials

02_bonus_efficient-multihead-attention implements and compares different implementation variants of multihead-attention
03_understanding-buffers explains the idea behind PyTorch buffers, which are used to implement the causal attention mechanism in chapter 3