mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2024-11-25 16:22:50 +08:00
updated RoPE statement (#423)
Some checks failed
Code tests (Linux) / test (push) Has been cancelled
Code tests (macOS) / test (push) Has been cancelled
Test PyTorch 2.0 and 2.5 / test (2.0.1) (push) Has been cancelled
Test PyTorch 2.0 and 2.5 / test (2.5.0) (push) Has been cancelled
Code tests (Windows) / test (push) Has been cancelled
Check hyperlinks / test (push) Has been cancelled
Spell Check / spellcheck (push) Has been cancelled
PEP8 Style checks / flake8 (push) Has been cancelled
Some checks failed
Code tests (Linux) / test (push) Has been cancelled
Code tests (macOS) / test (push) Has been cancelled
Test PyTorch 2.0 and 2.5 / test (2.0.1) (push) Has been cancelled
Test PyTorch 2.0 and 2.5 / test (2.5.0) (push) Has been cancelled
Code tests (Windows) / test (push) Has been cancelled
Check hyperlinks / test (push) Has been cancelled
Spell Check / spellcheck (push) Has been cancelled
PEP8 Style checks / flake8 (push) Has been cancelled
* updated RoPE statement * updated .gitignore * Update ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
This commit is contained in:
parent
b5f2aa3500
commit
81eed9afe2
2
.gitignore
vendored
2
.gitignore
vendored
@ -44,6 +44,8 @@ ch05/07_gpt_to_llama/Llama-3.1-8B
|
||||
ch05/07_gpt_to_llama/Llama-3.1-8B-Instruct
|
||||
ch05/07_gpt_to_llama/Llama-3.2-1B
|
||||
ch05/07_gpt_to_llama/Llama-3.2-1B-Instruct
|
||||
ch05/07_gpt_to_llama/Llama-3.2-3B
|
||||
ch05/07_gpt_to_llama/Llama-3.2-3B-Instruct
|
||||
|
||||
ch06/01_main-chapter-code/gpt2
|
||||
ch06/02_bonus_additional-experiments/gpt2
|
||||
|
@ -409,7 +409,7 @@
|
||||
"self.pos_emb = nn.Embedding(cfg[\"context_length\"], cfg[\"emb_dim\"])\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"- Instead of these absolute positional embeddings, Llama uses relative positional embeddings, called rotary position embeddings (RoPE for short)\n",
|
||||
"- Unlike traditional absolute positional embeddings, Llama uses rotary position embeddings (RoPE), which enable it to capture both absolute and relative positional information simultaneously\n",
|
||||
"- The reference paper for RoPE is [RoFormer: Enhanced Transformer with Rotary Position Embedding (2021)](https://arxiv.org/abs/2104.09864)"
|
||||
]
|
||||
},
|
||||
|
Loading…
Reference in New Issue
Block a user