- get_rotary_emb 保留 cos/sin 实数存储,forward 组合为 complex - apply_rotary_emb 用 view_as_complex 复数乘法替代多次 view mul stack - 移除 GQA MLA DecoderBlock 中的 Tuple Tensor Tensor 类型 - 解码从 4.24s 降到 3.49s |
||
|---|---|---|
| .. | ||
| config | ||
| dataset | ||
| inference | ||
| model | ||
| parallel | ||
| tokenize | ||
| trainer | ||
| __init__.py | ||
| factory.py | ||
| serialization.py | ||