- total_steps 除以 accumulation_steps,匹配 optimizer.step() 频率 - warmup_steps 用 min 截断,避免 lr_decay_steps 为负 |
||
|---|---|---|
| .. | ||
| demo | ||
| tools | ||
| docker.sh | ||
| pre_commit.sh | ||
- total_steps 除以 accumulation_steps,匹配 optimizer.step() 频率 - warmup_steps 用 min 截断,避免 lr_decay_steps 为负 |
||
|---|---|---|
| .. | ||
| demo | ||
| tools | ||
| docker.sh | ||
| pre_commit.sh | ||