- 交换adamw_beta1/adamw_beta2默认值:beta1=0.95, beta2=0.99 - label_smoothing默认值改为0.05 - 文档示例统一更新:train_type=pt, weight_decay=0.01 - 移除文档中过时的strategy default标注 |
||
|---|---|---|
| .. | ||
| benchmark.py | ||
| generate.py | ||
| perplexity.py | ||
| server.py | ||
| train.py | ||
- 交换adamw_beta1/adamw_beta2默认值:beta1=0.95, beta2=0.99 - label_smoothing默认值改为0.05 - 文档示例统一更新:train_type=pt, weight_decay=0.01 - 移除文档中过时的strategy default标注 |
||
|---|---|---|
| .. | ||
| benchmark.py | ||
| generate.py | ||
| perplexity.py | ||
| server.py | ||
| train.py | ||