- setup_parallel 接收 local_rank 参数,不再读环境变量推导 - TorchrunStrategy 从 env 读取 LOCAL_RANK,LocalStrategy 用 rank - _detect_launcher() 分级检测替代内联 RANK 检查 - _run_single_rank 统一入口,消除 _run_single/_run_multi 重复 - 优雅退出:except BaseException 终止子进程并 re-join - gradient_checkpointing_modules 判定提取到外部变量 |
||
|---|---|---|
| .. | ||
| demo | ||
| tools | ||
| docker.sh | ||
| pre_commit.sh | ||