AstrAI/scripts
ViperEkura 3ab4f237e5 refactor: 重构训练后端为 Executor 模式
- backend.py → executor.py,BaseTrainingBackend → BaseExecutor
- 新增 NoneExecutor(单卡)和 DDPExecutor(DDP,world_size=1 自动降级)
- 新增 GradientState 分离梯度同步状态,AccumOptimizer/AccumScheduler 包裹拦截
- 新增 astrai/protocols.py:OptimizerProtocol/SchedulerProtocol 结构子类型
- TrainContext.backend → executor,TrainConfig 移除 parallel_wrapper/state_dict_fn,新增 parallel_mode/executor_kwargs
- 训练循环用 accumulate() 包裹,on_optimizer_step 命名约定=gate
- scripts/tools/train.py 移除 ddp_wrap/prepare_checkpoint,新增 --parallel_mode
2026-05-24 20:35:44 +08:00
..
demo refactor: generate_ar 改用流式输出并去除冗余注释 2026-05-17 10:23:42 +08:00
tools refactor: 重构训练后端为 Executor 模式 2026-05-24 20:35:44 +08:00
docker.sh fix: docker-compose UID/GID 添加默认值,修复 docker.sh logs 命令 2026-05-18 14:24:00 +08:00
pre_commit.sh ci: 优化 GitHub Actions 工作流 2026-04-05 22:40:16 +08:00