AstrAI

Commit Graph

Author	SHA1	Message	Date
ViperEkura	48a53121ba	refactor: 工厂 kwargs 过滤及组件参数清理 - BaseFactory.create() 按 __init__ 签名过滤多余 kwargs - 移除 GQA/MLA/MLP/DeepSeekMoE 中多余的 **kwargs - MLP/DeepSeekMoE 参数名统一为 dim_ffn - scheduler max_seq_len 增加 None 显式判断 - 默认 max_prompt_len 提升至 2048	2026-05-16 16:47:41 +08:00
ViperEkura	e3382f6bb5	fix: 修复推理引擎 batch decode 中多项正确性与并发问题 - scheduler: decode 分组由幂次分桶改为精确 next_pos，消除 KV cache 位置错乱 - task: activate() 加锁操作 active_tasks，消除数据竞争 - engine: wait_completion 加超时，防止分配失败时永久死锁 - sample: TopKStrategy 向量化为 per-sample threshold，尊重各 task 的 top_k - cache: Storage.write/gather 中 -1 页改用 mask 处理，防数据污染 - executor: prefill 逐 task 循环改为单次 tensor 调用	2026-05-14 21:31:39 +08:00
ViperEkura	205b40bd28	refactor: 重构 cache 和 inference 参数体系，分离存储与分配 - 合并 GenerationRequest/GenerationParams，统一 max_tokens 参数名 - PagePool/PrefixCache 分离为 Allocator + PrefixCache + PagePool - 拆分 KV 存储为独立 Storage 类，PagedCache → KVCache，CacheView → KvcacheView - Allocator.inc_ref 移除 LRU 防止竞争，Storage.write 增加负页防御 - Allocator/PrefixCache/TaskTable 加 threading.Lock 保证线程安全 - server.py uvicorn.run 改为传 app 对象修复导入错误 - benchmark.py 适配 KVCache 新 API	2026-05-14 20:05:08 +08:00
ViperEkura	2196c34c52	refactor: 重构 inference 模块架构，引入设计模式并分组文件 - 新增 protocol.py 协议层，Template Method 模式消除流/非流分支 45% 重复 - SSEBuilder 统一 SSE 构造，StopChecker 独立 stop_sequence 检测 - AnthropicHandler 追踪已产出文本，修复 stop 时重复 delta - server.py 路由从约 100 行缩减至 3 行 - 拆分为 core/（cache/executor/scheduler/task）和 api/（protocol/server） - 外部保持二级导入路径（from astrai.inference import Name） - 删除所有分隔线注释，代码按语义自然分组	2026-05-14 17:42:37 +08:00

4 Commits