AstrAI

Commit Graph

Author	SHA1	Message	Date
ViperEkura	2d5dc93b3d	fix : 修正类型标注与统一 CLI 参数命名 - AutoRegressiveLM.forward 返回类型标注 -> Dict[str, Tensor] - EmbeddingEncoder 移除冗余 position_ids 自动创建 - CLI 脚本模型目录参数统一为 --param_path	2026-05-27 20:49:44 +08:00
ViperEkura	737585a32a	feat: 新增NTK-Aware RoPE缩放支持 - RotaryEmbedding接受rope_scaling配置,自动计算scaled base - AutoRegressiveLMConfig和EncoderConfig新增rope_scaling字段	2026-05-25 21:22:07 +08:00
ViperEkura	97c7ac0f4f	refactor: Transformer更名为AutoRegressiveLM并新增EmbeddingEncoder - AutoRegressiveLM 注册名改为 autoregressive_lm - 新增 EmbeddingEncoder 支持 mean/cls/last pooling - ModelConfig 增加 pooling_type / normalize_embeddings 字段 - 导入、注释、测试全部同步更新	2026-05-17 15:29:20 +08:00
ViperEkura	a44fd22a99	fix: 修复训练与模型参数传递问题 - state_dict_fn 传入 CheckpointCallback，修复多卡 DDP 下 key 前缀丢失 - MLA 增加 use_qk_norm 支持，消除参数静默丢失 - moe_topk_method 统一命名为 topk_method - checkpoint 回调移至最前	2026-05-17 11:20:13 +08:00
ViperEkura	1d54491809	refactor: 改用递归子模块 init 替代统一 normal_(0.006) - Embedding.reset_parameters: normal_(std=0.02) - Linear.reset_parameters: kaiming_uniform_ + uniform_ bias - Transformer._init_weights 通过 apply 递归调用子模块 reset_parameters - 移除全局 normal_(0.006) 覆盖，各模块使用更合适的分布	2026-05-17 10:44:18 +08:00
ViperEkura	0ba8c70ce1	fix: 修复 MLA 多个 bug 并缩小测试模型参数 - MLA kv_b_proj 输出维度和 q_rope 切分偏移修复 - 打通 MLA 配置从 ModelConfig 到 DecoderBlock 的传递路径 - rope_theta 配置不再被忽略，MLA 使用 qk_rope_head_dim - tie_weight 使用 is True 避免 None 隐式生效 - norm_eps/rope base 类型标注修正 - 测试模型参数缩小 (dim=8, head_dim=4) - 新增 6 种架构配置 × 2 场景的前向传播测试	2026-05-16 14:57:43 +08:00
ViperEkura	e12f1a7ee5	feat: BaseModelConfig + DeepSeekMoE + 工厂模式替代 if/else - BaseModelConfig: fields() 精确字段匹配 + 类型矫正 + 未知key警告 - DeepSeekMoE: 共享专家 + 路由专家 + top-K 门控 - AttnFactory/FFNFactory: 装饰器注册，DecoderBlock 零分支 - config 用 attn_type/ffn_type 驱动组件选择	2026-05-15 20:34:52 +08:00
ViperEkura	ef25efffa2	refactor: 拆分 module.py 为 components 子包 - rope/linear/norm/embedding/mlp/attention/decoder_block 各自独立文件 - 依赖单向无循环 - 公开接口不变，外部无需修改	2026-05-15 20:08:36 +08:00
ViperEkura	205b40bd28	refactor: 重构 cache 和 inference 参数体系，分离存储与分配 - 合并 GenerationRequest/GenerationParams，统一 max_tokens 参数名 - PagePool/PrefixCache 分离为 Allocator + PrefixCache + PagePool - 拆分 KV 存储为独立 Storage 类，PagedCache → KVCache，CacheView → KvcacheView - Allocator.inc_ref 移除 LRU 防止竞争，Storage.write 增加负页防御 - Allocator/PrefixCache/TaskTable 加 threading.Lock 保证线程安全 - server.py uvicorn.run 改为传 app 对象修复导入错误 - benchmark.py 适配 KVCache 新 API	2026-05-14 20:05:08 +08:00
ViperEkura	2196c34c52	refactor: 重构 inference 模块架构，引入设计模式并分组文件 - 新增 protocol.py 协议层，Template Method 模式消除流/非流分支 45% 重复 - SSEBuilder 统一 SSE 构造，StopChecker 独立 stop_sequence 检测 - AnthropicHandler 追踪已产出文本，修复 stop 时重复 delta - server.py 路由从约 100 行缩减至 3 行 - 拆分为 core/（cache/executor/scheduler/task）和 api/（protocol/server） - 外部保持二级导入路径（from astrai.inference import Name） - 删除所有分隔线注释，代码按语义自然分组	2026-05-14 17:42:37 +08:00
ViperEkura	466c2e1efd	fix: process_attention_mask 中 expand 后的 inplace 写导致 alias 报错 - pad.view.expand 产生的视图多元素指向同一内存，attend &= 写入报错 - 改为 .expand().clone() 独立内存后再 inplace	2026-05-14 16:30:31 +08:00
ViperEkura	6d6ef99e66	perf: 消除 PagedCache.write 中的 position_ids GPU 同步，解码提速 15% - CacheView.write 用 total_len - k.size(1) 推导 start_pos，替代 position_ids[0,0].item() - 移除 GQA/MLA/DecoderBlock 中不再使用的 position_ids 参数 - PagedCache.write 参数 position_ids:Tensor → start_pos:int	2026-05-14 15:37:48 +08:00
ViperEkura	c0effc9f5b	refactor: 位置编码改用 position_ids [B,S]，简化 attention mask 构建 - RotaryEmbedding/CacheView 接受 position_ids 替代 start_pos - process_attention_mask 用 position_ids >= arange 做逐位置 causal - 训练/无 KV cache 时 position_ids=None 内部自动处理 - 移除 executor/benchmark 中冗余的 input_mask 构造	2026-05-14 13:26:31 +08:00
ViperEkura	a3c8296135	fix: page cache 分配失败越界崩溃 + 长度超限终止 - astrai/inference/scheduler.py: add_task 增加 max_seq_len 检查，超限时直接发 STOP 信号终止 - astrai/inference/scheduler.py: _maybe_alloc_page 返回 bool，alloc 失败时标记 ABORTED + 发 STOP - astrai/inference/scheduler.py: _execute_decode 过滤分配失败任务，避免 page_table 越界 - astrai/inference/scheduler.py: _remove_finished_tasks 清理 ABORTED 任务并释放 pages - astrai/inference/scheduler.py: _execute_prefill input_mask 改为覆盖全部 prompt_len - astrai/model/transformer.py: seq_mask is None 分支补全 start_pos + seq_len 列	2026-05-10 20:14:38 +08:00
ViperEkura	c95ace41aa	fix: prefill 时 attention mask 长度不足导致 expand 崩溃 - astrai/inference/scheduler.py: prefill input_mask 由 [batch, seq_len] 改为 [batch, prompt_len]，覆盖全部 KV 位置 - astrai/model/transformer.py: seq_mask is None 分支补全 start_pos + seq_len 列，避免 expand 非 singleton 维度不匹配	2026-05-10 19:56:41 +08:00
ViperEkura	30cc2d67a4	refactor: 分页 KV cache 替换固定 slot，删除 PrefixCache 及相关死代码 - 用 PagedCache + CacheView 替换固定 slot 式 KV cache，attention 层只通过 page_table 间接索引 - 删除 PrefixCache（radix tree）及 scheduler 中所有 prefix cache 命中/插入/释放逻辑 - 删除无用函数：pin、version、free_count、_mark_seq_mask 及 seq_mask 分配 - 修复 write 在多页 prefill 时 offset 为负导致 chunk 计算错误 - _make_page_table_tensor 改用 list 拼接一次 tensor，去掉逐元素赋值 - 清理 model 接口参数：kv_cache, slot_indices → paged_cache（CacheView） - 精简 docstring 为单行，删除冗余 section 注释和旧代码 - 修复 test_scheduler_concurrency.py 缺少 import pytest	2026-05-08 20:44:05 +08:00
ViperEkura	b89f8436ea	refactor: 将KV缓存槽位映射下沉到模型注意力层，移除_remap_kv和_writeback_kv	2026-05-06 20:01:22 +08:00
ViperEkura	3fee87897d	chore: 修改拼写错误问题	2026-04-06 09:28:16 +08:00
ViperEkura	fc278d17ab	feat: 实现模型动态注册机制	2026-04-05 19:38:12 +08:00
ViperEkura	b531232a9b	style: 修改为显式导入	2026-04-04 16:02:49 +08:00
ViperEkura	0852b852f8	refactor: 优化参数传递，清理导入样式	2026-04-03 22:06:32 +08:00
ViperEkura	2e009cf59a	chore: 更新项目名称	2026-03-31 09:34:11 +08:00

22 Commits