This website requires JavaScript.
Explore
Help
Register
Sign In
ViperEkura
0 Followers
·
0 Following
Joined on
2026-04-02
Repositories
8
Projects
Packages
Public Activity
Starred Repositories
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-17 20:30:52 +08:00
d0e3464663
docs: 修正文档中类名/字段名与代码不一致之处
2c2697390d
feat: 新增 GradientCheckpointingCallback
7621f05d3f
docs: AdamW beta 默认值改为 (0.9, 0.95)
Compare 3 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-17 16:45:35 +08:00
10ebd7211f
feat: 新增 Muon 优化器
42a391f0fb
feat: 训练中新增验证循环
97c7ac0f4f
refactor: Transformer更名为AutoRegressiveLM并新增EmbeddingEncoder
8f1b32f2b6
fix: 移除多余 request 参数并增强 tokenizer 健壮性
c241a5dcef
refactor: 优化并行训练配置与启动管理
Compare 5 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-17 12:04:04 +08:00
44dab27fdc
feat: 数据集加载时校验必填字段
a44fd22a99
fix: 修复训练与模型参数传递问题
8a11a7d444
fix: 修复训练脚本两处参数传递问题
1d54491809
refactor: 改用递归子模块 init 替代统一 normal_(0.006)
Compare 4 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-17 10:31:56 +08:00
ad9f4d9cf6
refactor: generate_ar 改用流式输出并去除冗余注释
e1638a7ade
fix: 修正AdamW超参数默认值与文档示例
f91bfee33e
refactor: Config序列化统一BaseConfig基类
d7a7f570ed
refactor: 训练循环改为两重迭代并统一参数命名
7dea929788
refactor: checkpoint 按 HF 方式存独立 .pt 文件,callback 接管恢复
Compare 10 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-15 23:39:25 +08:00
3d12a03909
docs : 拆分文档并补充类图缺失类和关系线
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-15 22:56:13 +08:00
c169659611
docs: 修正 assets/docs/ 类图、数据流、参数文档及贡献指南
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-15 21:37:58 +08:00
e12f1a7ee5
feat: BaseModelConfig + DeepSeekMoE + 工厂模式替代 if/else
ef25efffa2
refactor: 拆分 module.py 为 components 子包
19532440b4
chore: 版本号升至 1.3.5
Compare 3 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-15 18:11:08 +08:00
9096e413c3
refactor: RotaryEmbedding 合并 cos/sin 为单一复数缓存
9d5e9fa6c4
perf: DDP 加 gradient_as_bucket_view/static_graph/broadcast_buffers,AdamW fused
08dde46778
fix: 修复训练循环 step/backward 顺序,重构为三重循环嵌套
Compare 3 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-14 21:38:17 +08:00
513f1f7826
perf: waiting_queue 改用 deque,pull_candidates 从 O(n²) 降到 O(1)
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-14 21:31:43 +08:00
e3382f6bb5
fix: 修复推理引擎 batch decode 中多项正确性与并发问题
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-14 21:27:36 +08:00
29b5717a38
fix: 修复推理引擎 batch decode 中多项正确性与并发问题
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-14 21:00:44 +08:00
f0339022c1
fix: batch 推理示例添加 chat template 和 system prompt
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-14 20:32:13 +08:00
d8da2cf17c
docs: 修复文档中与源码不符的类名、方法签名和模块归属
205b40bd28
refactor: 重构 cache 和 inference 参数体系,分离存储与分配
18fe6e9339
refactor: 消除多处重复模式,统一工厂和参数传递
2196c34c52
refactor: 重构 inference 模块架构,引入设计模式并分组文件
466c2e1efd
fix: process_attention_mask 中 expand 后的 inplace 写导致 alias 报错
Compare 18 commits »
ViperEkura
pushed to
main
at
ViperEkura/postgraduate-prep
2026-05-14 15:40:39 +08:00
89a3b16ef5
feat: 积分-万能代换增加判别式分析(a²-b²决定arctan/ln形式)
2fdf8e35ab
feat: 增加错题 1/(1+√((1+x)/x)) 积分(有理化分母+配平方)
f3dfb5273d
feat: 增加错题 1/(a+b cosx) 积分(万能代换+判别式分情况)
900e6c3e62
refactor: 规划对齐张宇+王道课程,按子专题细化,目标6月底完成第一轮
Compare 4 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-11 17:52:52 +08:00
38e18fdfd3
refactor: PagedCache Facade 模式,提取 PagePool/PrefixCache/TaskTable
4753958f92
refactor: 页状态移入 PagedCache,Task 纯化为域对象
73d6cc0f26
refactor: TaskManager 剥离页管理,STOP 移至 task.py
317ed90bac
refactor: 拆分 scheduler 为 TaskManager + Executor
Compare 4 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-10 21:07:53 +08:00
951df8155c
perf: gather 向量化
a58fab8d6e
fix: max_seq_len 检查改为仅 prompt 超限发 STOP,max_tokens 超出部分 clamp
a3c8296135
fix: page cache 分配失败越界崩溃 + 长度超限终止
c95ace41aa
fix: prefill 时 attention mask 长度不足导致 expand 崩溃
Compare 4 commits »
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-10 18:25:42 +08:00
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-10 18:23:33 +08:00
a3bde30fb1
feat: 服务化基础设施 - 有界队列/超时/优雅关闭/metrics
3da428e0e4
perf: PagedCache 持久前缀缓存 + LRU 逐出
133a9de98f
feat: _generate_streaming 支持 batch 模式
Compare 3 commits »
ViperEkura
pushed to
main
at
ViperEkura/postgraduate-prep
2026-05-10 17:31:11 +08:00
6335d21418
feat: 增加杂项随手记(个人发现与零散规律总结)
ViperEkura
pushed to
main
at
ViperEkura/AstrAI
2026-05-10 17:25:42 +08:00
523eacf5fe
release: v1.3.4
cffedaad5e
perf: 消除非流式推理 CPU 空转并减少 decode GPU 张量冗余分配
Compare 2 commits »
First
Previous
1
2
3
4
5
...
Next
Last