Go to file
ViperEkura 283bcaf2ff fix: 修复 CLI 参数缺失/重复、device_ids 越界、generate 参数名不一致、scheduler 时序、非流式截断等 bug
- train.py: 补上 --batch_size、--grpo_clip_eps,删除 3 处重复 --group_size
- generate.py: --model_dir 改为 --param_path 对齐 README
- automodel.py: from_pretrained 新增 strict 参数(默认 True)
- parallel/setup.py: 修复 device_ids 索引越界
- train_callback.py: scheduler.step() 移至 on_step_end
- test_train_strategy.py: 测试中补 optimizer.step()
- engine.py: 非流式改为循环等待所有任务完成,补 remove_task 清理
- scheduler.py: Task 添加 _pages_freed 标志,杜绝双重释放
- trainer.py: accumulation_steps=0 时 clamp 为 1
- tokenizer.py: save_pretrained 添加 _tokenizer is None 检查
- benchmark.py: 修复 ModelConfig 过时 import 路径
- inference/__init__.py: 修复 stale docstring
2026-05-09 14:36:42 +08:00
.github ci: 添加 Docker 镜像自动构建工作流 2026-04-10 13:09:58 +08:00
assets feat: GRPO CLI 接入 + on-policy,OpenAI API top_k 参数化,补充训练参数表 2026-05-09 12:22:33 +08:00
astrai fix: 修复 CLI 参数缺失/重复、device_ids 越界、generate 参数名不一致、scheduler 时序、非流式截断等 bug 2026-05-09 14:36:42 +08:00
scripts fix: 修复 CLI 参数缺失/重复、device_ids 越界、generate 参数名不一致、scheduler 时序、非流式截断等 bug 2026-05-09 14:36:42 +08:00
tests fix: 修复 CLI 参数缺失/重复、device_ids 越界、generate 参数名不一致、scheduler 时序、非流式截断等 bug 2026-05-09 14:36:42 +08:00
.dockerignore build: 修改docker 构建流程 2026-04-10 11:25:00 +08:00
.gitattributes ci: 优化 GitHub Actions 工作流 2026-04-05 22:40:16 +08:00
.gitignore feat: 新增 Docker Compose 一键部署,支持 GPU/CPU 双模式 2026-05-09 11:57:46 +08:00
CONTRIBUTING.md docs: 优化文档结构并添加 GitHub 模板 2026-03-31 10:00:49 +08:00
Dockerfile build: 修改docker 配置 2026-04-10 12:53:08 +08:00
LICENSE Change license from Apache 2.0 to GPL v3.0 2026-02-22 21:20:34 +08:00
README.md feat: GRPO CLI 接入 + on-policy,OpenAI API top_k 参数化,补充训练参数表 2026-05-09 12:22:33 +08:00
docker-compose.yml feat: 新增 Docker Compose 一键部署,支持 GPU/CPU 双模式 2026-05-09 11:57:46 +08:00
pyproject.toml fix: 修复工厂模式问题并增加chat-template设置 2026-04-04 12:05:05 +08:00

README.md

Logo

A lightweight Transformer training & inference framework

python license release stars forks


📖 Table of Contents


English

Features

  • 🚀 High Performance: Optimized for both training and inference with efficient parallelization.
  • 🔧 Flexible: Support for seq/sft/dpo/grpo training, customizable model architectures.
  • 💡 Easy to Use: Simple API with comprehensive examples and demos.
  • 📦 Lightweight: Minimal dependencies, easy to deploy.
  • 🔬 ResearchFriendly: Modular design, easy to experiment with new ideas.
  • 🤗 HuggingFace Integration: Compatible with HuggingFace models and datasets.
  • 🔌 Dual API Compatibility: Supports both OpenAI and Anthropic chat completion APIs out of the box.

Quick Start

Installation

git clone https://github.com/ViperEkura/AstrAI.git
cd AstrAI
pip install -e .

For development dependencies:

pip install -e ".[dev]"

Train a Model

python scripts/tools/train.py --train_type=seq --data_root_path=/path/to/dataset --param_path=/path/to/model
Parameter Description Default
--train_type Training type (seq, sft, dpo, grpo) required
--data_root_path Dataset root directory required
--param_path Model / checkpoint path required
--n_epoch Training epochs 1
--batch_size Batch size 1
--accumulation_steps Gradient accumulation steps 1
--warmup_steps LR warmup steps 1000
--max_lr Peak learning rate (cosine decay) 3e-4
--max_grad_norm Max gradient norm for clipping 1.0
--adamw_beta1 AdamW beta1 0.9
--adamw_beta2 AdamW beta2 0.95
--adamw_weight_decay AdamW weight decay 0.01
--random_seed Random seed 3407
--num_workers DataLoader workers 4
--window_size Max input sequence length auto
--stride Sequence stride auto
--label_smoothing Label smoothing for cross entropy 0.1
--dpo_beta DPO beta 0.1
--grpo_clip_eps GRPO clip epsilon 0.2
--grpo_kl_coef GRPO KL penalty coefficient 0.01
--group_size GRPO group size 4
--grpo_sync_interval GRPO ref model sync interval (steps) 200
--ckpt_interval Checkpoint interval (iters) 5000
--ckpt_dir Checkpoint directory checkpoint
--start_epoch Start epoch (for resume) 0
--start_batch Start batch (for resume) 0
--nprocs Number of GPUs 1
--device_type Device type cuda

Full reference at Parameter Guide.

Generate Text

python scripts/tools/generate.py --param_path=/path/to/param_path

Docker

Build and run with Docker (recommended for GPU environments):

# Build image
docker build -t astrai:latest .

# Run with GPU support
docker run --gpus all -it astrai:latest

# Run with specific GPUs
docker run --gpus '"device=0,1"' -it astrai:latest

# Run inference server
docker run --gpus all -p 8000:8000 astrai:latest \
  python -m scripts.tools.server --port 8000 --device cuda

# Run with volume mount for data
docker run --gpus all -v /path/to/data:/data -it astrai:latest

# Docker Compose (GPU, default)
docker compose up -d

# Docker Compose (CPU only)
docker compose --profile cpu up -d

Note: --gpus all is required for CUDA support. Without it, torch.cuda.is_available() will return False.

Note: --gpus all is required for CUDA support. Without it, torch.cuda.is_available() will return False.

Start HTTP Server

Start the inference server with OpenAI and Anthropic-compatible HTTP API:

python -m scripts.tools.server --port 8000 --device cuda

Make requests:

# OpenAI-compatible
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 512
  }'

# OpenAI-compatible streaming
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Tell a story"}],
    "stream": true,
    "max_tokens": 500
  }'

# Anthropic-compatible
curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "astrai",
    "system": "You are a helpful assistant.",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 512
  }'

# Anthropic-compatible streaming with stop sequences
curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "astrai",
    "messages": [{"role": "user", "content": "Write a story"}],
    "max_tokens": 500,
    "stream": true,
    "stop_sequences": ["The end"]
  }'

# Health check
curl http://localhost:8000/health

Demo

Check out the demos in the scripts/demo/ folder:

# Download preprocessed data (required before running demos)
python scripts/demo/download.py

# Interactive streaming chat
python scripts/demo/stream_chat.py

# Batch generation
python scripts/demo/generate_batch.py

# Autoregressive generation
python scripts/demo/generate_ar.py

Watch a video walkthrough on bilibili.

Documentation

Document Description
Parameter Guide Training & inference parameters
Design Document Framework architecture & module design
Data Flow Data processing pipeline details
Model Introduction Model architecture & technical details

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository.
  2. Create a feature branch.
  3. Commit your changes.
  4. Open a Pull Request.

For major changes, please open an issue first to discuss what you would like to change.

Community

License

This project is licensed under the GPL-3.0 License.


A lightweight Transformer framework designed for both high performance and ease of use.