Go to file
ViperEkura 629e72385b fix : 修复存储层 bug,JSON 切换为 JSONL,补齐测试覆盖
- save_bin/load_bin: save_json/load_json 替换为直接 json.dump/json.load,修复致命 bug
- _normalize: 空 cum 列表 guard,防止 IndexError
- load_json: 改为仅支持 JSONL 逐行解析 (json.loads),移除 .json 支持
- detect_format: 只匹配 *.jsonl,不再匹配 *.json
- save_json: 输出扩展名改为 .jsonl
- GRPODataset.__getitem__: 补齐 .to(dtype=torch.long/bool) 与其他数据集一致
- load_bin: np.memmap mode='r+' 消除 PyTorch 不可写 tensor 警告
- 新增 16 个测试: bin roundtrip, mmap load, 空 key, JSONL 多行/文本, GRPO dtype/load, detect_format bin/jsonl, fetch multi-key/越界, json_to_bin 转换, DPO from JSONL, 显式 storage_type
2026-05-28 15:29:46 +08:00
.github docs: 修正 assets/docs/ 类图、数据流、参数文档及贡献指南 2026-05-15 22:54:41 +08:00
assets docs : 更新架构文档与 storage 注释,同步 Store 重构 2026-05-28 14:36:18 +08:00
astrai fix : 修复存储层 bug,JSON 切换为 JSONL,补齐测试覆盖 2026-05-28 15:29:46 +08:00
scripts fix : 修正类型标注与统一 CLI 参数命名 2026-05-27 20:49:44 +08:00
tests fix : 修复存储层 bug,JSON 切换为 JSONL,补齐测试覆盖 2026-05-28 15:29:46 +08:00
.dockerignore build: 修改docker 构建流程 2026-04-10 11:25:00 +08:00
.gitattributes ci: 优化 GitHub Actions 工作流 2026-04-05 22:40:16 +08:00
.gitignore feat: 新增 Docker Compose 一键部署,支持 GPU/CPU 双模式 2026-05-09 11:57:46 +08:00
CONTRIBUTING.md docs: 修正 assets/docs/ 类图、数据流、参数文档及贡献指南 2026-05-15 22:54:41 +08:00
Dockerfile fix: docker-compose UID/GID 添加默认值,修复 docker.sh logs 命令 2026-05-18 14:24:00 +08:00
LICENSE Change license from Apache 2.0 to GPL v3.0 2026-02-22 21:20:34 +08:00
README.md docs: 更新视频链接 2026-05-19 17:34:01 +08:00
docker-compose.yml fix: docker-compose UID/GID 添加默认值,修复 docker.sh logs 命令 2026-05-18 14:24:00 +08:00
pyproject.toml fix: 修复工厂模式问题并增加chat-template设置 2026-04-04 12:05:05 +08:00

README.md

Logo

A lightweight Transformer training & inference framework

python license release stars forks


📖 Table of Contents


English

Features

  • 🚀 High Performance: Optimized for both training and inference with efficient parallelization.
  • 🔧 Flexible: Support for seq/sft/dpo/grpo training, customizable model architectures.
  • 💡 Easy to Use: Simple API with comprehensive examples and demos.
  • 📦 Lightweight: Minimal dependencies, easy to deploy.
  • 🔬 ResearchFriendly: Modular design, easy to experiment with new ideas.
  • 🤗 HuggingFace-Style API: AutoModel/AutoTokenizer APIs inspired by HuggingFace for easy model and tokenizer loading.
  • 🔌 Dual API Compatibility: Supports both OpenAI and Anthropic chat completion APIs out of the box.

Quick Start

Installation

git clone https://github.com/ViperEkura/AstrAI.git
cd AstrAI
pip install -e .

For development dependencies:

pip install -e ".[dev]"

Download Pre-trained Model

Download pre-trained model weights (1B bilingual checkpoint) to params/:

python scripts/demo/download.py

Or download manually from HuggingFace into params/.

Train a Model

export CUDA_VISIBLE_DEVICES=0,1,2,3

nohup python scripts/tools/train.py \
    --nprocs=4 \
    --train_type=seq \
    --data_root_path=/path/to/dataset \
    --param_path=/path/to/model \
    --batch_per_device=4 \
    --grad_accum_steps=8 \
    --warmup_ratio=0.05 \
    --max_lr=1e-4 \
    --max_grad_norm=1.0 \
    --adamw_beta1=0.9 \
    --adamw_beta2=0.95 \
    --adamw_weight_decay=0.01 \
    --window_size=2048 \
    --ckpt_interval=10000 \
    --ckpt_dir=./checkpoint \
    --random_seed=3407 \
    --label_smoothing=0.05 \
    > out.log 2> err.log &

Full reference at Parameter Guide.

Generate Text

python scripts/tools/generate.py \
    --param_path /path/to/model \
    --input_json_file /path/to/input.json \
    --output_json_file /path/to/output.json

Docker

Build and run with Docker (recommended for GPU environments):

# Build image
docker build -t astrai:latest .

# Run with GPU support
docker run --gpus all -it astrai:latest

# Run with specific GPUs
docker run --gpus '"device=0,1"' -it astrai:latest

# Run inference server
docker run --gpus all -p 8000:8000 astrai:latest \
  python -m scripts.tools.server --port 8000 --device cuda

# Run with volume mount for data
docker run --gpus all -v /path/to/data:/data -it astrai:latest

# Docker Compose (GPU, default)
docker compose up -d

# Docker Compose (CPU only)
docker compose --profile cpu up -d

Note: --gpus all is required for CUDA support. Without it, torch.cuda.is_available() will return False.

Start HTTP Server

Start the inference server with OpenAI and Anthropic-compatible HTTP API:

python -m scripts.tools.server --port 8000 --device cuda

Make requests:

# OpenAI-compatible
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 512
  }'

# OpenAI-compatible streaming
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Tell a story"}],
    "stream": true,
    "max_tokens": 500
  }'

# Anthropic-compatible
curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "astrai",
    "system": "You are a helpful assistant.",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 512
  }'

# Anthropic-compatible streaming with stop sequences
curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "astrai",
    "messages": [{"role": "user", "content": "Write a story"}],
    "max_tokens": 500,
    "stream": true,
    "stop_sequences": ["The end"]
  }'

# Health check
curl http://localhost:8000/health

Demo

Check out the demos in the scripts/demo/ folder:

# Download preprocessed data (required before running demos)
python scripts/demo/download.py

# Interactive streaming chat
python scripts/demo/stream_chat.py

# Batch generation
python scripts/demo/generate_batch.py

# Autoregressive generation
python scripts/demo/generate_ar.py

Watch a video walkthrough on bilibili.

Documentation

Document Description
Parameter Guide Training & inference parameters
Architecture System architecture, class diagram & design patterns
Training Training loop, strategies & formulas
Inference KVCache, continuous batching, sampling & HTTP API
Data Flow Data pipeline, storage backends & dataset architecture

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository.
  2. Create a feature branch.
  3. Commit your changes.
  4. Open a Pull Request.

For major changes, please open an issue first to discuss what you would like to change.

Community

License

This project is licensed under the GPL-3.0 License.


A lightweight Transformer framework designed for both high performance and ease of use.