- 合并 BaseStorage + MultiSegmentFetcher + BaseSegmentFetcher 三层为 Store ABC - Store._data 直接持有 Dict[str, List[Tensor]],不做强制拼接避免 OOM - _fetch_key 统一用 bisect 跨段切片,单段多段同一路径 - _length 显式存储(min total across keys),__len__ 返回 O(1) - MmapStore/H5Store/JSONStore 统一走 _normalize() 注册分段并预计算累积长度 - 所有 I/O 函数 (save_h5/load_h5/json_to_bin 等) 保持不变 |
||
|---|---|---|
| .. | ||
| config | ||
| dataset | ||
| inference | ||
| model | ||
| parallel | ||
| tokenize | ||
| trainer | ||
| __init__.py | ||
| factory.py | ||
| protocols.py | ||
| serialization.py | ||