- Shrink GQA title (42→34) to fit screen
- Move GQA annotation from left overflow to right-bottom of V box
- Enlarge heatmap cells (0.52→0.65) and labels (12→14, 9→10), lift grid up
- Remove Repeat KV section (shorten scene ~2s)
- Add position labels to auto-regressive token sequence
- Add layer stack effect behind transformer block
- Upgrade font sizes and spacing throughout for readability
- Replace 4 vertical system-phase boxes with 3 horizontal lanes
(PENDING queue / RUNNING batch / FINISHED done) for accurate
request lifecycle per scheduler.py:197-200
- System phases (Refill, Prefill, Decode, Cleanup) shown as
transition labels between lanes
- Tokens placed below lanes with dynamic state badge + cumulative
token count, updated each tick via ReplacementTransform
- Fix prefix_cache collective FadeOut using self.mobjects sweep
- Remove weight=BOLD across all scenes to prevent text drift
- Adjust GQA y-coordinates for subtitle clearance