If you use this repository, please cite the accompanying paper: “Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models” (2025). See the ** Citation** section ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback