Data-mixing framework using Kernel Ridge Leverage Scores (KRLS) for domain weighting in LLM pretraining and finetuning. Computes domain embeddings and weights to optimize universal generalization and transferability. Integrates with existing training pipelines via simple scripting.
This page was last edited on 2026-03-03.
This page was last edited on 2026-03-03.