PEPO

PEPO

Preference alignment framework for language models using DPO and RLHF techniques.

Preference alignment framework for LLMs implementing DPO, RLHF, and related techniques. Uses Hydra for configuration management; supports SLURM cluster scheduling, CUDA systems, training and evaluation pipelines, and environment variable management. Pre-commit hooks included for code quality.

AI SafetyLarge Language ModelOptimization
Key facts
Maturity
Support
C4DT
Inactive
Lab
Active
  • Technical

Laboratory for Information and Inference Systems

Laboratory for Information and Inference Systems
Volkan Cevher

Prof. Volkan Cevher

At LIONS, we are concerned with optimized information extraction from signals or data volumes. We therefore develop mathematical theory and computational methods for information recovery from highly incomplete data.

This page was last edited on 2026-03-03.