Inactive

CENTER FOR
DIGITAL TRUST

HalluHard

Hard multi-turn hallucination benchmark for evaluating language models across domains.

Multi-turn hallucination benchmark evaluating LLMs across diverse domains. Installed via Pixi; scripts provided for response generation, claim-based web-scraping judgment (or coding_direct mode), and report creation. Supports multiple models and CLI configuration. Designed to maximize hallucination elicitation difficulty.

2026 ProposalAI SafetyBenchmarkLarge Language Model

Maturity

Support

C4DT

Lab

Maturity

Support

C4DT

Lab

Technical

Source code: Lab Github
Last commit: 2026-05-15

Machine Learning and Optimization Laboratory

Machine Learning and Optimization Laboratory

Martin Jaggi

Prof. Martin Jaggi

The Machine Learning and Optimization Laboratory is interested in machine learning, optimization algorithms and text understanding, as well as several application domains.

This page was last edited on 2026-03-03.

This page was last edited on 2026-03-03.