Inactive

CENTER FOR
DIGITAL TRUST

MTBBench

Benchmark for evaluating multimodal LLM reasoning in complex oncology clinical decision-making scenarios

MTBBench is a benchmark for evaluating multimodal LLM reasoning in oncology, covering two core challenges: multimodal integration (pathology, genomics, radiology) and longitudinal reasoning across patient timelines. It includes agentic tasks requiring interaction with external foundation-model-based tools such as TRIDENT for pathology and DrugBank for pharmacology.

BenchmarkLarge Language ModelMachine LearningMedical

Maturity

Support

C4DT

Lab

Maturity

Support

C4DT

Lab

Technical

Source code: Lab Github
Last commit: 2025-10-23

Artificial Intelligence in Molecular Medicine

Artificial Intelligence in Molecular Medicine

Charlotte Bunne

Prof. Charlotte Bunne

Our research aims to advance personalized medicine by utilizing machine learning and large-scale biomedical data.

This page was last edited on 2026-03-19.

This page was last edited on 2026-03-19.