Why Foundation Models in Pathology Are Failing (arxiv.org)

🤖 AI Summary
A short arXiv paper argues that foundation models (FMs) — the large self‑supervised, often multimodal models that transformed general computer vision and language — are underperforming in computational pathology. Systematic evaluations report low diagnostic accuracy, poor robustness to simple transformations, geometric instability, high compute/memory costs on giga‑pixel whole‑slide images, and safety vulnerabilities (e.g., reliance on spurious non‑biological cues). The authors identify seven interrelated causes: biological complexity of tissue, ineffective self‑supervision for histology, harmful overgeneralization, excessive architectural complexity, a lack of pathology‑specific inductive biases, insufficient and heterogeneous clinical data, and a fundamental design flaw tied to the common patch‑based framing (patch size mismatches meaningful tissue context). This critique is significant because it reframes recent failures as conceptual, not just engineering, problems — suggesting that transplanting mainstream FM recipes (large patchified transformers, generic pretext tasks) into histopathology misses core properties of tissue morphology and clinical tasks. The paper implies concrete shifts for the field: develop domain‑aware self‑supervision and inductive biases, rethink patching strategies to preserve multi‑scale context, prioritize curated, diverse clinical datasets, and avoid overreliance on generic large models for high‑stakes diagnostics until these misalignments are resolved.
Loading comments...
loading comments...