Publication

Powerful and accurate case-control analysis of spatial molecular data

Reshef, Yakir
Sood, Lakshay
Curtis, Michelle
Rumker, Laurie
Stein, Daniel J
Palshikar, Mukta G
Nayar, Saba
Filer, Andrew
Jonsson, Anna Helena
Korsunsky, Ilya
... show 1 more
Citations
Google Scholar:
Altmetric:
Affiliation
Brigham and Women's Hospital; Harvard Medical School; Massachusetts Institute of Technology; University Hospitals Birmingham NHS Foundation Trust; University of Birmingham; University of Colorado
Other Contributors
Publication date
2025-08-07
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
As spatial molecular data grow in scope and resolution, there is a pressing need to identify key spatial structures associated with disease. Current approaches typically make restrictive assumptions such as representing tissue regions by local abundances of manually typed, discrete cell types, or representing samples in terms of abundances of manually called, discrete spatial structures; this risks overlooking important signals. Here we introduce variational inference-based microniche analysis (VIMA), a method that combines deep learning with principled statistics to discover associated spatial features with greater flexibility and precision. VIMA trains an ensemble of variational autoencoders to extract numerical "fingerprints" from small tissue patches that capture their biological content. It uses these fingerprints to define a large number of data-dependent "microniches" - small, potentially overlapping groups of tissue patches with highly similar biology that span multiple samples. It then meta-analyzes across the autoencoders to identify microniches whose abundance correlates with case-control status while controlling for multiple testing. We show in simulations that VIMA is well calibrated. We then apply VIMA to spatial datasets spanning three different diseases and spatial modalities: a 7-marker immunofluorescence (IF) microscopy dataset in rheumatoid arthritis (RA), a 52-marker CO-Detection by indEXing (CODEX) dataset in ulcerative colitis (UC), and a 140-gene spatial transcriptomics dataset in dementia. In each case, we recapitulate known biology and identify novel spatial features of disease that were not discoverable with current state-of-the-art methods.
Citation
Reshef Y, Sood L, Curtis M, Rumker L, Stein DJ, Palshikar MG, Nayar S, Filer A, Jonsson AH, Korsunsky I, Raychaudhuri S. Powerful and accurate case-control analysis of spatial molecular data. bioRxiv [Preprint]. 2025 Aug 7:2025.02.07.637149. doi: 10.1101/2025.02.07.637149.
Type
Article
Description
Journal
Embedded videos