Lung Tumor Core — multiplexed imaging
[spora] Spatial Proteomics Dataset

AI ready spatial proteomics dataset

Download and explore standardized multiplexed imaging data to understand tissue architecture across cancer types and organs with spora.

MX Images
12,596
Patients
5,254
Cohorts
31
Technologies
4
A project by:
EPFL

Diverse Data Collection

spora spans 31 cohorts across CRC, TNBC, NSCLC, Lymphoma, Melanoma, PDAC, and more, captured across IMC, CODEX, Orion, and MIBI platforms.

Explore included cohorts ›
Fluorescent tissue Lung TMA core Multiplex image Lung TMA core Orion multiplex Lung TMA core Organoid tissue

Standardized file types and annotations for each sample:

Multiplex Images Multiplex Images
Paired H&E Images Paired H&E Images
Tissue Masks Tissue Masks
Cell Masks Cell Masks
Cell Types Cell Types
Patient Data Patient Data
Meta Data Meta Data

Dataset Browser

Browse and filter all 31 cohorts by cancer type, technology, and more. Download any cohort directly from the command line or through the spora-io Python library.

View all cohorts ›
Cohort MX Imgs Type Technology Cancer
danenberg2022breast
Nat Genetics 2022 ↗
718 TC IMC Breast
lin2023high
Nat Cancer 2023 ↗
41 WSI Orion Colorectal
schurch2020coordinated
Cell 2020 ↗
251 TC CODEX Colorectal
cords2024cancer
Cancer Cell 2024 ↗
2,040 TC IMC Lung
hoch2022multiplexed
Sci Immunol 2022 ↗
50 TC IMC Melanoma
>_

spora [io]

Load any single or combination of datasets in two lines of Python. The spora-io library provides a unified interface to browse, tile, and load spatial proteomics data with matched H&E and diverse intensity standardizations.

Read the docs ›
example.py
from spora_io import ComposedImagingDataset

# Load H&E + IMC images
dataset = ComposedImagingDataset(
    name="danenberg2022breast",
    path="/path/to/dataset",
    modalities=["he", "imc"],
    resolution=1.0,
    crop_size=224,
    modality_kwargs={"imc": {"standardization": "identity"}},
)

tissue_ids = dataset.get_tissue_ids()
composed = dataset.get_composed_tissue(tissue_ids[0])
composed.modalities["he"]  # H&E Tissue
composed.modalities["imc"] # Multiplex Tissue (C, H, W)
# .channel_names  — marker names
# .uniprot_ids    — aligned UniProt IDs

spora [bench]

Benchmark multiplex foundation models across cell-level and patient-level tasks with standardized evaluation protocols.

View benchmarks ›
Cell Phenotyping Macro F1-score
Dataset KRONOS VirTues astir
cords2023cancer
0.599 0.763 0.414
danenberg2022breast
0.356 0.605 0.358
hoch2022multiplexed
0.576 0.842 0.431
lin2023high
0.393 0.639 0.275
meyer2025stratification
0.480 0.704 0.424
moldoveanu2022spatially
0.532 0.688 0.491

Explore Tissue Data

Explore every cohort in spora through an interactive multiplex visualizer with multi-channel overlays, right in your browser.

Launch Tissue Visualizer ›
Tissue Explorer — multi-channel viewer

Getting Started

Download the full collection or explore a small cohort as a quick demo.

1

Download via command line

Browse all cohorts and download spora or specific datasets using rclone from your terminal.

Open Dataset Browser ›
2

Download a small cohort directly

Try spora immediately by downloading allam2022spatially, a small lung cancer IMC cohort (26 images, 340MB), straight from your browser.

Download allam2022spatially ›

Citation

If you use spora in your research, please cite:

@article{spora2026,
            title   = {},
            author  = {},
            journal = {},
            year    = {},
            doi     = {}
          }