This vignette outlines the steps of inference, analysis and visualization of cell-cell communication network for spatial multiomics data using CellChat. We showcase CellChat’s application by applying it to a mouse spleen dataset that is generated by a high-throughput spatial transcriptomics and proteomics co-profiling technology (SPOTS).

CellChat requires gene expression, protein abundance and spatial location data of spots/cells as the user input and models the probability of cell-cell communication by integrating gene expression, protein abundance with spatial distance as well as prior knowledge of the interactions between signaling ligands, receptors and their cofactors.

Upon infering the intercellular communication network, CellChat’s various functionality can be used for further data exploration, analysis, and visualization.

Load the required libraries

ptm = Sys.time()

library(CellChat)
library(patchwork)
options(stringsAsFactors = FALSE)

Part I: Data input & processing and initialization of CellChat object

When inferring spatially-proximal cell-cell communication from spatially multiomics data, user also should provide spatial coordinates/locations of spot/cell centroids. In addition, to filter out cell-cell communication beyond the maximum diffusion range of molecules (e.g., ~250μm), CellChat needs to compute the cell centroid-to-centroid distance in the unit of micrometers. Therefore, for spatial technologies that only provide spatial coordinates in pixels, CellChat converts spatial coordinates from pixels to micrometers by requiring users to input the conversion factor.

CellChat requires four user inputs:

When inferring contact-dependent or juxtacrine signaling by setting contact.dependent = TRUE in computeCommunProb, and using L-R pairs from Cell-Cell Contact signaling classified in CellChatDB$interaction$annotation, CellChat requires another one user input:

Instead of providing contact.range, users may alternatively provide a value of contact.knn.k, in order to restrict the contact-dependent signaling within the k-nearest neighbors (knn).

Load data

# Here we load a Seurat object of 10X Visium mouse cortex data and its associated cell meta data
load("/Users/suoqinjin/Library/CloudStorage/OneDrive-Personal/works/CellChat/tutorial/data_mouse_spleen_RNA_ADT.RData")

# Prepare input data for CelChat analysis
data.list <- list(RNA = data.input.rna, ADT = data.input.adt)
res.multi <- preProcMultiomics(data.list, db = CellChatDB.mouse)
data.input <- res.multi$data.input

# define the meta data: 
# a column named `samples` should be provided for spatial transcriptomics analysis, which is useful for analyzing cell-cell communication by aggregating multiple samples/replicates. Of note, for comparison analysis across different conditions, users still need to create a CellChat object seperately for each condition.  
meta = data.frame(labels = meta$annotations, samples = "sample1", row.names = colnames(data.input)) # manually create a dataframe consisting of the cell labels
meta$samples <- factor(meta$samples)
unique(meta$labels) # check the cell labels
#> [1] Mac-1   B-cells Mac-2   T-cells
#> Levels: Mac-1 Mac-2 B-cells T-cells
unique(meta$samples) # check the sample labels
#> [1] sample1
#> Levels: sample1

# load spatial transcriptomics information
# Spatial locations of spots from full (NOT high/low) resolution images are required. For 10X Visium, this information is in `tissue_positions.csv`. 
spatial.locs <- read.csv("/Users/suoqinjin/Library/CloudStorage/OneDrive-Personal/works/CellChat/tutorial/spatial_imaging_data-mouse_spleen/tissue_positions_list.csv", header = F, row.names = 1)
spatial.locs = spatial.locs[rownames(meta), c(4,5)]
# Spatial factors of spatial coordinates
# For 10X Visium, the conversion factor of converting spatial coordinates from Pixels to Micrometers can be computed as the ratio of the theoretical spot size (i.e., 65um) over the number of pixels that span the diameter of a theoretical spot size in the full-resolution image (i.e., 'spot_diameter_fullres' in pixels in the 'scalefactors_json.json' file). 
# Of note, the 'spot_diameter_fullres' factor is different from the `spot` in Seurat object and thus users still need to get the value from the original json file. 
scalefactors = jsonlite::fromJSON(txt = file.path("/Users/suoqinjin/Library/CloudStorage/OneDrive-Personal/works/CellChat/tutorial/spatial_imaging_data-mouse_spleen", 'scalefactors_json.json'))
spot.size = 65 # the theoretical spot size (um) in 10X Visium
conversion.factor = spot.size/scalefactors$spot_diameter_fullres
spatial.factors = data.frame(ratio = conversion.factor, tol = spot.size/2)

d.spatial <- computeCellDistance(coordinates = spatial.locs, ratio = spatial.factors$ratio, tol = spatial.factors$tol)
min(d.spatial[d.spatial!=0]) # this value should approximately equal 100um for 10X Visium data
#> [1] 98.89004

Create a CellChat object

USERS can create a new CellChat object from a data matrix or Seurat. If input is a Seurat object, the meta data in the object will be used by default and USER must provide group.by to define the cell groups. e.g, group.by = “ident” for the default cell identities in Seurat object.

NB: If USERS load previously calculated CellChat object (version < 2.1.0), please update the object via updateCellChat

cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels",
                           datatype = "spatial", coordinates = spatial.locs, spatial.factors = spatial.factors)
#> [1] "Create a CellChat object from a data matrix"
#> Create a CellChat object from spatial transcriptomics data... 
#> Set cell identities for the new CellChat object 
#> The cell groups used for CellChat analysis are  Mac-1, Mac-2, B-cells, T-cells
cellchat
#> An object of class CellChat created from a single dataset 
#>  3044 genes.
#>  2568 cells. 
#> CellChat analysis of spatial data! The input spatial locations are 
#>                    x_cent y_cent
#> AAACACCAATAACTGC-1   2033    928
#> AAACAGAGCGACTCCT-1    727   2180
#> AAACAGCTTTCAGAAG-1   1568    761
#> AAACAGGGTCTATATT-1   1684    828
#> AAACCGGGTAGGTACC-1   1539   1078
#> AAACCGTTCGTCCAGG-1   1830   1312

Set the ligand-receptor interaction database

Before users can employ CellChat to infer cell-cell communication, they need to set the ligand-receptor interaction database and identify over-expressed ligands or receptors.

Our database CellChatDB is a manually curated database of literature-supported ligand-receptor interactions in both human and mouse. CellChatDB v2 contains ~3,300 validated molecular interactions, including ~40% of secrete autocrine/paracrine signaling interactions, ~17% of extracellular matrix (ECM)-receptor interactions, ~13% of cell-cell contact interactions and ~30% non-protein signaling. Compared to CellChatDB v1, CellChatDB v2 adds more than 1000 protein and non-protein interactions such as metabolic and synaptic signaling. It should be noted that for molecules that are not directly related to genes measured in scRNA-seq, CellChat v2 estimates the expression of ligands and receptors using those molecules’ key mediators or enzymes for potential communication mediated by non-proteins.

CellChatDB v2 also adds additional functional annotations of ligand-receptor pairs, such as UniProtKB keywords (including biological process, molecular function, functional class, disease, etc), subcellular location and relevance to neurotransmitter.

Users can update CellChatDB by adding their own curated ligand-receptor pairs. Please check the tutorial on updating the ligand-receptor interaction database CellChatDB.

When analyzing human samples, use the database CellChatDB.human; when analyzing mouse samples, use the database CellChatDB.mouse. CellChatDB categorizes ligand-receptor pairs into different types, including “Secreted Signaling”, “ECM-Receptor”, “Cell-Cell Contact” and “Non-protein Signaling”. By default, the “Non-protein Signaling” are not used.

CellChatDB.use <- res.multi$db.use
# set the used database in the object
cellchat@DB <- CellChatDB.use

Preprocessing the expression data for cell-cell communication analysis

To infer the cell state-specific communications, CellChat identifies over-expressed ligands or receptors in one cell group and then identifies over-expressed ligand-receptor interactions if either ligand or receptor are over-expressed.

We also provide a function to project gene expression data onto protein-protein interaction (PPI) network. Specifically, a diffusion process is used to smooth genes’ expression values based on their neighbors’ defined in a high-confidence experimentally validated protein-protein network. This function is useful when analyzing single-cell data with shallow sequencing depth because the projection reduces the dropout effects of signaling genes, in particular for possible zero expression of subunits of ligands/receptors. One might be concerned about the possible artifact introduced by this diffusion process, however, it will only introduce very weak communications. By default CellChat uses the raw data (i.e., object@data.signaling) instead of the projected data. To use the projected data, users should run the function projectData before running computeCommunProb, and then set raw.use = FALSE when running computeCommunProb.

# subset the expression data of signaling genes for saving computation cost
cellchat <- subsetData(cellchat) # This step is necessary even if using the whole database
future::plan("multisession", workers = 4) 
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat, variable.both = F)
#> The number of highly variable ligand-receptor pairs used for signaling inference is 43
 
# project gene expression data onto PPI (Optional: when running it, USER should set `raw.use = FALSE` in the function `computeCommunProb()` in order to use the projected data)
# cellchat <- projectData(cellchat, PPI.mouse)
execution.time = Sys.time() - ptm
print(as.numeric(execution.time, units = "secs"))
#> [1] 16.54874

Part II: Inference of cell-cell communication network

CellChat infers the biologically significant cell-cell communication by assigning each interaction with a probability value and peforming a permutation test. CellChat models the probability of cell-cell communication by integrating gene expression with prior known knowledge of the interactions between signaling ligands, receptors and their cofactors using the law of mass action.

CAUTION: The number of inferred ligand-receptor pairs clearly depends on the method for calculating the average gene expression per cell group. By default, CellChat uses a statistically robust mean method called ‘trimean’, which produces fewer interactions than other methods. However, we find that CellChat performs well at predicting stronger interactions, which is very helpful for narrowing down on interactions for further experimental validations. In computeCommunProb, we provide an option for using other methods, such as 5% and 10% truncated mean, to calculating the average gene expression. Of note, ‘trimean’ approximates 25% truncated mean, implying that the average gene expression is zero if the percent of expressed cells in one group is less than 25%. To use 10% truncated mean, USER can set type = "truncatedMean" and trim = 0.1. To determine a proper value of trim, CellChat provides a function computeAveExpr, which can help to check the average expression of signaling genes of interest, e.g, computeAveExpr(cellchat, features = c("CXCL12","CXCR4"), type = "truncatedMean", trim = 0.1). Therefore, if well-known signaling pathways in the studied biological process are not predicted, users can try truncatedMean with lower values of trim to change the method for calculating the average gene expression per cell group.

Compute the communication probability and infer cellular communication network

To quickly examine the inference results, USER can set nboot = 20 in computeCommunProb. Then “pvalue < 0.05” means none of the permutation results are larger than the observed communication probability.

If well-known signaling pathways in the studied biological process are not predicted, USER can try truncatedMean with lower values of trim to change the method for calculating the average gene expression per cell group.

USERS may need to adjust the parameter scale.distance when working on data from other spatial transcriptomics technologies. Please check the documentation in detail via ?computeCommunProb.

When inferring contact-dependent or juxtacrine signaling, users should provide a value of contact.range and set contact.dependent = TRUE. Briefly, users can set contact.range = 10, which is a typical human cell size. However, for low-resolution spatial data such as 10X visium, it should be the cell center-to-center distance (i.e., contact.range = 100 for 10X visium data). Please check the vignette of FAQ on applying CellChat to spatially resolved transcriptomics data for detailed explanations. In this example, we did not use the L-R pairs from Cell-Cell Contact signaling, therefore we can set contact.dependent = FALSE and contact.range = NULL. But as an illustration, we use the following settings that lead to the same results.

ptm = Sys.time()

cellchat <- computeCommunProb(cellchat, type = "truncatedMean", trim = 0.1,
                              distance.use = TRUE, interaction.range = 250, scale.distance = 0.01,
                              contact.dependent = TRUE, contact.range = 100)
#> truncatedMean is used for calculating the average gene expression per cell group. 
#> [1] ">>> Run CellChat on spatial transcriptomics data using distances as constraints of the computed communication probability <<< [2025-04-16 16:54:38.343122]"
#> The input L-R pairs have both secreted signaling and contact-dependent signaling. Run CellChat in a contact-dependent manner for `Cell-Cell Contact` signaling, and in a diffusion manner based on the `interaction.range` for other L-R pairs. 
#> [1] ">>> CellChat inference is done. Parameter values are stored in `object@options$parameter` <<< [2025-04-16 16:54:44.828905]"

Users can filter out the cell-cell communication if there are only few cells in certain cell groups. By default, the minimum number of cells required in each cell group for cell-cell communication is 10.

cellchat <- filterCommunication(cellchat, min.cells = 10)

Extract the inferred cellular communication network as a data frame

We provide a function subsetCommunication to easily access the inferred cell-cell communications of interest. For example,

  • df.net <- subsetCommunication(cellchat) returns a data frame consisting of all the inferred cell-cell communications at the level of ligands/receptors. Set slot.name = "netP" to access the the inferred communications at the level of signaling pathways

  • df.net <- subsetCommunication(cellchat, sources.use = c(1,2), targets.use = c(4,5)) gives the inferred cell-cell communications sending from cell groups 1 and 2 to cell groups 4 and 5.

  • df.net <- subsetCommunication(cellchat, signaling = c("WNT", "TGFb")) gives the inferred cell-cell communications mediated by signaling WNT and TGFb.

Infer the cell-cell communication at a signaling pathway level

CellChat computes the communication probability on signaling pathway level by summarizing the communication probabilities of all ligands-receptors interactions associated with each signaling pathway.

NB: The inferred intercellular communication network of each ligand-receptor pair and each signaling pathway is stored in the slot ‘net’ and ‘netP’, respectively.

cellchat <- computeCommunProbPathway(cellchat)

Calculate the aggregated cell-cell communication network

We can calculate the aggregated cell-cell communication network by counting the number of links or summarizing the communication probability. USER can also calculate the aggregated network among a subset of cell groups by setting sources.use and targets.use.

cellchat <- aggregateNet(cellchat)

execution.time = Sys.time() - ptm
print(as.numeric(execution.time, units = "secs"))
#> [1] 7.520704

Part III: Visualization of cell-cell communication network

Upon infering the cell-cell communication network, CellChat provides various functionality for further data exploration, analysis, and visualization. Here we only showcase the circle plot and the new spatial plot.

Visualization of cell-cell communication at different levels: One can visualize the inferred communication network of signaling pathways using netVisual_aggregate, and visualize the inferred communication networks of individual L-R pairs associated with that signaling pathway using netVisual_individual.

Here we take input of one signaling pathway as an example. All the signaling pathways showing significant communications can be accessed by cellchat@netP$pathways.

pathways.show <- c("IL16") 
# Chord diagram
par(mfrow=c(1,1), xpd = TRUE) # `xpd = TRUE` should be added to show the title
netVisual_aggregate(cellchat, signaling = pathways.show, layout = "chord", scale = T)


execution.time = Sys.time() - ptm
print(as.numeric(execution.time, units = "secs"))
#> [1] 7.658171

Compute and visualize the network centrality scores:

# Compute the network centrality scores
cellchat <- netAnalysis_computeCentrality(cellchat, slot.name = "netP") # the slot 'netP' means the inferred intercellular communication network of signaling pathways
#> Warning: UNRELIABLE VALUE: One of the 'future.apply' iterations
#> ('future_sapply-1') unexpectedly generated random numbers without declaring so.
#> There is a risk that those random numbers are not statistically sound and the
#> overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This
#> ensures that proper, parallel-safe random numbers are produced via the
#> L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set
#> option 'future.rng.onMisuse' to "ignore".
#> Warning: UNRELIABLE VALUE: One of the 'future.apply' iterations
#> ('future_sapply-2') unexpectedly generated random numbers without declaring so.
#> There is a risk that those random numbers are not statistically sound and the
#> overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This
#> ensures that proper, parallel-safe random numbers are produced via the
#> L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set
#> option 'future.rng.onMisuse' to "ignore".
#> Warning: UNRELIABLE VALUE: One of the 'future.apply' iterations
#> ('future_sapply-3') unexpectedly generated random numbers without declaring so.
#> There is a risk that those random numbers are not statistically sound and the
#> overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This
#> ensures that proper, parallel-safe random numbers are produced via the
#> L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set
#> option 'future.rng.onMisuse' to "ignore".
#> Warning: UNRELIABLE VALUE: One of the 'future.apply' iterations
#> ('future_sapply-4') unexpectedly generated random numbers without declaring so.
#> There is a risk that those random numbers are not statistically sound and the
#> overall results might be invalid. To fix this, specify 'future.seed=TRUE'. This
#> ensures that proper, parallel-safe random numbers are produced via the
#> L'Ecuyer-CMRG method. To disable this check, use 'future.seed = NULL', or set
#> option 'future.rng.onMisuse' to "ignore".
# Visualize the computed centrality scores using heatmap, allowing ready identification of major signaling roles of cell groups
par(mfrow=c(1,1))
netAnalysis_signalingRole_network(cellchat, signaling = pathways.show, width = 4, height = 2.5, font.size = 10)

Visualize gene expression distribution on tissue

# Take an input of a few genes
gg1 <- spatialFeaturePlot(cellchat, features = c("Il16"), point.size = 0.8, color.heatmap = "Reds", direction = 1, show.legend = F)
gg2 <- spatialFeaturePlot(cellchat, features = c("CD4"), point.size = 0.8, color.heatmap = "Blues", direction = 1, show.legend = F)
patchwork::wrap_plots(gg1, gg2, ncol =2)


# Take an input of a ligand-receptor pair
# spatialFeaturePlot(cellchat, pairLR.use = "IL16_CD4", point.size = 0.5, do.binary = FALSE, cutoff = NULL, enriched.only = F, color.heatmap = "Reds", direction = 1)

NB: Upon infering the intercellular communication network from spatial transcriptomics data, CellChat’s various functionality can be used for further data exploration, analysis, and visualization. Please check other functionalities in the basic tutorial of CellChat

Part V: Save the CellChat object

saveRDS(cellchat, file = "cellchat_SPOTS_mouse_spleen.rds")

Part VI: Explore the cell-cell communication through the Interactive CellChat Explorer

runCellChatApp(cellchat)

Part VII: Application to different technologies of spatially-resolved transcriptomics

Please check the vignette of FAQ on applying CellChat to spatially resolved transcriptomics data for detailed setting of different technologies of spatial transcriptomics data.

sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.5
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Asia/Shanghai
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] patchwork_1.3.0     CellChat_2.2.0      Biobase_2.60.0     
#> [4] BiocGenerics_0.46.0 ggplot2_3.5.1       igraph_1.3.5       
#> [7] dplyr_1.1.3        
#> 
#> loaded via a namespace (and not attached):
#>   [1] pbapply_1.7-2         rlang_1.1.1           magrittr_2.0.3       
#>   [4] clue_0.3-64           GetoptLong_1.0.5      gridBase_0.4-7       
#>   [7] matrixStats_1.0.0     compiler_4.3.1        systemfonts_1.0.4    
#>  [10] png_0.1-8             vctrs_0.6.3           reshape2_1.4.4       
#>  [13] ggalluvial_0.12.5     stringr_1.5.0         pkgconfig_2.0.3      
#>  [16] shape_1.4.6           crayon_1.5.2          fastmap_1.1.1        
#>  [19] magick_2.8.1          backports_1.4.1       ellipsis_0.3.2       
#>  [22] labeling_0.4.3        utf8_1.2.3            promises_1.2.1       
#>  [25] rmarkdown_2.24        network_1.18.1        purrr_1.0.2          
#>  [28] xfun_0.40             cachem_1.0.8          jsonlite_1.8.7       
#>  [31] highr_0.10            later_1.3.1           BiocParallel_1.34.2  
#>  [34] irlba_2.3.5.1         broom_1.0.5           parallel_4.3.1       
#>  [37] cluster_2.1.4         R6_2.5.1              bslib_0.5.1          
#>  [40] stringi_1.7.12        RColorBrewer_1.1-3    reticulate_1.31      
#>  [43] parallelly_1.36.0     car_3.1-2             jquerylib_0.1.4      
#>  [46] Rcpp_1.0.11.6         iterators_1.0.14      knitr_1.43           
#>  [49] future.apply_1.11.0   IRanges_2.34.1        FNN_1.1.3.2          
#>  [52] httpuv_1.6.11         Matrix_1.6-5          tidyselect_1.2.0     
#>  [55] abind_1.4-5           rstudioapi_0.15.0     yaml_2.3.7           
#>  [58] doParallel_1.0.17     codetools_0.2-19      listenv_0.9.0        
#>  [61] lattice_0.21-8        tibble_3.2.1          plyr_1.8.8           
#>  [64] shiny_1.7.5           withr_2.5.0           coda_0.19-4          
#>  [67] evaluate_0.21         future_1.33.0         circlize_0.4.16      
#>  [70] pillar_1.9.0          BiocManager_1.30.22   ggpubr_0.6.0         
#>  [73] carData_3.0-5         rngtools_1.5.2        foreach_1.5.2        
#>  [76] stats4_4.3.1          generics_0.1.3        S4Vectors_0.38.1     
#>  [79] munsell_0.5.0         scales_1.3.0          NMF_0.26             
#>  [82] ggnetwork_0.5.12      globals_0.16.2        xtable_1.8-4         
#>  [85] glue_1.6.2            tools_4.3.1           data.table_1.14.9    
#>  [88] BiocNeighbors_1.18.0  RSpectra_0.16-1       ggsignif_0.6.4       
#>  [91] registry_0.5-1        Cairo_1.6-2           cowplot_1.1.1        
#>  [94] grid_4.3.1            tidyr_1.3.0           colorspace_2.1-0     
#>  [97] presto_1.0.0          cli_3.6.1             fansi_1.0.4          
#> [100] svglite_2.1.1         ComplexHeatmap_2.15.4 gtable_0.3.4         
#> [103] rstatix_0.7.2.999     sass_0.4.7            digest_0.6.33        
#> [106] ggrepel_0.9.3         sna_2.7-1             farver_2.1.1         
#> [109] rjson_0.2.21          htmltools_0.5.6       lifecycle_1.0.3      
#> [112] statnet.common_4.9.0  GlobalOptions_0.1.2   mime_0.12