CellTracer: A comprehensive database to dissect the causative multilevel interplay contributing to cell development trajectories


During the complex process of tumour development, the unique destiny of cells is driven by the fine-tuning of multilevel features such as gene expression, network regulation and pathway activation. The dynamic formation of the tumour microenvironment influences the therapeutic response and clinical outcome. Thus, characterizing the developmental landscape and identifying driver features at multiple levels will help us understand the pathological development of disease in individual cell populations and further contribute to precision medicine. Here, we describe a database, CellTracer (http://bio-bigdata.hrbmu.edu.cn/CellTracer), which aims to dissect the causative multilevel interplay contributing to cell development trajectories. CellTracer consists of the gene expression profiles of 1,941,552 cells from 222 single-cell datasets and provides the development trajectories of different cell populations exhibiting diverse behaviours. By using CellTracer, users can explore the significant alterations in molecular events and causative multilevel crosstalk among genes, biological contexts, cell characteristics and clinical treatments along distinct cell development trajectories. CellTracer also provides 12 flexible tools to retrieve and analyse gene expression, cell cluster distribution, cell development trajectories, cell-state variations and their relationship under different conditions. Collectively, CellTracer will provide comprehensive insights for investigating the causative multilevel interplay contributing to cell development trajectories and serve as a foundational resource for biomarker discovery and therapeutic exploration within the tumour microenvironment.

Overview of data content and functions of CellTracer. The top panels demonstrate the dataset content and expansion of multilevel features. The middle and bottom panels demonstrate the functional frameworks to retrieve, analysis and visualize the scRNA-seq data.


CellTracer's Home: CellTracer provides quick search and powerful analysis tools and high-throughput single-cell data datasets


1. Click this search button to start a quick search for gene/organ/disease etc. 2. Click this button to get an introduction to CellTracer. 3. CellTracer's navigation bar. 4. Click this button to get our contact information and our recent works. 5. All of CellTracer's powerful analysis tools. 6. CellTracer's statistical data results.

Figure 2-1


Quick search in CellTracer: CellTracer provides a user-friendly search and result interface


1. You can search for multiple data types in CellTracer, including gene, disease, organ, cell type and etc. Then CellTracer will return the results of the relevant dataset in the background based on your input. 2. Enter what you want to find in the search box. Then hit " Enter " or the search button to search.

Figure 3-1


Find your interested datasets in CellTracer: CellTracer provides a variety of search methods including gene names, disease names, tissues, species, sequences and customized searches (For example: customized search)


1. Select the data type you are interested in to locate the data set in CellTracer. Data were screened including, but not limited to platform, cell type, treatment etc. 2. Result display.This comprehensive data table shows the multi-dimensional information of the relevant data sets obtained according to the above data filtering methods.

Figure 4-1 Figure 4-2


Basic information for Dataset / Cell in CellTracer: CellTracer provides basic information, which is displayed in Figure 5-1 and Figure 5-2.


1. The basic information of the dataset event you searched. 2. You can click on the analysis results of different presentations on the right to further explore the data set that interests you. CellTracer has 6 comprehensive analysis tools and 6 mini analysis tools.After selecting a dataset of interest, we can explore the dataset further with cellTracer's powerful analysis tool in order to gain more information about the dataset. 3. The basic information of the cell event you searched.

Figure 5-1

Figure 5-2


The geneCellCluster tool can explore gene expression and cellular distribution of different clusters, celltypes, states, primary/metastatic sites and etc.


1.CellTracer provides multi-dimensional data set basic information, including data set name, disease name, TCGA, species, data number, treatment method, sample number, platform information, primary/metastatic, and cell type. 2.CellTracer provides a personalized way to explore the dataset, including selecting different cell types, coordinates, and scatter sizes. 3.This figure shows the cell clustering map at different resolutions. 4.This figure shows the clustering of cells according to various types (including state, sample, stage and etc.). 5.The figure shows the expression of a gene in the dataset. 6.This figure shows the boxplots of the average expression values of each category of a gene at different resolutions of the dataset. 7.This figure shows the histogram of the expression value of a gene in different cell types of the dataset. 8.The figure shows statistical values such as the variance of a gene in different cell types in the dataset.

Figure 6-1

The geneCellTraject tool can explore gene expression and detailed distribution of cell subpopulations along cell development trajectory.


1.CellTracer provides multi-dimensional data set basic information, including data set name, disease name, TCGA, species, data number, treatment method, sample number, platform information, primary/metastatic, and cell type. 2.CellTracer provides a personalized way to explore the dataset, including selecting different cell types, track line, and scatter sizes. 3.This figure shows the cell trajectory map at different resolutions. (Monocle2 or Monocle3) 4.This figure shows the trajectory of cells according to various types (including state, sample, stage and etc.). (Monocle2 or Monocle3) 5.The figure shows the expression of a gene in the dataset. (Monocle2 or Monocle3) 6.This figure shows the boxplots of the average expression values of each category of a gene at different resolutions of the dataset. (Monocle2) 7.This figure shows the histogram of the expression value of a gene in different cell types of the dataset. (Monocle2) 8.The figure shows statistical values such as the variance of a gene in different cell types in the dataset. (Monocle2)

Figure 7-1

The geneStateTraject tool can explore causative interplay between gene expression and cell states contributing to cellular development trajectory and cell fates.


1.CellTracer provides multi-dimensional data set basic information, including data set name, disease name, TCGA, species, data number, treatment method, sample number, platform information, primary/metastatic, and cell type. 2.CellTracer provides a personalized way to explore the dataset, including selecting track line and scatter sizes. 3.The figure shows the expression of a gene in the dataset. 4.The figure shows the cell state scores in the dataset. 5.This graph shows the correlation between gene expression values and cell status scores in this dataset. 6.The user can further explore the dataset by selecting the parameters in the drop-down box. 7.The figure shows cell trajectories in different cell states of the dataset. 8.This figure shows the bubble plots of the 14 cell states with the change of pseudotime series.

Figure 8-1

The geneFuncTraject tool can explore causative interplay between gene expression and functions (GO terms, pathways, Hallmarks, etc.) contributing to cellular development trajectory and cell fates.


1.CellTracer provides multi-dimensional data set basic information, including data set name, disease name, TCGA, species, data number, treatment method, sample number, platform information, primary/metastatic, and cell type. 2.CellTracer provides a personalized way to explore the dataset, including selecting track line and scatter sizes. 3.The figure shows the expression of a gene in the dataset. 4.The figure shows the biological pathway scores in the dataset. 5.This graph shows the correlation between gene expression values and biological pathway scores in this dataset. 6.The user can further explore the dataset by selecting the parameters in the drop-down box. 7.The figure shows cell trajectories in different cell states of the dataset. 8.This figure shows the bubble plots of the biological pathway scores you selected with the change of pseudotime series.

Figure 9-1

The Multi-Omics-3D tool can explore multi-omics interplay which contributing to cellular development trajectory and cell fates.


1. You can select datasets and genes of interest to obtain the analysis shown in Figure 10-1.And the user can further explore the stereogram by selecting different parameters in the upper right corner.

Figure 10-1

CellStateTrans: A fast tool to visualize dynamic cell states variation along cell development pseudotime


1. The user can choose the data set of interest, two classification methods of quasi-timing and state, and the step size of quasi-timing and whether the coordinate axis is displayed to explore the dataset. 2. This graph presents comprehensive information, including dynamic bar charts and pie charts of 14 cell state scores as a function of pseudo-timing or state. CellTracer provides a visual representation of some of the cells in this data set. Users can click on the cell to go to the cell detail page for more information. 3. The figure shows cell trajectories in different cell states of the dataset. 4. This figure shows the bubble plots of the 14 cell states with the change of pseudotime series.

Figure 11-1

CellCluster: A fast tool to visualize detailed distribution of cell subpopulations in different clusters


1. The user can choose the data set of interest, multiple clustering methods (including class in different resolutions, stage, state,sample and etcs.), and two kinds of coordinates to explore the dataset. 2. CellCluster's analysis result.

Figure 12-1

CellTraject: A fast tool to visualize detailed distribution of cell subpopulations along cell development trajectory


1. The user can choose the data set of interest, multiple clustering methods (including class in different resolutions, pseudotime, stage, state,sample and etcs.), and whether to show track line to explore the dataset. 2. CellTraject's analysis result. 3. CellTracer trajectory analysis includes a comprehensive analysis of all cell types as well as a specific analysis of each cell type.

Figure 13-1

GeneExpression: A fast tool to visualize gene expression in different cellular clusters.


1. The user can choose the data set of interest, multiple genes and two kinds of coordinates to explore the dataset. 2. GeneExpression's analysis result.

Figure 14-1

GeneSurvival: Exploring Cox reguression and Kaplan-Meier survival curves of gene expressed across thousands of cancer patients.


1. The user can choose the data set of interest, multiple genes, data segmentation, color and line thickness of survival curves to explore the dataset. 2. GeneSurvival's analysis result.

Figure 15-1

GeneStateInterplay: A fast tool to explore correlation between a gene and a cell state.


1. The user can choose the data set of interest, multiple genes, 14 cell states and 3 data methods to explore the dataset. 2. GeneStateInterplay's analysis result.

Figure 16-1

GeneFuncInterplay: A fast tool to explore correlation between a gene and a functional context(GO term, pathway, hallmark, etc.).


1.The user can choose the data set of interest, multiple genes, up to 5 biological pathways and 3 data methods to explore the dataset. 2. GeneStateInterplay's analysis result.

Figure 17-1

How to download dataset.


1.We offer download links for 222 single cell dataset sources. You can click on " Processing data " to get the analysis results of the corresponding dataset. 2.You can click on " Gene info " to get the gene information of the corresponding dataset. 3.You can click Accession number to view detailed data information. 2.We also provide the results of each single cell analysis in each dataset, displayed as web-based results that users can download as needed.

Figure 18-1

A tutorial that guide the readers through the basic steps of CellTracer.


1 Construction of developmental trajectory within tumour-infiltrating lymphocytes


To demonstrate the potential application of CellTracer in characterizing the developmental landscape and identification of driver features at multi-level, we performed an analysis on a scRNA-seq dataset of breast cancer T cells (GSE110686). We used CellCluster tool to identify cellular clusters of this dataset (Figure T1).

Figure T1. The CellCluster tool for identification of cellular clusters. (1) Choose an interesting dataset. (2) Choose the plot type of this dataset. (3) Choose the resolution parameter to cluster cells. (4) Choose the tSNE or UMAP method to generate coordinates.

We found that T cells were distributed into seven unique clusters at resolution of 0.1 (Figure T2).

Figure T2. The cellular clusters of BRCA_GSE110686 dataset.

The developmental trajectories were constructed and visualized by CellTraject tools of CellTracer (Figure T3-T4).

Figure T3. The CellTraject tool for construction of cellular developmental trajectories. (1) Choose an interesting dataset. (2) Choose Cluster to set cell colors in the plot. (3) Choose the resolution parameter to cluster cells. (4) Choose the parameter to show trajectory lines.

Figure T4. The CellTraject tool for construction of cellular developmental trajectories. (1) Choose an interesting dataset. (2) Choose Pseudotime to set cell colors in the plot. (3) Choose the parameter to show trajectory lines.

We found that T cells were distributed into seven unique clusters with different cellular pseudotime along the development trajectory, indicting the complex tumour-infiltrating microenvironments with diverse T cell sub-populations and cellular states (Figure T5-T6).

Figure T5. The cellular trajectories of BRCA_GSE110686 dataset colored by clusters.

Figure T6. The cellular trajectories of BRCA_GSE110686 dataset colored by pseodotime.

To further explore the diverse cellular composition, we used GeneExpression tool of CellTracer to visualize marker genes expression across cellular clusters and lineages (Figure T7-T8).

Figure T7. The GeneExpression tool for visualization of gene expression across different cells. (1) Choose an interesting dataset. (2) Input a gene to be analysed. (3) Choose the tSNE or UMAP method to generate coordinates.

Figure T8. The GeneExpression tool for visualization of gene expression across different cells. (1) Choose an interesting dataset. (2) Input a gene to be analysed. (3) Choose the trajectory method to generate coordinates.

Figure T9. The marker genes expression across cellular clusters and lineages.

We found that four clusters (C2, C3, C4 and C6 in Figure T1) had high expression of marker genes (ITGAE and GZMB) suggestive of a tissue-resident memory T (TRM) cell phenotype (1,2). Cluster C1 had high expression of marker genes (KLRG1 and TRDC) for T effector memory (TEM) cells (3,4). The composition of TRM and TEM cells revealed the diverse T cell sub-populations with different immune states. TRM cells exhibit tissue residency and provide rapid and superior control of localized infections (5) whereas TEM cells meditate the protective memory by migration to inflamed peripheral tissues (6). We observed that TRM and TEM cells were localized at opposite ends of the developmental trajectory (Figure T10), indicating the distinct gene expression profiles and dynamic state transition of these cells (7). High expression of several immune checkpoints, such as PDCD1 and CTLA4, were observed in TRM cells, which is consistent with previous studies (7,8) that TRM cells expressed high levels of immune checkpoint molecules (Figure T9).

Figure T10. Cell type annotation of different cellular clusters.

2 Dissection of functional heterogeneity along developmental lineage


To dissect the phenotypic and functional heterogeneity of tumour-infiltrating lymphocytes, we used GeneStateTraject tool of CellTracer to evaluate the cellular states transition based on different functional contexts (Figure T11).

Figure T11. The GeneStateTraject tool for visualization of cellular state transition. (1) Choose an interesting dataset. (2) Input a gene to be analysed.

Previous study has reported a sub-population of TRM cells that displayed mitotic features with proliferation activity distributing on the end of pseudotime path (7). This cellular heterogeneity has also been captured by CellTracer analysis pipeline. The C6 sub-population was found to be localized on the end of cellular trajectory (with high pseudotime) with highly expressed proliferation gene HMGB2 (Figure T5). The increased cellular activities of proliferation, cell cycle and division were also observed in the C6 sub-population with relatively higher pseudotime (Figure T12).

Figure T12. The cell state scores along the cellular development trajectory.

Based on results of CellTracer-GeneStateTraject tool, HMGB2 expression was positively correlated with cellular proliferation and cell cycle score (Figure T13). These observations revealed the heterogeneity of TRM cells by characterization of a subset cellular populations undergoing proliferation and division states.

Figure T13. The gene-function correlation analysis of CellTracer.

3 Identification of driver features contributing to cell state transition


To further identify potential driver features and multi-level interplay contributing to the development of T cell state transition, we used GeneFuncTraject tool of CellTracer to evaluate the transforming growth factor β-responsive (TGF-β) pathway activity at single-cell resolution (Figure T14).

Figure T14. The GeneFuncTraject tool for visualization of cellular function transition. (1) Choose an interesting dataset. (2) Input a gene to be analysed. (3) Choose the pathway functional context. (4) Choose pathways to be analysed.

Figure T15. The GeneFuncTraject tool for visualization of cellular function transition. (1) Choose an interesting dataset. (2) Input a gene to be analysed. (3) Choose the biological processes functional context. (4) Choose GO terms to be analysed.

The TGF-β pathway is required for the formation and maintenance of tissue residency of TRM cells (2). An increased TGF-β activity has been observed within the cellular developmental trajectory (from lower to higher pseudotime), indicating its contribution to TRM cells formation (Figure T16).

Figure T16. The pathway scores along the cellular development trajectory.

Further, we focused on the transcriptional factors which were highly expressed in TRM cell clusters to identify essential genes which were required in the formation of TRM cells. We found that RUNX3 gene was highly expressed in C4 and C6 clusters and moderately expressed in C2 and C3 clusters (Figure T17).

Figure T17. The expression distribution of RUNX3 in different cellular clusters. This figure was generated by GeneCellCluster tool of CellTracer.

In the cellular trajectory, the C2 and C3 clusters (with lower pseudotime) were localized adjacent of C4 and C6 clusters (with higher pseudotime), indicating the developmental process of TRM cell formation. A recent study has demonstrated that RUNX3 is a critical regulator of CD8+ T cell tissue residency while cells with RUNX3-deficiency lacked the TGF-β transcriptional network that underpins the tissue residency (5).
We used GeneFuncTraject tool to explore the interplay between RUNX3 expression and TGF-β pathway activity of T cells and found positive correlation between these molecular and functional features (Figure T18), indicating RUNX3 is a potential driver gene on TRM cells developmental process.

Figure T18. The gene-function correlation analysis of CellTracer.

Previous studies revealed TRM gene signature identified from the scRNA-seq profile was significantly associated with patient survival (7,9).We used GeneSurvival tool of CellTracer to perform survival analysis of TRM marker gene ITGAE (Figure T19) across a panel of 16 breast cancer bulk datasets and found that ITGAE was a prognostic factor in TCGA and GSE18229 dataset (Figure T20).

Figure T19. The GeneSurvival tool to perform survival analysis. (1) Input a gene to be analysed. (2) Choose the cut-off value to group samples. (3) Choose the style to plot curves. (4) Choose an interesting dataset.

Figure T20. Survival analysis of ITGAE in a panel of 16 breast cancer bulk datasets based on GeneSurvival tool in CellTracer.

4 Combination and illustration of multi-level features interplay


In the above, we performed analysis of cellular clusters, cellular trajectories, marker genes expressions and cellular states based on different CellTracer tools and purposes. These multi-level features and their cross talk can be combined and visualized by the Multi-Omics-3D tool of CellTracer (Figure T21).

Figure T21. The Multi-Omics-3D tool in CellTracer. (1) Choose an interesting dataset. (2) Input genes to be analysed. 3) Choose the pathway functional context. (4) Choose pathways to be analysed.

Users can simultaneously map these multi-level features to the x, y, z axes, node colours and symbol size. As an example of Multi-Omics-3D tool in Figure T22, the distinct T cell clusters can be directly visualized with different cellular developmental trajectories. Based on this construction, it is easily and flexibly to explore the cross talk between molecular features (such as TRM cell marker ITGAE, immune checkpoint PDCD1 and TGF-β pathway driver gene RUNX3), functional and state features (such as cell cycle, proliferation, TGF-β pathway) and clinical features (such as stage, source and treatments).

Figure T22. An example of Multi-Omics-3D tool in CellTracer. This illustration was generated by setting x=cluster_0.1_resolution, y=monocle_dim2, z=pseudotime, node size=RUNX3_exp, color=gene_count, and backgroundColor=#FFFFFF.

More usage and examples of Multi-Omics-3D tool were illustrated in Figure T23.

Figure T23. Examples of Multi-Omics-3D tool in characterizing cellular states of Monocyte/Macrophage cells (nodes in red colour). This cluster of cells exhibit high inflammation activities with increased expression of IL1B which is an important mediator of the inflammatory response.

5 References


1. Vieira Braga, F.A., Kar, G., Berg, M., Carpaij, O.A., Polanski, K., Simon, L.M., Brouwer, S., Gomes, T., Hesse, L., Jiang, J. et al. (2019) A cellular census of human lungs identifies novel cell states in health and in asthma. Nat Med, 25, 1153-1163.

2. Mackay, L.K., Rahimpour, A., Ma, J.Z., Collins, N., Stock, A.T., Hafon, M.L., Vega-Ramos, J., Lauzurica, P., Mueller, S.N., Stefanovic, T. et al. (2013) The developmental pathway for CD103(+)CD8+ tissue-resident memory T cells of skin. Nat Immunol, 14, 1294-1301.

3. Park, J.H. and Lee, H.K. (2020) Re-analysis of Single Cell Transcriptome Reveals That the NR3C1-CXCL8-Neutrophil Axis Determines the Severity of COVID-19. Front Immunol, 11, 2145.

4. Zhang, F., Gan, R., Zhen, Z., Hu, X., Li, X., Zhou, F., Liu, Y., Chen, C., Xie, S., Zhang, B. et al. (2020) Adaptive immune responses to SARS-CoV-2 infection in severe versus mild individuals. Signal Transduct Target Ther, 5, 156.

5. Fonseca, R., Burn, T.N., Gandolfo, L.C., Devi, S., Park, S.L., Obers, A., Evrard, M., Christo, S.N., Buquicchio, F.A., Lareau, C.A. et al. (2022) Runx3 drives a CD8(+) T cell tissue residency program that is absent in CD4(+) T cells. Nat Immunol.

6. Sallusto, F., Geginat, J. and Lanzavecchia, A. (2004) Central memory and effector memory T cell subsets: function, generation, and maintenance. Annu Rev Immunol, 22, 745-763.

7. Savas, P., Virassamy, B., Ye, C., Salim, A., Mintoff, C.P., Caramia, F., Salgado, R., Byrne, D.J., Teo, Z.L., Dushyanthen, S. et al. (2018) Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis. Nat Med, 24, 986-993.

8. Nolan, E., Savas, P., Policheni, A.N., Darcy, P.K., Vaillant, F., Mintoff, C.P., Dushyanthen, S., Mansour, M., Pang, J.B., Fox, S.B. et al. (2017) Combined immune checkpoint blockade as a therapeutic strategy for BRCA1-mutated breast cancer. Sci Transl Med, 9.

9. Nalio Ramos, R., Missolo-Koussou, Y., Gerber-Ferder, Y., Bromley, C.P., Bugatti, M., Nunez, N.G., Tosello Boari, J., Richer, W., Menger, L., Denizeau, J. et al. (2022) Tissue-resident FOLR2(+) macrophages associate with CD8(+) T cell infiltration in human breast cancer. Cell, 185, 1189-1207 e1125.

Visits

222

Datasets

1,941,552

Cells

42

Diseases

80

Organs&Tissues