This process enables you to perform standard exploration on a Single-Cell RNA-Seq data set to analyze the expression patterns of suites of genes within single cells and facilitates identification of genes groups that may be associated within aberrant or otherwise unusual cell types (the genes can also be used to identify regular cell groups or new cell groups that have never been discovered). It first selects variable genes using Dispersion or VST method. Then it generates an interactive report including Data Overview, Variable Gene Plot, Clustering, Feature Importance Screening, and Violin Plots on individual gene expression levels. It also provides ways to navigate customized marker genes, launch ANOVA for Differential Gene Expression, and perform t-SNE or UMAP visualizations, provided the necessary R packages are installed. It then clusters cells into families based on expression levels of various marker genes and clustering.
• An Input Data Set that contains all of the numeric data to be analyzed. The PMBC_dense.sas7bdat data set serves as an example, and is partially shown below. It has 32739 columns and 2700 rows. Note that this is a wide data set. Each row represents a single gene, identified by a bar code. Each column represents a single cell with the numbers of copies of transcript for each cell.
• An Experimental Design Data Set (EDDS). This data set tells how the experiment was performed, providing information about the columns in the input data set. Note that one column in the EDDS must be named ColumnName and the values contained in this column must exactly match the column names in the input data set. The PBMC.edf EDF (shown below) used in this example has two columns.Note: This data set is required only when you wish to use the ANOVA process in your analysis. It is then used in conjunction with the output data from this workflow for the ANOVA.The output generated by this process is summarized in a Tabbed report. Refer to the Basic Single Cell RNA-Seq Workflow output documentation for detailed descriptions and guides to interpreting your results.