Several processes are available for generating or binning counts, inferring gene structure, and creating and importing variant call format (VCF) files from next-generation sequencing data.Count SAS Data GenerationThe first three processes import a set of files and generate count data, which is combined into SAS data sets containing chromosome , location, and sequence identity with respect to a reference sequence.
Tip : This can be useful to reduce the number of rows in a large data set in preparation for downstream plotting and modeling. Summarizing position-level intensity data into exon and intron bins as defined by an isoform definition file in UCSC formatTip : Output from a process such as SAM Input Engine can be used as input for this process.The remaining processes focus on the detection of single nucleotide polymorphisms ( SNPs ) and insertion - deletion polymorphisms (INDELs, also known as deletion insertion polymorphisms (DIPs)), generating VCF or SAS files.
Generating variant call format (VCF) files from SNPs/INDELs called (using SAMtools/BCFtools) from BAM files Importing CLC bio SNP or DIP Detection Table .csv files into SAS data set(s) Importing variant call format (VCF) files into SAS data set(s)