Annotation SAS Data Set
Specify the annotation SAS data set that contains annotation information (for example, Chromosome and Position). It is merged with the Input SAS Data Set to create a new data set.
Annotation Data Sets
An Annotation Data Set contains biological or chemical information and properties about genes, SNPs, probes, probesets, or peptides. This annotation information comes from various online bioinformatics resources, including government agencies, academic organizations and commercial entities. It is used to create a custom Annotation Data Set for your analysis.
The structure of an annotation data set and the information that it provides can vary depending on the nature of the experiment, the source of the data and the application that generated it. The table below lists information commonly contained in an Annotation Data Set. Keep in mind that different providers might name annotation information differently.
Item |
Description |
Probe or Probeset ID |
A unique identifier given to a probe or probeset in a probe array or microarray. |
GenBank Accession Number |
An accession number is a unique identifier given to a biological polymer sequence (such as DNA or a protein) when it is submitted to a sequence database (GenBank, EMBL, DDBJ). |
UniGene Cluster ID |
A unique identifier given to a cluster of sequences in UniGene. |
Gene ID |
A unique identifier assigned to a gene record in Entrez Gene. It is an integer and is species specific. |
Description |
Description of a gene, probe, or probeset. |
Chromosomal Location |
The physical location of a gene or sequence on a chromosome. |
Ensembl ID |
A unique identifier assigned to a sequence in Ensembl. |
Swiss-Prot ID |
A unique identifier assigned to a protein sequence in Swiss-Prot, a curated protein sequence database that provides a high level of annotation (such as the description of protein function, domain structures, post-translational modifications, variants, and so on), a minimal level of redundancy, and significant integration with other databases. |
EC Number |
A number assigned to an enzyme according to a scheme of standardized enzyme nomenclature developed by the Enzyme Commission of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB). The EC number is a unique identifier in ENZYME, the Enzyme nomenclature database, maintained at the ExPASy molecular biology server. |
OMIM ID |
A unique identifier assigned to a genetic disorder in the Online Mendelian Inheritance in Man. OMIM is a directory of human genes and genetic disorders, with links to literature references, sequence records, maps, and related databases. |
dbSNP ID |
A unique identifier assigned to a single nucleotide polymorphism (SNP) when it is submitted to the SNP database. Also known as an 'rs' ID. |
RefSeq Accession |
A unique identifier given to a sequence in the NCBI RefSeq database. The RefSeq database is a curated, non-redundant set including genomic DNA contigs, mRNAs and proteins for known genes, and entire chromosomes. |
Gene Ontology ID |
A unique alphanumerical identifier given to a GO term. |
Genomic Location or Coordinate |
A location assigned to a gene or a sequence at both the chromosome and sequence-levels. |
For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.
To Specify the Annotation SAS Data Set:
The method used for this specification can vary depending on whether JMP is connected to SAS on your local machine or connected to SAS on a server. You should refer to the Specifying Folders, Files, and Data Sets documentation for detailed information.
To View the Contents of the Specified Data Set:
8 | Click . |