The Imputed SNP (Tall Format) Input Engine imports a set of files created by a SNP
imputation program, such as IMPUTE (Marchini et al., 2007) or BEAGLE (Browning and Browning, 2009). This process outputs three different SAS
genotype data sets that can be used for subsequent analyses.
Consult the Imputed SNP Import Tutorial (
Genomics > Import > Other Genetics > Imputed SNP Import Tutorial) for help on what options to use for your particular files. You should also refer to
Data Sets Used in JMP Genomics Processes for information about data set formats.
At least one genotype probability file is required for this process. This file must be in the tall format, where sets of genotype probability columns correspond to individuals and SNPs are in rows. With the options provided, files from programs can be imported and analyzed.
A second, optional file is the sample file. This text file, which contains information about the sample in the genotype probability file(s), must be a space-delimited file with column names in the first row and data beginning on the third row, with rows of samples ordered the same as the columns of samples in the genotype file(s). During the input process, columns from this file are merged with the genotype columns.
The following example uses the example.gen and the
example.sample files included in the
Sample Data folder, which are example files from the IMPUTE program. They are provided courtesy of Jonathan Marchini at University of Oxford.