The
Imputed SNP (Tall Format) Input Engine
imports a set of files created by a SNP
imputation
program, such as IMPUTE (Marchini et al., 2007) or BEAGLE (Browning and Browning, 2009). This process outputs three different SAS
genotype
data sets that can be used for subsequent analyses.
Consult the
Imputed SNP Import Tutorial
(
Genomics > Import > Other Genetics > Imputed SNP Import Tutorial
) for help on what options to use for your particular files. You should also refer to
Data Sets Used in JMP Genomics Processes
for information about data set formats.
At least one
genotype probability file
is required for this process. This file must be in the tall format, where sets of genotype probability columns correspond to individuals and SNPs are in rows. With the options provided, files from programs can be imported and analyzed.
A second, optional file is the
sample file
. This text file, which contains information about the sample in the genotype probability file(s), must be a space-delimited file with column names in the first row and data beginning on the third row, with rows of samples ordered the same as the columns of samples in the genotype file(s). During the input process, columns from this file are merged with the genotype columns.
The following example uses the
example.gen
and the
example.sample
files included in the
Sample Data
folder, which are example files from the IMPUTE program. They are provided courtesy of Jonathan Marchini at University of Oxford.