Glossary

Definitions/Explanations of Terms Used in JMP ClinicalDocumentation
Term	Definition/Explanation
ADaM	CDISC Analysis Data Model.
Adverse Event (AE)	Any adverse change in health or side effect that occurs in a person who participates in a Clinical Research trial while the patient is receiving treatment or within a previously specified length of time after treatment completion.
Alpha	Significance level. Although alpha can be any value between 0 and 1, it is typically set at either 0.01, 0.05 or 0.10.
Alternative Hypothesis	A position that a researcher evaluates in an experiment. The alternative hypothesis, H1 (or Ha), is the hypothesis that sample observations are influenced by a specific non-random cause. It is rival to the Null Hypothesis, H0.
ANCOVA	Analysis of covariance; a general linear model with a continuous outcome Variable and multiple predictors variables, with at least one nominal and one continuous predictor variable. Considered a hybrid of regression for continuous variables and ANOVA, ANCOVA can determine whether specific factors have an impact on the outcome variable after removing variance resulting from Covariates (the qualitative predictors).
ANOVA	Statistical models and procedures that partition observed Variance in a Variable into components attributable to different variation sources. By analyzing comparisons of variance estimates, ANOVA can determine whether the Means of several groups are statistically equal.
Arm	In a Clinical Research trial, the group of patients receiving a certain type of therapy. For example, one arm of a clinical trial might consist of patients receiving a new medication, another arm might consist of a standard-of-care medication, and another a placebo pill.
Bar Chart	A graphical representation of discrete or non-continuous data. It consists of rectangular bars whose lengths are proportional to the magnitude of the values that they represent. See Bar Chart.
BCPNN	Bayesian Confidence Propagation Neural Network 1
Beta	Denotes Type II Error rate, and is related to the Power of a test power = 1-beta).
Bin	A group of related of functionally similar Observations that are considered as a unit for statistical analysis.
Binary Variable	A Variable that contains two discrete values (0 and 1, for example).
Binomial Regression	A regression method where the Dependent Variable contains binomial values (for example, 0 and 1, often corresponding to ‘no’ and ‘yes’, or ‘failure’ and ‘success’, respectively).
Bioinformatics	A scientific field of study involving the integration and application of computer science, information technology, mathematics, and statistics to the fields of biology, genetics, genomics, and medicine.
Bivariate	Involving two Variables.
Body System	A group of Glossarys that work together to perform a task. Examples in humans include the digestive system, the nervous system, and the endocrine system.
Bootstrap	The practice of using with-replacement empirical distributions of Observations to estimate the statistical properties of the population from which the observations were made.
Box Plot	Used to display the response distribution at different combinations of factor levels. Box plots can reveal differences in the response Mean at different levels, suggesting Main Effects. Box plots can also reveal whether the response variation is homogenous across factor levels, an assumption made in ANOVA. See Box Plot.
Bubble Plot	A two-dimensional Scatterplot showing the relationship between two Variables over time. Each circle, or bubble, represents a single instance of an ID variable. See Bubble Plot.
BY Group	All of the Observations with the same values for all BY Variables.
BY Variable	An optional (in most reports) Variable specification whose values define groups of Observations, such as hour, month, or year. Specifying a BY variable enables you to animate an image so that you can see how response values change according to some grouping, like over time. Alternatively, BY variables can enable analyses to be performed separately on different groups as defined by a variable such as gender.
CDISC	Clinical Data Interchange Standards Consortium, a nonprofit organization that has “established standards to support the acquisition, exchange, submission, and archive of Clinical Research data and Metadata” whose mission is “to develop and support global, platform-independent data standards that enable information system interoperability to improve medical research and related areas of health-care”. See the CDISC website for more information.
Cell Plot	Direct representations of a data table, drawn as a rectangular array of cells with each cell corresponding to a data table entry. Colors are assigned to each cell based on the range and type of values found in the column
Censor Variable(s)	These columns specify those Observations for which data have been censored or truncated. For example, investigations of the effects of certain genes on life span might be terminated before all of the individuals have expired. The ultimate life spans for these individuals are unknown. All that can be said is that they exceed the period of the study. These data are considered censored.
Character Variable	A Variable whose values can consist of alphabetic and special characters as well as numeric characters
Chart	A graphical representation of data. Charts can take many forms. See Chart.
Check Box	An item in a dialog or that you can select without affecting any other items. You can deactivate a check box by selecting it again.
Chi-square Test	A statistical test used to test the existence of a relationship between two nominal Variables where the sampling distribution of the Test Statistic is a chi-squared distribution when the Null Hypothesis is true (or where it is asymptotically true).
Class Variable	The Variable whose values define the groups for analysis. Class variables can have continuous values, but they typically have a few discrete values that define the classifications of the variable. Values can either be character or numeric.
Clinical Research	A medical science branch focused on determining both the safety and effectiveness of diagnostic products, medications, medical devices, and treatment regimens for human health.
Clustering	The process of dividing a data set into mutually exclusive groups such that the Observations for each group are as close as possible to one another, and different groups are as far as possible from one another.
Cochran-Mantel-Haenszel Test	A statistical test used for repeated tests of nominal variable independence.
Color Variable	A Variable whose values are used to specify how the graphical output of an analysis is to be colored.
Conditional Probability	The probability of an event (for example, X) given that another specific event (for example, Y) occurs. Conditional probability is often expressed as P(X\|Y) or PY(X).
Contingency Plot	See Mosaic Plot.
Contingency Table	A table used to record and analyze the relationship between two or more categorical variables. See Contingency Table.
Correlation	A relationship between Variables in terms of dependence.
Correlation Coefficient	Also known as the Pearson product-moment correlation coefficient, it is equal to the Covariance of two Variables divided by the product of their Standard Deviations.
Covariance	A measure of the relationship between two Variables. It equals the Correlation Coefficient between the two variables times the square roots of their Variances.
Covariate	An Independent Variable, not manipulated by the experimenter, that can influence the outcome of the experiment.
Cox Proportional Hazards Model	A classical semiparametric (sometimes considered nonparametric) method that relates the time of an event (for example, failure or death) to explanatory variables (Covariates). This model assumes that hazard rate, rather than survival time, is a function of the explanatory variables. There are no assumptions made on the shape or nature of the hazard function.
CSV	Comma-separated value format. This text format stores tabular data, with line breaks and commas used to delimit table rows and columns, respectively.
Dendrogram	A tree-like diagram used to summarize a Clustering report. A dendrogram shows where each cluster divides in a hierarchical fashion. See Dendrogram.
Dependent Variable	A Variable whose value is determined by the value of another variable or by the values of a set of variables. This variable lists the responses you measure. In a two-dimensional plot, the dependent variable is usually plotted on the y (horizontal) axis.
Deviance Residual	A Residual that measures the disagreement between the maxima of the fitted and observed log likelihood functions.
Dialog	An interactive that enables you to set parameters for and run a report.
Distance Matrix	A matrix of distances.
Distribution	Graphics showing the number or proportion of events falling within a particular interval. JMP Clinical software presents these distributions as histograms or Parallel Plots. See Distribution.
Dot Product	An algebraic operation that takes two equal-length number sequences (usually coordinate vectors) and returns a single number obtained by multiplying corresponding entries and summing those products.
Double False Discovery Rate (FDR) Adjustment	The Double FDR method of Mehrotra and Heyse (2004)2 is used to compare the incidence of adverse events among treatments, leveraging the grouping of related adverse events (typically defined by the MedDRA system organ class). The method considers whether related terms within a group show differences between the treatments and upweights or downweights the significance of an individual term within the group accordingly. In the 2004 paper, the FDR adjustment is performed twice, and simulations are used to control the false discovery rate. Mehrotra and Adewale (2011) refine the Double FDR method to avoid the need for simulations by applying FDR adjustment thrice.
Drill Down	To start at one level of a dimension hierarchy and to click through one or more lower levels until you reach the data that you are interested in.
Eigenvalue	A scalar value that determines by how much a corresponding eigenvector is scaled by the square matrix for which it is defined. In principal components analysis, the eigenvalues of the Covariance or Correlation matrix represent the Variance of the components.
Eigenvector	For a given square matrix, a nonzero vector that changes length, but not direction, when multiplied by the matrix. The computation of principal components for a set of Variables uses the eigenvectors of the variables' Covariance or Correlation matrix.
Electrocardiogram (EG)	A record or display of a person’s heartbeat produced by electrocardiography, the transthoracic interpretation of electrical activity of the heart over a period of time.
Euclidean Distance	The distance between two points that would be measured with a ruler. It is derived from the Pythagorean equation: a2 + b2 = c2.
Extensible Markup Language (XML)	A markup language that structures information by tagging it for content, meaning, or use. Structured information contains both content (for example, words or numbers) and an indication of what role the content plays.
Factor	Also referred to as an Independent Variable or predictor variable, a factor is a Variable included in a model to account for variation in a response. Factors are the variables whose values (levels) you set to study their relationship to a response. You often experiment with many potentially influential factors at the same time.
False Discovery Rate (FDR)	The expected percentage of a set of predictions that are assumed to be false. For example, if an analysis, which predicts the association of 10 genes with a particular trait has a false discovery rate of 0.1, you can expect 9 of the predictions to be correct.
Familywise Error Rate (FWER)	The probability of making one or more false discoveries (Type I Errors) among all hypotheses while performing multiple pairwise tests.
Field	An area in which you can view, enter, or modify a value.
Fisher’s Exact Test	A statistical significance test used in the analysis of contingency tables where sample sizes are small. It is useful when you want to conduct a Chi-square Test, but one of your cells has an expected frequency of five or less. Its name is derived from its inventor, R.A. Fisher, and reflects that the significance of the deviation from a Null Hypothesis can be calculated exactly (as opposed to relying on an approximation whose exactness is realized only as sample size approaches infinity).
Fixed Effects	The effects that drive the variation that you are interested in assessing that have a fixed number of well-defined levels. They can also include nuisance variables that you need to consider in your model. Fixed effects include factors such as experimental treatment, disease status, age or developmental status of the test organisms, and gender. Variation due to fixed effects is the variation that you are interested in estimating and must be kept in the analysis.
Forest Plot	A graphical display designed to illustrate the relative strength of treatment effects (or relative degree of gene enrichment), in multiple quantitative scientific studies (or databases) addressing the same question. Forest plots generally display results for each study (or other data source) as horizontal lines representing the 95% confidence interval of the effect observed in that trial. See Forest Plot.
Gaussian Graphical Models	Multivariate probability distributions encoding a dependency network among variables.
Geometric Mean	The nth root of the product of the data. The statistic is helpful when the data contains a large value in a skewed distribution.
Group Variable	A Variable that is used for grouping results.
Heat Map	A heat map is a visual representation that shows the intensity of a phenomenon as color in two dimensions.
Hepatotoxicity	Chemical-driven damage (toxicity) to the liver.
Hierarchical Clustering	A method of cluster analysis that constructs a hierarchy of clustering. Strategies include the agglomerative approach, where each cluster initially contains only one observation, and the divisive approach, where all observations are initially contained in one cluster. Hierarchical clustering results are commonly presented in Dendrogram form.
High Level Group Term (HLGT)	The second-highest level of the Medical Dictionary for Regulatory Activities (MedDRA) Hierarchy, below System Organ Class (SOC) and above High Level Term (HLT). An example of an HLGT is “Respiratory tract infections”.
High Level Term (HLT)	The third-highest level of the Medical Dictionary for Regulatory Activities (MedDRA) Hierarchy, below High Level Group Term (HLGT) and above Preferred Term (PT). An example of an HLT is “Viral upper respiratory tract infections”.
Hoeffding Correlation (D)	A nonparametric measure of association that detects general departures from independence. This Statistic approximates a weighted sum over observations of chi-square statistics for two-by-two classification tables.
Hotelling T-squared Test	A test of the Null Hypothesis that “the population Mean vector is equal to the given mean vector”. It is the multidimensional equivalent of the one-sample t-test.
Hypothesis Test	A decision-making rule based on data from an experiment or observational study. A hypothesis test is used to conclude significance of a result based on the sufficiently low likelihood (set by the predefined significance level) that it occurred because of random chance alone.
Hy’s Law	An ominous prognostic indicator (in Clinical Research) that a pure drug-induced liver injury (DILI) leading to jaundice, without a hepatic transplant, has a case fatality rate of 10-50%.
Imputation	The computation of replacement values for missing input values.
Independent Variable	This Variable does not depend on the value of another variable; it represents the condition or parameter that is manipulated by the investigator. In a two-dimensional plot, the independent variable is usually plotted on the x (horizontal) axis.
Index Variable	One or more columns specifying how the Observations are to be classified.
Jitter	A random shifting of points by a slight amount along an axis so that more of those points can be effectively visualized in a graphical display.
JMP Scripting Language (JSL)	A scripting language used in JMP applications.
Journal	In JMP, a journal is a file (.jrn) that contains results of user-specified reports.
Kaplan-Meier Survival Curve	A curve based on the survival function estimator from life-time or clinical outcome data. For example, it can be used to measure the proportion of patients living for a given amount of time after treatment, or to measure the time until a tumor disappears.
Kendall Correlation	A metric used to measure the degree of correspondence between two sets of rankings where the metrics used to assess each set of rankings are not equivalent.
K-Means Clustering	A statistical method that creates optimally separated groups of Observations in data using one of several methods. A set of points called cluster seeds is selected as a first guess of the means of the clusters. One cluster seed is selected for each of k clusters. Each observation is assigned to the nearest seed to form temporary clusters. The seeds are then replaced by the Means of the temporary clusters, and the report is repeated until no further changes occur in the clusters.
Label Variable	A column containing descriptive labels that can be printed in the output by certain procedures instead of, or in addition to, the Variable name (which is also known as the SAS Variable Name). Synonymous with SAS Variable Label.
Log-rank Test	A nonparametric Hypothesis Test to compare the survival distributions of two samples. It is appropriate when data are right-skewed and non-informatively censored. Also known as a Mantel-Cox test. This test can be considered a time-stratified Cochran-Mantel-Haenszel Test.
Logistic Function	A common sigmoid curve that can model the S-shaped curve of population growth. Initial growth approximates an exponential curve, followed by slowing growth as saturation begins, followed by no growth at maturity.
Logistic Regression	A generalized linear model used for prediction of the probability of event occurrence (Binomial Regression) by fitting data to a Logit Function logistic curve.
Logit Function	The inverse of the sigmoidal Logistic Function. It is synonymous with Log-odds.
Log-odds	See Logit Function.
Lowest Level Term (LLT)	The lowest level of the Medical Dictionary for Regulatory Activities (MedDRA), below Preferred Term (PT). This level is reserved for non-current, vague, ambiguous, truncated, or misspelled terms, or for terms taken from other terminologies that do not conform to MedDRA rules.
LSMeans	Least squares Means, which are estimates of means of classification effects that would be observed, assuming that the experimental design is balanced.
Macro	A single statement, instruction, or catalog entry that automatically expands into a set of statements, instructions, or text.
Mahalanobis Distance	A distance measure based on Correlations between Variables. In contrast to Euclidean Distance, it is better adapted to non-spherically symmetric distributions, and is scale-invariant. See Mahalanobis Distances.
Main Effect	An effect measures the extent to which the response depends on the factors involved in the effect. A main effect is the change in the response due to a single factor. For two-level factors, the main effect is the difference between the mean response at the high level of a factor and the Mean response at its low level.
MANCOVA	Multivariate analysis of covariance; an extension of the analysis of covariance (ANCOVA) for multiple Dependent Variables or where it is not feasible to combine dependent variables. MANCOVA is similar to MANOVA, but enables control for additional continuous Independent Variables (Covariates).
Manhattan Plot	A type of scatter plot commonly used to display dense data, or data of highly diverse orders of magnitude.
MANOVA	Multivariate analysis of variance, a generalized form of univariate analysis of variance (ANOVA), used when there are two or more Dependent Variables. This analysis is useful in determining whether changes in the Independent Variables have significant effects on the dependent variables, as well as the associated interactions among dependent and independent variables.
Matched Pairs Analysis	A Clinical Research comparison of average score during baseline and a summary score during the trial for each finding. See Matched Pairs Analysis.
Mean	Mathematical average for a collection of n Observations. It is calculated by dividing the sum of the observations by n.
Medical Dictionary for Regulatory Activities (MedDRA)	A clinically validated international medical terminology dictionary used by regulatory authorities and the biopharmaceutical industry.
Median	In any set of n Observations arranged in order of magnitude, the median is represented by the observation positioned at n/2.
Menu Bar	The primary list of items at the top of a , which represent the actions or classes of actions that can be executed. Selecting an item executes an action, opens a Pull-down Menu, or opens a Dialog box that requests additional information.
Metadata	Descriptive data on the content of primary data.
MGPS	Multi-Item Gamma Poisson Shrinker 3
Missing Value	A value in the SAS System indicating that no data is stored for the Variable in the current Observation. It is indicated by a single dot (.) for a numeric variable or a blank for a character variable.
Mixed Model	A statistical model containing both Fixed Effects and Random Effects.
Mode	The value that occurs most often in a probability Distribution or data set.
Model	A formula or algorithm that computes output values from input values.
Modus Tollens	An argument of proof by contradiction; often known as denying the consequent. It has the general argument form of: 1. If P, then Q. 2. Not Q. 3. Therefore, not P.
Mosaic Plot	A graphical representation of a two-way frequency table or Contingency Table. A mosaic plot is divided into colored rectangles, so that the area of each rectangle is proportional to the proportions of the Y Variable in each level of the X Variable. See Mosaic Plot.
Nominal Variable	A Variable that contains discrete values that do not have a logical order. Includes names and other verbal descriptions.
Null Hypothesis	A general or default position that a researcher tests (and attempts to reject) in an experiment. The null hypothesis, H0, is an essential part of a research design, and usually proposes that sample Observations result purely from chance. The null hypothesis can never be proven; data can reject it or fail to reject it only. If a null hypothesis is rejected, an Alternative Hypothesis, H1 (or Ha) is accepted.
Numeric Variable	A Variable that contains only numeric values and related symbols, such as decimal points, plus signs, and minus signs.
Observation	A row (horizontal component) in a SAS data set. Each observation contains one data value for each Variable in the data set.
One-way ANOVA	Analysis of variance with one between-groups factor. This is useful when you have a nominal Independent Variable and a normally distributed interval Dependent Variable, and you want to compare differences in the means of the dependent variable according to levels of the independent variable.
One-way Plot	A plot showing the response points along the Y axis for each X factor value. Using the plot, you can compare the distribution of the response across the levels of the X factor. The distinct values of X are sometimes called levels. See One-way Plot.
One-way Repeated Measures ANOVA	Analysis of variance used on one nominal Independent Variable and a normally distributed interval Dependent Variable that is repeated at least twice for each subject. This method is equivalent to the paired samples t-test, but allows for two or more nominal variable levels.
Operator	A symbol in an expression that requests a comparison, a logical operation, or arithmetic computation.
Optimistic Bias	The established systematic tendency for humans to be overly optimistic about the outcome of planned actions. The likelihood of positive and negative events are over- and under-estimated, respectively. This tendency varies based on person and type of action.
Ordinal Variable	A Variable that contains discrete values that have a logical order. For example, a variable called Rank could have values such as 1, 2, 3, 4, and 5.
Overlay Plot	A plot showing several lines or markers on the Y axis overlaid to a common variable on the X axis. See Overlay Plot.
Parallel Plot	A plot consisting of connected line segments across all responses for each row in a data table. See Parallel Plot.
PCTL	See Percentile.
Pearson Correlation	A parametric measure of association for two variables. It measures both the strength and the direction of a linear relationship. If one variable X is an exact linear function of another variable Y, a positive relationship exists if the correlation is 1 and a negative relationship exists if the correlation is -1. If there is no linear predictability between the two variables, the correlation is 0. If the two variables are normal with a correlation 0, the two variables are independent. However, correlation does not imply causality because, in some cases, an underlying causal relationship might not exist.
Percentile	The value of a Variable below which a certain percent of Observations fall. For example, the 60th percentile is the value below which 60% of the observations can be found. Note the following percentile landmarks. - 25th percentile = first quartile = Q1 - 50th percentile = second quartile = median = Q2 - 75th percentile = third quartile = Q3
Plain Text Format	The format of a text (.txt) file that is readable with little-to-no-processing. Files in this format cannot be embellished with multiple font styles, underlining, italicization, emboldening, and so on.
Posterior Probability	The Conditional Probability of a random event that is assigned after relevant evidence is considered. Contrast with Prior Probability.
Power	The probability of a statistical significance test enabling you to reject the Null Hypothesis when the Alternative Hypothesis is true. Power equals one minus Beta (the rate of Type II Error).
Preferred Term (PT)	The fourth-highest level of the Medical Dictionary for Regulatory Activities (MedDRA) Hierarchy, below High Level Term (HLT) and above Lowest Level Term (LLT). An example of a PT is “Influenza”.
Prior Probability	The probability of an event computed before collection of new data (often based on an experienced expert opinion or rules-of-thumb). An experimenter begins with a prior probability of an event and then revises it in light of new data. Contrast with posterior probability.
PROC	A SAS procedure; a group of SAS statements that call and execute a procedure, usually with a SAS data set as input
PRR	Proportional Reporting Ratio 4
Pull-down Menu	The list of menu items or choices that appears when you choose an item from a Menu Bar or from another menu.
p-Value	The statistical probability that a Statistic is as or more extreme than the observed value, assuming that the Null Hypothesis is true. A smaller p-value enables you to more rigorously reject the null hypothesis.
Q Matrix	The n x p population structure incidence matrix where n is the number of individuals assayed and p is the number of populations defined.
Quantile	Portions taken at regular intervals along a distribution that divide a data set into discrete subsets.
Random Effects	The effects that cause extraneous variation in your results and have little to do with the questions being addressed. Random effects include factors such as physical differences between the arrays, or batch effects resulting from performing different parts of the experiments at different times, on different days, using different lots of reagents, and so on. Variation resulting from random effects can confound your results and should be eliminated from your analysis.
Random Number Seed	The starting point for a random number generator. Unless a number is specified, an arbitrary value, such as the date or time of an event, is used.
Regression Analysis	Techniques for modeling and analyzing several Variables, with the focus on the relationship between dependent and independent variables. Regression analysis is useful in uncovering how values of a Dependent Variable change when a single Independent Variable is varied.
Reliability Diagram	A graph where the conditional distribution of the observations, given the forecast probability, is plotted against the forecast probability. The distributions for perfectly reliable forecasts are plotted along the 45-degree diagonal. See Reliability Diagram.
Residual	Value equal to the response value minus the predicted value.
Rho	See Spearman Correlation.
Root Mean Square Error (RMSE)	A measure of the differences between the values predicted by a model or an estimator and the values actually observed. It is calculated by taking the square root of the Mean square error value.
ROR	Reporting Odds Ratio 5
Sample Size	The number of Observations that constitute a statistical sample. For example, the sample size in a study might consist of the number of subjects. Greater sample sizes lead to greater precision and Power for a study design to detect an effect at a given size.
SAS Data Set	A file whose contents are in one of the native SAS file formats. SAS data sets contain data values in addition to descriptor information that is associated with the data.
SAS Transport File	A file with a compressed format used in SAS. Transport files can be used to move SAS libraries, SAS catalogs, and SAS data sets across different operating systems. Files of this format have the extension .xpt.
SAS Variable Label	Variables (columns) in a SAS data set can have a SAS Variable Label. This label has much less restrictive creation rules than the corresponding SAS Variable Name. Blank spaces, special characters, and longer lengths are permitted.
SAS Variable Name	Every Variable (column) in a SAS data set must have a unique SAS Variable Name. This name must conform to a number of conventions, with notable restrictions on the first character, blank spaces, special characters, and length.
Scatterplot	A graph showing the relationship between two Variables. Multiple scatterplot formats exist, including scatterplot matrices, three-dimensional scatterplots, and Bubble Plots. See Reliability Diagram.
Screen Failure	A subject in a Clinical Research study that skips treatments or otherwise does not meet treatment criteria. In clinical data sets, a value of “Screen Failure” is given in the treatment column for this subject.
SDTM	CDISC Study Data Tabulation Model. Refer to the SDTM website for more details.
Sensitivity	The proportion of true positives that are correctly identified. Specifically, sensitivity equals the number of true positives divided by the sum of the number of true positives and the number of false negatives.
Shift Plot	A graphical display enabling you to compare how an experimental population responds to an experimental treatment. See Shift Plot.
Singular Value Decomposition	The factorization of a real or complex matrix, allowing the matrix to be expressed as a product. Every m x n matrix has a singular value decomposition.
Smoothing Bandwidth	A number determining the degree of smoothing for certain algorithms.
Spearman Correlation	Nonparametric method for examining whether two quantitative Variables co-vary. Each pair of variables is converted to ranks and is linked with an “unseen nominal variable.
Specificity	The proportion of true negatives that are correctly identified. Specifically, specificity equals the number of true negatives divided by the sum of the number of true negatives and the number of false positives.
Standard Deviation	A statistical measure of how “spread out” the data are. It is calculated by taking the positive square root of the sum of the squared deviations of each Observation from the sample Mean divided by (n-1).
Standard Error	The standard deviation of the sample mean. It is calculated by dividing the Standard Deviation by the square root of the Sample Size.
Standard MedDRA Query (SMQ)	Standardised MedDRA Queries (SMQs) are used to support signal detection and monitoring. SMQs are validated, standard sets of MedDRA terms. Some SMQs are a simple set of Preferred Terms (PTs) while other SMQs are hierarchical containing subordinate SMQs. SMQs include narrow, broad and algorithmic terms.
Standardization	Multiple meanings are possible: - Transformation of a data set to have zero Mean and unit Variance. - Making all regression coefficients have the same scale. - Normalization.
Statistic	A single measure of a sample attribute. Statistics are derived from the application of a function to sample data. An example is the sample median.
Strata Variable	A Variable that partitions the data into blocks with similar characteristics.
Survival Curves	Plots of survival functions estimated for each subject. See Survival Curves.
Survival Plot	A plot summarizing the survival of patients in each experimental ARM over the course of a Clinical Research trial. See Survival Plot.
System Organ Class (SOC)	The highest level of the Medical Dictionary for Regulatory Activities (MedDRA) Hierarchy, above High Level Group Term (HLGT). An example of an SOC is “Respiratory, thoracic, and mediastinal disorders”.
Tau Value	A nonparametric measure of association based on the number of concordances and discordances in paired Observations. Concordance occurs when paired observations vary together, and discordance occurs when paired observations vary differently. Also, used for the truncated product p-Value adjustment method to indicate that there is at least one false Null Hypothesis among those with p-values less than tau when the null hypothesis is rejected.
Test Statistic	A function of the data sample that reduces and summarizes the data to either one or a few values that can be used to conduct a Hypothesis Test.
Transformation	The process of applying a function to a Variable in order to adjust the variable's range, variability, or both.
Truncated Product Method (TPM)	A method that smooths p-Values over s of markers for n Hypothesis Tests by taking the product of those p-values less than a specified cutoff value and evaluating the probability of this product under the overall hypothesis that all n hypotheses are true.
t-statistic	A measure of how extreme a statistical estimate is. It is calculated by subtracting a reference of hypothetical value from your estimate and then dividing the remainder by the Standard Error value for the experiment.
t-test	A test that assesses the statistical difference between the Means of two different experimental groups. The Test Statistic follows a Student’s t distribution if the Null Hypothesis is supported. If only one Variable is chosen (one-sample t-test), the null hypothesis is that “the population mean is equal to the given mean”.
Type I Error	An incorrect decision made when a test rejects a true Null Hypothesis (H0). This is comparable to a false positive error. Type I error rate is denoted by Alpha, and is referred to as the size of the test.
Type II Error	An incorrect decision made when a test fails to reject a false Null Hypothesis (H0). This is comparable to a false negative error. Type II error rate is denoted by Beta, and is related to the Power of a test (power = 1-beta).
Variable	A column (vertical component) in a SAS data set. The data values for each variable describe a single characteristic for all Observations.
Variance	A measure of deviation of a group of samples from the mean. It is calculated by squaring the Standard Deviation.
Venn Diagram	A graphical representation composed of two or more overlapping circles that shows all of the hypothetical relationships between two or more data sets.
Volcano Plot	A Scatterplot of the negative log10-transformed p-Values derived from a specific t-test against the log2-fold change in expression. Genes whose expression is decreased lie to the left of the Mean; genes whose expression is increased lie to the right of the mean. See Volcano Plot.
WHERE Clause	A SAS statement that enables you to filter a set of Observations so that only the subset of data meeting the specific filtering criteria are considered in the analysis.
Wilcoxon Signed-rank Test	A nonparametric statistical hypothesis test used when comparing two related samples or repeated measurements on a single sample to assess whether their population Mean ranks differ. It is appropriate as an alternative to the paired Student’s t-test when the population is not normally distributed or the data is ordinal.
Wizard	An interactive utility program that consists of a series of dialog boxes, s, or pages. You supply information in each dialog box, , or page, and the wizard uses that information to perform a task.
Workflow	A series of processes run in a specified order, whose output is collected in a Journal. Given a constant basic experimental design and analysis objectives, a workflow can be used repeatedly with different data sets.
XML	See Extensible Markup Language (XML).