Europe
Discovery Summit
Exploring Data | Inspiring Innovation
Prague | 21-23 March 2017
Abstracts
Triskaidekaphilia
John Sall, Co-Founder and Executive Vice President, SAS
Triskaidekaphilia. This word means “love of the number 13.” With the release of JMP® 13, we plan to make this word meaningful. This session is a tour of some feature highlights of the new release.
To Explain or to Predict?
Galit Shmueli, Tsing Hua Distinguished Professor, Institute of Service Science, National Tsing Hua University, Taiwan
Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction and description. In many disciplines, there is near-exclusive use of statistical modeling for causal explanation with the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge and for proper use in practice.
Understanding the differences between explanatory and predictive modeling and assessment is crucial for being able to assess a data set’s information quality – its potential to achieve a scientific/practical goal using data analysis. While the explain-predict distinction has been recognized in the philosophy of science, the statistical and data mining literature lack a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. In this talk I will clarify the distinction between explanatory and predictive modeling and reveal the practical implications in terms of data analysis.
From Quality by Design to Information Quality: A Journey Through Science and Business Analytics
Ron S. Kenett, Research Professor, Mathematics Department, University of Turin, Italy
This talk is a journey meandering between science and business analytics. To provide context I will first list, very briefly, my role models. Specifically, I will mention Sir David Cox, who taught me the introduction to statistics class as an undergraduate at Imperial College; Sam Karlin, who was my PhD adviser at Stanford and the Weizmann Institute; George Box and Bill Hunter, who opened the door for me to applied statistics in business and industry; as well as Stu Hunter, Ed Deming and Joe Juran. They all had a significant impact on my career. The next stop on the journey will provide a brief introduction to Quality by Design (QbD), as applied in the pharmaceutical industry. The third stop will discuss a topic of growing concern in science – reproducibility of research findings. To address this issue, I will sketch a new proposal based on generalizability of findings. Generalization is one of the eight dimensions of information quality (InfoQ), and this stop represents joint work with Galit Shmueli carried over the last eight years and summarized in our recent book. A final stop will discuss challenges ahead for analytics and statistics. The motivation behind this journey is to demonstrate the key role of statistical thinking in modern analytics and its impact both on science and business applications. Eventually, these thoughts and examples are driven by the ambition to put statistics back in the driver’s seat of data-driven work. Throughout, I will show some examples using JMP to make the case.
Statistical Thinking and Politics: Perspectives From a Parliamentary Experience
Pedro Manuel Saraiva, Full Professor, Chemical Engineering Department, University of Coimbra, Portugal
After some decades in an academic career, during which I used a strong statistical background to conduct research activities, I was given the opportunity to run twice for election as a Member of the National Parliament of Portugal. It is mostly about this challenging experience and period of my life (ranging from 2009 to 2015) as a Member of Parliament (MP) that I will share examples, thoughts and conclusions. This talk will show evidence of how statistical thinking and tools, as well as fact-based approaches, can provide a better understanding about how Parliaments work and some of the strongest features of their organizational cultures, and help achieve better results, increased efficiency and efficacy. For that purpose, specific illustrations will be provided, through which I will try to show namely how statistical tests, variation analysis, clustering or Bayesian interpretations were applied to several situations related to the Portuguese Parliament, politics and politicians. I hope that this presentation will provide enough support to show that indeed statistical thinking and tools can help to better understand and improve Parliaments and help politicians make better fact-based decisions. Parliaments and societies are likely to improve if more people with a good statistical background accept the challenge of becoming a MP, at least for a while.
-
A Constraint for the Prediction Profiler for a Partial Least Squares Regression Model in JMP®
Zhiwu Liang, PhD, Principal Statistician, Procter & Gamble Company
- Topic: Predicitive Modeling
- Level: 4
Since H. Wold (1985) and S. Wold (1983) developed the partial least squares regression (PLSR) method, the approach has been broadly applied to many areas, especially in industry. PLSR can deal with data with many highly correlated predictors (which can cause a collinearity issue in an ordinary linear regression model). This is very useful for chemistry engineers who develop prediction models by using the data with many highly correlated chemical variables as predictors in their regression models. JMP Pro provides a very good PLSR modeling tool and Prediction Profiler graph. However, this Prediction Profiler also brings challenges for prediction of PLS regression models because the predictors in a PLSR model are highly correlated, while the JMP Prediction Profiler assumes all predictors in the PLSR model are independent. In this paper, we will present a linear constraint profiler for predictors in a PLSR model to correctly predict response variables by considering the correlation among predictors. In Section 1 we will present the constraint formula for a PLSR model with linear terms only. And then we extend the formula to a PLSR model with higher-order terms such as interaction or quadratic terms or categorical variables in Section 2. In Section 3 we will present a real case study to show the application of the method.
-
An Image Analysis Project
Olivier Brack, Consultant in Industrial Statistics, KSIC
- Topic: Data Visualization
- Level: 4
In order to improve automation of photo processing, we develop various scripts and implement new analytical methods. Our approach is based on the analysis of color shades by combining the image analysis functions offered as standard in JMP, the add-in ImageToDataTable.jmp (developed by J. Ponte), the contours plot, and the “In Polygon” function available for dealing with expressions columns (vectors, in our case). Among the topics that we faced, we will present:
- Identification and quantification of impurities in quality control on food production.
- The “In Polygon” function available for dealing with expressions columns (vectors, in our case).
- Analysis of the Olive Tail Moment (OTM) distribution, based on Comet Assay results.
- Comparison of anti-pollution products from the cosmetics industry.
Alongside the image analysis, we’ll present a script to analyze the performance of vehicles according to the type of terrain encountered, using the Google Earth map background in JMP.
-
Analysis Strategies for Wide, Omic Data Sets
Stanley Young, PhD, CEO, CGStat
Paul Fogel, PhD, Independent Statistical Consultant
- Topic: Predictive Modeling
- Level: 3
Omic data sets are becoming widely available from public repositories. Most of these data sets have substantially fewer observations than variables. There is a need for analysis strategies to efficiently and effectively deal with multiple mechanisms/subgroups. There usually are multiple mechanisms (groups of variables) that give rise to the same observed phenotype, one named disease made up with two or more etiologies. We examine a topical and public data set, about 700 subjects and about 27,000 variables. The response of interest is subject age. Can the age of a person be deduced from the methylation state of a blood sample? We highlight two JMP add-ins: P-value Plot and Variable Partitioning, a combination of Lasso and Non-Negative Matrix Factorization. We also use recursive partitioning (JMP Partition) and visualizations. The benefit of our work is a simple analysis strategy that can identify subgroups of subjects that follow a similar mechanism.
-
Application of the SEMI C64 Algorithm to Calculate the Control Limit for Non-Normal Distributed Process Data Using JMP® Scripting
Marc Stanke, PhD, Software Engineer, Life Science Analytics, Accantec Consulting
Albrecht Uhlig, PhD, Manager Quality Assurance, Atotech
- Topic: Quality and Reliability
- Level: 2
Statistical process control gives the opportunity to reduce the risk of batch faults and, therefore, potentially reduces the reconditioning costs or the loss of a whole batch. Six Sigma is a measure for the statistical certainty, to which extent the process is within the specification limits. Usually statistical process control is realized using a normal distribution. But what if a normal distribution does not describe the process data? This presentation will give insight to applications of the SEMI C64 Ship to Control algorithm, the use for non-normal distributed process data and the effects of the skewness of process data, as well as the number of available measurements on the calculated control limits. It will also report experiences applying the C64 algorithm to real-life data. The authors will give insight into the implementation of JSL, as well as the integration of JMP, and give an estimate for the applicability for real processes.
- Topic: Quality and Reliability
-
Beyond the Trees: The Majesty of the Forest Plot
Richard C. Zink, PhD, JMP Principal Research Statistician Developer, SAS
- Topic: Data Visualization
- Level: 1
Given the cost and complexity of conducting a clinical trial, and the uncertainty as to how patients may respond to an intervention, researchers often collect as much data as possible in order to describe the safety and efficacy of a novel treatment. Data visualization plays a key role to effectively summarize and communicate the results of these investigations. One graphical display of note is the forest plot, a figure that presents one or more confidence or credible intervals vertically to communicate either the findings of multiple endpoints or a single endpoint from multiple groups. In this presentation, we describe several applications of the forest plot in the context of clinical trials, including safety and quality screening, subgroup analysis, sensitivity analysis of a primary endpoint and meta-analysis. While we present many of the aforementioned examples in a Frequentist context, we also discuss how forest plots can easily communicate the results of complex Bayesian models that utilize Markov Chain Monte Carlo. We illustrate these examples using the freely available JMP forest plot add-in.
-
Determining the Right Sample Size for an MSA Study
Laura Lancaster, PhD, JMP Principal Research Statistician Developer, SAS
Chris Gotwalt, PhD, JMP Director of Statistical Research and Development, SAS
- Topic: Quality and Reliability
- Level: 3
Measurement systems analysis (MSA) studies are designed experiments that determine how much measurement variation is contributing to overall process variation. These studies are a critical first step in determining whether a measurement system will be able to detect process shifts with control charts or correctly identify product improvements in designed experiments. While typical MSA studies that are presented in guidebooks, college textbooks and journals have small sample sizes, they have been widely emulated despite the fact that there has been little published research on the sample sizes needed for these studies to give reliable results. We will show the results of a series of simulation experiments that investigate the relationship between sample size, estimation method (REML versus Bayesian), the actual quality of the measurement system, and the way the MSA study is collected (crossed versus nested). Based on the results, we recommend the sample sizes needed to correctly determine the adequacy of a measurement system with high probability. As part of the presentation, we will demonstrate how we used the new Simulate option in JMP Pro 13 to make the simulation experiments easy.
-
Development of JMP® Scripts That Enable Reliable, Reproducible and Fast Data Analysis and Reduce the Amount of Rework
Birgitte Høiriis Kaae, PhD, Manager of the Chemistry Laboratory, Radiometer Medical
- Topic: JSL Application Development
- Level: 3
Biomarkers for medical devices are subjected to complex verification and validation studies. Inconsistency within the data, as well as subjective evaluations of the data, can result in data analyses that are inconclusive or non-reproducible, which in turn result in rework and waste. In this project we have implemented standard work for data evaluation and analysis through the use of best practices and JMP scripts for all major verification and validation studies. The developed JMP scripts include:
- Logging of input parameters.
- Built-in IQ-OQ function to ensure that only validated scripts are employed.
- Consistency check of the data.
- Data analysis.
- Reporting.
The JMP scripts can be controlled centrally enabling version control. The user interface is simple and allows for a large user group. The implementation of standard work has had a remarkable effect on the quality of the results produced. The data quality has improved, and the need for repeating the studies as a result of errors has decreased. The need for cleaning up of the data is dramatically decreased. The data analyses are fast and reproducible using advanced JMP scripts.
- Topic: JSL Application Development
-
Fitting Repeated Measures Data With JMP® Pro 13: Bigger and Better
Don McCormack, JMP Technical Enablement Engineer, SAS
- Topic: Predicitive Modeling
- Level: 3
Accurate characterization of repeated measures mixed models depends on assumptions about the correlation of observations across measurement periods. Adjacent time periods are likely more similar than distant time periods, but how is that similarity captured? The correlation structure of a repeated measures model answers that question. From the perspective of model implementation, the more structures available, the better the chance of approximating the way data looks in reality. Prior to JMP Pro 13 only three commonly used time based repeated measures covariance structures were available: unstructured, autoregressive (with a period of 1), and residual. This made characterization challenging, leaving a large number of potential solutions unavailable to the modeler. JMP Pro 13 introduces seven new structures, greatly expanding the modeling opportunities. This talk will discuss modeling repeated measures data using any of the existing covariance structures including those traditionally reserved for spatial relationships. It will provide examples illustrating their relationship to each other, when to use which structure, and how to compare them to find the best fit. A brief overview of repeated measures mixed models will be given, as well as the SAS code corresponding to the examples.
-
Increase Efficiency and Model Applicability Domain When Testing Options That Are at First Glance Multilevel Categorical Factors
Silvio Miccio, Modeling and Simulation, Trainer and Consultant for Empirical Modeling and Optimization, Procter & Gamble Service
- Topic: Design of Experiments
- Level: 3
When testing options of, for example, different raw materials or formulation ingredients, it is common practice to vary them as multilevel categorical variables (A, B, C) in an experiment. To identify the best option, all of them have to be tested. This leads to time-consuming physical testing, and the resulting model is only applicable to predict the tested options, not to predict options that have not been tested. A much more efficient approach is to design the experiment based on the physical/chemical properties of each option. This significantly decreases the number of required experimental conditions and results in a model that can predict options not tested before. The presentation will demonstrate how to efficiently:
- Compress the available information of physical/chemical descriptors by principal components.
- Select the “corners of the box” for testing representative options based on design of experiments.
- Model the data via PLS.
- Find the overall optimum solution.
- Identify physical available options closest to the calculated optimum solution.
Notice that what is shown here is based on a method commonly used in chemometrics called quantitative structure-activity relationship (QSAR).
-
jClick! – Revolutionizing Data Analysis and Delivery in Semiconductor High-Volume Manufacturing
Kevin Lennon, Staff Product Development Engineer, Intel Corporation
Adrian Porter, Intel Senior Staff Technologist, Intel Corporation
- Topic: Data Access and Manipulation
- Level: 1
Intel Corporation, one of the world’s largest semiconductor manufacturers, supplies processors used in every part of the connected world; from high-powered multi-core servers to ultra-low-power internet of things devices. In the creation of these world-class products, the manufacture of silicon wafers creates terabytes of data – data that can, and must, be used to maximize value. In 2009 the authors initiated a data analytics revolution by creating an automated data delivery system in Intel Ireland’s Fab 24 site that has gone on to save over 1,000 engineers hundreds of thousands of hours in engineering time. It has automated mundane tasks to allow immediate solutions and it has detected defect signals that were previously undetectable. The system is called jClick!, and JMP is right at its heart. What started as a lean project in a single process engineering department has spread to seven high-volume manufacturing facilities and is being used by thousands of employees daily. By focusing on the individual needs of the engineer and looking for lots of small wins, jClick! has succeeded in making analytics personal. jClick! is a remarkable tale of technical innovation and inventive marketing, one everyone should hear.
-
JMP® 13 Bayesian Design and Analysis Delivers Profitable Market Share Gains
Robert Reul, Founder and Managing Director, Isometric Solutions
- Topic: Predictive Modeling
- Level: 3
A major appliance repair provider sought to profitably increase market share by introducing the most competitive product they could deliver. The challenge was to address the vast spectrum of considerations with just one study, which when rendered, produced a choice experiment in excess of 2.6 million combinations varying characteristics such as service price, repair warranty, appointment window and repair urgency. Further, specific considerations needed to be given to differing customer populations and appliances needing repair. Getting the product right. A Bayesian choice design was generated by phasing the study. A first wave was fielded with their employee population. This produced a Bayesian prior distribution of likely parameter values that were then used to create the Bayesian D-optimal design that was fielded in subsequent waves to both customer and prospect populations. Further, because preference characteristics were believed to vary by appliance, the survey was fielded such that respondents evaluated only choices based on specific appliance breakdowns they had experienced within the previous year. Targeting the right product. A Hierarchical Bayes choice analysis was used to fit parameter estimates for each individual respondent surveyed. By modeling respondent-level parameter vectors, the company was able to identify prospective customers based on specific demographic, geographic and psychographic profiles that showed who was most receptive to the new repair service offering. Bringing home the profit. The company chairman believed that there was untapped revenue potential in the home repair business and by using willingness to pay estimates, this potential was monetized and tested in market with premium pricing for improved service offerings. The results of these market tests will be shared along with extensive demonstrations using the JMP Choice platform and follow-up analysis using the respondent level information on the anonymized but real study data.
-
Localize Your Custom JMP® Applications for International Users
Michael Hecht, JMP Principal Systems Developer, SAS
- Topic: JSL Application Development
- Level: 4
JSL, the powerful scripting language in JMP, can be used to extend JMP features in a variety of ways. But when your JSL script has an international audience, what is the best way to localize its user interface? This paper describes the tools and methodology used by the JMP development team to localize the JSL scripts that are built-in to JMP. You can combine these same tools with your own methodology to give your scripts an international appeal.
- Topic: JSL Application Development
-
Missing Genuine Effects Is Bad, but Identifying False Effects Can Be Worse
Robert Anderson, JMP Senior Statistical Consultant, SAS
- Topic: Predictive Modeling
- Level: 3
Scientists and engineers need to be able to find the best possible model for their process or product and correctly identify which factors are genuinely important and which are not. Often the greatest concern is that an important or vital factor will be missed. However, a more insidious and potentially worse problem is that statistical modeling methods frequently identify factors that are statistically significant but not genuinely active. The identification of these non-genuine effects results in valuable scientific and engineering resources being squandered on further investigations of these false effcts. It may take considerable time and resources before the flawed model and non-genuine effects are recognized. The holdback validation methodology in JMP Pro provides a powerful way of suppressing this model overfitting problem even with relatively small data sets. Using simulated data sets, this paper will demonstrate how frequently and easily the problem of detecting non-genuine effects can occur and how holdback validation can effectively suppress this problem. However, holdback validation is not a magic bullet, and some examples of when it doesn’t work well will also be shown.
- Topic: Predictive Modeling
-
Multivariate Analysis of Sensory Data With JMP®
Jianfeng Ding, JMP Senior Research Statistician Developer, SAS
- Topic: Data Exploration
- Level: 3
Data sets resulting from sensory and consumer studies can be quite large, with many different columns and data types. A variety of multivariate data analysis methods can be useful in the exploration and analysis of sensory data. Over the past few years, many of these methods have been added to JMP. In this paper, we present how to apply methods such as analysis of variance (ANOVA), hierarchical cluster analysis, principal components analysis (PCA), multidimensional scaling (MDS), multiple correspondence analysis (MCA) and text exploration to sensory data, emphasizing how each of these procedures operates, how each is interpreted, and how they relate to one another. By illustrating the best methods to address sensory problems, the goal of this paper is to familiarize analysts with sensory evaluation and appropriate multivariate methods so that each analyst can effectively use these methods in JMP for their own sensory studies.
-
Multivariate Analysis Overcomes Complexities in Injection Molding
Wayne Levin, President, Predictum
David Calder, Six Sigma Black Belt, Magna International
- Topic: Quality and Reliability
- Level: 2
Over the years, automotive exterior parts have become more complex and substantially larger, yet are molded at faster cycle times. The transformation in design and challenging manufacturing demands have driven changes in tool design, hot runner design, material formulation and molding machine functionality. With these increasing challenges, we have to ask ourselves if conventional methods of quality control, which are typically univariate, are still effective. The short answer is no. This presentation demonstrates how multivariate analysis extracts pertinent information from large amounts of complex data. It is then able to identify the correlation structure and relationships that exist between multiple process variables and present it visually. We’ll present a project comparing univariate and multivariate approaches. These methods hold the promise to both reduce the dependency on subjective, visual inspection and make lights-out manufacturing more viable.
- Topic: Quality and Reliability
-
Nonlinear Modeling for Dissolution Data Treatment
Noëlle Boussac-Marliere, Senior Statistician, Merial
- Topic: Data Exploration
- Level: 3
Comparison of dissolution data of a pharmaceutical product will be presented. The analysis approach is not a standard one in the pharmaceutical domain, but it was requested in the specific context. For each condition, for each batch of product and vessel, a nonlinear regression using a specific model is applied. The model is built in line because it is not in the predefined list. The parameters of interest are estimated by inverse prediction and are analyzed to compare these conditions. Statisticians are located at the pharmaceutical manufacturing site of Merial in Toulouse.
- Topic: Data Exploration
-
Pictures From the Gallery Two: Select Advanced Graph Builder Views
Scott Wise, JMP Beacon Account Technical Manager, SAS
- Topic: Data Exploration
- Level: 1
This presentation will feature how to build the next round of advanced graphs that can be built in JMP 13 Graph Builder. We will feature several popular industry graph formats that you may not have known could be easily built within JMP. Views such as incorporating density overlays, Sankey diagrams and more will be shown, all of which can help breathe life into your analytics and provide a compelling platform to help manage up your results.
-
Regression Control Charting Using the Random Coefficient Regression Method
Pius Dahinden, PhD, Manager of Analytical Sciences, Tillotts Pharma
- Topic: Quality and Reliability
- Level: 2
Since the new EU GMP Guideline (EudraLex Volume 4) Chapter 6 (Quality Control) took effect in 2014, performing a trend analysis on drug product stability data has been an additional analytical quality control requirement. The European Compliance Academy's (ECA) Analytical Quality Control Working Group (AQCWG) has worked out a draft standard operating procedure for the assessment of out-of-expectation (OOE) and out-of-trend (OOT) results. For the assessment of stability data, it is recommended to use the regression control chart (RCC) procedure for the comparison of current batches with data of historical batches. However, the slopes and/or intercepts estimated from the historical batch data themselves are subject to uncertainty, the ECA AQCWG proposes a simplified random coefficient regression (RCR) method, which takes this into account. The JMP script, which implements the recommended RCC/RCR method, is presented. The script makes the identification of OOE/OOT points in stability data simple and provides all the necessary information in the output, together with an appropriate graphical display.
-
Second-Order Terms in Decision Trees
Sebastian Hoffmeister, Trainer and Statistical Consultant, STATCON
Bertram Schäfer, Owner, STATCON
- Topic: Data Exploration
- Level: 2
Decision trees are a well-known method when it comes to data mining and predictive modeling. Compared to other methods, decision trees are unique in that they are not only able to allow good predictions but can also display the relationships in the data. This talk will showcase the use of decision trees on technical process data as a preliminary step before a designed study. Decision trees are used to find potential factors in process data that might be worth investigating. As part of this presentation we will introduce the usage of interactions and higher-order terms in decision trees – a technique that is researched very little yet. We will show how the inclusion of interactions can make decision trees smaller, more precise and easier to explain.
- Topic: Data Exploration
-
The Experience of Applying a Definitive Screening Design in a Polymerization Process
Maria Lanzerath, Statistician, W.L. Gore & Associates
- Topic: Design of Experiments
- Level: 4
Definitive screening designs (DSDs) have some very attractive properties, as well as some differences from optimal designs. This tells the story of the first usage of a DSD at Gore in Germany. It was used in a medium stage of a process development for the production of a PTFE resin (chemical process). The team of engineers preferred to run a DSD over an optimal design because of the center points and the smaller design size, with the caveat of not being able to deliberately choose the interactions. The design comprises nine factors over three process steps. An interesting detail for the analysis will be that we have responses after each process step, so there will be up to three different models for the analysis, depending on after which process step a response was measured. The design is being executed at the moment. I want to show the whole journey, from the considerations and decisions during the planning phase all the way down to the analysis and the lessons we took from this design. We will also discuss whether the design was a good choice.
-
The New and Improved Formula Editor in JMP® 13
Mike Muhlada, JMP Development Tester, SAS
- Topic: JSL Application Development
- Level: 1
The Formula Editor got a facelift for JMP 13. While this new design is intuitive and flows well, there are many new enhancements to explore. This paper will start by showing off the new Formula Editor layout with its filterable function list, standard interactive column list, formula parameter/constant list and new larger workspace. Display updates include readability features for configurable matrix display, associative array display, automatic reformat of displayed formula depending on window size, and mini- and full-screen script editors. Using interactive examples, we will show more new features, including multiple column selection, drag/drop into formulas, context clicking replace and undo/redo functionality. We will share some lesser-known tricks such as alt-click formula replacement, direct editing of local variables and typing directly into formulas. We will also explore the new custom formats and show how these propagate to data tables and graphs.
-
Using JMP® 13 to Compare Designed Experiments
Bradley Jones, PhD, JMP Principal Research Fellow, SAS
- Topic: Design of Experiments
- Level: 3
When considering an experimental study, it is desirable for the design team to compare the properties of two or three designed experiments before choosing one to actually run. It is often useful to compare designs having the same number of runs but generated using different criteria. It is also meaningful to compare designs having different numbers of runs to evaluate whether the extra runs are worth the extra cost. This talk provides examples of both cases with advice about how to use this new feature to best advantage.
-
Visually Exploring Design of Experiments Models With the Generalized Regression Platform
Chris Gotwalt, PhD, JMP Director of Statistical Research and Development, SAS
Clay Barker, PhD, JMP Senior Research Statistician Developer, SAS
- Topic: Design of Experiments
- Level: 2
The Generalized Regression platform (GenReg) has evolved into a world-class framework for the analysis of designed experiments. In this presentation, we use simple case studies to demonstrate how easy it is to use GenReg's powerful new automated model selection capabilities such as the double lasso and two stage forward selection. Most importantly, we will show how GenReg's interactive diagnostic plots demystify the analysis of designed experiments. The case studies will include supersaturated designs, definitive screening designs, and designed experiments for binomial responses. As a part of the presentation, we will also introduce the new Simulate capability in JMP Pro 13 as a way to evaluate the statistical power of the model selection capabilities in GenReg. Finally, we will show how to use GenReg and Simulate together to obtain power calculations for non-standard analyses like the logistic regression for designed experiments with binary outcomes.
-
World State of Quality (WSQ): Multivariate Statistical Contributions
Marco S. Reis, PhD, Professor, Department of Chemical Engineering, University of Coimbra
Catarina Cubo, Researcher, University of Minho
Paulo Sampaio, Professor of Quality and Organizational Excellence, University of Minho
João d’ Orey, Executive Director, National Observatory of Human Resources
Pedro M. Saraiva, Full Professor, University of Coimbra
- Topic: Quality and Reliability
- Level: 3
In a joint project involving two Portuguese universities (University of Minho and University of Coimbra), a first attempt is being made to establish an international quality-oriented ranking of performances achieved by different countries, with a pilot conducted so far in the European Union (EU) countries. Therefore, an extended data collection was performed across a variety of sources regarding the 28 EU countries, in order to assess their maturity on quality-related performance and characterize the different profiles in terms of contributions to the overall level of quality achieved. For that purpose, we considered 21 indicators of quality, partitioned according to 10 different dimensions (education, research, organizations, professionals, health, innovation and entrepreneurship, sustainability, satisfaction, social cohesion, and competitiveness). This rich data set was analyzed using multivariate graphical and data analytics capabilities provided by JMP. Statistical methodologies employed include PCA, PLS, linear regression with variable selection, clustering (including dendograms and constellation plots) and parallel coordinates (used interactively, providing insights about the specificities of each country). In the end, the application of these methodologies contributed to the construction of a clear picture of the current state of quality in the EU, with plans to be later on extended to other countries as well.
- Topic: Quality and Reliability
- Beginner: 1
- Intermediate: 2
- Advanced: 3
- Power user: 4