Fostered by Industry 4.0, complex and massive data sets are currently available in many industrial settings and manufacturing is facing a new renaissance, due to the widespread of emerging process technologies (e.g., additive manufacturing, micro-manufacturing) combined to a paradigm shift in sensing and computing.
On the one hand, the product quality is characterized by free-form complex...
The high production rate of modern industrial and chemical processes and the high cost of inspections make it unfeasible to label each data point with its quality characteristics. This is fostering the use of active learning for the development of soft sensors and predictive models. Instead of performing random inspections to obtain quality information, labels are collected by evaluating the...
In statistical evaluation of process effectiveness using statistics like capability or performance indices there are strong assumptions such as normality, homogeneity or independence. It can be problematic to check the assumptions for automated unsupervised data streams. Approaches are applied to standard data as well as data violating assumptions, like probability models. It has been shown...
The growing complexity of the shapes produced in modern manufacturing processes, Additive Manufacturing being the most striking example, constitutes an interesting and vastly unexplored challenge for Statistical Process Control: traditional quality control techniques, based on few numerical descriptors or parsimonious parametric models are not suitable for objects characterized by great...
After completing the experimental runs of a screening design, the responses under study are analyzed by statistical methods to detect the active effects. To increase the chances of correctly identifying these effects, a good analysis method should provide alternative interpretations of the data, reveal the aliasing present in the design, and search only meaningful sets of effects as defined by...
In analytical chemistry, high dimensionality problems in regression are generally solved using two dimension reduction techniques: projection methods, one of which is PLS or variable selection algorithms, as in lasso. Sparse PLS combines both approaches by adding a variable selection step to the PLS dimension reduction scheme. However, in most existing algorithms, interpretation of the...
Interval-valued data are often encountered in practice, namely when only upper and lower bounds on observations are available. As a simple example, consider a random sample $x_1, \dots, x_n$ from a distribution $\Phi$; the task is to estimate some of the characteristics of $\Phi$, such as moments or quantiles. Assume that the data $x_1, \dots, x_n$ are not observable; we have only bounds...
Rail transport demand in Europe has increased over the last few years, as well as the comfort level of passengers, which has been playing a key role in the fierce competition among transportation companies. In particular, the passenger thermal comfort is on the spotlight also of recent European regulations, that urge railway companies to install data acquisition systems to continuously...
The most widely used methods for online change detection have been developed within the Statistical Process Control framework. These methods are typically used for controlling the quality during a manufacturing process. In general, the problem concerns detecting whether or not a change has occurred and identifying the times of any such changes. In the last decade, some new approaches based on...
This work will focus on experiments with several variables (responses) to be observed over time. The observations will be taken on different experimental units that may have distinct characteristics, and they may be correlated in several ways, namely intra-correlation between different responses on the same subject at the same time, and inter-correlation between observations of the same...
Enbis Live is back. The session in which two open problems are discussed by the audience.
There are two ways of participating: proposing and helping. You can propose a project (must be open, you will have 7 minutes to present what it is about and after that the audience will ask questions and give suggestions) and you can help with another project by asking good questions and giving...
The improvement of surgical quality and the corresponding early detection of its changes is of increasing importance. To this end, sequential monitoring procedures such as the risk-adjusted CUSUM chart are frequently applied. The patient risk score population (patient mix), which considers the patients’ perioperative risk, is a core component for this type of quality control chart....
Multi-group data have N observations partitioned into m groups sharing the same set of P variables. This type of data is commonly found in industrial applications where production takes place in groups or layers, so the observations can be linked to the specific groups of products, creating a multiple-group structure in the data. The commonly used methodological solution for modelling such...
One of the most important product test machines (ATOS) are investigated in this global Autoliv project with the target to introduce an automated alarm system for product test data and a root cause analysis. We wanted a flexible automated software solution, which can transfer data into a SQL-database and perform a root cause analysis. Furthermore, we wanted to send web-based links of reports to...
The goal of this study is to design an experiment to detect a specific kind of heteroscedasticity in a non-linear regression model, i.e.
$$
y_i=\eta(x_i;\beta)+\varepsilon_i,\; \varepsilon_i\sim N(0;\sigma^2 h(x_i;\gamma)),\quad i=1,\ldots,n,
$$
where $\eta(x_i;\beta)$ is a non-linear mean function, depending on a vector of regression coefficients $\beta\in {\rm I!R}^m$, and $\sigma^2...
Gas chromatography (GC) plays an essential role in manufacturing daily operations for quality and process control, troubleshooting, research and development. The reliable operation of chromatography equipment ensures accurate quantitative results and effective decision-making. In many quality control and analytical labs, the operational procedure for GC analysis requires the chromatogram to be...
From the perspective of anomaly detection when data are functional and spatially dependent, we explore death counts from all causes observed along 2020 in the provinces and municipalities of Italy. Our aim is to isolate the spatio-temporal perturbation brought by COVID-19 to the Country’s expected process of mortality during the first two waves of the pandemic.
Within the framework of...
As data collection systems grow in size, multivariate Statistical Process Monitoring (SPM) methods begin to experiment difficulties to detect localized faults, the occurrence of which is masked by the background noise of the process associated to the many sources of unstructured variability. Moreover, these methods are primarily non-causal and do not consider or take advantage of the...
When the run size of an experiment is a multiple of four, D-optimal designs for a main effects model can be obtained by dropping the appropriate number of factor columns from a Hadamard matrix. Alternatively, one can use a two-level orthogonal array. It is well known that one orthogonal array is not the other, and this has led to a rich literature on the choice of the best orthogonal arrays...
Classical optimality criteria for the allocation problem of experimental designs usually focus on the minimization of the variance of estimators.
Optimal designs for parameter estimation somehow minimize the variance of the parameter estimates. Some criteria just use the variances (A-optimality, E-optimality) whereas other criteria also implicitly consider the covariances of the parameter...
New data acquisition technologies facilitate the acquisition of data that may be described as functional data. The detection of significant changes in group functional means determined by shifting experimental settings, which is known as functional analysis of variance (FANOVA), is of great interest in a lot of applications. When working with real data, it's typical to find outliers in the...
When comparing a medical treatment with placebo it is usual to apply a two-sample t-test. n1 patients are given treatment and n2 patients are given placebo. The standard assumptions for using a two-sample t-test are assumed. It is also assumed that large response values in the treatment group are desirable. Usually H0: “The distribution means are equal” is tested against H1: “The distribution...
In health registries, like cancer registries, patient outcomes are registered over time. It is then often of interest to monitor whether the distribution of the time to an event of interest changes over time – for instance if the survival time of cancer patients changes over time. A common challenge in monitoring survival times based on registry data is that time to death, but not cause of...
Over the last decade, 3-dimensional in vitro cell-based systems, like organs-on-chips, have gained in popularity in the pharmaceutical industry because they are physiologically relevant and easily scalable for high-throughput measurements. We wish to detect influential steps in a cell-based bio-assay using the OrganoPlate® platform developed by the company Mimetas BV. The cells are to form...
A unit circulating in a business process is characterized by a unique identifier, a sequence of activities, and timestamps to record the time and date at which said activities have started. This triplet constitutes an individual journey. The aim of predictive Business Process Monitoring is to predict the next or remaining activities of an ongoing journey, and/or its remaining time, be it until...
Pocket detection is a key step inside the process of drug design and development. Its purpose is to prioritize specific areas of the protein surface with high chance of being binding sites. The primary byproduct of this is to avoid blind docking. During a blind docking, the software tries to fit the ligand into the target protein without prior knowledge, thus it scans the whole protein surface...
Statistical Process Control, presented by Walter Shewhart a hundred years ago, was always a tough topic to be completely and correctly adopted, applied and used by organizations.
In research studies we conducted in the last three decades, we found that very few organizations that tried to apply SPC either failed or dropped it within two years. But in the recent decade, SPC has undergone a...
This presentation will briefly discuss the concepts, methodologies, and applications of In-Process Quality Improvement (IPQI) in complex manufacturing systems. As opposed to traditional quality control concepts that emphasize process change detection, acceptance sampling, and offline designed experiments, IPQI focuses on integrating data science and system theory, taking full advantage of...
In my presentation I will briefly describe how throughout my career I have always challenged the system; always questioning why executives and managers do what they do, by looking at their processes from a perspective of Statistical Thinking and System Thinking. Remember it is executives and managers who are responsible for developing the systems and processes that their organization...
In reliability analysis, methods used to estimate failure probability are often limited by the costs associated with model evaluations. Many of these methods, such as multifidelity importance sampling (MFIS), rely upon a computationally efficient, surrogate model like a Gaussian process (GP) to quickly generate predictions. The quality of the GP fit, particularly in the vicinity of the failure...
Self-starting control charts have been proposed as alternative methods for testing process stability when the in-control (IC) process parameters are unknown and their prospective monitoring has to start with few initial observations. Self-starting schemes use consecutive observations to simultaneously update the parameter estimates and check for out-of-control (OC) conditions. Although such...
Re-identification is a deep learning based method, defined as the process of not only detecting but identifying a previously recorded subject over a network of cameras. During this process, the subject in question is assigned a unique descriptor, used to compare the current subject with previously recorded ones, stored in a database. Due to the use of a unique descriptor instead of a class,...
Detailed information on the occurrence and duration of human activities is crucial to enhance the efficiency of manual processes. Thus, methods of sensor-based human activity recognition (HAR) gain relevance. Training a classifier for this task demands a large amount of data, as human movements are highly variable and diverse, in particular in the diverse environments of industrial...
With the advent of ‘Big Data’, massive data sets are becoming increasingly prevalent. Several subdata selection are proposed in these last few years both to reduce the computational burden and to improve cost effectiveness and learning of the phenomenon. Some of these proposals (Drovandi et al., 2017; Wang et al., 2019; Deldossi and Tommasi (2021) among others) are inspired to Optimal...
Polymerase Chain Reaction (PCR) diagnostic tests for the SARS-CoV-2 virus (COVID) have been commonplace during the global pandemic. PCR tests involve genomic sequencing of saliva samples. Genomic sequencing allows scientists to identify the presence and evolution of COVID. When a sample is run through a sequencer, the sequencer will make a read on each genomic base pair and the number of times...
The detection of anomalous radioxenon atmospheric concentrations is an important activity, carried out by the International Data Centre (IDC) of the Comprehensive Nuclear Test-Ban Treaty Organization (CTBTO), for revealing both underground nuclear explosions and radioactive emissions from nuclear power plants or medical isotope production facilities. The radioxenon data are validated by IDC in...
Although confirmatory modeling has dominated much of applied research in medical, business, and behavioral sciences, modeling large data sets with the goal of accurate prediction has become more widely accepted. The current practice for fitting and evaluating predictive models is guided by heuristic-based modeling frameworks that lead researchers to make a series of often isolated decisions...
The availability of real-time data from processes and systems has shifted the focus of maintenance from preventive to condition-based and predictive maintenance. There is a very wide variety of maintenance policies depending on the system type, the available data and the policy selection method. Recently, reinforcment learning has been suggested as an approach to maintenance planning. We...
To overcome the frequently debated “reproducibility crisis” in science, replicating studies is becoming increasingly common across a variety of disciplines such as psychology, economics and medicine. Their aim is to assess whether the original study is statistically consistent with the replications, and to assess the evidence for the presence of an effect of interest. While the majority of...
In the production of pelleted catalysts products, it is critically important to control the physical properties of the pellets, such as their shape, density, porosity and hardness. Maintaining these critical quality attributes (CQAs) within their in-specification boundaries requires the manufacturing process to be robust to process disturbances and to have good knowledge of the relationships...
Physical experimentation for complex engineering and technological processes could be too costly, or in certain cases, impossible to be performed. Thus, computer experiments are increasingly used in such context. Specific surrogate models are adopted for the analysis of computer experiments which are statistical interpolators of the simulated input-output data. Among such surrogate models, a...
The basic available control charts for attributes are based on either the binomial or the Poisson distribution (p-chart and u-chart) with the assumption of a constant in-control parameter for the mean. The corresponding classical control limits are then determined by the expected sampling variation only. If common cause variability is present between subgroups, these control limits could be...
In this paper, a multi-component series system is considered. The system is periodically inspected, where at inspection times the failed components are replaced by a new one. Therefore, this maintenance action is perfect corrective maintenance for the failed component, and it can be considered as imperfect corrective maintenance for the system. The inspection interval is considered as a...
The present work is done in collaboration with an industrial partner that manufactures plastic products via an injection moulding process. Different plastic products can be produced by using metal moulds in the injection moulding machine. The moulds are complex and built utilizing various parts. High quality of mould parts is crucial for ensuring that the plastic products are produced within...
This talk focuses on one aspect of DOE practice. When applying DOE, we always seek to save time, money and resources, to enable further experimentation. After asking the right questions, we often encounter an opportunity to obtain some form of partial, interim results before a full experiment is run and complete results become available. How can we exploit this opportunity? How can we take it...
Reliability of repairable systems are commonly analyzed with the use of simple Poisson processes. Using data for operation of wind turbines as motivation and illustration, we show, step by step, how certain extensions of such a model can increase its usefulness as both a realistic and easily interpretable mathematical model. In particular, standard regression modeling may for example account...
Modelling human self-reported longitudinal health data is a challenge: data accuracy, missingness (at random or not), between and within-subject variability, correlation, … poses challenges even in the framework of modelling “just” for hypothesis generation.
In this talk I will share my experience on modelling (for the purpose of describing) peak pain migraine-attack severity in...
The progress of technology and market demand led chemical process industries to abandon stationary production towards more flexible operation models that are able to respond to rapid changes on market demand (Zhang et al. 2021). Therefore, being able to move production from a source product grade A to a target product grade B with minimal effort and cost is highly desirable. Since the new...
In Aerospace industry, high reliability and safety standarts must be ensured in order to eliminate hazards, where possible, and minimize risks where those hazards cannot be eliminated. Special attention is also paid to aircraft availability, a measure of the percentage of time aircraft can be flown on training or missions, and flying hours per aircraft per year, since this metric is usually...
Solving inverse problems with the Bayesian paradigm relies on a sensible choice of the prior. Elicitation of expert knowledge and formulation of physical constraints in a probabilistic sense is often challenging. Recently, the advances made in machine leaning and statistical generative models have been used to develop novel approaches to Bayesian inference relying on data-driven and highly...
The high level of automation, the process miniaturization, the multiple consecutive operation steps, and the permanent entrant flows make the semiconductor manufacturing one of the most complex industrial processes. In this context, the development of a Run-to-Run (R2R) controller that automatically adjust recipe parameters to compensate for process variations becomes a top priority.
Since...
The development of a complex technical system can usually be described along a V-process (c.f., e.g., Forsberg and Mooz (1994)). It starts with the identification of the system requirements and allocates them top-down to subsystems and components. Verification activities start, where possible, on the component level and should be integrated bottom-up in the subsystem and system...
We consider balanced one-way, two-way, and three-way ANOVA models to test the hypothesis that the fixed factor A has no effect. The other factors are fixed or random. We determine the noncentrality parameter for the exact F-test, describe its minimal value by a sharp lower bound, and thus we can guarantee the worst-case power for the F-test. These results allow us to compute the minimal sample...
In semiconductor industry, Statistical Process Control (SPC) is a mandatory methodology to keep a high production quality. It has two main objectives: the detection of out-of-controls and the identification of potential root causes in order to correct them. Contrary to the first objective which is generally well covered by the different techniques already developed, the root cause analysis is...
The online quality monitoring of a process with low volume data is a very challenging task and the attention is most often placed in detecting when some of the underline (unknown) process parameter(s) experience a persistent shift. Self-starting methods, both in the frequentist and the Bayesian domain aim to offer a solution. Adopting the latter perspective, we propose a general closed-form...
Many industries produce products that are exposed to varying climate conditions. To ensure adequate robustness to climate, variation in the relevant features of climate must be quantified, and the design space of interest must be defined. This is challenging due to the complex structure of climate data, which contains many sources of variation, including geography, daily/seasonal/yearly time...
In many industrial applications, the goal is to predict (possibly in real-time) some target property based on a set of measured process variables. Process data always need some sort of preprocessing and restructuring before modelling. In continuous processes, an important step in the pipeline is to adjust for the time delay between target and input variables.
Time delay can generally be...
Car detailing is a tough job. Transforming a car from a muddy, rusty, full of pet fur box-on-wheels into a like-new clean and shiny ride takes a lot of time, specialized products and a skilled detailer.
But…what does the customer really appreciate on such a detailed car cleaning and restoring job? Are shiny rims most important for satisfaction? Interior smell? A shiny waxed hood? It is...
Advancing the spread and practice of statistics enhances an organization’s ability to successfully achieve their mission. While there are well-known examples of corporate leadership mandates to employ statistical concepts, more often the spread of statistical concepts flourishes more effectively through the practice of statistical influence. At first glance, the term influence may seem to...
In recent years, active subspace methods (ASMs) have become a popular means of performing subspace sensitivity analysis on black-box functions. Naively applied, however, ASMs require gradient evaluations of the target function. In the event of noisy, expensive, or stochastic simulators, evaluating gradients via finite differencing may be infeasible. In such cases, often a surrogate model is...
With the development of an Industry 4.0, logistics systems will increasingly implement data-driven, automated decision-making processes. In this context, the quality of forecasts with multiple time-dependent factors is of particular importance.
In this talk, we compare time series and machine learning algorithms in terms of out-of-the-box forecasting performance on a broadset of simulated...
Unbiased assessment of the predictivity of models learnt by supervised machine-learning methods requires knowledge of the learnt function over a reserved test set (not used by the learning algorithm). Indeed, some industrial context requires the model predictivity to be estimated on a test set strictly disjoint from the learning set, which excludes cross-validation techniques. The quality of...
Multiblock data analysis has become a standard tool for analysis of data from several sources, be it linking of omics data, characterisation or prediction using various spectroscopies, or applications in sensory analyses. I will present some basics concerning possibilities and choices in multiblock data analysis, introduce some of the standard methods, show some examples of usage and refer to...
Industry 4.0 along with efforts in digitalization has brought forth many challenges but also opportunities in data analytics applications in manufacturing. Many of the conventional methods are in need of extending as they often fall short in accommodating the characteristics of modern production data while there has been an increasing influx of new data analytics methods from machine learning...
Logistics networks are complex systems due to drivers of change such as globalization and digitalization. In this context, decision makers in supply chain management are challenged with maintaining a logistics network in a good and competitive state. Because of the complexity, decision makers should be supported accordingly in answering a multitude of logistics task. A crucial factor to...
Traditional mid-term electricity forecasting models rely on calendar and meteorological information such as temperature and wind speed to achieve high performance. However depending on such variables has drawbacks, as they may not be informative enough during extreme weather. While ubiquitous, textual sources of information are hardly included in prediction algorithms for time series, despite...
A Control Chart for signal detection in the Covid-19 pandemic
Bo Bergman, Svante Lifvergren, Emma Thonander Hallgren
Abstract
The spread of the SARS-Cov-2 virus since the late 2019 has been problematic to follow and have often surprised epidemiologists and statisticians in their efforts to predict its future course
Objective: Interventions such as recommended social distancing or other...
The development of electric vehicles is a major lever towards low-carbon transport. It comes with a growing number of charging infrastructures that can be used as flexible assets for the grid. To enable this smart-charging, an effective daily forecast of charging behaviours is necessary. The purpose of our work is to evaluate the performance of models for predicting load curves and charging...
Maintaining quality pavement is important for road safety. Further, effective pavement
maintenance targeted at the right locations is important for maximising the socioeconomic benefits from the resources allocated to maintenance activities. A flexible pavement is multilayered
with asphalt concrete at the top, base and subbase course followed by compacted soil subgrade.
Several laboratory...
Third and final edition - this face-to-face session follows online sessions in 2020 and 2021
Are you interested in case studies and real-world problems for active learning of statistics?
Then come and join us in this one-hour interactive session organised by the SIG Statistics in Practice.
A famous project for students to apply the acquired knowledge of design of experiments is...
Prognostics and health management and calculation of residual useful life are important topics in automotive industry. In the context of autonomous cars, it is imperative to lower the residual risk to an acceptable level.
On the semiconductor level, various advanced statistical models are used to predict degradation on the basis of accelerated life time stress tests. The change of electrical...
In semiconductor manufacturing, Virtual Metrology (VM) refers to the task performed to predict post-process metrology variables based on machine settings and sensor data. To improve the efficiency of a VM system, the paradigm of transfer learning is used to leverage the knowledge extracted when exploiting a source domain of a source task, by applying it to a new task and/or new domain. The...
The ICH Q8 guideline [1] emphasized the Quality by Design (QbD) approach, according to which quality should be built into the product since its conception. A key component of the QbD paradigm is the definition of the Design Space (DS), defined as the multidimensional combination and interaction of inputs variables that have been demonstrated to provide assurance of quality. Besides, Rozet et...
In solar power plants, high reliability of critical assets must be ensured---these include inverters, which combine the power from all solar cell modules. While avoiding unexpected failures and downtimes, maintenance schedules aim to take advantage of the full equipment lifetimes. So-called predictive maintenance schedules trigger maintenance actions by modelling the current equipment...
Predictive monitoring techniques produce signals in case of a high probability of an undesirable event, such as machine failure, heart attacks, or mortality. When using these predicted probabilities to classify the unknown outcome, a decision threshold needs to be chosen in statistical and machine learning models. In many cases, this is set to 0.5 by default. However, in a high number of...
Exploratory well-bore drilling is fundamental to future oil and gas supplies. It is also a highly financially risky investment. While a large literature exists estimating the relationship between oil prices and drilling activity, the mechanism behind this relationship clearly relates to decision making at the firm level and in turn the financial state of individual firms. However, there has...
When products are made-to-order, sales forecasts must rely on data from the sales process as a basis for estimation. Process mining can be used for constructing the sales process from event logs, but it also provides characteristics from the process itself that can be utilized in prediction algorithms to create the actual sales predictions. Based on literature, the encoding of process...
The Markovchart R package is able to minimise the cost of a process, using Markov-chain based $x$-charts under general assumptions (partial repair, random shift-size, missing samples). In this talk a further generalisation will be presented. Quite often the degradation can take different forms (e.g. if there is a chance for abrupt changes besides the „normal” wear), which might be modelled by...
Engineers often have to make decisions based on noisy data, that have to be collected first (fi. fine-tuning of a pilot plant). In this case, there is a vast range of situations about which data could be collected, but only time and money to explore a few. Efficient data collection (i.e. optimal experimental design and sampling plan) is an important skill, but there is typically little...
There are often constraints among the factors in experiments that are important not to to violate, but are difficult to describe in mathematical form. In this presentation, we illustrate a simple workflow of creating a simulated dataset of candidate factor values. From there, we identify a physically realisable set of potential factor combinations that is supplied to the new Candidate Set...
For linear errors-in-variables regression, various methods are available to estimate the parameters, e.g. least-squares, maximum likelihood, method of moments and Bayesian methods. In addition, several approaches exist to assign probability distributions, and herewith uncertainties, to such estimators.
Following the standard approach in metrology (the Guide to the expression of uncertainty...
The use of "Null hypothesis significance testing" and $p$-values in empirical work has come in for widespread criticism from many directions in recent years. Nearly all this commentary has, understandably, focused on research practice, and less attention has been devoted to how we should teach econometrics (my home discipline) and applied statistics generally. I suggest that it is possible to...
One of the quality characteristics characterizing the technological process is the measured degree of variation between produced objects. The probability of certain ordinal response of an object under test depends on its ability, given thresholds, characterizing the specific test item. A class of models borrowed from item response theory were recently adapted to business and industry...
The IMR (individual-moving average) control chart is a classical proposal in both SQC books and ISO standards for control charting single observations. Simple rules are given for setting the limits. However, it is more or less known that the MR limits are misplaced. There were some early accurate numerical Average Run Length (ARL) results by Crowder in 1987! However, they are not flawless. We...
When dealing with uncertainty quantification (UQ) in numerical simulation models, one of the most critical hypotheses is the choice of the probability distributions of the uncertain input variables which are propagated through the model. Bringing stringent justifications to these choices, especially in a safety study, requires quantifying the impact of potential uncertainty on the input...
When a Cusum signals an alarm, often it is not initially clear whether the alarm is true or false. We argue that in principle the observations leading to a signal may provide information on whether or not an alarm is true. The intuition behind this is that the evolution of a false alarm has a well-defined stochastic behavior, so if observations preceding the alarm were to exhibit a behavior...
The approach is based on participant-centered learning of an hour workshop. The class starts with a real problem for the students to estimate the true proportion of red beads in a box containing approximately 4000 beads (red and white). Using a random sampling of 50 units, each student draws his/her own sample using a paddle two times of which the second sample size is doubled. An MS-Excel...
Recently, low-rank sparse decomposition methods such as Smooth-Sparse Decomposition and Robust PCA have been widely applied in various applications for their ability to detect anomalies. The essence of these methods is to break down the signal into a low-rank mean and a set of sparse components that are mainly attributed to anomalies. In many applications, a simple decomposition of the signal...
Since the early works by Shewhart and Deming manufacturing is mainly controlled by adapting a design and its tolerances to the statistical variation of the process. In design development work there is a challenge to take into account variation and uncertainty in connection to geometrical outcome for products where fabrication is used, where bits and pieces are joined together by...
JMP 17 is a feature packed new release with tremendous new capabilities for statisticians and data scientists with all levels of experience. In this presentation, we will demonstrate Easy DoE, a guided process for the design and analysis of experiments that softer entry for new DoE practitioners. We will also cover new JMP Pro capabilities including spectral analysis features such as wavelet...
Turning data into accurate decision support is one of the challenges in the daily organizational life. There are several aspects of it related to variation in the interaction between technology, organization, and humans, where the normal managing and engineering methods based on an outside-in perspective of system development does not always work. Problems such as lack of common understanding...
High-dimensional data to describe various aspects of a patient's clinical condition have become increasingly abundant in the medical field across a variety of domains. For example, in neuroimaging applications, electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) can be collected simultaneously (i.e., EEG-fMRI) to provide high spatial and temporal resolution of a...
In many statistical process control applications data acquired from multiple sensors are provided as profiles, also known as functional data. Building control charts to quickly report shifts in the process parameters or to identify single anomalous observations is one of the key aims in these circumstances. In this work, the R package funcharts is introduced, which implements new methodologies...
Telemonitoring is the use of electronic devices such as smartphones to remotely monitor patients. It provides great convenience and enables timely medical decisions. To facilitate the decision making for each patient, a model is needed to translate the data collected by the patient’s smartphone into a predicted score for his/her disease severity. To train a robust predictive model,...
The process of digitalization is happening at a great pace and is driven by enabling technologies such as Internet of Things (IoT), cloud computing, simulation tools, big data analytics and artificial intelligence (AI). Altogether, these allow to create virtual copies of physical systems or even complete environments. The concept of Digital Twins (DTs) provides a framework for integrating...
Batch processes are widely used in industrial processes ensuring repeatable transition of raw materials into the desired products. Examples include chemical reactions and biological fermentation processes in chemical, pharmaceutical and other industries.
To optimize the performance and quality of end products various data analytical approaches can be considered depending on the available...
Sometimes measurements in different (industrial) sectors do not have a quantitative range, but a qualitative range consisting of a finite number of categories, which in turn exhibit a natural order. Several proposals have been made in the literature to monitor such data. In a comprehensive review, we present existing control charts that focus on independent and identically distributed ordinal...
In 2016, the ASA published their now famous statement on the use & misuse of p-values. In the same year, I started working with CL:AIRE (who represent the UK land contamination & remediation industry) to update their statistical guidance "Comparing soil contamination data with a critical concentration". CL:AIRE's older guidance used 1-way hypothesis testing that ran into many of the...
In this talk we reflect upon the ramifications of two decades of Lean Six Sigma implementations in Dutch healthcare institutions in the light of the current COVID-19 pandemic. We provide an evaluation of the impact that Lean Six Sigma implementations have had on the ability of Dutch healthcare institutions to respond adequately to healthcare needs during the COVID-19 crisis. An assessment of...
One of the challenges of the industrial revolution taking place today is the fact that engineers are increasingly faced with the need to deal with new types of data, which are significantly different from ordinary numerical data by virtue of their nature and the operations that can be performed with them (spectrograms, for example). Basic concepts related to processing of such data, e.g.: data...
Nowadays, in order to guarantee and preserve the high quality of their products, most manufacturing companies design monitoring schemes which allow abnormal events to be quickly, easily and efficiently recognised and their possible root causes to be correctly identified. Traditionally, these monitoring schemes are constructed calibrating a so-called in-control model on data collected...
The problem of allocation of large portfolios requires modeling joint distributions, for which the copula machinery is most convenient. While currently copula-based settings are used for a few hundred variables, we explore and promote the possibility of employing dimension-reduction tools to handle the problem in ultra-high dimensions, up to thousands of variables that use up to 30 times...
Design of experiments (DOE) [1], the key tool in the Six Sigma methodology, provides causal empirical models that allow process understanding and optimization. However, in the Industry 4.0 era, it may be difficult to carry them out, if not unfeasible, due to the generally high number of potential factors involved, and the complex aliasing [2]. Nevertheless, nowadays, large amounts of...
Brake pads and braking systems are among the parts of the vehicle that are harder to innovate. The extreme temperatures and pressures and the presence of dust make them an inhospitable environment for sensors and electronics. Despite these difficulties, GALT. | an ITT company managed to develop SmartPad, an innovative technology that acquires data from the braking pads. It aims to elaborate...
I will discuss sample size calculations and treatment effect estimation for randomized clinical trials under a model where the responses from the treatment group follow a mixture distribution. The mixture distribution is aimed at capturing the reality that not all treated patients respond to the treatment. Both fixed sample trials and group sequential trials will be discussed. It will be...
Predictive process monitoring aims to produce early warnings of unwanted events. We consider the use of the machine learning method extreme gradient boosting as the forecasting model in predictive monitoring. A tuning algorithm is proposed as the signaling method to produce a required false alarm rate. We demonstrate the procedure using a unique data set on mental health in the Netherlands....
During the last decades, the foreign ownership of domestic business has shown a strong increase at global level and now it represents an important share of the world economy. The economic literature widely debates on possible foreign ownership of business spillover effects on employment. This paper examines this presence in Italy during a crucial period for the country (2002-2007), right after...
The application of design of experiments has undergone major changes in the last two decades. For instance, optimal experimental design has gained substantial popularity, and definitive screening designs have been added to experimenters’ toolboxes. In this keynote lecture, I will focus on some of the newest developments in the area of experimental design. More specifically, I will introduce a...
Within the PrimaVera (Predictive maintenance for Very effective asset management) project, we carried out a case study on monitoring procedures for failure detection of bearing in diesel engines of ocean-going patrol vessels. Monitoring is based on bearing temperature, since the two most important failure modes (abrasive wear and cavitation) cause an increase in these temperatures.
A...