ENBIS-24 Leuven Conference

Name: ENBIS-24 Leuven Conference
Start: 2024-09-15T09:00:00+02:00
End: 2024-09-19T22:00:00+02:00
Location: Leuven, Belgium

15–19 Sept 2024

Leuven, Belgium

Europe/Berlin timezone

Chair of the Local Organising Committee

Contribution List

139. Screening and optimization with one single OMARS design: design, analysis, and optimization

Jose Nunez Ares (EFFEX)

15/09/2024, 14:00

60. Statistics in the Knowledge Economy

Prof. David Banks (Duke University)

16/09/2024, 09:30

Keynote

Opening keynote: David Banks

Statistics came of age when manufacturing was king. But today’s industries are focused on information technology. Remarkably, a lot of our expertise transfers directly. This talk will discuss statistics and AI in the context of computational advertising, autonomous vehicles, large language models, and process optimization

54. Accounting for misspecification in design based subsampling approaches: a comparative analysis

Laura Deldossi (Università Cattolica del Sacro Cuore)

16/09/2024, 11:00

Other/ Special/ Invited

SIS Invited session: Stratification, subsampling, randomization,issues and proposals

Supervised learning under measurement constraints presents a challenge in the field of machine learning. In this scenario, while predictor observations are available, obtaining response observations is arduous or cost-prohibitive. Consequently, the optimal approach involves selecting a subset of predictor observations, acquiring the corresponding responses, and subsequently training a...

57. Federated generalized linear model for multimodal data analysis

Mostafa Reisi Gahrooei (University of Florida)

16/09/2024, 11:00

Data Science

INFORMS invited session: Quality Statistics Reliability (QSR)

We propose a generalized linear model for distributed multimodal data, where each sample contains multiple data modalities, each collected by an instrument. Unlike the centralized methods that require access to all samples, our approach assumes that samples are distributed across several sites, and pooling the data is not allowable due to data sharing constraints. Our approach constructs a set...

67. Orthogonal Arrays for Practical Experimentation

Eric Schoen (KU Leuven, Belgium)

16/09/2024, 11:00

Other/ Special/ Invited

Invited session, Design of Experiments 1

In their simplest form, orthogonal arrays (OAs) are experimental designs where all level-combinations of any two factors occur equally often. As a result, the main effects of the factors are orthogonal to each other. There are also more involved OAs for which the level-combinations of any three factors occur equally often. In such OAs, the main effects are orthogonal to each other as well as...

49. SVM Regression Oblique Trees: A Novel Approach to Regression Tasks

Dr Andrea Carta (University of Cagliari)

16/09/2024, 11:00

Other/ Special/ Invited

ISBIS invited session: Advancements in Data-driven Insights

SVM Regression Oblique Trees: A Novel Approach to Regression Tasks. This technique combines feature selection based on predictor correlation and a weighted support vector machine classifier with a linear kernel. Evaluation on simulated and real datasets reveals the superior performance of the proposed method compared to other oblique decision tree models, with the added advantage of enhanced...

78. Constructing large OMARS designs by concatenating definitive screening designs

Alan Vazquez (Tecnologico de Monterrey)

16/09/2024, 11:30

Invited session, Design of Experiments 1

Orthogonal minimally aliased response surface (OMARS) designs permit the screening of quantitative factors at three levels using an economical number of runs. In these designs, the main effects are orthogonal to each other and to the quadratic effects and two-factor interactions of the factors, and these second-order effects are never fully aliased. Complete catalogs of OMARS designs with up...

21. Generation of all randomizations using circuits

Fabio Rapallo (University of Genova)

16/09/2024, 11:30

SIS Invited session: Stratification, subsampling, randomization,issues and proposals

After a rich history in medicine, randomized control trials (RCTs), both simple and complex, are in increasing use in other areas, such as web-based A/B testing and planning and design of decisions. A main objective of RCTs is to be able to measure parameters, and contrasts in particular, while guarding against biases from hidden confounders. After careful definitions of classical entities...

117. Optimizing Anomaly Detection in Printed Circuit Boards Using Machine Learning: The Critical Role of Data Volume for Enhanced Accuracy

Farnoosh Naderkhani (Concordia Univeristy)

16/09/2024, 11:30

AI in Industry

INFORMS invited session: Quality Statistics Reliability (QSR)

In the era of Industry 4.0, ensuring the quality of Printed Circuit Boards (PCBs) is essential for maintaining high product quality, reliability, and reducing manufacturing costs. Anomaly detection in PCB production lines plays a critical role in this process. However, imbalanced datasets and the complexities of diverse data types pose significant challenges. This study explores the impact of...

51. Two-sample tests based on data depth

Prof. Yuejiao Cindy Fu (York University)

16/09/2024, 11:30

Other/ Special/ Invited

ISBIS invited session: Advancements in Data-driven Insights

The focus is on the homogeneity test that evaluates whether two multivariate samples come from the same distribution. The problem arises naturally in various applications, and many methods are available in the literature. Based on data depth, several tests have been proposed for this problem, but they may not be very powerful. In light of the recent development of data depth as an important...

50. Improving the Usefulness of Statistical Classifiers

Prof. Daniel Jeske (University of California, Riverside)

16/09/2024, 12:00

Other/ Special/ Invited

ISBIS invited session: Advancements in Data-driven Insights

The use of a statistical classifier can be limited by its conditional misclassification rates (i.e., false positive rate and false negative rate) even when the overall misclassification rate is satisfactory. When one or both conditional misclassification rates are high, a neutral zone can be introduced to lower and possibly balance these rates. In this talk the need for neutral zones will be...

8. Subgroup Analyses: Exploring Stratification Strategies for Population- vs. Randomization-Based Inference

Marco Novelli (University of Bologna)

16/09/2024, 12:00

Other/ Special/ Invited

SIS Invited session: Stratification, subsampling, randomization,issues and proposals

Stratification on important variables is a common practice in clinical trials,
since ensuring cosmetic balance on known baseline covariates is often deemed to be a crucial requirement for the credibility of the experimental results. However, the actual benefits of stratification are still debated in the literature. Other authors have shown that it does not improve efficiency in large samples...

132. Tensor-Based Temporal Control for Partially Observed High-Dimensional Streaming Data

Kamran Paynabar (Georgia Tech Atlanta)

16/09/2024, 12:00

INFORMS invited session: Quality Statistics Reliability (QSR)

In advanced manufacturing processes, high-dimensional (HD) streaming data (e.g., sequential images or videos) are commonly used to provide online measurements of product quality. Although there exist numerous research studies for monitoring and anomaly detection using HD streaming data, little research is conducted on feedback control based on HD streaming data to improve product quality,...

87. Two-Level Designs within Three-Level Designs

Max Morris (Iowa State University)

16/09/2024, 12:00

Invited session, Design of Experiments 1

Much has been written about augmenting preliminary designs for first-order regression with additional runs to support quadratic models. This is a reasonable approach to practical sequential experimentation, allowing an early stop if the preliminary first-order result does not look promising. Central composite designs are especially well-suited to this (Box and Wilson, 1951), as all or part of...

74. Knowledge Transfer from Marketing to Healthcare - Using Customer Lifetime Value in Dental Practice

Andrea Ahlemeyer-Stubbe (Ahlemeyer-Stubbe)

16/09/2024, 13:30

Stochastic Modelling

Stochastic modelling

Dental practices are a small business. Like any other business, they need cash flow management and financial planning to be viable, if not highly profitable. What a lot of practices may not realize is that they are sitting on a treasure trove of data to be used in more ways than plain accounting and financial forecasting. Here we focus on longitudinal data, such as the timing of each patient’s...

15. Nonparametric Time Between Events and Amplitude Control Charts for Drought Understanding and Monitoring

Michele Scagliarini (University of Bologna)

16/09/2024, 13:30

Process modelling and Control

Process Monitoring and Control

Drought is a major natural hazard which can cause severe consequences on agricultural production, the environment, the economy and social stability. Consequently, increasing attention has been paid to drought monitoring methods that can assist governments in implementing preparedness plans and mitigation measures to reduce the economic, environmental, and social impacts of drought. The...

19. Practical views on quantifying performance of prediction models

Mr Jan-Willem Bikker (CQM)

16/09/2024, 13:30

Machine Learning

Machine learning I

In recent decades, machine learning and industrial statistics have moved closer to each other. CQM, a consultancy company, performs projects in supply chains, logistics, and industrial R&D that often involve building prediction models using techniques from machine learning. For these models, challenges persist, e.g. if the dataset is small, has a group structure, or is a time series. At the...

10. Tool for One-way and Two-way CATANOVA and ORDANOVA for Analysis of Variation in a Cross-balanced Design and Power of a Test

Dr Yariv N. Marmor (BRAUDE - College of Engineering, Karmiel)

16/09/2024, 13:30

Stochastic Modelling

Handling categorical and ordinal data

A tool for analysis of variation of qualitative (nominal) or semi-quantitative (ordinal) data obtained according to a cross-balanced design is developed based on one-way and two-way CATANOVA and ORDANOVA. The tool calculates the frequencies and relative frequencies of the variables, and creates the empirical distributions for the data. Then the tool evaluates the total data variation and its...

76. Application of spatial point processes to the study of brain tumor localization

Dr Sophie Mézières (INRIA BIGS - IECL)

16/09/2024, 13:50

Stochastic Modelling

Stochastic modelling

Gliomas are the most common form of primary brain tumors. Diffuse Low-Grade Gliomas (DLGG) are slow growing tumors, and often asymptomatic during a long period. They turn into a higher grade, leading to the patients’ death. Treatments are surgery, chemotherapy and radiotherapy, with the aim of controlling tumor evolution. Neuro-oncologists estimate the tumor size evolution by delineating tumor...

59. Random forest and transfer learning based methods for causality.

Véronique Maume-Deschamps (Institut Camille Jordan, Université Claude Bernard Lyon 1)

16/09/2024, 13:50

Machine Learning

Machine learning I

Conditional Average Treatment Effect (CATE) is widely studied in medical contexts. It is one tool used to analyze causality. In the banking sector, the interest for causality methods increases. As an example, one may be interested in estimating the average effect of a financial crisis on credit risk, conditionally to macroeconomic as well as internal indicators. On one other hand, transfer...

56. Repeatability and Reproducibility of categorical measurements

Dr Yariv N. Marmor (BRAUDE - College of Engineering, Karmiel)

16/09/2024, 13:50

Machine Learning

Handling categorical and ordinal data

Classification precision is particularly crucial in scenarios where the cost of false output is high, e.g. medical diagnosis, search engine results, product quality control etc. A statistical model for analyzing classification's precision from collaborative studies will be presented. Classification (categorical measurement) means that the object’s property under study is presented by each...

42. Statistical and Machine Learning Methods in Monitoring: A Case Study on Student Performance

Paulina von Stackelberg (University of Amsterdam)

16/09/2024, 13:50

Predictive Analytics

Process Monitoring and Control

Despite the success of machine learning in the past several years, there has been an ongoing debate regarding the superiority of machine learning algorithms over simpler methods, particularly when working with structured, tabular data. To highlight this issue, we outline a concrete example by revisiting a case study on predictive monitoring in an educational context. In their work, the authors...

80. A Unified Framework for Choice Experiments: Integrating Strict Preferences and Ties

Yuan Xi (KU Leuven)

16/09/2024, 14:10

Stochastic Modelling

Handling categorical and ordinal data

In most discrete choice experiments (DCEs), respondents are asked to choose their preferred alternative. But it is also possible to ask them to indicate the worst, or the best and worst alternative among the provided alternatives or to rank all or part of the alternatives in decreasing preference. In all these situations, it is commonly assumed that respondents only have strict preferences...

121. Applications of machine learning methods for parameter estimation in spreading processes on hypergraphs

András Zempléni (Eötvös Loránd University, Budapest)

16/09/2024, 14:10

Machine Learning

Machine learning I

In spreading processes such as opinion spread in a social network, interactions within groups often play a key role. For example, we can assume that three members of the same family have higher chance to persuade a fourth member to change their opinion than three friends of the same person who do not know each other, and hence who do not belong to the same community. The other way around, in a...

129. Automated Paths Detection in Composite-Based Structural Equation Modeling

Laura Trinchera (NEOMA Business School)

16/09/2024, 14:10

Stochastic Modelling

Stochastic modelling

Structural Equation Models (SEMs) are primarily employed as a confirmatory approach to validate research theories. SEMs operate on the premise that a theoretical model, defined by structural relationships among unobserved constructs, can be tested against empirical data by comparing the observed covariance matrix with the implied covariance matrix derived from the model parameters....

70. Assessing the Calibration and Performance of Attention-based Spatiotemporal Neural Networks for Lightning Prediction

Nathan Gaw (Air Force Institute of Technology)

16/09/2024, 14:35

Applications of learning and statistics

Lightning is a chaotic atmospheric phenomenon that is incredibly challenging to forecast accurately and poses a significant threat to life and property. Complex numerical weather prediction models are often used to predict lightning occurrences but fail to provide adequate short-term forecasts, or nowcasts, due to their design and computational cost. In the past decade, researchers have...

6. Omnibus Control Charts for Poisson Counts

Prof. Christian Weiß (Helmut Schmidt University)

16/09/2024, 14:35

Process modelling and Control

Statistical Process control 1

Existing control charts for Poisson counts are tailor-made for detecting changes in the process mean while the Poisson assumption is not violated. But if the mean changes together with the distribution family, the performance of these charts may deviate considerably from the expected out-of-control behavior. In this research, omnibus control charts for Poisson counts are developed, which are...

14. Optimal Experimental Designs for Process Robustness Studies

Ying Chen (KU Leuven)

16/09/2024, 14:35

DoE

DoE for Industry I

In process robustness studies, experimenters are interested in comparing the responses at different locations within the normal operating ranges of the process parameters to the response at the target operating condition. Small differences in the responses imply that the manufacturing process is not affected by the expected fluctuations in the process parameters, indicating its robustness. In...

71. Pattern matching for multivariate time series forecasting

Noé Lebreton (Université Lyon 2)

16/09/2024, 14:35

Predictive Analytics

Machine learning for time series

The aim of pattern matching is to identify specific patterns in historical time series data to predict future values. Many pattern matching methods are non-parametric and based on finding nearest neighbors. This type of method is founded on the assumption that past patterns can be repeated and provide informations about future trends. Most of the methods proposed in the literature are...

27. A Class of Saturated Mixed Level Designs for Even Numbers of Runs

Bradley Jones (EFFEX)

16/09/2024, 14:55

DoE

DoE for Industry I

Screening experiments often require both continuous and categorical factors. In this talk I will introduce a new class of saturated main effects designs containing m three-level continuous factors and m − 1 two-level discrete or continuous factors in n = 2m runs, where m ≥ 4. A key advantage is that these designs are available for any even n ≥ 8. With effect sparsity a few quadratic effects...

9. CUSUM Charts based on Individual Observations for Monitoring a Shifted Exponential Process

Dr Athanasios Rakitzis (University of Piraeus, Department of Statistics and Insurance Science)

16/09/2024, 14:55

Process modelling and Control

Statistical Process control 1

The shifted (or two-parameter) exponential distribution is a well-known model for lifetime data with a warranty period. Apart from that, it is useful in modelling survival data with some flexibility due to its two-parameter representation. Control charts for monitoring a process that it is modeled as a shifted exponential distribution have been studied quite extensively in the recent...

84. Measuring the size distribution of ultrafine particles by means of diffusion selectors

Efoevi Angelo Koudou (Université de Lorraine, Institut Elie Cartan de Lorraine)

16/09/2024, 14:55

Stochastic Modelling

Applications of learning and statistics

The problem of measuring the size distribution of ultrafine (nano and submicron-sized) particles is important to determine the physical and chemical properties of aerosols, their toxicity. We give a quick review of some statistical methods used in the literature to solve this problem, for instance an EM algorithm for the reconstruction of particle size distributions from diffusion battery...

33. Multivariate Singular Spectrum Analysis by Robust Diagonalwise Low-Rank Approximation

Dr Fabio Centofanti (University of Naples Federico II)

16/09/2024, 14:55

Machine Learning

Machine learning for time series

Multivariate Singular Spectrum Analysis (MSSA) is a nonparametric tool for time series analysis widely used across finance, healthcare, ecology, and engineering. Traditional MSSA depends on singular value decomposition that is highly susceptible to outliers. We introduce a robust version of MSSA, named Robust Diagonalwise Estimation of SSA (RODESSA), that is able to resist both cellwise and...

90. Foreign affiliates and productivity in EU countries

Dr Abdoulaye Kané (Research engineer at CSTB; Researcher affiliated with the Paris Nanterre University, EconomiX), Dr Antonio Frenda (Italian National Institute of Statistics)

16/09/2024, 15:15

Applications of learning and statistics

This study examines the relationship between foreign affiliates and labour productivity in the construction and manufacturing sectors. Labour productivity is calculated using EUKLEMS & INTANProd database of the Luiss Lab of European Economics, while foreign affiliates abroad data are taken from Eurostat. With the help of data coming from 19 EU countries between 2010 and 2019, we demonstrate...

92. Physical explanations of AI-time series forecasting using BAPC

ALFREDO LOPEZ (Software Competence Center Hagenberg SCCH)

16/09/2024, 15:15

Machine Learning

Machine learning for time series

Our previous contribution to ENBIS included an introduction of BAPC (‘Before and After correction Parameter Comparison’), a framework for explainable AI time series forecasting, which has formerly been applied to logistic regression. An initially non-interpretable predictive model (such as neural network) to improve the forecast of a classical time series ’base model’ is used. Explainability...

53. The Connection between Mixed-Level OMARS Designs and Orthogonal Mixed-Level Designs

Peter Goos (KU Leuven)

16/09/2024, 15:15

DoE

DoE for Industry I

Orthogonal minimally aliased response surface or OMARS designs are an extensive family of experimental designs, bridging the gap between definitive screening designs and traditional response surface designs. Their technical properties render OMARS designs suitable to combine a screening experiment and a response surface experiment in one. The original OMARS designs are intended for...

12. The EWMA Control Chart for Monitoring the Ratio of Variances

Prof. Su-Fen Yang (National Chengchi University)

16/09/2024, 15:15

Process modelling and Control

Statistical Process control 1

This article constructs a control chart for monitoring a ratio of two variances within a bivariate-distributed population. For an in-control process, we assume the in-control two variances and the covariance of the bivariate-distributed population are known. The ratio of two variances is equivalent to a difference of the two variances. An unbiased estimator of the difference between the two...

143. Causal latent space-based models for scientific learning in Industry 4.0

Joan Borràs-Ferrís (Universitat Politècnica de València)

16/09/2024, 16:00

Award session, young statistician and best manager: Joan Borràs-Ferrís and Bruno Boulanger, Greenfield challenge: Jonathan Smyth-Renshaw

Causality is a fundamental concept in the scientific learning paradigm. For this purpose, deterministic models are always desirable, but they are often unfeasible due to the lack of knowledge. In such cases, empirical models fitted on process data can be used instead. Moreover, the advent of Industry 4.0 and the growing popularity of the Big Data movement have caused a recent shift in process...

146. Bayesian Statistics in Pharmaceutical R & D & Manufacturing

Bruno Boulanger (PharmaLex)

16/09/2024, 16:20

Award session, young statistician and best manager: Joan Borràs-Ferrís and Bruno Boulanger, Greenfield challenge: Jonathan Smyth-Renshaw

In the pharmaceutical industry, the use of statistics has been largely driven by clinical development, an area where frequentist statistics have been and remain dominant. This approach has led to numerous successes when considering the various effective treatments available to patients today.
However, over time, Null Hypothesis Significance Testing (NHST) and related Type-I error thinking...

34. Fast Linear Model Trees by PILOT

Jakob Raymaekers (University of Antwerp)

17/09/2024, 09:00

Machine Learning

Machine learning 2

Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In...

75. Mitigating raw material variation in manufacturing processes: A novel multivariate SPC scheme based on Sequential Multi-Block PLS

Prof. Alberto Ferrer (Universitat Politècnica de València, Department of Applied Statistics and Operational Research and Quality)

17/09/2024, 09:00

Process modelling and Control

Statistical Process control 2

In modern manufacturing processes, one may encounter processes composed of two or more critical input blocks having an impact on Y-space. If these blocks follow a sequential order, any cause of variation in a particular block may be propagated to subsequent blocks. This is frequently observed when a first block of raw material properties entering a production process influence the performance...

29. Multivariate statistics on hyperspectral images: challenges and importance of interpretability in textile sorting

Dr Giulia Gorla (University of Basque Country)

17/09/2024, 09:00

Sustainability

Hyperspectral imaging is an instrumental method that allows obtaining images where each pixel contains information in a specific range of the electromagnetic spectrum. Initially used for military and satellite applications, hyperspectral imaging has expanded to agriculture, pharmaceuticals, and the food industry. In recent decades, there has been an increasing focus on such analytical...

26. Simplifying experimental design generation for real-world challenges

Mathias Born (KU Leuven)

17/09/2024, 09:00

DoE

DoE for Industry 2

In today’s fast-paced industrial landscape, the need for faster and more cost-effective research and development cycles is paramount. As experiments grow increasingly complex, with more factors to optimize, tighter budgetary and time constraints, and limited resources, the challenges faced by industry professionals are more pressing than ever before.

Although the optimal design of...

73. A Bandit Approach With Evolutionary Operators for Model Selection: Application to Neural Architecture Optimization for Image Classification

Julie Keisler (EDF R&D)

17/09/2024, 09:20

Machine Learning

Machine learning 2

This work formulates model selection as an infinite-armed bandit problem, namely, a problem in which a decision maker iteratively selects one of an infinite number of fixed choices (i.e., arms) when the properties of each choice are only partially known at the time of allocation and may become better understood over time, via the attainment of rewards.
Here, the arms are machine learning...

36. A Bayesian Self-starting Hotelling (BSSH) T2 for Online Multivariate Outlier Detection

Panagiotis Tsiamyrtzis (Politecnico di Milano)

17/09/2024, 09:20

Process modelling and Control

Statistical Process control 2

Online outlier detection in multivariate settings is a topic of high interest in several scientific fields, with the Hotelling's T2 control chart being probably the most widely used method in practice to treat it. The problem becomes challenging though when we lack the ability to perform a proper phase I calibration, like in short runs or in cases where online inference is requested from the...

47. Messy Energy Data: Sense-making via changepoint and anomaly detection

Dr Paul Smith (Lancaster University)

17/09/2024, 09:20

Sustainability

With the routine collection of energy management data at the organisational level comes a growing interest in using data to identify opportunities to improve energy use. However, changing organisational priorities can result in data streams which are typically very messy; with missing periods, poor resolution and containing structures that are challenging to contextualise. Using operational...

38. Optimal two-level designs under model uncertainty

Steven Gilmour (King's College London)

17/09/2024, 09:20

DoE

DoE for Industry 2

Two-level designs are widely used for screening experiments where the goal is to identify a few active factors which have major effects. We apply the model-robust Q_B criterion for the selection of optimal two-level designs without the usual requirements of level balance and pairwise orthogonality. We provide a coordinate exchange algorithm for the construction of Q_B-optimal designs for the...

118. From chemical fingerprints to environmental footprints: advancing feed production through near-infrared spectroscopy, Life Cycle Assessment and novel Artificial Intelligence

Jeroen Jansen (Radboud University)

17/09/2024, 09:40

Sustainability

Process Analytical Technologies have been the key technology of quality maintenance and improvement in process industry. Quality is however only one indicator of process excellence: Safety, Cost, Delivery, Maintenance and specifically Environment are strongly complementary determinants of process value. The rising societal demands on sustainability of contemporary process industry has made...

120. Optimal experimental design when not all tests are equally expensive

Mohammed Saif Ismail Hameed (KU Leuven)

17/09/2024, 09:40

DoE

DoE for Industry 2

Industrial experiments often have a budget which directly translates into an upper limit on the number of tests that can be performed. However, in situations where the cost of the experimental tests is unequal, there is no one-to-one relation between the budget and the number of tests. In this presentation, we propose a design construction method to generate optimal experimental designs for...

41. AdDownloader - Automating the retrieval of advertisements and their media content from the Meta Online Ad Library

Paula-Alexandra Gitu (Maastricht University)

17/09/2024, 10:05

Data Science

AdDownloader is a Python package for downloading advertisements and their media content from the Meta Online Ad Library. With a valid Meta developer access token, AdDownloader automates the process of downloading relevant ads data and storing it in a user-friendly format. Additionally, AdDownloader uses individual ad links from the downloaded data to access each ad's media content (i.e. images...

89. Optimizing Industrial Systems with Hybrid Information Quality

Marco P. Seabra dos Reis (Department of Chemical Engineering, University of Coimbra)

17/09/2024, 10:05

AI in Industry

AI in industry 1

Industry 4.0 contexts generate large amounts of data holding potential value for advancing product quality and process performance. Current research already uses data-driven models to refine theoretical models, but integrating mechanistic understanding into data-driven models is still overlooked. This represents an opportunity to harness extensive data alongside fundamental principles.
We...

17. Time-to-Failure Models for Health Monitoring of Railway Track Failures

Nikolaus Haselgruber (CIS Consulting in Industrial Statistics GmbH)

17/09/2024, 10:05

Reliability and Safety

Fault detection and monitoring

Squats are punctual material failures at railway tracks which can lead to critical effects when not detected or removed in time. Investigations in the last years (c.f., e.g., [1], [2], [3], [4]) pointed out the severity of this problem, although relevant questions about root causes remain open. A main reason for this situation may be the challenging detectability of squat genesis as well as...

43. Which query strategy should one choose in active learning of real-valued functions when the model class is possibly misspecified?

Bernhard Spangl (University of Natural Resources and Life Sciences, Vienna)

17/09/2024, 10:05

Machine Learning

Regression

We discuss the problem of active learning in regression scenarios. In active learning, the goal is to provide criteria that the learning algorithm can employ to improve its performance by actively selecting data that are most informative.

Active learning is usually thought of as being a sequential process where the training set is augmented one data point at a time. Additionally, it is...

18. Assessing general process stability with local regression

Mr Karel Kupka (TriloByte Statistical Software)

17/09/2024, 10:25

Process modelling and Control

Regression

Process stability is usually defined using iid assumption about data. However violating stability requires some concrete model like changepoint, linear trend, outliers, distributional models, positive or negative autocorrelation, etc. These violations are often tested separately and not all of the possible modes of instability can always be taken into account. We suggested a likelihood-based...

102. Evaluating classifier performance in hard classification tasks

Amalia Vanacore (Department of Industrial Engineering University of Naples Federico II), Dr Armando Ciardiello (Dept. of Industrial Engineering, University of Naples Federico II)

17/09/2024, 10:25

Data Science

Different data difficulty factors (e.g., class imbalance, class overlapping, presence of outliers and noisy observations and difficult border decisions) make classification tasks challenging in many practical applications and are hot topics in the domain of pattern recognition, machine learning and deep learning. Data complexity factors have been widely discussed in specialized literature from...

40. Network-Informed Bayesian Anomaly Detection by using Gaussian Processes

Konstantinos Bourazas (Athens University of Economics and Business)

17/09/2024, 10:25

Machine Learning

Fault detection and monitoring

Anomaly detection identifies cases that deviate from a common behavior or pattern in data streams. It is of great interest in a variety of fields, e.g., from biology recognizing uncommon observations in genetic data, to financial sectors identifying frauds through unusual economic activities. Detection of anomalies can be formulated as a binary classification problem, distinguishing between...

97. Simplifying repetitive manual work: AI application to summarizing Technical Documentation of In Vitro Diagnosis Medical Devices in preparation for regulatory submission

Dr Marina Vives-Mestres (Datancia)

17/09/2024, 10:25

AI in Industry

AI in industry 1

In various global regions, In Vitro Diagnostic Medical Devices (IVDs) must adhere to specific regulations in order to be marketed. To obtain approval from entities such as the U.S. Food and Drug Administration (FDA), the In Vitro Diagnostic Medical Devices Regulation (IVDR) in Europe, Health Canada, or Japanese regulatory bodies, manufacturers are required to submit Technical Documentation to...

100. Leveraging Transfer Learning for Efficient Bioprinting

Filippo Bracco (Politecnico di Milano)

17/09/2024, 10:45

AI in Industry

AI in industry 1

Bioprinting is an innovative set of technologies derived from additive manufacturing, with significant applications in tissue engineering and regenerative medicine. The quality of printed constructs is commonly measured in terms of shape fidelity trough a procedure known as printability assessment. However, the cost of experimental sampling and the complexity of various combinations of...

45. On the equivalence between null space and orthogonal space in latent variable modeling

Sergio García Carrión (Universitat Politècnica de València (UPV))

17/09/2024, 10:45

Process modelling and Control

Regression

The concepts of null space (NS) and orthogonal space (OS) have been developed in independent contexts and with different purposes.
The former arises in the inversion of Partial Least Squares (PLS) regression models, as first proposed by Jaeckle & MacGregor [1], and represents a subspace in the latent space within which variations in the inputs do not affect the prediction of the outputs. The...

28. Bridge Monitoring with the Help of Local Distance Correlation

Carina Beering (Helmut Schmidt University)

17/09/2024, 11:30

Structural Health Monitoring

Invited session: Structural Health Monitoring

When analyzing sensor data, it is important to distinguish between environmental effects and actual defects of the structure. Ideally, sensor data behavior can be explained and predicted by environmental effects, for example via regression. However, this is not always the case, and explicite formulas are often missed. Then, comparing the behavior of environmental and sensor data can help to...

94. Forecasting offshore wind energy: non-linearity, non-stationarity and varying bounds

Amandine PIERROT (University of Bath)

17/09/2024, 11:30

Young statisticians

Young statisticians invited session

Forecasting is of the utmost importance to the integration of renewable energy into power systems and electricity markets. Wind power fluctuations at horizons of a few minutes ahead particularly affect the system balance and are the most significant offshore. Therefore, we focus on offshore wind energy short-term forecasting.

Since forecasts characterize but do not eliminate uncertainty,...

61. Optimising for average reward in a continuing environment: an application to industrial production planning

Paul Berhaut (Air Liquide)

17/09/2024, 11:30

AI in Industry

frEnbis invited session: Deep learning in industry

Our research addresses the industrial challenge of minimising production costs in an undiscounted, continuing, partially observable setting. We argue that existing state-of-the-art reinforcement learning algorithms are unsuitable for this context. We introduce Clipped Horizon Average Reward (CHAR), a method tailored for undiscounted optimisation. CHAR is an extension applicable to any...

109. Post Mortem of a Failed Statistical Engineering Project

Geoff Vining (Virginia Tech Statistics Department)

17/09/2024, 11:30

Other/ Special/ Invited

ISEA Invited session: Statistical engineering

An important axiom in innovation is “Fail early, fail often, but learn from the failures.” This talk discusses an academic-industrial statistical engineering project that initially had good prospects for success but ultimately provided virtually no benefit to the industrial partner although it did produce a nice dissertation for the PhD student assigned to the project. It is crucial to note...

108. Advances in Stress Rupture Modeling: A Case Study for Predicting COPV Reliability

Anne Driscoll (Virginia Tech)

17/09/2024, 12:00

Other/ Special/ Invited

ISEA Invited session: Statistical engineering

In this presentation, we present a case study that results from a multi-stage project supported by NASA’s Engineering Safety Center (NESC) where the objective was to assess the safety of composite overwrapped pressure vessels (COPVs). The analytical team was tasked with devising a test plan to model stress rupture failure risk in carbon fiber strands that encase the COPVs with the goal of...

66. AI Friendly Hacker : when an AI reveals more than it should...

Vincent Thouvenot (Thales SIX GTS France)

17/09/2024, 12:00

Other/ Special/ Invited

frEnbis invited session: Deep learning in industry

The aim of AI based on machine learning is to generalize information about individuals to an entire population. And yet...
- Can an AI leak information about its training data?
- Since the answer to the first question is yes, what kind of information can it leak?
- How can it be attacked to retrieve this information?

To emphasize AI vulnerability issues, Direction Générale de l’Armement...

55. Bayesian Nonparametric Clustering for Structural Health Monitoring

Tim Rogers (University of Sheffield)

17/09/2024, 12:00

Structural Health Monitoring

Invited session: Structural Health Monitoring

In data-driven Structural Health Monitoring (SHM), a key challenge is the lack of availability of training data for developing algorithms which can detect, localise and classify the health state of an engineering asset. In many cases, it is additionally not possible to enumerate the number of operational or damage classes prior to operation, so the number of classes/states is unknown. This...

85. From Zero to Hero: Developing a Probabilistic Analyzer for Manufacturing Processes with Bayesian Networks

Valeria Fonseca Diaz (Software Competence Center Hagenberg)

17/09/2024, 12:00

Young statisticians

Young statisticians invited session

Manufacturing processes are systems composed of multiple stages that transform input materials into final products. Drawing inferences about the behavior of these systems for decision-making requires building statistical models that can define the flow from input to output. In the simplest scenario, we can model the entire process as a single-stage relationship from input to output. In the...

62. Covariate-adjusted Sensor and System Outputs for Structural Health Monitoring: A Functional Data Approach

Philipp Wittenberg (Helmut Schmidt University)

17/09/2024, 12:30

Structural Health Monitoring

Invited session: Structural Health Monitoring

Structural Health Monitoring (SHM) is increasingly applied in civil engineering. One of its primary purposes is detecting and assessing changes in structure conditions to reduce potential maintenance downtime. Recent advancements, especially in sensor technology, facilitate data measurements, collection, and process automation, leading to large data streams. We propose a function-on-function...

107. Deep learning applications in electricity markets

Yvenn Amara-Ouali (Université Paris Saclay)

17/09/2024, 12:30

Other/ Special/ Invited

frEnbis invited session: Deep learning in industry

In this presentation, we provide an overview of deep learning applications in electricity markets, focusing on several key areas of forecasting. First, we discuss state-of-the-art methods for forecasting electricity demand, including Generalised Additive Models (GAMs), which inspired the work that follows. Second, we look at multi-resolution forecasting, which uses data at high- and...

106. Planning for Success: The Importance of Statistical Engineering

Jennifer Kensler (Shell)

17/09/2024, 12:30

Other/ Special/ Invited

ISEA Invited session: Statistical engineering

The International Statistical Engineering Association (ISEA) defines statistical engineering as "the discipline dedicated to the art and science of solving complex problems that require data and data analysis." Statistical Engineering emphasizes the importance of understanding the problem and its context before developing a problem-solving strategy. While this step may appear obvious, it is...

114. Robust multiway PCA for casewise and cellwise outliers

Mehdi Hirari (KU Leuven)

17/09/2024, 12:30

Data Mining and Warehousing

Young statisticians invited session

Multi-way data extend two-way matrices to a higher-dimensional tensor. In many fields, it is relevant to pursue the analysis of such data by keeping it in its initial form without unfolding it into a matrix. Often, multi-way data are explored by means of dimensional reduction techniques. Here, we study the Multilinear Principal Component Analysis (MPCA) model, which expresses the multi-way...

63. Cellwise Outliers: Challenges and Solutions

Prof. Peter Rousseeuw (KU Leuven)

17/09/2024, 14:00

Box award

George Box Award: Peter Rousseeuw

It is well-known that real data often contain outliers. The term outlier typically refers to a case, corresponding to a row of the $n \times d$ data matrix. In recent times also cellwise outliers are being considered. These are suspicious cells (entries) that can occur anywhere in the data matrix. Even a relatively small proportion of outlying cells can contaminate over half the rows, which is...

103. Development of a software tool with case studies for learning statistical data analyses

Sonja Kuhnt (Dortmund University of Applied Sciences and Arts)

17/09/2024, 15:05

Teaching statistics

Project and problem-based learning is becoming increasingly important in teaching. In statistics courses in particular, it is important not only to impart statistical knowledge, but also to keep an eye on the entire process of data analysis. This can best be achieved with case studies or data analysis projects. In the IPPOLIS project, we are developing a software learning tool that allows...

37. Generalizing the Generalized Variance Criterion for Simultaneous Prediction

Helmut Waldl (Johannes Kepler University Linz)

17/09/2024, 15:05

DoE

Design of experiments 2

In this talk, the problem of selecting a set of design points for universal kriging,
which is a widely used technique for spatial data analysis, is further
investigated. We are interested in optimal designs for prediction and present
a new design criterion that aims at simultaneously minimizing the variation
of the prediction errors at various points. This optimality criterion is based
on...

39. Learning Ordinal Preferences on Wearable Devices through Reinforcement Learning using Time-Dependent States

Mr Simon Weinberger (EssilorLuxottica & Université Lumière Lyon 2)

17/09/2024, 15:05

AI in Industry

AI in industry 2

Reinforcement learning proposes a flexible framework that allows to tackle problems where data is gathered in a dynamic context: actions have an influence on future states. The classical reinforcement learning paradigm depends on a Markovian hypothesis, the observed states depend upon past states only through the last state and action. This condition may be too restrictive for real-world...

68. Marble-ous DOE - A fun 3D-printed teaching tool for the classroom

Dr Morten Bormann Nielsen (Danish Technological Institute)

17/09/2024, 15:25

DoE

Teaching statistics

The "DOE Marble Tower" is a modular 3D-printed experiment system for teaching Design of Experiments. I designed it to solve one primary weakness of most DOE exercises, namely to prevent the ability of the experimenter to simply look at the system to figure out what each factor does. By hiding the mechanics, the DOE Marble Tower feels much more like real processes where the only way to know the...

130. NeuroBayes Design Optimizer (NBDO)

Theodoros Ladas (King's College London)

17/09/2024, 15:25

DoE

Design of experiments 2

Finding an optimal experimental design is computationally challenging, especially in high-dimensional spaces. To tackle this, we introduce the NeuroBayes Design Optimizer (NBDO), which uses neural networks to find optimal designs for high-dimensional models, by reducing the dimensionality of the search space. This approach significantly decreases the computational time...

98. Optimizing Polymeric Fed-Batch Processes Using Reinforcement Learning

Mr Omid Sobhani (University of Antwerp - imec)

17/09/2024, 15:25

AI in Industry

AI in industry 2

Reinforcement Learning (RL) has emerged as a pivotal tool in the chemical industry, providing innovative solutions to complex challenges. RL is primarily utilized to enhance chemical processes, improve production outcomes, and minimize waste. By enabling the automation and real-time optimization of control systems, RL aims to achieve optimal efficiency in chemical plant operations, thereby...

126. Mini workshop: having fun with statistics

Dr Stefanie Feiler (FHNW School of Life Sciences)

17/09/2024, 15:45

Other/ Special/ Invited

Teaching statistics

Over the years I've seen diverse examples of fun elements in teaching statistics at the ENBIS. Of course paper helicopters and using catapults, or candle or water beads projects, for hands-on experience with DoE. But I also vividly remember a Lego assembly competition used for explaining control charts and process control.

Fun parts boost motivation and serve as anchors to remember...

112. Multimodal AI for Warehouses Monitoring

Emma van Doren

17/09/2024, 15:45

AI in Industry

AI in industry 2

The integration of multimodal artificial intelligence (AI) in warehouses monitoring offers substantial improvements in efficiency, accuracy, and safety. This approach leverages diverse data sources, including visual, and speech sensors, to provide comprehensive monitoring capabilities. Key challenges include the fusion of heterogeneous data streams, which requires sophisticated algorithms to...

2. Acceleration Invariance Principle for Hougaard Processes in Degradation Analysis

Chien-Yu Peng (Academia Sinica)

18/09/2024, 09:00

Reliability and Safety

Accelerated degradation tests (ADTs) are widely used to assess lifetime information under normal use conditions for highly reliable products. For the accelerated tests, two basic assumptions are that changing stress levels does not affect the underlying distribution family and that there is stochastic ordering for the life distributions at different stress levels. The acceleration invariance...

81. Combining profile likelihood with Bayesian estimation for the Crow-AMSAA process

Marek Skarupski (Eindhoven University of Technology)

18/09/2024, 09:00

Stochastic Modelling

Bayesian methods

Storage of spare parts is one of the basic tasks set by the industry. Mathematical models, such as Crow-AMSAA (known in the Statistical Literature as the Power-Law Nonhomogeneous Poisson Process), allow us to estimate the demand based on coming data. Unfortunately, the amount of data is limited in the case of parts with high reliability, which is why the estimation is inaccurate. Bayesian...

113. Multispectral Imaging Flow for Industrial Applications

Lukas De Greve

18/09/2024, 09:00

AI in Industry

Image processing in Industry

Multispectral imaging, enhanced by artificial intelligence (AI), is increasingly applied in industrial settings for quality control, defect detection, and process optimization. However, several challenges hinder its widespread adoption. The complexity and volume of multispectral data necessitate advanced algorithms for effective analysis, yet developing these algorithms is resource-intensive....

11. Multivariate vs. multi-stream EWMA control charts

Sven Knoth (Helmut Schmidt University Hamburg, Germany)

18/09/2024, 09:00

Process modelling and Control

Multivariate statistical process control

Multivariate EWMA control charts were introduced in Lowry et al. in 1992 and became a popular and effective tool for monitoring multivariate data. However, multi-stream data are somehow related to the aforementioned framework. In both cases, correlation between the components respective streams is considered. However, whereas the multivariate EWMA charts deploys a distance (Mahalanobis) in the...

64. A Bayesian risk-based approach to the design of single-sampling plans for attributes

Muhammad Ajmal (Ghent University)

18/09/2024, 09:20

Process modelling and Control

Bayesian methods

Acceptance sampling plays an imperative role in quality control. It is a common technique employed across various industries to assess the quality of products. The decision to accept or reject a lot depends upon the inspection of a random sample from that lot. However, traditional approaches often overlook valuable prior knowledge of product quality. Moreover, existing Bayesian literature...

119. Active learning for quality inspecting with synthetic hot-start approach

Mr Aaron De Rybel (FlandersMake)

18/09/2024, 09:20

AI in Industry

Image processing in Industry

In the pharmaceutical industry, there are strict requirements on the presence of contaminants inside single-use syringes (so-called unijects). Quality management systems include various methods such as measuring weight, manual inspection or vision techniques. Automated and accurate techniques for quality inspection are preferred, reducing the costs and increasing the speed of production.

...

32. An Adaptive EWMA Monitoring Scheme for Multivariate Functional Data

Christian Capezza (Department of Industrial Engineering, University of Naples "Federico II")

18/09/2024, 09:20

Process modelling and Control

Multivariate statistical process control

In modern industrial settings, the complexity of quality characteristics necessitates advanced statistical methods using functional data. This work extends the traditional Exponentially Weighted Moving Average (EWMA) control chart to address the statistical process monitoring (SPM) of multivariate functional data, introducing the Adaptive Multivariate Functional EWMA (AMFEWMA). The AMFEWMA...

22. General Random-Effects Trend Renewal Processes with Applications

Tsai-Hung Fan (National Central University)

18/09/2024, 09:20

Reliability and Safety

A repairable system can be reused after repairs, but data from such systems often exhibit cyclic patterns. However, as seen in the charge-discharge cycles of a battery where capacity decreases with each cycle, the system's performance may not fully recover after each repair. To address this issue, the trend renewal process (TRP) transforms periodic data using a trend function to ensure the...

83. Bayesian inference for measurements by counting

Francesca Pennecchi (Istituto Nazionale di Ricerca Metrologica - INRIM)

18/09/2024, 09:40

Stochastic Modelling

Bayesian methods

Counting processes occur very often in several scientific and technological problems. The concept of numerousness and, consequently the counting of a number of items are at the base of many high-level measurements, in fields such as, for example, time and frequency, optics, ionizing radiations, microbiology and chemistry. Also, in conformity assessment and industrial quality control, as well...

105. Design margins and handling of variations and uncertainty aiming for a Robust Design

Dr Sören Knuts (GKN Aerospace Sweden)

18/09/2024, 09:40

Reliability and Safety

In a period of time when Artificial Intelligence and Machine Learning algorithms are taking over the analysis of our needs in product development. There is still important to be reminded where humans still have to question and control how new product designs handle the aspects of variation and uncertainty.
One part is the mapping of the variation of all aspects of design and production...

122. Exploiting prior knowledge for efficient Deep Learning: a case study about automated insect monitoring

Laura Carolina Martinez Esmeral (KU Leuven)

18/09/2024, 09:40

AI in Industry

Image processing in Industry

Pest insects threaten agriculture, reducing global crop yields by 40% annually and causing economic losses exceeding $70 billion, according to the FAO. Increasing pesticide use not only affects pest species but also beneficial ones. Consequently, precise insect population monitoring is essential to optimize pesticide application and ensure targeted interventions.
In today's AI-driven era,...

65. Monitoring Multivariate Data Streams Using Convex Hull Based Control Charts

Sotiris Bersimis (University of Piraeus, Greece)

18/09/2024, 09:40

Process modelling and Control

Multivariate statistical process control

Statistical process monitoring is of vital importance in various fields such as biosurveillance, data streams, etc. This work presents a non-parametric monitoring process aimed at detecting changes in multidimensional data streams. The non-parametric monitoring process is based on the use of convex hulls for constructing appropriate control charts. Results from applying the proposed method are...

44. A Comparative Analysis of Bayesian Sampling Methodologies for Design Space Identification in Quality by Design (QbD) Approach

Deniz Akinc Abdulhayoglu (Cencora Pharmalex)

18/09/2024, 10:05

Biostatistics/ Statistics in the Pharmaceutical Industry

Statistics in the pharmaceutical industry

Quality by Design (QbD) has emerged as a pivotal framework in the pharmaceutical industry, emphasizing proactive approaches to ensure product quality. Central to QbD is the identification of a robust design space, encompassing the range of input variables and process parameters that guarantee pharmaceutical product quality. In this study, we present a comparative analysis of random walk...

25. Estimating the heating duration of composite parts in the autoclave curing process using regression and artificial neural networks

Tuğçe Yücel (Hacettepe University)

18/09/2024, 10:05

AI in Industry

AI in Industry 3

The use of composite materials has been increasing in all production industries including the aviation industry, due to their strength, lightness, and design flexibility. The manufacturing of composite materials finalizes with their curing in the autoclaves that are heat and pressure ovens. The autoclave curing cycle, in which a batch of materials is cured in the autoclave, is made of three...

88. How the Sausage is Made? Tips & Tricks for Communicating Results and a case study

Froydis Bjerke (Animalia Meat and Poultry Research Centre), Jennifer Van Mullekom (Virginia Polytechnic Institute & State University)

18/09/2024, 10:05

Education and thinking

As a collaborative statistician you have been charged with completing a complicated toxicology analysis regarding levels of harmful chemicals in groundwater. At the conclusion of your presentation, an audience member asks, ‘So, should my cows drink the water?” At least half the audience nods and comments that they, too, would like to know the answer to that question. Clearly, something went...

13. Online Multi-factor Screening Experiments with Bandit Allocation

David Steinberg (Tel Aviv University)

18/09/2024, 10:05

DoE

Design of experiments 3

Online experimentation is a way of life for companies involved in information technology and e-commerce. These experiments allocate visitors to a website to different experimental conditions to identify conditions that optimize important performance metrics. Most online experiments are simple two-group comparisons with complete randomization. However, there is great potential for improvement...

46. A Bayesian approach to accelerated stability modeling and shelf-life determination

Dr Joris Chau (Open Analytics)

18/09/2024, 10:25

Biostatistics/ Statistics in the Pharmaceutical Industry

Statistics in the pharmaceutical industry

Chemical and physical stability of drug substances and drug products are critical in the development and manufacturing of pharmaceutical products. Classical stability studies, conducted under defined storage conditions of temperature and humidity and in the intended packaging, are resource intensive and are a major contributor to the development timeline of a drug product. To provide support...

16. Optimal designs for mixture choice experiments by Simulated Annealing

Yicheng Mao (Maastricht University)

18/09/2024, 10:25

DoE

Design of experiments 3

Mixture choice experiments investigate people's preferences for products composed of different ingredients. To ensure the quality of the experimental design, many researchers use Bayesian optimal design methods. Efficient search algorithms are essential for obtaining such designs, yet research in the field of mixture choice experiments is still not extensive. Our paper pioneers the use of a...

69. Process Analytics Technology (PAT) modeling using linear and non-linear convolutional approaches

Tiago Rato (Department of Chemical Engineering, University of Coimbra)

18/09/2024, 10:25

Machine Learning

AI in Industry 3

Waste lubricant oil (WLO) is a hazardous residual that is preferably recovered through a regeneration process, for promoting a sustainable circular economy. WLO regeneration is only viable if the WLO does not coagulate in the equipment. Thus, to prevent process shutdowns, the WLO’s coagulation potential is assessed offline in a laboratory through an alkaline treatment. This procedure is...

104. Turning a real story with data into an active learning class exercise in statistical practice with a focus on communication

Jacqueline Asscher (Kinneret College, Technion)

18/09/2024, 10:35

Other/ Special/ Invited

Education and thinking

This talk ties in with the previous two talks in the session: the story and data are from one of the series of cases discussed by Froydis Bjerke, from Animalia, Norway, and the communication focus follows guidelines provided by Jennifer Van Mullekom.
The issues that arise in the case study itself include industrial statistics classics: “is the expensive external laboratory test really better...

93. Automatic Detection of Destruction and Freshness in Anchovy (Engraulis ringens) Using Informal Photos

Miguel Angel Rodriguez Anticona (TASA)

18/09/2024, 10:45

Machine Learning

AI in Industry 3

In the fishing industry, maintaining the quality of fish such as the Peruvian anchovy (Engraulis ringens), used primarily for fishmeal and oil, is critical. The condition and freshness of the fish directly influence production outcomes and the final product's quality. Traditional methods for assessing fish freshness, though precise, are often too costly and time-consuming for frequent...

3. Bayesian non-linear mixed effects model for safer powder storage

Arno Strouwen

18/09/2024, 10:45

Biostatistics/ Statistics in the Pharmaceutical Industry

Statistics in the pharmaceutical industry

Powders are ubiquitous in the chemical industry, from pharmaceutical powders for tablet production to food powders like sugar. In these applications, powders are often stored in silos where the powder builds up stress under its own weight. The Janssen model describes this build up, but this model has unknown parameters that must be estimated from experimental data. This parameter estimation...

30. Testing Statistical Software as a Designed Experiment

Ryan Lekivetz (JMP)

18/09/2024, 10:45

DoE

Design of experiments 3

Validating statistical software involves a variety of challenges. Of these, the most difficult is the selection of an effective set of test cases, sometimes referred to as the “test case selection problem”. To further complicate matters, for many statistical applications, development and validation are done by individuals who often have limited time to validate their application and may not...

101. Questions, discussions, and exchange of experience with the audience

Froydis Bjerke (Animalia Meat and Poultry Research Centre)

18/09/2024, 10:50

Education and thinking

133. Active learning and adaptive sampling for regression data streams

Davide Cacciarelli (Imperial College London)

18/09/2024, 11:30

JQT/Technometrics/QE invited session

As businesses increasingly rely on machine learning models to make informed decisions, developing accurate and reliable models is critical. Obtaining curated and annotated data is essential for the development of these predictive models. However, in many industrial contexts, data annotation represents a significant bottleneck to the training and deployment of predictive models. Acquiring...

136. ENBIS live session

Christian Ritter (Université Catholique de Louvain), Dr Jennifer Van Mullekom (Virginia Tech)

18/09/2024, 11:30

Invited active session

The popular ENBIS LIVE session is again on the program!

ENBIS LIVE 2024 will be hosted by Christian Ritter and Jennifer Van Mullekom.

This is a session in which three volunteers present open problems and the audience discusses them. t's a special occasion where we can all work together to make either by providing useful suggestions or by gaining a deeper understanding. In this session,...

24. New in JMP 18 and JMP Pro 18

Dr Phil Kay (JMP), Dr Volker Kraft (JMP)

18/09/2024, 11:30

Other/ Special/ Invited

Invited Software session

Version 18 of JMP and JMP Pro are being released in Spring 2024, bringing a host of new features useful to scientists and engineers in industry and academia. This presentation will focus on some key extensions and improvements: Besides an improved user experience based on a new Columns Manager for easier data management or Platform Presets for creating and reusing customized report templates,...

135. Statistical Challenges in Pharmaceutical Industry

Tatsiana Khamiakova (Johnson and Johnson Innovative Medicine), Nicolas Sauwen (Open Analytics NV), Deniz Akinc Abdulhayoglu (Cencora Pharmalex), Bernard Francq (GSK)

18/09/2024, 11:30

Invited session: Statistical challenges in pharmaceutical industry

There is a common perception that bringing in statistical innovation in the highly regulated industry, such as pharmaceutical companies, is a hard mission. Often, due to legal constraints, the statistical innovation in the nonclinical space is not obvious to the outer world. In our discussion panel we would like to discuss challenges we face as industrial statisticians working in...

52. Design and analysis of split-plot, split-split-plot, and other stratified designs

Jose Nunez Ares (EFFEX)

18/09/2024, 12:00

Other/ Special/ Invited

Invited Software session

In stratified designs, restricted randomization is often due to budget or time constraints. For example, if a factor is difficult to change and changing its level is expensive, the tests in a design are grouped into blocks so that within each block the level of the difficult factor is kept constant. Another example appears in agriculture, where some factors may need to be applied to larger...

128. Statistical Process Monitoring of Artificial Neural Networks

Anna Malinovskaya (Leibniz University Hannover/Nala Earth GmbH)

18/09/2024, 12:00

Machine Learning

JQT/Technometrics/QE invited session

The rapid progress in artificial intelligence models necessitates the development of innovative real-time monitoring techniques with minimal computational overhead. Particularly in machine learning, where artificial neural networks (ANNs) are commonly trained in a supervised manner, it becomes crucial to ensure that the learned relationship between input and output remains valid during the...

145. Predictive Ratio Cusum (PRC): A Bayesian Approach in Online Change Point Detection of Short Runs

Panagiotis Tsiamyrtzis (Politecnico di Milano)

18/09/2024, 12:30

JQT/Technometrics/QE invited session

The online quality monitoring of a process with low volume data is a very challenging task and the attention is most often placed in detecting when some of the underline (unknown) process parameter(s) experience a persistent shift. Self-starting methods, both in the frequentist and the Bayesian domain aim to offer a solution. Adopting the latter perspective, we propose a general closed-form...

134. A New Interpretation To The Equivalence Test (TOST) By (Bayesian) Success Probabilities.

Bernard Francq (GSK)

18/09/2024, 14:00

Biostatistics

In medical and pharma research, statistical significance is often based on confidence intervals (CIs) for means or mean difference and p-values, the reporting of which is included in publications in most top-level medical journals. However, recent years have seen ongoing debates on the usefulness of these inferential tools. Misinterpretations of CIs for means and p-values can lead to...

77. Dimension reduction for flow cytometry data for classification purposes

Anne Gégout-Petit (Université de Lorraine)

18/09/2024, 14:00

Biostatistics/ Statistics in the Pharmaceutical Industry

PCA and mining

Flow cytometry is a technique used to analyze individual cells or particles contained in a biological sample. The sample passes through a cytometer, where the cells are irradiated by a laser, causing them to scatter and emit fluorescent light. A number of detectors then collect and analyze the scattered and emitted light, producing a wealth of quantitative information about each cell (cell...

7. Measurement System Analysis for Parts Without a Reference Point

Mahmut Onur Karaman (Hacettepe University)

18/09/2024, 14:00

Quality Improvement and Six Sigma

Quality Improvment and Six Sigma

Many measurement system capability studies investigate two components of the measurement error, namely repeatability and reproducibility. Repeatability is used to denote the variability of measurements due to gauge, whereas reproducibility is the variability of measurements due to different conditions such as operators, environment, or time. A gauge repeatability and reproducibility (R&R)...

79. Penalized Spline Regression for Gaussian Function-on-Function Mixture-of-Experts

Jean Steve Tamo Tchomgui (Université Lyon2)

18/09/2024, 14:00

Predictive Analytics

Functional data

During the past few decades, it has become necessary to develop new tools for exploiting and analysing the ever increasing volume of data. This is one of reason why Functional Data Analysis (FDA) have become a very popular in a constantly growing number of industrial, societal and medical applications. FDA is a branch of statistics that deals with data that can be represented as functions....

72. Improving the quality of data-driven projects in advanced manufacturing

Lindsay Lee (Advanced Manufacturing Research Centre)

18/09/2024, 14:20

Quality Improvement and Six Sigma

Quality Improvment and Six Sigma

The Advanced Manufacturing Research Centre has invested heavily in AI for manufacturing and has seen success in many applications, including process monitoring, knowledge capture and defect detection. Despite the success in individual projects, the AMRC still has few experts in data science and AI and currently has no framework in place to enable wider adoption of AI nor to ensure the quality...

124. Investigation of corporate sustainability reporting through automated text analysis

Aleš Toman (University of Ljubljana, School of Economics and Business)

18/09/2024, 14:20

Data Mining and Warehousing

PCA and mining

In order to evaluate the performance of companies, the focus is shifting from purely quantitative (financial) information to qualitative (textual) information. Corporate annual reports are comprehensive documents designed to inform investors and other stakeholders about a company's performance in the past year and its goals for the coming years. We have focused on the corporate sustainability...

86. Statistical Analysis of Traffic Flow Profiles through Functional Graphical Models

Davide Forcina (Università degli Studi di Napoli Federico II)

18/09/2024, 14:20

Process modelling and Control

Functional data

Traffic flow estimation plays a key role in the strategic and operational planning of transport networks. Although the amplitude and peak times of the daily traffic flow profiles change from location to location, some consistent patterns emerge within urban networks. In fact, the traffic volumes of different road segments are correlated with each other from spatial and temporal perspectives....

95. Statistics in practice, to handle a pandemic emergency.

Stefano Barone (University of Palermo)

18/09/2024, 14:20

Biostatistics

The management of the COVID 19 pandemic, especially during years 2020 and 2021, highlighted a serious shortage at all levels and in the majority of countries around the world.
Some countries reacted slightly better, having faced similar epidemics in their recent past, but obviously this was not enough, since the flows of people worldwide are now so huge that it makes little sense to make...

31. New insights about robust kernel PCA

Can Hakan Dagidir (KU Leuven)

18/09/2024, 14:40

Data Mining and Warehousing

PCA and mining

Kernel Principal Component Analysis (KPCA) extends linear PCA from a Euclidean space to data provided in the form of a kernel matrix. Several authors have studied its sensitivity to outlying cases and have proposed robust alternatives, as well as outlier detection diagnostics. We investigate the behavior of kernel ROBPCA, which relies on the Stahel-Donoho outlyingness in feature space...

110. Semiparametric predictive modeling of the electricity prices

Marek Brabec (Institute of computer science, Czech Academy of Sciences)

18/09/2024, 14:40

Predictive Analytics

Functional data

We formulate a semiparametric regression approach to short-term prediction (48 to 72 hours ahead horizons) of electricity prices in the Czech Republic. It is based on complexity penalized spline implementation of GAM hence it allows for flexible modeling of dynamics of the process, important details of the hourly + weekly periodic components (which are salient for both point prediction and its...

5. Designing machine learning for industrial applications. From transfer learning to Physical Informed Machine Learning models.

Mathilde MOUGEOT (ENSIIE & ENS Paris-Saclay)

18/09/2024, 15:05

Keynote

Closing keynote: Mathilde Mougeot

In recent years, significant progress has been made in setting up decision support systems based on machine learning exploiting very large databases. In many research or production environments, the available databases are not very large, and the question arises as to whether it makes sense to rely on machine learning models in this context.
Especially in the industrial sector, designing...

137. Transfer Learning. Theory, methods & applications

Mathilde MOUGEOT (ENSIIE & ENS Paris-Saclay)

19/09/2024, 09:00

https://conferences.enbis.org/event/59/

138. Active Learning in JMP

Chris Gotwalt (JMP Division of SAS Institute), Phil Kay (SAS)

19/09/2024, 14:00

https://conferences.enbis.org/event/60/

82. Detecting changes in Crow-AMSAA reliability predictions

Dr Alessandro Di Bucchianico (Eindhoven University of Technology)

Reliability and Safety

In the semiconductor industry it is required that high-tech equipment has a large uptime due to large costs of production losses. As a consequence, it is important to have accurate reliability predictions of parts of such equipment, so that there are sufficient spare parts available. This is not a trivial task since high-tech equipment may consist of thousands of parts.
It is common in the...

131. Hybrid Approach for Cutting Tools Remaining Useful Life

Dr Mahmoud Awad (American University of Sharjah)

Reliability and Safety

Prognostics of cutting tools health is an important and challenging task in manufacturing industry. The main objective of prognostics is to examine the ability of the cutting tool to perform its function throughout its expected life and determine its remaining useful life (RUL). An accurate estimate of RUL will aid in maximizing the utilization of the cutting tool, improve quality performance,...

35. Partial M-quantile Regression for Predictive Mantainance

Prof. Diego Zappa (Università Cattolica del Sacro Cuore), Riccardo Borgoni (University of Milano Bicocca)

Stochastic Modelling

In Industry 4.0 factories, innovative prediction tools are adopted
so that data can be systematically processed into information that can
explain uncertainties and support decisions. Predictive manufacturing
systems begin with acquiring data from monitored assets using appropriate
sensors to extract various signals. These signals can then be integrated
with historical data into extensive...

99. Railway Track Geometry Measurements for Intervention Analysis – Improving the Gold Standard

Bjarne Bergquist (Luleå University of Technology)

Quality Improvement and Six Sigma

This paper compares measurements from a regular track measurement car and an onboard measurement system mounted on a regular passenger train car. The measurement systems were compared as an experimental instrument to assess a maintenance action. The experiment involved frequent pre- and post-maintenance measurements from onboard mounted equipment to assess short-term effects, while more...

144. Stepping out of my circle using LinkedIn

Dr Jonathan Smyth-Renshaw (Jonathan Smyth-Renshaw & Associates Ltd)

Award session, Greenfield challenge: Jonathan Smyth-Renshaw

Tony Greenfield said, ‘That is my challenge: Tell the world, outside your circle, of work you have done, and done successfully because you used statistics.’
Often outside our circle, if you mention the word, statistics! The reply is often ‘There are three kinds of lies: lies, damned lies, and statistics.
We use social media to communicate, this includes LinkedIn, a network for...

23. Using shrinkage strategies to estimate parameters in Zero-Inflated Count Models

Hossein Bevrani (University of Tabriz)

Data Science

Counting data with excess zeros is commonly encountered in various scientific fields such as public health, insurance, economics, and engineering. To handle this issue, zero-inflated count models such as zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models are widely used.
In the context of regression models, it can be beneficial to incorporate uncertain prior...

Choose timezone

ENBIS-24 Leuven Conference

Chair of the Local Organising Committee