Project monitoring practices have significantly evolved over the past decades. Initially grounded in traditional methodologies such as Earned Value Management (EVM), these practices have advanced to incorporate control charts and sophisticated techniques utilizing Artificial Intelligence (AI) and Machine Learning (ML) algorithms to predict final project costs and durations. Despite these...
Retrieval-Augmented Generation (RAG) offers a robust way to enhance large language models (LLMs) with domain-specific knowledge via external information retrieval. In banking—where precision, compliance, and accuracy are vital—optimizing RAG is crucial. This study explores how various document parsing, chunking, and indexing techniques influence the performance of RAG systems in banking...
Purpose: Industrial applications increasingly rely on complex predictive models for process optimization and quality improvement. However, the relationship between statistical model performance and actual operational benefits remains insufficiently characterized. This research investigates when model complexity provides genuine business value versus statistical...
Timely crop yield estimation is a key component of Smart Agriculture, enabling proactive decision-making and optimized resource allocation under the constraints of climate variability and sustainability goals. Traditional approaches based on manual sampling and empirical models are constrained by labour intensity, limited spatial coverage, and sensitivity to within-block (-site) heterogeneity....
The IFP group is a leader in research and training in the energy and environmental sector, particularly in the development and commercialization of catalysts. Building accurate predictive models for these catalysts usually requires expensive and time-consuming experiments. To make this process more efficient, it’s helpful to leverage existing data from previous generations of catalysts. This...
Manual Welding is an important manufacturing process in several industries such as marine, automotive and furniture among others. Despite the widespread welding, it still causes a significant percentage of rework in many companies, especially small to medium sized companies. The objective of this project is to develop an economic online monitoring method for detecting defective welds using...
Bayesian Optimization (BO) has received tremendous attention for optimizing deterministic functions and tuning ML parameters. There is increasing interest in applying BO to physical measurement data in industrial settings as a recommender system for product/process design. In this context multiple responses of interest are the norm, but "basic" BO is only defined minimization/maximization of a...
In this work, we address the problem of binary classification under label uncertainty in settings where both feature-based and relational data are available. Motivated by applications in financial fraud detection, we propose a Bayesian Gaussian Process classification model that leverages covariate similarities and multilayer network structure. Our approach accounts for uncertainty in the...
Early process development in biopharma traditionally relies on small-scale experimentation, e.g. microtiter plates. At this stage, most catalyst candidates (clones) are discarded before process optimisation is conducted in bioreactors at larger scales, which differ significantly in their feeding strategies and process dynamics. This disconnect limits the representability of small-scale...
PRIM is a Bump Hunting algorithm traditional used in a supervised learning setting to find regions in the input variables subspace while being guided by the data analyst, that are associated with the highest or lowest occurrence of a target label of a class variable.
We present in this work a non-parametric PRIM-based algorithm that involves all the relevant attributes for rule generation...
Data privacy is a growing concern in real-world machine learning (ML) applications, particularly in sensitive domains like healthcare. Federated learning (FL) offers a promising solution by enabling model training across decentralized, private data sources. However, both traditional ML and FL approaches typically assume access to fully labeled datasets, an assumption that rarely holds in...
Real-time monitoring systems play a crucial role in detecting and responding to changes and anomalies across diverse fields such as industrial automation, finance, healthcare, cybersecurity, and environmental sensing. Central to many of these applications is multivariate statistical process monitoring (MSPM), which enables the concurrent analysis of multiple interrelated data streams to...
Real world datasets frequently include not only vast numbers of observations but also high dimensional feature spaces. Exhaustively gathering and examining every variable to uncover meaningful insights can be time consuming, costly, or even infeasible. In order to build up robust, reliable and efficient regression models, feature selection techniques have therefore become inevitable. Yet many...
Extracting meaningful insights from vast amounts of unstructured textual data presents significant challenges in text mining, particularly when attempting to separate valuable information from noise. This research introduces a novel deep learning framework for text mining that identifies latent structures within comprehensive text corpora. The proposed methodology incorporates an initial...
A major challenge in the chemicals industry is coordinating decisions across different levels, such as individual equipment, entire plants, and supply chains, to enable more sustainable, autonomous operations. Multi-agent systems, based on large language models (LLMs), have shown potential for managing complex, multi-step problems in software development (Qian et al., 2023). This work...