ENBIS-26 Conference

Name: ENBIS-26 Conference
Start: 2026-09-06T09:00:00+02:00
End: 2026-09-10T18:00:00+02:00
Location: Centro Didattico Morgagni

Sep 6 – 10, 2026

Centro Didattico Morgagni

Europe/Rome timezone

Chair of the Local Organising Committee

Statistical Evaluation of CPU-Based Offline LLMs for Industrial Text Processing on Low-Cost Edge Hardware

Sep 7, 2026, 12:00 PM

20m

Conference Room 103

Trustworthy and Explainable AI Truthworthy and explainable AI

Dr Guido Moeser (masem research institute)

Recent advances in generative AI have enabled powerful language models for industrial applications. However, most solutions rely on cloud-based infrastructures or GPU-accelerated environments, which raise concerns regarding data privacy, latency, and operational cost—particularly in industrial settings dealing with sensitive internal documents.

In this study, we investigate the feasibility of deploying offline large language models (LLMs) on CPU-only edge hardware, such as standard notebooks and low-cost mini PCs. In particular, we evaluate the performance of highly quantized models, including emerging architectures such as BitNet, within a real-world industrial use case.

The application scenario is based on a production system developed using n8n, where incoming customer emails are processed locally without external data transfer. Two core tasks are considered:

Structured information extraction (classification task): Extraction of machine identifiers and request types from customer emails.
Text summarization and interpretation (generation task): Generation of concise summaries and actionable insights from unstructured text.

For the classification task, performance is evaluated using classical statistical metrics derived from the confusion matrix, including precision, recall, and F1-score. For the generative task, we apply G-Eval-based metrics to assess dimensions such as correctness, completeness, and consistency.

Beyond output quality, we introduce a comprehensive set of system-level performance indicators, including:

processing latency
tokens generated per second
total token count per task
CPU utilization and memory footprint

These metrics are analyzed under different deployment configurations, comparing execution on a high-performance notebook (64 GB RAM) and a low-cost edge device (Intel N95 mini PC, 12 GB RAM).

Our results demonstrate that CPU-based offline LLMs can achieve competitive performance for industrial text processing tasks, while significantly reducing infrastructure complexity and enabling fully local data processing. The study highlights that, with appropriate model selection and evaluation metrics, cost-efficient edge deployments without GPUs are a viable alternative for industrial AI applications.

Classification	Mainly application
Keywords	Edge Computing, Quantized Language Models (BitNet), Offline-LLMs

Dr Guido Moeser (masem research institute)

Andrea Ahlemeyer-Stubbe (Data Mining & More) Igor Dub (Wiesbaden Smart City Department) Rainer Bumm (PHOENIX)

There are no materials yet.

ENBIS-26 Conference

Chair of the Local Organising Committee

Statistical Evaluation of CPU-Based Offline LLMs for Industrial Text Processing on Low-Cost Edge Hardware

Conference Room 103

Speaker

Description

Author

Co-authors

Presentation materials

Choose timezone

ENBIS-26 Conference

Chair of the Local Organising Committee

Speaker

Description

Author

Co-authors

Presentation materials