ENBIS-25 Conference

Name: ENBIS-25 Conference
Start: 2025-09-14T11:00:00+03:00
End: 2025-09-18T18:00:00+03:00
Location: Piraeus, Greece

14–18 Sept 2025

Piraeus, Greece

Europe/Athens timezone

Chair of the Local Organising Committee

Selecting Statistical Metrics for Prompt Optimization of Locally Hosted LLMs: A Case Study in Asset Registration

15 Sept 2025, 13:30

20m

AI: Interpretability and Trustworthiness AI: Interpretability and Trustworthiness

Dr Guido Moeser (masem research institute)

In a client project focused on asset registration, we transitioned from a publicly available LLM (Gemini) to a locally hosted LLM to prevent the potential leakage of sensitive manufacturing and customer data. Due to hardware constraints (GPUs with a maximum of 12 GB VRAM), the performance of local LLMs was initially inferior to hosted models. Therefore, prompt optimization became crucial to achieve comparable output quality.
We explored the application of LLM-specific statistical evaluation metrics — G-Eval, Answer Relevance, Prompt Alignment, Sensitivity, and Consistency — using the Langfuse framework deployed on-premises. We evaluated three quantized models (llama3.1 8b, mistral 7b , deepseek-llm 7b) on a dataset of 200 simulated asset registration requests in multiple languages (German, English, Turkish), reflecting real-world operational complexity.
Reference values for evaluation were manually extracted by human coders. Prompt engineering techniques, including Zero-shot, One-shot, and Few-shot strategies, were applied systematically. Only information present in the registration prompts was extracted; no hallucinations occurred (temperature was fixed at 0). However, significant effort was needed to maximize information extraction through tailored prompts.
This case study demonstrates how statistical metrics can guide and validate prompt optimization, especially under constrained computing conditions and multilingual input. We invite discussion on alternative metrics, further application domains, and lessons learned from fitting models primarily trained for English to German and mixed-language tasks.

Classification	Mainly application
Keywords	Statistical Evaluation Metrics; Prompt Optimization; Locally Hosted LLMs

Dr Guido Moeser (masem research institute)

Mr Igor Dub (Referat Smart City Wiesbaden)

There are no materials yet.

ENBIS-25 Conference

Chair of the Local Organising Committee

Selecting Statistical Metrics for Prompt Optimization of Locally Hosted LLMs: A Case Study in Asset Registration

Speaker

Description

Primary author

Co-author

Presentation materials

Choose timezone

ENBIS-25 Conference

Chair of the Local Organising Committee

Speaker

Description

Primary author

Co-author

Presentation materials