Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Data Visualization

This article is part of the Research Topic15th International Meeting on Visualizing Biological Data (VIZBI 2025)View all 5 articles

SHACLens: A Visualization Workflow for SHACL Violation Exploration in Knowledge Graphs

Provisionally accepted
Christian  Alexander SteinparzChristian Alexander Steinparz1*Andreas  HinterreiterAndreas Hinterreiter1Labinot  BajraktariLabinot Bajraktari2Vitaly  SedlyarovVitaly Sedlyarov2Markus  J. BauerMarkus J. Bauer1Thomas  ZichnerThomas Zichner2Marc  StreitMarc Streit2
  • 1Department of Computer Science, Faculty of Science and Natural Sciences, Johannes Kepler University of Linz, Linz, Austria
  • 2Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria

The final, formatted version of the article will be published soon.

Validating large knowledge graphs with the Shapes Constraint Language (SHACL) often yields violation reports too large to interpret and trace to root causes, especially in industry-scale datasets such as pharmaceutical omics pipelines. We present SHACLens, an interactive visualization workflow—developed with a major pharmaceutical partner—that links ontology, instance data, and violation reports across multiple coordinated views. We contribute a practitioner-informed workflow co-designed with pharmaceutical data-analysis experts, along with design lessons distilled: We first analyzed their existing practice, and optimized it into a multiple coordinated-view system. A Node-Link View combines ontology and groups of equivalent violations, a projection view reveals clusters of nodes with similar errors, a LineUp table combines instance data with violation information, a Class Tree offers a class-hierarchy overview, and an integrated LLM assistant provides contextual explanations and can operate the system via natural-language commands. Within this workflow, selections and filters propagate across views, exposing co-occurring errors and their likely upstream causes. Analysts iteratively identify violation clusters, inspect correlations, and trace the detailed cause of errors. We evaluated SHACLens through an iterative expert-in-the-loop design process with the partner team and a qualitative study on a transcriptomics dataset containing 5,203 violating nodes with the same experts. In this study, SHACLens efficiently surfaced repeated sets of errors due to missing objects and schema inconsistencies, supporting goal-oriented analysis and serendipitous findings.

Keywords: Data curation, Down-Projection, Knowledge graphs, Large LanguageModels, LLM Interfaces, machine learning, Visual Analytics, visualization

Received: 28 Nov 2025; Accepted: 12 Feb 2026.

Copyright: © 2026 Steinparz, Hinterreiter, Bajraktari, Sedlyarov, Bauer, Zichner and Streit. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Christian Alexander Steinparz

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.