Research Article

Explainable Data Lineage AI Agents: Bridging Technical Pipelines with Human-Centric Narratives

by  Thananjayan Kasi
journal cover
Journal of Advanced Artificial Intelligence
Foundation of Computer Science (FCS), NY, USA
Volume 2 - Issue 3
Published: December 2025
Authors: Thananjayan Kasi
10.5120/jaai202555
PDF

Thananjayan Kasi . Explainable Data Lineage AI Agents: Bridging Technical Pipelines with Human-Centric Narratives. Journal of Advanced Artificial Intelligence. 2, 3 (December 2025), 17-28. DOI=10.5120/jaai202555

                        @article{ 10.5120/jaai202555,
                        author  = { Thananjayan Kasi },
                        title   = { Explainable Data Lineage AI Agents: Bridging Technical Pipelines with Human-Centric Narratives },
                        journal = { Journal of Advanced Artificial Intelligence },
                        year    = { 2025 },
                        volume  = { 2 },
                        number  = { 3 },
                        pages   = { 17-28 },
                        doi     = { 10.5120/jaai202555 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2025
                        %A Thananjayan Kasi
                        %T Explainable Data Lineage AI Agents: Bridging Technical Pipelines with Human-Centric Narratives%T 
                        %J Journal of Advanced Artificial Intelligence
                        %V 2
                        %N 3
                        %P 17-28
                        %R 10.5120/jaai202555
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Traditional data lineage tools trace source-to-destination paths but often lack contextual clarity, creating a disconnect between technical implementation and business interpretation. This paper introduces explainable data lineage AI agents that generate natural language narratives explaining the rationale behind each transformation, covering business logic, risk implications, and data quality impact. These agents enable conversational interrogation of data pipelines by combining metadata intelligence, governance policies, and large language models (LLMs), tailored to organizational roles. The proposed architecture delivers multi-persona reports: executive summaries for leadership, compliance narratives for auditors, and technical insights for engineers, all derived from a unified lineage graph. Challenges remain in handling ambiguity and incomplete metadata, suggesting directions for future research.

References
  • R. Eichler et al., "Enterprise-Wide Metadata Management: An Industry Case on the Current State and Challenges," Business Information Systems, 2021. DOI: 10.52825/bis.v1i.47
  • J. Schneider, "Explainable Generative AI (GenXAI): a survey, conceptualization, and research agenda," Springer, 2024. DOI: https://doi.org/10.1007/s10462-024-10916-x
  • S. Bhupathi, "Building Scalable AI-Powered Applications with Cloud Databases: Architectures, Best Practices and Performance Considerations," arXiv:2504.18793v1, 2025. https://arxiv.org/pdf/2504.18793
  • S. Pahune et al., "The Importance of AI Data Governance in Large Language Models," MDPI, 2025. DOI: https://doi.org/10.3390/bdcc9060147
  • V. Gatta et al., "An eXplainable Artificial Intelligence Methodology on Big Data Architecture," Cognitive Computation, 2024. DOI: https://doi.org/10.1007/s12559-024-10272-6
  • C. Keyser, "Why data governance is essential for enterprise AI," IBM. https://www.ibm.com/think/topics/data-governance-for-ai
  • Jenna Peuralinna, "Data lineage in the financial sector," Aalto University, 2024. https://aaltodoc.aalto.fi/server/api/core/bitstreams/02d288f3-70a7-46a1-ac4b-6de62554b2d0/content
  • P. Bandi, "The Role of Metadata in Making Data AI-Ready: Enhancing Data Discoverability and Usability," Journal of Computer Science and Technology Studies, 2025. DOI: https://doi.org/10.32996/jcsts.2025.7.5.110
  • S. Sithakoul et al., "BEExAI: Benchmark to Evaluate Explainable AI," arXiv:2407.19897v1, 2024. https://arxiv.org/pdf/2407.19897?
  • A. Singh, "Data governance and ethics in AI-Powered Platforms," World Journal of Advanced Research and Reviews, 2025. DOI: https://doi.org/10.30574/wjarr.2025.26.1.1068
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Data Lineage Explainable AI Metadata Intelligence Governance Large Language Models

Powered by PhDFocusTM