Generated cultural heritage question-answer dataset: Durga in multi-dimensional perspectives

作者Tri Lathif Mardi Suryanto · Aji Prasetya Wibawa · None Hariyono · Andrew Nafalski · Gulsun Kurubacak Çakır

2026年2月9日 19:00

Data Brief. 2026 Jan 20;65:112495. doi: 10.1016/j.dib.2026.112495. eCollection 2026 Apr.

ABSTRACT

This dataset presents a valuable compilation of question-answer (QA) pairs derived from cultural texts and sources related to Durga mythology. A total of 21,395 QA pairs, encompassing textual materials such as scriptures, ritual narratives, temple inscriptions, and traditional storytelling records. Each entry includes the source reference, question, and corresponding answer, provided in a structured format compatible with Excel for seamless integration into downstream natural language processing (NLP) tasks. Data collection involved manual curation and annotation by domain experts, followed by preprocessing steps including text normalization, duplication removal, and verification of factual and contextual accuracy. The dataset is designed to support generative QA models, culturally aware chatbots, and digital preservation of heritage knowledge. It is particularly valuable for research in AI-driven cultural applications, educational tools, and digital humanities initiatives aiming to bridge traditional knowledge with computational methods. Researchers and practitioners may utilize the dataset for training generative models, creating interactive educational platforms, developing culturally sensitive AI agents, and supporting comparative studies in cross-cultural heritage. This openly accessible resource adheres to ethical standards, with proper attribution to source materials, and provides a foundational asset for both academic research and applied development in culturally informed artificial intelligence.

PMID:41657412 | PMC:PMC12874138 | DOI:10.1016/j.dib.2026.112495