普通视图

Received before yesterday6 - JDHASA (Journal of the Digital Humanities Association of Southern Africa)

Preface to the Proceedings of RAIL 2025

The sixth workshop on Resources for African Indigenous Languages (RAIL) was held on 10 November 2025 at the CSIR International Convention Centre in Pretoria, South Africa. It was co-located with the Digital Humanities Association of Southern Africa (DHASA) 2025 conference, which took place from 11 to 14 November 2025

Ideational Analysis and Integration of African Folktale in Science, Technology, and Education

2025年12月31日 08:00

Folktales are literary forms that reveal the soul of any society; they express its wishes, desires, hopes, and beliefs about the world. They have fictional characters and situations, mostly oral traditions, before they were written down. According to Cynthia McDaniel (1993), folktales can be used in all disciplines to convey knowledge and communicate ideas; they serve as an inherent vehicle for intergenerational communication that prepares and assigns roles and responsibilities to different generations in their communities. They are more pedagogic devices and less literary pieces. They cultivate universal values such as compassion, generosity, and honesty while disapproving of attributes such as cruelty, greed, and dishonesty. To illustrate McDaniel's claims, this paper will firstly use the ideational metafunctional framework found in Systemic Functional Linguistics, which expresses the clausal experiences and content from a grammatical perspective, coupled with syntagmatic analysis, which describes the text (folktale) in chronological order as reported by the storyteller. Secondly, the presentation will use a textual metafunctional framework that fulfills the thematic function of the clause, coupled with the paradigmatic analysis where the folkloristic text's patterns are regrouped more analytically to reveal the text's latent content, or theme. The Voyant Tool, a web-based text reading and analysis environment designed to facilitate the analysis of various text formats, was used to extract and analyze data from a Sesotho folktale to illustrate how folktales may be integrated with technology for research and educational purposes. This paper employed a descriptive research design that incorporates qualitative (content analysis) and quantitative (statistical analysis) methodologies to analyze and interpret the story. It is observed, through the Voyant tool, that the story is built out of 191 Sesotho word formations, and through the ideational analysis, that the storyteller employed more material process types than mental process types, and lastly, with the textual interpretation, indicating the value of oral literature in our daily lives as well as the significant role folktales may play in interpreting sociopolitical events in contemporary communities.

An exploration of the computational identification of English loan words in Sesotho

2025年12月31日 08:00

South Africa, with its twelve official languages, is an inherently multilingual country. As such, speakers of many of the languages have been in direct contact. This has led to a cross-over of words and phrases between languages. In this article, we provide a methodology to identify words that are (potentially) borrowed from another language. We test our approach by trying to identify words that moved from English into Sesotho (or potentially the other way around). To do this, we start with a bilingual Sesotho-English dictionary (Bukantswe).
We then develop a lexicographic comparison method that takes a pair of lexical items (English and Sesotho) and computes a range of distance metrics. These distance metrics are applied to the raw words (i.e., comparing orthography), but using the Soundex algorithm, an approximate phonological comparison can be made as well. Unfortunately, Bukantswe does not contain complete annotation of loan words, so a quantitative evaluation is not currently possible. We provide a qualitative analysis of the results, which shows that many loan words can be found, but in some cases lexical items that have a high similarity are not loan words. We discuss different situations related to the influence of orthography, phonology, syllable structure, and morphology. The approach itself is language independent, so it can also be applied to other language pairs, e.g., Afrikaans and Sesotho, or more related languages, such
as isiXhosa and isiZulu.

Mafoko: Structuring and Building Open Multilingual Terminologies for South African NLP

The critical lack of structured terminological data for South Africa’s official languages hampers progress in multilingual NLP, despite the existence of numerous government and academic terminology lists. These valuable assets remain fragmented and locked in non-machine-readable formats, rendering them unusable for computational research and development. Mafoko addresses this challenge by systematically aggregating, cleaning, and standardising these scattered resources into open, interoperable datasets. We introduce the foundational Mafoko dataset, released under the equitable, Africa-centered NOODL framework. To demonstrate its immediate utility, we integrate the terminology into a Retrieval-Augmented Generation (RAG) pipeline. Experiments show substantial improvements in the accuracy and domain-specific consistency of English-to-Tshivenda machine translation for large language models. Mafoko provides a scalable foundation for developing robust and equitable NLP technologies, ensuring South Africa’s rich linguistic diversity is represented in the digital age.

Following Digital Footprints: Researching South African Digital Poetry

2025年12月31日 08:00

Contemporary scholarship increasingly recognises the need to document the growing corpus of African literature being produced and distributed via social media and other online platforms. In African literature and the future, Ogundipe (2015) declared that: In the search for a viable path for the future of African literature, a well-crafted vision of the future and effective strategies to engender transformation are imperative. This raises the practical application of the digital space, the internet and related innovative technology as new paradigms of knowledge to African literary engagement. But the absence of a critical standard remains a bane of this development. To address this critical imperative and further explore the prevalence of such works, I collected a dataset to find examples of literary trends and key recent examples of significant works, informed by Moretti's scholarship on distant reading. The dataset focuses on poetry written by younger South African authors from the Born Free Generation, in line with my broader research. The main purpose of this paper is to present my findings and the theoretical and methodological framework that informed them. The paper concludes by briefly proposing some possible means of expanding this research and proposing a large-scale online archival project.

Shedding Light on Loadshedding with Natural Language Processing: A social media case study on public perspectives of the South African electricity crisis in 2022

2025年12月31日 08:00

In times of collective discomfort and dissatisfaction, people often find solace in shared adversity on social media platforms like X (formerly known as Twitter). These platforms offer a unique window into the public’s emotions andviewpoints concerning common challenges. I n2022, South Africa experienced an electricity crisis, during which the country was subjectedto rolling blackouts, commonly known as load-shedding, by Eskom, the country’s primary electricity provider, to prevent a national electricity grid shutdown. This study conducted adata-driven exploration of the public discourse surrounding Eskom and loadshedding on X using natural language processing and data science techniques. The dataset utilised for thisstudy comprised tweets containing keywords related to Eskom and loadshedding. The studydelved into the topics of discussion by applying topic modelling techniques to uncover latent themes within the discourse. The topics were analysed through a multifaceted lens to unpack and highlight patterns within the sentiments, emotions and biases that underpin conversations related to loadshedding and Eskom. A notable inclusion in the analysis was the incorporation of sarcasm classifications,which enhanced the interpretation of the emotion and sentiment within the topics discussed.The findings uncovered from the analysis were contrasted with loadshedding-related events in 2022 to understand the public discourse as the electricity crisis escalated. The methodologyof this study provides a framework for utilising natural language processing techniques touncover and examine the perspectives of a collective within discourse related to events of shared interest.

Voicing in Ngamambo

2023年1月25日 08:00

This paper describes voicing in Ngamambo, a semi Grassfields Bantu language in the North West Region of Cameroon. The language is classified under the Momo sub-language family (Eberhard, David M., Gray F. Simons and Charles D. Fenning, 2020). Ngamambo is unwritten, and research on the language is scanty. The only available literature on the language is by Asongwed & Hyman (1976)), Achiri-Taboh (2014) and Lem Atanga (2020) However, there has been some recent attempt by the Mbu Language Committee (MLC) to study the language. Interest in the study of Ngamambo stems from the imperative of undertaking a comprehensive description of the language. Preliminary research has revealed the existence of voicing in the language. Voicing is a process whereby the pronunciation of a word is influenced by one of the sounds. Data was obtained from Ngamambo native speakers (informants) over six months. The originality of this study resides in the fact that very little research has been carried out on the language. The authors of this paper discuss one aspect of the language and hope that subsequent studies will determine if voicing is also present in other Grassfields languages, especially the Momo sub-language family. The phonological process of voicing in Ngamambo has been observed when a voiceless sound becomes voiced depending on the environment. It is hoped that understanding this phenomenon would lead to a better understanding of voicing related to language learning.

Re-discovering Narratives of South African Defence Force Servicemen through the Informal Digital Archive

2024年2月19日 08:00

The South African Defence Force (SADF) has become a point of contention in post-Apartheid South African public memory. From 1966 until 1989, approximately 600,000 white males were conscripted into the SADF to fight in conflicts around Southern Africa as well as at home in South Africa. While much academic work has been done on the SADF during the latter half of the Apartheid era, it is filled with rampant apologia relating to the actions of the SADF as well as the narratives of those that served within it. The stories and experiences of SADF conscripts and soldiers has essentially been ‘sanitized’ by academics and authors attempting to make them suitable for the post-Apartheid era. Yet, on internet forums and social media websites, many SADF veterans have found a space to express their narratives freely without the input of reconciliation conscious reviewers. In this informal digital space, a plethora of material has been deposited by these ex-servicemen which now serves as a digital archive from which researchers can gain valuable insight into the actions and experiences of SADF veterans. The unfiltered narratives found in this informal digital archive shines new light on the current academic understanding of the SADF. Instead of the narrative pushed by many academics and authors of young men filled with remorse for fighting a war they understood little about, this material tells a different story. White supremacism, braggadocio and light-hearted discussion on war crimes committed by the SADF fill these digital spaces, creating a counter-narrative to the apologetic stance of many historians and sociologists who have written extensively on the Border War. This paper will explore some of these informal digital archives and seek to answer not only why SADF veterans feel comfortable expressing their narratives freely in the digital space, but also why they have been largely ignored by mainstream academia.

The Possibility of Using African Languages as Media of Teaching and Learning in South Africa

This study sets out to examine the possibility of using African languages as media of teaching and learning in South African schools. Literature is consistent that (a) language is a crucial means of communication and gaining access to important knowledge and skills, and (b) mother tongue is the only language that promotes effective teaching and learning and that any language, which is not a mother tongue, is a barrier to teaching and learning. In South Africa, there are nine African official languages, but English is the media of instruction used by South African learners, which is a barrier to teaching and learning. This study revealed that using one or two African languages may improve teaching, learning, and the academic performance of the learners, but the problem is how to implement because it will be difficult to use many African languages as media of instruction. The use of nine African languages as media of instruction in South Africa will promote tribalism, which was dominant during the apartheid era, and it will be costly to the government. Therefore, this study supports the use of English as a media of instruction because it will promote unity in South Africa, it will not be costly, and it is an international language.

Investigating the Role of Digital Arts in Decolonizing Knowledge and Promoting Indigenous Standpoints

2024年2月19日 08:00

Preliminary studies indicate that African educational systems reflected their socio-cultural being, and fit into the moral, economic and physical developments of its generation before colonial inception. Marker (2011) noted that education is one of the significant tools for colonial exploitation in Africa. Even in this post-colonial era, the contemporary African education or knowledge system is predominantly centered on foreign educational structures and standpoints. This undermines or alters the focus of African belief systems and culture. Africans must preserve and promote their traditional knowledge-based system regardless of its co-existence with foreign education in order to sustain and restore their self-respect and total emancipation. In order to elevate the rich cultural heritage of Africans and to promote the indigenous perspective, there must be a paradigm shift from foreign epistemologies to a decolonized knowledge-based system. Decolonizing knowledge is an effort to theorize one traditional knowledge system and entrench into the imposed foreign epistemology theories and interpretations in order to promote indigenous standpoints. According to Dreyer (2017), it seeks to construct and legitimize other knowledge systems by exploring alternate epistemologies, ontologies, and methodologies. The purpose of this paper is to explore the role of visual narratives/digital storytelling within Digital Arts in decolonizing knowledge and promoting indigenous African cultures and viewpoints. An exploratory research approach through a narrative literature review was utilized to come out with scholarly suggestions from the stance of digital arts researchers. Additionally, an oral interview was conducted to seek views from Digital Arts professionals and researchers.

Exploring ASR fine-tuning on limited domain-specific data for low-resource languages

2024年2月19日 08:00

The majority of South Africa’s eleven languages are low-resourced, posing a major challenge to Automatic Speech Recognition (ASR) development. Modern ASR systems require an extensive amount of data that is extremely difficult to find for low-resourced languages. In addition, available speech and text corpora for these languages predominantly revolve around government, political and biblical content. Consequently, this hinders the ability of ASR systems developed for these languages to perform well especially when evaluating data outside of these domains. To alleviate this problem, the Icefall Kaldi II toolkit introduced new transformer model scripts, facilitating the adaptation of pre-trained models using limited adaptation data. In this paper, we explored the technique of using pre-trained ASR models in a domain where more data is available (government data) and adapted it to an entirely different domain with limited data (broadcast news data). The objective was to assess whether such techniques can surpass the accuracy of prior ASR models developed for these languages. Our results showed that the Conformer connectionist temporal classification (CTC) model obtained lower word error rates by a large margin in comparison to previous TDNN-F models evaluated on the same datasets. This research signifies a step forward in mitigating data scarcity challenges and enhancing ASR performance for low-resourced languages in South Africa.

Digital Archival Preservation and Cultural Heritage

2024年2月19日 08:00

This paper presents a current MA study that addresses the research problem, "What issues and insights about the role of digital archives in the preservation of South African cultural history are raised via the production of an archival documentary and archival website on the life and art of the late sculptor Mr. Bonginkosi Michael Gasa?". This study hopes to show, through the presentation of research and archival material curated thus far, not only the importance of the role archival documentary film and the digital platform play in the preservation of heritage but also how this archival project promotes the idea of an African gaze, which is essential for preserving an authentic cultural voice and heritage. This study is conducted by following a practice-led slant, meaning the research primarily leads to new information about the practice. In this case, the practice will investigate the key elements that go into the production of an archival documentary and secondly the digital archiving of the project online. Mr Bonginkosi Michael Gasa was a sculptor who passed away on the 18th of April 2019 at the age of 55. The film about Mr. Gasa will be reported in a critical reflexive MA dissertation, which will also serve to elucidate the critical, theoretical, and cultural matrix from which the film emerges. The documentary film will be preserved on a website, which will also serve as an online repository, curation, and record of Mr. Gasa’s sculptures. In detailing the study thus far, this paper aims to highlight the potential of digital archives to preserve marginalized voices, such as Michael Bonginkosi Gasa, whose life and work would otherwise remain largely unknown. Moreso, this paper and study hope to show that archives exist to be used for present and future generations, and in this way, to preserve our national heritage.

❌