普通视图

Received before yesterday6 - JDHASA (Journal of the Digital Humanities Association of Southern Africa)

Preface to the Proceedings of RAIL 2025

The sixth workshop on Resources for African Indigenous Languages (RAIL) was held on 10 November 2025 at the CSIR International Convention Centre in Pretoria, South Africa. It was co-located with the Digital Humanities Association of Southern Africa (DHASA) 2025 conference, which took place from 11 to 14 November 2025

Traditional Readability Approaches in Sesotho and isiZulu

2025年12月31日 08:00

This paper presents a conceptual overview of traditional readability metrics adapted for two South African Indigenous languages, isiZulu and Sesotho, which differ orthographically with conjunctive and disjunctive writing systems, respectively. Both languages are low-resource, lacking extensive corpora, lexicons, and pretrained models necessary for automatic readability assessment. By critically examining these adaptations, we highlight the challenges of applying English-based metrics to morphologically complex African languages and emphasise the need for language-specific digital resources that reflect local linguistic structures. Our work aligns with ongoing efforts to develop and enhance language resources for under-resourced African Indigenous languages, thereby supporting their evolving presence and accessibility in the digital age, including contexts shaped by large language models.

An exploration of the computational identification of English loan words in Sesotho

2025年12月31日 08:00

South Africa, with its twelve official languages, is an inherently multilingual country. As such, speakers of many of the languages have been in direct contact. This has led to a cross-over of words and phrases between languages. In this article, we provide a methodology to identify words that are (potentially) borrowed from another language. We test our approach by trying to identify words that moved from English into Sesotho (or potentially the other way around). To do this, we start with a bilingual Sesotho-English dictionary (Bukantswe).
We then develop a lexicographic comparison method that takes a pair of lexical items (English and Sesotho) and computes a range of distance metrics. These distance metrics are applied to the raw words (i.e., comparing orthography), but using the Soundex algorithm, an approximate phonological comparison can be made as well. Unfortunately, Bukantswe does not contain complete annotation of loan words, so a quantitative evaluation is not currently possible. We provide a qualitative analysis of the results, which shows that many loan words can be found, but in some cases lexical items that have a high similarity are not loan words. We discuss different situations related to the influence of orthography, phonology, syllable structure, and morphology. The approach itself is language independent, so it can also be applied to other language pairs, e.g., Afrikaans and Sesotho, or more related languages, such
as isiXhosa and isiZulu.

Mafoko: Structuring and Building Open Multilingual Terminologies for South African NLP

The critical lack of structured terminological data for South Africa’s official languages hampers progress in multilingual NLP, despite the existence of numerous government and academic terminology lists. These valuable assets remain fragmented and locked in non-machine-readable formats, rendering them unusable for computational research and development. Mafoko addresses this challenge by systematically aggregating, cleaning, and standardising these scattered resources into open, interoperable datasets. We introduce the foundational Mafoko dataset, released under the equitable, Africa-centered NOODL framework. To demonstrate its immediate utility, we integrate the terminology into a Retrieval-Augmented Generation (RAG) pipeline. Experiments show substantial improvements in the accuracy and domain-specific consistency of English-to-Tshivenda machine translation for large language models. Mafoko provides a scalable foundation for developing robust and equitable NLP technologies, ensuring South Africa’s rich linguistic diversity is represented in the digital age.

Multilingual vibes: Visualising linguistic resources and emoji in Southern African online discourse


This article presents Vibes, a prototype interface for visualising multilingual online discourse in Southern Africa. We developed the prototype during a three-day hackathon with a multidisciplinary team. The interface combines computational tools, manual coding and visualisation methods to work with data that standard NLP tools cannot process due to their monolingual design. We tested Vibes on two YouTube datasets: English/isiXhosa comments from the @cmtvsa channel and comments on videos discussing a hair product advertisement controversy. Through this work, we encountered practical challenges, including language identification failures, code-switching within single posts, non-standard orthographies, and multimodal communication through emojis. The challenges led us to propose an interface for collaborative coding that accounts for translanguaging practices. The hackathon development process highlighted the need for context-sensitive tools to study linguistic diversity in the Global South.

Exploring African Digital Humanities Using the Journal of the Digital Humanities Association of Southern Africa

2025年12月31日 08:00

Digital Humanities scholarship is often framed through paradigms developed in the Global North, leaving African-specific practices and epistemologies underexplored. In this article, I use topic modelling and lexical analysis to investigate what constitutes African DH by analysing 41 Southern African DH articles. The findings indicate that the majority of publications in JDHASA engage deeply with language-related topics. The field combines advanced computational methods with a strong grounding in local languages, cultural heritage, and socio-historical realities. It also reflects responsiveness to evolving digital social realities, addressing themes such as online harm, misinformation, and affective communities. This article contributes to the theorisation of African DH by identifying thematic tendencies and methodological patterns specific to the Southern African context. It highlights the dual focus on computational innovation and cultural rootedness, offering an empirically grounded foundation for further critical engagement with what African DH is and what it can become.

A deep mapping prospect:

2024年10月8日 08:00

Urbanisation in South Africa is expanding at an alarming rate. With the ever-expanding growth of urban areas, it is essential to understand how urban sites can function as greenspaces that provide wildlife habitats, biodiversity hotspots, as well as act as movement corridors for birds and mammals. In this study, I investigate the grounds of the Fort England Psychiatric Hospital as a small-scale greenspace. The hospital is located in Makhanda (formerly Grahamstown), in the Eastern Cape Province of South Africa. I propose an investigation of the site that is informed by a branch of digital environmental humanities that uses digital technologies to mobilise collective action in the conservation and stewardship of a greenspace. To this end, the article calls for a deep mapping of the site to achieve a twofold research objective. First, to study the biodiversity of the site and explore the possibility that it could offer a refuge for threatened species. Second, to call for an inclusive management plan for the greening and conservation of the site. To substantiate, the site is of importance for human use, environmental history, as well as for fauna and flora species. Accordingly, the site’s management plan must engage, accommodate and negotiate a diverse set of interests, as well as mobilise action from various members of the community.

Voicing in Ngamambo

2023年1月25日 08:00

This paper describes voicing in Ngamambo, a semi Grassfields Bantu language in the North West Region of Cameroon. The language is classified under the Momo sub-language family (Eberhard, David M., Gray F. Simons and Charles D. Fenning, 2020). Ngamambo is unwritten, and research on the language is scanty. The only available literature on the language is by Asongwed & Hyman (1976)), Achiri-Taboh (2014) and Lem Atanga (2020) However, there has been some recent attempt by the Mbu Language Committee (MLC) to study the language. Interest in the study of Ngamambo stems from the imperative of undertaking a comprehensive description of the language. Preliminary research has revealed the existence of voicing in the language. Voicing is a process whereby the pronunciation of a word is influenced by one of the sounds. Data was obtained from Ngamambo native speakers (informants) over six months. The originality of this study resides in the fact that very little research has been carried out on the language. The authors of this paper discuss one aspect of the language and hope that subsequent studies will determine if voicing is also present in other Grassfields languages, especially the Momo sub-language family. The phonological process of voicing in Ngamambo has been observed when a voiceless sound becomes voiced depending on the environment. It is hoped that understanding this phenomenon would lead to a better understanding of voicing related to language learning.

Introduction to the Special Issue: “Digital Humanities for Inclusion”

2024年2月19日 08:00

It is with immense pride and anticipation that we introduce the fifth volume of the Journal of Digital Humanities Association of Southern Africa (JDHASA), centred on the theme “Digital Humanities for Inclusion.”

Advocating for the Digitization of the History of China-Africa Diplomatic Relations

2024年2月19日 08:00

This article delves into the examination of research pertaining to the historical establishment of diplomatic relations between China and African countries, intersecting with the domain of historical digitization, focusing on the fusion of oral history and the digital preservation of historical documents. After underscoring the significance of investigating the history of the establishment of diplomatic relations between China and African nations, the study conducts a literature review, delving into the status quo of historical digitization research in both Chinese and African academies with both sides’ current study on each other’s history and Sino-African relations, thereby furnishing a robust technical and scholarly groundwork for this research. Afterward, the article deliberates upon the prospects and challenges intrinsic to digitizing the history surrounding the initiation of diplomatic relations between China and Africa. Conclusively, the article proffers recommendations aimed at catalyzing the digitization trajectory of this historical narrative, spanning two tiers, including the official multilateral cooperation mechanism and collaborative endeavors involving private academic institutions.

Empathic Engagement and Aesthetic Appreciation Between Readers’ Ethnicity and Narratives’ Literary Prestige

2024年2月19日 08:00

Scholars of postcolonial studies have highlighted the role played by identity features in both the production and the reception of literary works. In this paper, we apply computational methods to a corpus of reviews of South-African post-colonial novels, downloaded from the Goodreads platform, in order to assess the influence of sociocultural and intersectional factors on the level of appreciation and identification potential of narratives. In particular, we investigate the effect, on the one hand, of the reader’s ethnicity and, on the other, of the work’s literary prestige on the appreciation and the empathic transportation elicited by narratives in the reader. To operationalize our hypotheses, we collected information on the reviewers’ country of provenance (self-declared by Goodreads users) and on the book’s critical appreciation (via either the award of or the nomination for a literary prize). Such information was compared with: (a) Goodreads star rating scores, indicative of success in the online reading community; (b) usage of empathy lexicon (identified via the Linguistic Inquiry and Word Count tool – in short LIWC), indicative of the reader’s identification in the narrative. Results indicate that readers typically empathize more with works that reflect themes from their own country and tend to award them with slightly higher ratings. Furthermore, we found that critically appreciated books, though collecting higher ratings, elicit a smaller empathic response in the reader than those that did not win or were not nominated for any literary prize.

The Role of Social Media in Xenophobic Attack in South Africa

2024年2月19日 08:00

Xenophobia is a pressing issue in South Africa, with frequent instances of violence against immigrants. With the rise of social media, platforms like Twitter reflect public sentiment on this matter. This study examines tweets from 2017 to 2022 about xenophobia in South Africa, using NLP, sentiment analysis, and machine learning to understand public feelings and predict potential xenophobic incidents. The findings aim to help policymakers devise strategies to enhance social cohesion and promote a more inclusive society.

Investigating the Role of Digital Arts in Decolonizing Knowledge and Promoting Indigenous Standpoints

2024年2月19日 08:00

Preliminary studies indicate that African educational systems reflected their socio-cultural being, and fit into the moral, economic and physical developments of its generation before colonial inception. Marker (2011) noted that education is one of the significant tools for colonial exploitation in Africa. Even in this post-colonial era, the contemporary African education or knowledge system is predominantly centered on foreign educational structures and standpoints. This undermines or alters the focus of African belief systems and culture. Africans must preserve and promote their traditional knowledge-based system regardless of its co-existence with foreign education in order to sustain and restore their self-respect and total emancipation. In order to elevate the rich cultural heritage of Africans and to promote the indigenous perspective, there must be a paradigm shift from foreign epistemologies to a decolonized knowledge-based system. Decolonizing knowledge is an effort to theorize one traditional knowledge system and entrench into the imposed foreign epistemology theories and interpretations in order to promote indigenous standpoints. According to Dreyer (2017), it seeks to construct and legitimize other knowledge systems by exploring alternate epistemologies, ontologies, and methodologies. The purpose of this paper is to explore the role of visual narratives/digital storytelling within Digital Arts in decolonizing knowledge and promoting indigenous African cultures and viewpoints. An exploratory research approach through a narrative literature review was utilized to come out with scholarly suggestions from the stance of digital arts researchers. Additionally, an oral interview was conducted to seek views from Digital Arts professionals and researchers.

Developing a Code-Mixed Sentiment Analysis Dataset of Xitsonga-English Music Reviews

2024年2月19日 08:00

Sentiment analysis is the process of classifying text emotions as positive, negative or neutral. Code-mixed sentiment analysis refers to the classification of text’s sentiments that contains two or more languages. There are limited studies developed for sentiment analysis on South African code-mixed languages and this is due to the absence of annotated dataset. The purpose of the study was to collect code-mixed text data for the Xitsonga-English language pair. The study collected Xitsonga-English code-mixed comments for music reviews from a YouTube channel. After the data was collected, tokenization using a python library called natural language toolkit was performed. Subsequently, we analyzed the comments for the presence of code-mixing. The collected Xitsonga-English code-mixed data would be suitable to build a sentiment analysis model.

Digital Archival Preservation and Cultural Heritage

2024年2月19日 08:00

This paper presents a current MA study that addresses the research problem, "What issues and insights about the role of digital archives in the preservation of South African cultural history are raised via the production of an archival documentary and archival website on the life and art of the late sculptor Mr. Bonginkosi Michael Gasa?". This study hopes to show, through the presentation of research and archival material curated thus far, not only the importance of the role archival documentary film and the digital platform play in the preservation of heritage but also how this archival project promotes the idea of an African gaze, which is essential for preserving an authentic cultural voice and heritage. This study is conducted by following a practice-led slant, meaning the research primarily leads to new information about the practice. In this case, the practice will investigate the key elements that go into the production of an archival documentary and secondly the digital archiving of the project online. Mr Bonginkosi Michael Gasa was a sculptor who passed away on the 18th of April 2019 at the age of 55. The film about Mr. Gasa will be reported in a critical reflexive MA dissertation, which will also serve to elucidate the critical, theoretical, and cultural matrix from which the film emerges. The documentary film will be preserved on a website, which will also serve as an online repository, curation, and record of Mr. Gasa’s sculptures. In detailing the study thus far, this paper aims to highlight the potential of digital archives to preserve marginalized voices, such as Michael Bonginkosi Gasa, whose life and work would otherwise remain largely unknown. Moreso, this paper and study hope to show that archives exist to be used for present and future generations, and in this way, to preserve our national heritage.

Unmasking Deception: An Exploratory Study of Viewers’ Attitudes Towards Romantic Betrayal

2024年2月19日 08:00

Although romantic deception is prevalent in many societies, it may not be readily acceptable to publicly acknowledge approval of acts associated with such deception. This article explores the publicly acknowledged sentiments of viewers of two YouTube channels aimed at the exposure of romantic deceit through two shows for facilitating a “couple switching phones” game. Specifically, we analyse videos where all participants are caught engaging in extra-relationship affairs. Our study reveals a prevailing trend of neutral comments from viewers, indicating a reluctance to openly acknowledge approval or disapproval of the depicted acts. Interestingly, the discussions primarily revolve around tribal issues [specially focused on the Xhosa tribe] rather than focusing on the subject of romantic deception itself.

Harnessing Google Translations to Develop a Readability Corpus for Sesotho: An Exploratory Study

2024年2月19日 08:00

This article addresses the scarcity of gold-standard annotated corpora for readability assessment in Sesotho, a low-resource language. As a solution, we propose using translated texts to construct a readability-labelled corpus. Specifically, we investigate the feasibility of using Google Translate to translate texts from Sesotho to English and then manually post-editing the texts. We then evaluate the effectiveness of the Google translations by comparing them to the human-post-edited versions. We utilised the Ghent University readability demo to extract the readability levels of both the Google translations and the human-post-edited translations. The translations are then evaluated using three evaluation metrics, namely, BLEU, NIST, and RIBES scores. The translation evaluation results reveal substantial similarities between the machine translations and the corresponding human-post-edited texts. Moreover, the results of the readability assessment and the comparison of text properties demonstrate a high level of consistency between machine translations and human-post-edited texts. These findings suggest that Google Translations show promise in addressing challenges in developing readability-labelled parallel datasets in low-resource languages like Sesotho, highlighting the potential of leveraging machine translation techniques to develop translated corpora for such languages. The evaluation of Google Translations in the context of educational texts in Sesotho and the demonstration of the feasibility and potential of using machine translations for enhancing readability in Sesotho will aid in the quest for developing Sesotho text readability measures.

❌