阅读视图

Preface to the Proceedings of RAIL 2025

The sixth workshop on Resources for African Indigenous Languages (RAIL) was held on 10 November 2025 at the CSIR International Convention Centre in Pretoria, South Africa. It was co-located with the Digital Humanities Association of Southern Africa (DHASA) 2025 conference, which took place from 11 to 14 November 2025

  •  

Creating Bilingual Corpora for isiZulu: A Case Study from the University of KwaZulu-Natal

Although several bilingual resources exist, there is a lack of domain-specific, institutionally verified parallel corpus focusing on academic and administrative texts. Existing datasets such as Autshumato English–isiZulu corpus, UNISA English/Zulu Parallel Corpus, and the WebCrawl African Corpus hosted on GitHub provide valuable material but differ in accessibility, domain coverage, and documentation. To complement these initiatives, the University Language Planning and Development Office (ULPDO) at the University of KwaZulu-Natal has developed a curated isiZulu–English Parallel Corpus comprising 10,000 carefully aligned sentence pairs drawn from institutional and academic texts. This paper outlines the corpus compilation process, including data sourcing, cleaning, alignment, and validation, and discusses key structural and linguistic challenges encountered. The resource contributes to translation studies, terminology development, and multilingual natural language processing, while supporting ongoing efforts to advance the digital presence and intellectualisation of isiZulu.

  •