• Login
    • Login
    Advanced Search
    View Item 
    •   Maseno IR Home
    • Journal Articles
    • School of Arts and Social Sciences
    • Department of Linguistics
    • View Item
    •   Maseno IR Home
    • Journal Articles
    • School of Arts and Social Sciences
    • Department of Linguistics
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili

    Thumbnail
    View/Open
    2210.16537.pdf (1.109Mb)
    Publication Date
    2022-10-29
    Author
    Awino, Ebbie
    Wanzare, Lilian
    Muchemi, Lawrence
    Wanjawa, Barack
    Ombui, Edward
    Indede, Florence
    McOnyango, Owen
    Okal, Benard
    Metadata
    Show full item record
    Abstract/Overview
    Building automatic speech recognition (ASR) systems is a challenging task, especially for underresourced languages that need to construct corpora nearly from scratch and lack sufficient training data. It has emerged that several African indigenous languages, including Kiswahili, are technologically under-resourced. ASR systems are crucial, particularly for the hearing-impaired persons who can benefit from having transcripts in their native languages. However, the absence of transcribed speech datasets has complicated efforts to develop ASR models for these indigenous languages. This paper explores the transcription process and the development of a Kiswahili speech corpus, which includes both read-out texts and spontaneous speech data from native Kiswahili speakers. The study also discusses the vowels and consonants in Kiswahili and provides an updated Kiswahili phoneme dictionary for the ASR model that was created using the CMU Sphinx speech recognition toolbox, an open-source speech recognition toolkit. The ASR model was trained using an extended phonetic set that yielded a WER and SER of 18.87% and 49.5%, respectively, an improved performance than previous similar research for under-resourced languages.
    Permalink
    https://repository.maseno.ac.ke/handle/123456789/6044
    Collections
    • Department of Linguistics [77]

    Maseno University. All rights reserved | Copyright © 2022 
    Contact Us | Send Feedback

     

     

    Browse

    All of Maseno IRCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage Statistics

    Maseno University. All rights reserved | Copyright © 2022 
    Contact Us | Send Feedback