Speech to Text by Using the Sindhi Language

  • Naadiya Khuda Bux
  • Ambreen Khan
  • Khuda Bakhsh
Keywords: Speech to Text, Sindhi Natural Language Processing, Native Language, Convolutional Neural Network.

Abstract

We live in the era of technology, and advancement in technology is growing exponentially. In Pakistan, especially in Sindh, many people prefer to speak than write. People are not well aware of computational and other global languages. So, that's why they face so many difficulties typing and then converting it into the Sindhi language. Especially in offices/organizations where Sindhi is the first language used in speaking and typing. There drafting huge consumes too much time. They face many difficulties such as finding spelling correct words and so on. People with medical deficiencies and disabilities will also get a beneficial source of help from this tool. This tool can handle all these difficulties and solve all the discussed problems. This project aims to develop a web-based application that tries to overcome the disadvantages of the other available applications. The application is generic, meaning it may work solely for a specific regional language speaker in any country in the world. The main objective of the work presented throughout this report is to develop an enterprise and open platform for the nation. We are using the Convolutional Neural Network and API in the development phase. Advance python libraries detect the user's speech, and then the conversion will take one into text.

References

[1] Humayun, M.A., H. Yassin, and P.E. Abas, Native language identification for Indian-speakers by an ensemble of phoneme-specific, and text-independent convolutions. Speech Communication, 2022. 139: p. 92-101.
[2] Ma, P., S. Petridis, and M. Pantic, Visual Speech Recognition for Multiple Languages in the Wild. arXiv preprint arXiv:2202.13084, 2022.
[3] Takenouchi, T., The Effects of Pronunciation Instruction Using Speech Recognition Software for Adult Learners of English.
[4] Bhaskar, S. and T. Thasleema, LSTM model for visual speech recognition through facial expressions. Multimedia Tools and Applications, 2022: p. 1-18.
[5] Abbasi, A.M., M.A. Channa, and M.A. Khan, Temporal Patterns of Voice Onset Time of English-Sindhi Stops. 2022.
[6] Tatipang, D.P., William Shakespeare and Modern English: To What Extent the Influence of Him in Modern English. Journal of English Language Teaching, Literature and Culture, 2022. 1(1): p. 61-71.
[7] Lee, S.-M., A systematic review of context-aware technology use in foreign language learning. Computer assisted language learning, 2022. 35(3): p. 294-318.
[8] Kalyanathaya, K.P., D. Akila, and P. Rajesh, Advances in natural language processing–a survey of current research trends, development tools and industry applications. International Journal of Recent Technology and Engineering, 2019. 7(5C): p. 199-202.
[9] Gatt, A. and E. Krahmer, Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 2018. 61: p. 65-170.
[10] Mahmood, S., et al., Belt and road initiative as a catalyst of infrastructure development: Assessment of resident’s perception and attitude towards China-Pakistan Economic Corridor. PloS one, 2022. 17(7): p. e0271243.
[11] M.H PAHWARM., LAGUAGES OF SID BETWEE RISE OF AMRI AD FALL OF MASURA i.e. 5000 YEARS AGO TO 1025 A.D.
[12] Ghori, A.F., et al., Acoustic modelling using deep learning for Quran recitation assistance. International Journal of Speech Technology, 2022: p. 1-9.
[13] Arora, S., et al. Espnet-slu: Advancing spoken language understanding through espnet. in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2022. IEEE.
[14] Gupta, A., On Building Spoken Language Understanding Systems for Low Resourced Languages. arXiv preprint arXiv:2205.12818, 2022.
[15] Halabi, N. and M. Wald, Phonetic inventory for an Arabic speech corpus. 2016.
[16] Alsharhan, E. and A. Ramsay, Improved Arabic speech recognition system through the automatic generation of fine-grained phonetic transcriptions. Information Processing & Management, 2019. 56(2): p. 343-353.
[17] Sawa, Y., R. Takashima, and T. Takiguchi. Adaptation of a Pronunciation Dictionary for Dysarthric Speech Recognition. in 2022 IEEE 4th Global Conference on Life Sciences and Technologies (LifeTech). 2022. IEEE.
[18] Al-Anzi, F. and D. AbuZeina. Literature survey of Arabic speech recognition. in 2018 International Conference on Computing Sciences and Engineering (ICCSE). 2018. IEEE.
[19] Poncelet, J. and V. Renkens, Low resource end-to-end spoken language understanding with capsule networks. Computer Speech & Language, 2021. 66: p. 101142.
[20] Dootio, M.A. and A.I. Wagan, Unicode-8 based linguistics data set of annotated Sindhi text. Data in brief, 2018. 19: p. 1504-1514.
[21] Ursani, A.A., B.S. Chowdhry, and M. Unar, A Speech To Text System for Sindhi Language. Mehran University Research Journal of Engineering and Technology, 2001. 20(3): p. 139-146.
Published
2022-08-29
How to Cite
Bux, N., Khan, A., & Bakhsh, K. (2022). Speech to Text by Using the Sindhi Language. International Journal of Artificial Intelligence & Mathematical Sciences, 1(1), 37-46. https://doi.org/10.58921/ijaims.v1i1.21