Thanks to advancements in speech and natural language processing, there is hope that one day you may be able to ask your virtual assistant what the best salad ingredients are. Currently, it is possible to ask your home gadget to play music, or open on voice command, which is a feature already found in some many devices.

If you speak Moroccan, Algerian, Egyptian, Sudanese, or any of the other dialects of the Arabic language, which are immensely varied from region to region, where some of them are mutually unintelligible, it is a different story. If your native tongue is Arabic, Finnish, Mongolian, Navajo, or any other language with high level of morphological complexity, you may feel left out.

These complex constructs intrigued Ahmed Ali to find a solution. He is a principal engineer at the Arabic Language Technologies group at the Qatar Computing Research Institute (QCRI)—a part of Qatar Foundation’s Hamad Bin Khalifa University and founder of ArabicSpeech, a “community that exists for the benefit of Arabic speech science and speech technologies.”

Qatar Foundation Headquarters

Ali became captivated by the idea of talking to cars, appliances, and gadgets many years ago while at IBM. “Can we build a machine capable of understanding different dialects—an Egyptian pediatrician to automate a prescription, a Syrian teacher to help children getting the core parts from their lesson, or a Moroccan chef describing the best couscous recipe?” he states. However, the algorithms that power those machines cannot sift through the approximately 30 varieties of Arabic, let alone make sense of them. Today, most speech recognition tools function only in English and a handful of other languages.

The coronavirus pandemic has further fueled an already intensifying reliance on voice technologies, where the way natural language processing technologies have helped people comply with stay-at-home guidelines and physical distancing measures. However, while we have been using voice commands to aid in e-commerce purchases and manage our households, the future holds yet more applications.

Millions of people worldwide use massive open online courses (MOOC) for  its open access and unlimited participation. Speech recognition is one of the main features in MOOC, where students can search within specific areas in the spoken contents of the courses and enable translations via subtitles. Speech technology enables digitizing lectures to display spoken words as text in university classrooms.

Ahmed Ali, Hamad Bin Kahlifa University

According to a recent article in Speech Technology magazine, the voice and speech recognition market is forecast to reach $26.8 billion by 2025, as millions of consumers and companies around the globe come to rely on voice bots not only to interact with their appliances or cars but also to improve customer service, drive health-care innovations, and improve accessibility and inclusivity for those with hearing, speech, or motor impediments.

In a 2019 survey, Capgemini forecast that by 2022, more than two out of three consumers would opt for voice assistants rather than visits to stores or bank branches; a share that could justifiably spike, given the home-based, physically distanced life and commerce that the epidemic has forced upon the world for more than a year and a half.

Nonetheless, these devices fail to deliver to vast swaths of the globe. For those 30 types of Arabic and millions of people, that is a substantially missed opportunity.

rabic for machines

English- or French-speaking voice bots are far from perfect. Yet, teaching machines to understand Arabic is particularly tricky for several reasons. These are three commonly recognised challenges:

Lack of diacritics. Arabic dialects are vernacular, as in primarily spoken. Most of the available text is nondiacritized, meaning it lacks accents such as the such as the acute (´) or

Read More


By: Ahmed Ali
Title: Machine learning improves Arabic speech transcription capabilities
Sourced From:
Published Date: Wed, 24 Nov 2021 14:20:07 +0000