Iqra'Eval is a shared task aimed at advancing automatic assessment of Qur’anic recitation pronunciation by leveraging computational methods to detect and diagnose pronunciation errors. The focus on Qur’anic recitation provides a standardized and well-defined context for evaluating Modern Standard Arabic (MSA) pronunciation.
Participants will develop systems capable of detecting mispronunciations (e.g., substitution, deletion, or insertion of phonemes).
Design a model to detect and provide detailed feedback on mispronunciations in Quranic recitations. Users read vowelized verses; the model predicts the spoken phoneme sequence and flags deviations. Evaluation is on the QuranMB.v2 dataset with human‐annotated errors.
Figure: Overview of the Mispronunciation Detection Workflow
System shows a Reference Verse plus its Reference Phoneme Sequence.
Example:
< i n n a SS A f aa w a l m a r w a t a m i n $ a E a a < i r i l l a h i
User recites; system captures and stores the audio waveform.
Model predicts the phoneme sequence—deviations from reference indicate mispronunciations.
Example of Mispronunciation:
< i n n a SS A f aa w a l m a r w a t a m i n $ a E a a < i r i l l a h i
< i n n a SS A f aa w a l m a r w a t a m i n s a E a a < i r u l l a h i
< i n n a SS A f aa w a l m a r w s a E a a < i r u l l a h i
Here, $
→s
and i
→u
; omission of ta
went undetected.
The phoneme set used in this work is based on a specialized phonetizer developed for vowelized MSA by Nawar Halabi. It includes a comprehensive range of 68 phonemes designed to capture key phonetic and prosodic features of Qur’an recitation, such as stress, pausing, intonation, emphaticness, and notably, gemination. Gemination—the doubling of consonant sounds—is explicitly represented by duplicating the consonant symbol (e.g., /b/
becomes /bb/
).
While the phonetizer distinguishes vowels following emphatic and non-emphatic consonants, this distinction is merged in our approach to better align with MSA pronunciation norms, where the difference does not affect meaning. This phoneme set provides a detailed yet practical representation of the speech sounds relevant for accurate mispronunciation detection in Qur’anic recitation.
For further details, including the full phoneme inventory, see Phoneme Inventory.
Hosted on Hugging Face:
load_dataset("IqraEval/Iqra_train", split="train")
load_dataset("IqraEval/Iqra_train", split="dev")
Columns:
audio
: waveformsentence
: original text (verse)index
: verse IDtashkeel_sentence
: fully diacritized text (verse)phoneme
: phoneme sequence (using phonetizer)
Auxiliary high-quality TTS corpus for augmentation:
load_dataset("IqraEval/Iqra_TTS")
98 verses × 18 speakers ≈ 2 h, with deliberate errors and human annotations.
load_dataset("IqraEval/Iqra_QuranMB_v2")
Submit a UTF-8 CSV named teamID_submission.csv
with two columns:
ID,Labels 0000_0001, i n n a m a a y a … 0000_0002, m a a n a n s a … …
Note: no extra spaces, single CSV, no archives.
IqraEval Leaderboard is based on phoneme-level F1-score. We use a hierarchical evaluation (detection + diagnostic) per MDD Overview.
From these we compute:
Rates:
Plus standard Precision, Recall, F1 for detection:
Teams and individual participants must register to gain access to the test set. Please complete the registration form using the link below:
Registration opens on June 10, 2025.
Further details on the open-set leaderboard submission will be posted on the shared task website (June 20, 2025). Stay tuned!
For inquiries and support, reach out to the task coordinators at iqraeval@googlegroups.com.