Gherbal
State-of-the-art language identification that outperforms models many times its size. The only LID model that identifies all Arabic dialects, detects Arabizi, and covers dozens of African languages competitors miss.
Benchmark results
How Gherbal compares
| Model | Size | Languages | Avg Accuracy | Arabic Dialects |
|---|---|---|---|---|
| Gherbal v4 | 200 MB | 214 | 0.836 | 16 / 16 |
| OpenLID v2 | 1,230 MB | 201 | 0.824 | 8 / 16 |
| GlotLID | 1,690 MB | 2,102 | 0.803 | 5 / 16 |
| NLLB-LID | 1,180 MB | 218 | 0.711 | 1 / 16 |
| OpenLID v1 | 1,230 MB | 201 | 0.808 | 6 / 16 |
| FastText-176 | 131 MB | 176 | 0.510 | 0 / 16 |
Arabic Dialect Coverage
Only model to identify all 16 Arabic dialect variants tested — Darija, Egyptian, Gulf, Tunisian, Hassaniya, and more. Competitors identify 0–8.
Arabizi Detection
96–98% accuracy on Latin-script Darija. Every competing model scores exactly 0%. Critical for North African social media analysis.
Lightweight Deployment
Deployable on mobile, serverless, and browser environments. Dramatically better accuracy-per-MB than competing models — production-grade results at a fraction of the resource cost.
Full coverage
214 Languages, 28 Writing Systems
Every language Gherbal can identify — search by name or code, filter by writing system.
Latin
131 languagesArabic
29 languagesCyrillic
12 languagesDevanagari
10 languagesTry Gherbal now
Send text, get language identification results. Fast inference across Arabic dialects, Arabizi, and African languages.
Open Playground