Sovereign Multilingual AI Infrastructure
The language models no one else will build
Production-grade LLMs, language identification, and embeddings for Arabic dialects, Amazigh, and African languages β all through one API.
Models
What we ship
Sawtone
The universal linguistic layer
A deep word encoder bridging phonetics, orthography, and semantics. Built for search, cross-lingual retrieval, and dialectal resilience.
Sawalni
The first LLM for Moroccan languages
Natively supports Darija in Arabic and Latin scripts, Hassaniya, Tachelhit, Tarifit, and standardized Tamazight in Tifinagh.
Gherbal
State-of-the-art language identification
Covers Arabic dialects, detects Arabizi, and identifies African languages that competitors miss entirely.
Madmon
Multilingual text embeddings
Semantic representations for Arabic dialects, Berber languages, and other underrepresented language families.
Our mission
Close the language gap in AI
500 million Arabic speakers use dialects invisible to frontier AI. Dozens of African and Amazigh languages lack basic NLP tooling. We build the models that close the gap.
Ecosystem
Projects
Sawalni
Chat with the first LLM for Moroccan languages β Darija, Amazigh, Arabic, French, and more.
Wikilangs
Open-source NLP models for 300+ languages β the community initiative behind our language coverage.
Wikipedia Monthly
Monthly-refreshed Wikipedia dumps for 340+ language editions. Used by leading AI labs.
Herd
Browser superpowers and MCP server for AI agents. Operate any website via function-calling.
Start building with our models
LLMs, language identification, and embeddings. One API key. Production-ready.