Independent GenAI R&D Lab

AI for the languages the industry ignores

We build language models, datasets, and tools for underrepresented languages and cultures – starting with the 500 million Arabic speakers whose dialects remain invisible to frontier AI.

340+ Languages supported
1 Peer-reviewed publication
1st LLM for Moroccan Darija
Scroll

What we do

Three pillars of research, one mission

🌍

Low-Resource Languages

Pre-trained models, tokenizers, and datasets for 340+ underrepresented languages. No GPU required.

πŸ€–

Agentic AI

End-to-end agentic LLM training pipelines, MCP tooling, and broad-coverage agent behaviors.

🎭

Cultural Alignment

Language models that reflect the values, norms, and linguistic realities of non-Western cultures.

Stay connected

Reducing the digital divide, one language at a time

Get updates on our research, open-source releases, and upcoming publications.