AI for Northeast India’s Languages

Research-grade multilingual AI, built for real-world deployment. We develop production-ready language models, ethical data systems, and tools designed for government, enterprise, and community use across Northeast India.

Rooted in Context, Built for the World

At MWirelabs, we advance open research in AI with a focus on inclusivity, safety, and cultural context. Our work begins with low-resource languages of Northeast India and scales toward universal accessibility. We build the complete AI stack: language models, speech, vision, and deployment-ready tools, ensuring that technology serves every community in their own voice.

Languages First

From Khasi to Garo to Mizo and Assamese, our models are trained from with deep linguistic and cultural sensitivity, not adapted from English-first systems. We preserve the grammatical structures, cultural context, and nuances that make each language unique.

Efficient and Accessible

Optimized for real-world use: responsive, lightweight, and capable of running in resource-constrained environments across Northeast India. Our models work where connectivity is limited and computing resources are precious.

Research-Driven and Transparent

Every release is grounded in rigorous testing, published research, and open-source transparency. We contribute to AIkosh, share datasets on Kaggle and HuggingFace, and publish our findings so the entire AI community can build on our work.

Models Published
0 +
Northeast India Languages
0 +
Multilingual Northeast Datasets
0 M+

Our Foundation Models

We create and open-source multilingual foundational models designed to support research, education, and practical applications across Northeast India.

Kren-M

Generative Language Model

Kren-M is a bilingual (Khasi–English) language model developed through extensive continued pre-training and supervised fine-tuning of Gemma 2 (2B). Specifically designed for the Khasi, a low-resource Austroasiatic language spoken in Meghalaya, Northeast India, while retaining English fluency from its base model.

~2.6B params

NE-BERT

Multilingual Foundation Model

NE-BERT: A regional state-of-the-art open-source model for 9 Northeast Indian languages. Built on ModernBERT for superior speed and accuracy in low-resource NLP.

~149M params

NE-OCR

Text Recognition Model

NE-OCR is a unified OCR model for 10 Northeast Indian languages across 12 language-script pairs and 4 scripts, along with Hindi and English anchors. Developed on the ViTSTR-Base backbone (86M parameters) using 1.34 million text-image pairs from native corpora, it delivers 94.99% mean Character Accuracy and the fastest inference speed of 17.2 ms/image.

~86M params

Ready for Deployment

Our models and systems are production-ready. Choose the path that fits your needs:

Production APIs

REST APIs for our models with commercial licensing

On-Premise & Offline

Full model deployment on your infrastructure for sensitive data and governance use

Custom Integration

End-to-end solutions for government departments, enterprises, and NGOs

From Research to Deployed Impact

Our models and systems are designed to power real AI solutions for government departments, NGOs, and enterprises across Northeast India; from citizen services and document workflows to offline tools that work in local languages.

Government Automation

Production-ready multilingual models and automation tools built for citizen services, government workflows, and public service delivery. Designed for scalability, offline use, and cultural context across Northeast India.

Document Intelligence

High-accuracy multilingual document extraction, classification, and processing systems optimised for government and enterprise use. Handles regional scripts and real-world document formats with strong performance.

Offline AI Solutions

Efficient models engineered for on-premise and offline deployment. Ideal for sensitive data, low-connectivity environments, and organisations that require full control and compliance.

Be Part of Northeast India's AI Revolution

Research Collaboration

Access our models, datasets, and technical documentation. Collaborate on cutting-edge NLP research for low-resource languages.

Startup Support

Build AI-driven solutions with our open-source models. Get technical guidance and integration support for your applications.

Government & NGO Partnerships

Deploy proven AI solutions for citizen services, automation, and multilingual communication at scale.

Ethical Data Partnerships

Consent-based data collection, curation, and annotation services tailored for governance projects and low-resource NLP. Strong focus on indigenous data sovereignty and community benefit.

Developer Ecosystem

Integrate KREN APIs into your applications. Commercial licensing and custom development available.

Built in the Open, For Everyone

All our foundational models are open-source on HuggingFace. We believe language AI should be accessible to researchers, developers, and communities across Northeast India and beyond. Our datasets are published on Kaggle, our research is transparent, and we actively contribute to AIkosh, India’s National AI Repository.

Our team works across Northeast India, building AI from authentic community data.

Ready to deploy AI that works in your language and region?

MWirelabs invites researchers, educators, and developers to collaborate in shaping technology that reflects the world’s diversity.