AI for Northeast India’s Languages
Research-grade multilingual AI, built for real-world deployment. We develop production-ready language models, ethical data systems, and tools designed for government, enterprise, and community use across Northeast India.
Rooted in Context, Built for the World
At MWirelabs, we advance open research in AI with a focus on inclusivity, safety, and cultural context. Our work begins with low-resource languages of Northeast India and scales toward universal accessibility. We build the complete AI stack: language models, speech, vision, and deployment-ready tools, ensuring that technology serves every community in their own voice.
Languages First
From Khasi to Garo to Mizo and Assamese, our models are trained from with deep linguistic and cultural sensitivity, not adapted from English-first systems. We preserve the grammatical structures, cultural context, and nuances that make each language unique.
Efficient and Accessible
Optimized for real-world use: responsive, lightweight, and capable of running in resource-constrained environments across Northeast India. Our models work where connectivity is limited and computing resources are precious.
Research-Driven and Transparent
Every release is grounded in rigorous testing, published research, and open-source transparency. We contribute to AIkosh, share datasets on Kaggle and HuggingFace, and publish our findings so the entire AI community can build on our work.
Our Foundation Models
We create and open-source multilingual foundational models designed to support research, education, and practical applications across Northeast India.
Kren-M
Generative Language Model
Kren-M is a bilingual (Khasi–English) language model developed through extensive continued pre-training and supervised fine-tuning of Gemma 2 (2B). Specifically designed for the Khasi, a low-resource Austroasiatic language spoken in Meghalaya, Northeast India, while retaining English fluency from its base model.
~2.6B params
NE-BERT
Multilingual Foundation Model
NE-BERT: A regional state-of-the-art open-source model for 9 Northeast Indian languages. Built on ModernBERT for superior speed and accuracy in low-resource NLP.
~149M params
NE-OCR
Text Recognition Model
NE-OCR is a unified OCR model for 10 Northeast Indian languages across 12 language-script pairs and 4 scripts, along with Hindi and English anchors. Developed on the ViTSTR-Base backbone (86M parameters) using 1.34 million text-image pairs from native corpora, it delivers 94.99% mean Character Accuracy and the fastest inference speed of 17.2 ms/image.
~86M params
Ready for Deployment
Our models and systems are production-ready. Choose the path that fits your needs:
Production APIs
REST APIs for our models with commercial licensing
On-Premise & Offline
Full model deployment on your infrastructure for sensitive data and governance use
Custom Integration
End-to-end solutions for government departments, enterprises, and NGOs
From Research to Deployed Impact
Our models and systems are designed to power real AI solutions for government departments, NGOs, and enterprises across Northeast India; from citizen services and document workflows to offline tools that work in local languages.
Government Automation
Production-ready multilingual models and automation tools built for citizen services, government workflows, and public service delivery. Designed for scalability, offline use, and cultural context across Northeast India.
Document Intelligence
High-accuracy multilingual document extraction, classification, and processing systems optimised for government and enterprise use. Handles regional scripts and real-world document formats with strong performance.
Offline AI Solutions
Efficient models engineered for on-premise and offline deployment. Ideal for sensitive data, low-connectivity environments, and organisations that require full control and compliance.
Be Part of Northeast India's AI Revolution
Research Collaboration
Access our models, datasets, and technical documentation. Collaborate on cutting-edge NLP research for low-resource languages.
Startup Support
Build AI-driven solutions with our open-source models. Get technical guidance and integration support for your applications.
Government & NGO Partnerships
Deploy proven AI solutions for citizen services, automation, and multilingual communication at scale.
Ethical Data Partnerships
Consent-based data collection, curation, and annotation services tailored for governance projects and low-resource NLP. Strong focus on indigenous data sovereignty and community benefit.
Developer Ecosystem
Integrate KREN APIs into your applications. Commercial licensing and custom development available.
Built in the Open, For Everyone
All our foundational models are open-source on HuggingFace. We believe language AI should be accessible to researchers, developers, and communities across Northeast India and beyond. Our datasets are published on Kaggle, our research is transparent, and we actively contribute to AIkosh, India’s National AI Repository.
Our team works across Northeast India, building AI from authentic community data.
Ready to deploy AI that works in your language and region?
MWirelabs invites researchers, educators, and developers to collaborate in shaping technology that reflects the world’s diversity.