Research & Publications

Our work focuses on advancing the state of AI for the languages, cultures, and knowledge systems of Northeast India.

KhasiBERT: A Foundational Transformer Language Model for the Khasi Language

Architecture: RoBERTa-base Parameters: ~110M Corpus Size: 3.6M sentences (63M tokens)

Why Multilingual Transformers Fail for Khasi: A Linguistic Analysis of Low-Resource Austroasiatic AI Gaps

Multilingual models like mBERT and XLM-R often fail on typologically distinct, low-resource languages such as Khasi, producing unreliable predictions due to tokenization bias and structural divergence.

Other Works & Technical Notes

Introducing Kren-M: Northeast India's First Foundational AI Model

Introducing Kren-M: Building Northeast India’s First Foundational AI Model

November 19, 2025

seo fundamentals for ai generated content rank smarter, not harder

Open-Source AI for Governance, Education, and Rural Innovation

October 22, 2025

agentic process automation a step forward in ai driven business workflows

Why Civic AI Must Start with the Underserved

September 30, 2025

Join Us in Building Inclusive AI

MWirelabs invites researchers, educators, and developers to collaborate in shaping technology that reflects the world’s diversity.