Northeast India AI Research Fellowship
Building the next generation of AI for Northeast India's languages.
A competitive remote fellowship to work with MWire Labs on cutting-edge NLP research for Northeast India’s indigenous languages.
Applications for the current Cohort close on January 31, 2026.
We are currently reviewing applications on a rolling basis. Early applicants will be prioritized for selection calls.
About the Fellowship
The Northeast India AI Research Fellowship is a selective 8-12 week remote program for students and early-career researchers who want to build language technology for Khasi, Garo, Adi, Ao, Kokborok, Meitei, Assamese, and other Northeast Indian languages. Fellows work directly with MWire Labs researchers on impactful projects in language modeling, datasets, and tools for cultural preservation.
What You'll Work On
Build and evaluate transformer-based language models for Northeast Indian languages (like our NE-BERT, Kren-M series)
Design and curate high-quality text, speech, and annotation datasets
Develop tools and applications that support language preservation and access
What You'll Gain
Research Impact
Co-authorship opportunities on papers, reports, or open-source releases Hands-on experience with fine-tuning, evaluation, and low-resource NLP methods Close mentorship from the MWire Labs research team
Recognition
Fellowship completion certificate from MWire Labs. Letters of recommendation for top-performing fellows. A portfolio of contributions to GitHub, Hugging Face, and academic papers
Community
Join a network of fellows focused on Northeast Indian languages. Collaborate with linguists, NGOs, and cultural organizations. Contribute to models that can ultimately reach millions of speakers
Who Should Apply
- Undergraduate or graduate students in CS, AI, Linguistics, or related fields
- Native or heritage speakers of Northeast Indian languages are strongly encouraged
- Students from anywhere welcome; Northeast India residents and diaspora especially encouraged
- Prior ML/NLP experience is a plus, but strong motivation and basic data skills can compensate
- Passion for language technology, cultural preservation, and open research
Current Focus Areas
- Language modeling and multilingual NLP for any Northeast Indian language
- Dataset creation for text, speech, and multimodal resources
- Translation systems and cross-lingual models
- Speech recognition, synthesis, and evaluation benchmarks for low-resource languages
Program Structure
- Duration: 8–12 weeks, fully remote
- Time Commitment: 8–12 hours per week, flexible around classes
- Mode: Mentored project work, regular check-ins, and group sessions
- Admissions: Rolling; applications reviewed monthly
How to Apply
- Brief statement: Why this fellowship? What language(s) or problem interests you?
- CV/resume and any prior projects or repos
Apply for the Fellowship
Timeline
- Applications reviewed monthly
- Shortlisted candidates may be invited for a brief call or task
Questions? Email connect@mwirelabs.com