Nagamese ASR: Automatic Speech Recognition for Nagamese Language
An AI-Powered Speech-to-Text Model for Nagamese
Nagamese ASR is an automatic speech recognition (ASR) model that converts spoken Nagamese into written text. It is trained on Latin-script Nagamese data and designed to handle conversational speech, including code-mixing with English.
What is Nagamese ASR?
Nagamese ASR uses neural network architectures to process audio recordings and generate text transcriptions. It supports applications such as transcription services, accessibility tools, and documentation of oral traditions. The model is optimized for Latin-script Nagamese, which is commonly used in Northeast India.
Technology & Architecture.
The model is built on OpenAI’s Whisper architecture, a transformer-based neural network with 244M parameters. It leverages multilingual pre-training and fine-tuning on Nagamese speech data, achieving good accuracy with limited training resources.
Key Technical Features
- End-to-end neural architecture (no phoneme dictionaries required)
- Handles code-mixing with English
- Robust to background noise and varying audio quality
- Supports multiple audio formats (.wav, .mp3, .ogg, .flac, .m4a)
- Optimized for 16kHz audio sampling rate
Quick start
from transformers import pipeline
asr = pipeline(
"automatic-speech-recognition",
model="MWirelabs/nagamese-asr"
)
asr("audio.wav")
Let's Build Together
Are you a researcher, developer, or part of a language community in Northeast India? We are always looking for partners to collaborate on new datasets, fine-tune models, and advance the state of regional AI.