Nagamese ASR: Speech to text model for Nagamese Language

Nagamese ASR: Automatic Speech Recognition for Nagamese Language

An AI-Powered Speech-to-Text Model for Nagamese

Nagamese ASR is an automatic speech recognition (ASR) model that converts spoken Nagamese into written text. It is trained on Latin-script Nagamese data and designed to handle conversational speech, including code-mixing with English.

What is Nagamese ASR?

Nagamese ASR uses neural network architectures to process audio recordings and generate text transcriptions. It supports applications such as transcription services, accessibility tools, and documentation of oral traditions. The model is optimized for Latin-script Nagamese, which is commonly used in Northeast India.

Technology & Architecture.

The model is built on OpenAI’s Whisper architecture, a transformer-based neural network with 244M parameters. It leverages multilingual pre-training and fine-tuning on Nagamese speech data, achieving good accuracy with limited training resources.

Key Technical Features

End-to-end neural architecture (no phoneme dictionaries required)
Handles code-mixing with English
Robust to background noise and varying audio quality
Supports multiple audio formats (.wav, .mp3, .ogg, .flac, .m4a)
Optimized for 16kHz audio sampling rate

Quick start

				
					from transformers import pipeline

asr = pipeline(
    "automatic-speech-recognition",
    model="MWirelabs/nagamese-asr"
)

asr("audio.wav")

Let's Build Together

Are you a researcher, developer, or part of a language community in Northeast India? We are always looking for partners to collaborate on new datasets, fine-tune models, and advance the state of regional AI.