Abstract: Audio feature selection and neural network architecture play crucial roles in speech recognition performance. This paper presents a comparative analysis of Artificial Neural Networks (ANNs) ...
By Atharva Agrawal Growing up in the Tiger Capital of India, Nagpur, a city surrounded by some of the country’s most eminent wildlife sanctuaries, including Pench National Park, Tadoba-Andhari, Kanha ...
Abstract: The rapid advancement of audio deepfake technologies, which enable the synthesis of highly realistic speech, presents serious challenges to digital media integrity and public trust. In ...
This repo contains code for our DCASE 2025 task3 proposed method : Stereo sound event localization and detection based on PSELDnet pretraining and BiMamba sequence modeling [1]. For more information, ...
A complete video subtitle translation pipeline with modern web interface that uses OpenAI Whisper for speech-to-text transcription and Google Translate for multi-language subtitle generation.