aiOla has released Whisper-NER: an open-source AI model that allows joint speech transcription and entity recognition. This model combines speech-to-text transcription with Named Entity Recognition ...
BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.
AI researcher specialized in enhancing and refining generative models across various domains such as image, video, audio, and music synthesis. Explore my portfolio and projects by searching for ...
In cases where Whisper encounters poor-quality audio in medical notes, the AI model will produce what its neural network predicts is the most likely output, even if it is incorrect. And the most ...
In proper noun recognition, Universal-2 demonstrated superior accuracy with a 13.87% PNER, outperforming both Whisper large-v3 and turbo. This model also excelled in text formatting, achieving a U-WER ...