Software engineers, developers and academic researchers have serious concerns about OpenAI’s Whisper transcripts, according to a report in the Associated Press.
While there has been no shortage of discussion about generative AI’s tendency to hallucinate (basically, make things up), it is a bit surprising that this is an issue in transcription, where the transcription would be expected to closely follow the audio that is transcribed.
Instead, investigators told the AP that Whisper has inserted everything from racial slurs to imagined medical treatments into the transcripts. And that could be particularly disastrous as Whisper is adopted in hospitals and other medical contexts.
A University of Michigan researcher who studied public meetings found hallucinations in eight out of 10 audio transcripts. A machine learning engineer studied more than 100 hours of Whisper transcripts and found hallucinations in more than half of them. And one developer reported finding hallucinations in nearly all of the 26,000 transcripts he created with Whisper.
An OpenAI spokesperson said the company is “continually working to improve the accuracy of our models, including reducing hallucinations” and noted that its usage policies prohibit the use of Whisper “in certain high-risk decision-making contexts.” “.
“We thank the researchers for sharing their findings,” they said.