Reduce background noise and optimize the speech from an audio clip using ffmpeg

audioffmpegnoisespeech recognitionvoice

I extract audio clips from a video file for speech recognition. These videos come from mobile/other handmade devices and hence contain a lot of noise. I want to reduce the background noise of the audio so that the speech that I relay to my speech recognition engine is clear. I am using ffmpeg to do all of this stuff, but am stuck at the noise reduction phase.

Till now I have tried following filters:

ffmpeg-20140324-git-63dbba6-win64-static\bin>ffmpeg -i i nput.wav -filter_complex "highpass=f=400,lowpass=f=1800" out2.wav

ffmpeg -i i nput.wav -af "equalizer=f=1000:width_type=h:width=900:g=-10" output.wav

ffmpeg -i i nput.wav -af "bandreject=f=1200:width_type=h:width=900:g=-10" output.wav

But the results are very disappointing. My reasoning was that since speech comes under 300-3000 hz range I can filter out all other frequencies to suppress any background noise. What am I missing?

Also, I read about weiner filters that could be used for speech enhancements and found this but am not sure how to use it.

Best Answer

If you are looking to isolate audible speech try combining a lowpass filter with a high pass filter. For usable audio I have noticed that filtering out 200hz and below then filter out 3000hz and above does a pretty good job of keeping usable voice audio.

ffmpeg -i <input_file> -af "highpass=f=200, lowpass=f=3000" <output_file>

In this example add the high pass filter first to cut the lower frequencies then use the low pass filter to cut the higher frequencies. If needed you could run your file through this more than once to clean up higher db frequencies within the cut frequency ranges.