Bandwidth Extension
![]()
Bandwidth Extension is a method to convert telephony quality (8 kHz) speech signal to high quality (16 kHz) wideband speech signal. Usually a trained model is used for reconstruction of the missing bandwidth:
Listen
Listen |
The extrapolation and interpolation of missing spectrum of audio and speech signals has applications in speech telecommunication and restoration of band-limited archived speech recordings. Telephony speech is limited to a bandwidth of less than 4 kHz; normally about 3.4 kHz. In recent years several methods have been proposed for bandwidth extension of telephony speech signals to a higher bandwidth of broadcast quality. The aim of these methods is to reconstruct the upper band contents of speech signals using an extrapolation of the spectral envelop from the available spectrum at lower bands in order to gain the sensation of higher bandwidth and higher quality speech. Most bandwidth extension techniques strive to reproduce the expanded spectral envelop through the use of codebook mapping methods [13-15]. The missing spectral envelope in the higher bands is obtained from codebooks trained on joint feature vectors of band-limited and full band speech. The spectral envelop representation for codebook mapping is often based on the line spectral frequencies (LSF) parameters derived from a linear prediction model of speech [15]. The estimates of the spectral envelop are then combined with an estimate of the excitation signal to yield the wideband output speech signal. Different methods are used for estimation of the excitation signal, such as spectral folding, Gaussian modulation, etc [13][16]. The main difficult challenge for these algorithms is to recover those parts of signal in which valuable information resides in the upper band rather than the lower band (e.g. fricatives).
Below is the spectrogram of (top) original wideband signal (middle) narrowband signal and (bottom) the reconstructed signal. Note that these are from a different signal that the one above.