06.773GPPMinimum Performance Requirements for Noise Suppresser Application to the AMR Speech EncoderTS
5.1 Bit Exactness of the Speech Encoder
The Noise Suppression shall be implemented as a separate pre-processing module prior to the speech encoding. The functionality and all internal states, tables and variables of the speech encoder shall remain unaltered by the Noise Suppression function.
The Noise Suppression should be implemented as a stand-alone pre-processing module operating on the 160 samples input speech buffer to the speech encoder according to Figure 1.
Figure 1: Noise Suppression implementation
Alternatively, for implementation in conjunction with the bit-exact fixed point C reference code [GSM 06.73] the NS module may operate on the pre-processed input speech buffer “old_speech[L_TOTAL]” in the structure “cod_amrState” in the AMR C code [GSM 06.73] after the pre-processing module (sample down-scaling and input high pass filtering) of the speech encoder. The bit-integrity of the speech encoder for this implementation shall be verified according to Figure 2 where the signals at Test Points 1 and 2 shall be identical for any input signal and the Reference Encoder is the part of [GSM 06.73] after the pre-processing module. Note: implementation in conjunction with the AMR floating point C code is for further study.
Figure 2: Verification of AMR speech encoder bit-exactness for embedded NS implementations
5.2 Bit Exactness of the Speech Decoder
The AMR speech decoder shall remain unaltered by the Noise Suppression function.
5.3 Impact on Speech Path Delay
The one way algorithmic delay due to the activation of AMR noise suppression shall be no more than 5ms in excess of the delay inserted by the AMR speech codec. In the handsfree case, this delay is part of the 39ms delay specified in GSM 03.50.
The total additional delay (comprising of algorithmic and processing delays) shall not exceed 10ms. The processing delay is calculated using the following formula with E*S*P set to 50.
delay(proc) = WMOPS20/(ESP)
where WMOPS = complexity in weighted operations per second evaluated through the theoretical worst case. (Direct means of measurement of total delay is for further study.).
5.4 Impact on Channel Activity
The AMR speech codec with noise suppression activated should not significantly increase channel activity when used in conjunction with DTX.
Channel activity increase will be measured thanks to the Voice Activity factor (VAF), defined as follows.
Let x be the VAF measured by the AMR VAD as an averaged value on all clean speech signals
Let y be the VAF measured by the AMR VAD without AMR NS active as an averaged value on all clean speech + noise signals (where the applicable clean speech signal is the speech signal used in the measure of x).
Let w be the VAF measured by the AMR VAD with AMR NS active as an averaged value on all clean speech +noise signals (where the applicable clean speech signal is the speech signal used in the measure of x). w is required to be not significantly more than the maximum of y and x. Any case where w is greater than y should be further investigated.
These requirements shall apply to both standardized AMR VADs. (w,x,y) are determined using one or both VADs, and, if both are used, the requirements are checked relatively to each AMR VAD independently.
The definition of upper limits on VAF increase and attendant confidence intervals are for further study.