5.1.6 Bandwidth detection

26.4453GPPCodec for Enhanced Voice Services (EVS)Detailed algorithmic descriptionRelease 15TS

A detection algorithm is applied to detect the actual input audio bandwidth for input sampling rates greater than 8 kHz. This bandwidth information is used to run the codec in its optimal mode, tailored for a particular bandwidth (BW) rather than for a particular input sampling frequency. For example, if the input sampling frequency is 32 kHz but there is no "energetically" meaningful spectral content above 8 kHz, the codec is operated in the WB mode. The following bandwidths/modes are used throughout the EVS codec: NB (0-4kHz), WB (0-8kHz), SWB (0-16kHz) and FB (0-20 kHz).

The detection algorithm is based on computing energies in spectral regions and comparing them to certain thresholds. The bandwidth detector operates on the CLDFB values (see subclause 5.1.2). In the AMR-WB IO mode, the bandwidth detector uses a DCT transform to determine the signal bandwidth.

5.1.6.1 Mean and maximum energy values per band

The CLDFB energy vector computed per 400Hz frequency bins (see subclause 5.1.2.2), is further aggregated as described below. Each value of represents a 1600Hz band consisting of four CLDFB energy bins summed up from to .

(27)

Depending on the input sampling frequency up to nine CLDFB bands are calculated using the above equation and the values are given below:

Table 3: CLDFB bands for energy calculation

bandwidth in kHz

bandwidth index

0

3

6

1.2 – 2.8

NB

1

11

14

4.4 – 7.2

WB

2

14

17

3

23

26

9.2 – 15.6

SWB

4

27

30

5

31

34

6

35

38

7

42

45

16.8 – 20.0

FB

8

46

49

The values in CLDFB bands are converted to the log domain and scaled by

(28)

where is set according to the input sampling frequency as follows: 88.293854 for 8kHz, 88.300926 for 16kHz, 88.304118 for 32kHz and 88.028412 for 48kHz.

The per-band CLDFB energy is then used to calculate the mean energy values per bandwidth:

(29)

and the maximum energy values per bandwidth:

(30)

In case of the DCT based detector, the DCT values are computed by first applying a Hanning window on the 320 samples of the input audio signal sampled at input sampling rate. Then the windowed signal is transformed to the DCT domain and finally decomposed into several bands as shown in Table 3a.

Table 3a: DCT bands for energy calculation

bandwidth in kHz

bandwidth index

0

1.5 – 3.0

NB

1

4.5 – 7.5

WB

2

3

9.0 – 15.0

SWB

4

5

6

7

16.5 – 19.5

FB

8

The values in DCT bands are converted to the log domain by

(33a)

and per-band and maximum energies are computed using (32) and (33) while the constant 1.6 in these equations is omitted in case of the DCT based detector.