4.1.11 Multimode gain vector quantization

06.203GPPHalf rate speech transcodingTS

A separate GSP0 vector quantizer is derived for each of the four voicing modes. Once the frame voicing mode is selected, the vector quantizer, corresponding to that mode, is searched to select the excitation gains at each subframe of the frame.

Although the interpretation of what the excitation sources are differs between MODE=0 and the remaining MODE values, the procedure for searching the gain vector quantizer is identical. In each case, the P0 term specifies the relative contribution of the first of the two excitation vectors to the total excitation energy at the subframe, where the first excitation vector is the long term prediction vector for MODE=1, 2 or 3, while the vector selected from the first of the two VSELP codebooks is used in the MODE=0 case.

4.1.11.1 Coding GS and P0

Define ex(n) to be the excitation function at a given subframe. For MODE=1, 2 or 3, ex(n) is a linear combination of the pitch prediction vector scaled by , the long term predictor coefficient, and of the codevector scaled by , its gain. In equation form:

0 £ n £ Ns‑1 (127)

where for MODE¹0

c0(n) is the unweighted long term prediction vector, bL(n)

c1(n) is the unweighted codevector selected, uI(n)

and for MODE=0

c0(n) is the unweighted codevector selected from the first VSELP codebook, uI,1(n)

c1(n) is the unweighted codevector selected from the second VSELP codebook, uH,2(n)

The variable c’j(n) is a weighted version of cj(n). The power in each excitation vector is given by:

0 £ k £1 (128)

Let R be the total power in the coder subframe excitation:

(129)

P0, the power contribution of the pitch prediction vector as a fraction of the total excitation power at a subframe,

where 0 £ P0 £ 1 (130)

Define R’q(0) to be the quantized value of R(0) to be used for the current subframe and Rq(0) to be the quantized value of R(0). Then:

R’q(0) = Rq(0)previous frame for subframe 1 (131a)

R’q(0) = Rq(0)current frame for subframes 2, 3, 4 (131b)

Let RS be

(132)

The term GS is the energy tweak parameter defined as:

(133)

P0 represents the fraction of the total subframe excitation energy which is due to the first codebook vector, and GS, the energy tweak factor which bridges the gap between R, the actual energy in the coder excitation, and RS, its estimated value.

The gain bias factor , formulated to force a better energy match between p(n) and the weighted synthetic excitation, is given below where:

(134)

The weighted error equation is:

(135)

where:

(136)

(137)

(138)

(139)

(140)

k=0,1 (141)

k=0,1, j=k,1 (142)

(143)

(144)

Four separate vector quantizers for jointly coding P0 and GS are defined, one for each of the four voicing modes. The first step in quantizing of P0 and GS consists of calculating the parameters required by the error equation:

Rcc(k,j) k = 0, 1, j = k, 1

Rx(k) k = 0, 1

RS

Rpc(k) k = 0, 1

a, b, c, d, e

Next equation (135) is evaluated for each of the 32 vectors in the {P0,GS} codebook, corresponding to the selected voicing mode, and the vector which minimizes the weighted error is chosen. Note that in conducting the code search may be ignored in equation (135), since it is a constant. q, the quantized long term predictor coefficient, and q, the quantized gain, are reconstructed from

(145)

(146)

where P0vq and GSvq are the elements of the vector chosen from the {P0,GS} codebook.

A special case occurs when the long term predictor is disabled for a certain subframe, but voicing MODE¹0. This will occur when the state of the long term predictor is populated entirely by zeroes.

For that case, the following error equation is used:

(147)

For this case the quantized codevector gains are:

(148)

(149)