1. Description

06.103GPPFull Rate Speech TranscodingTS

1.1 void

1.2 Outline description

The present document is structured as follows:

Subclause 1.3 contains a functional description of the audio parts including the A/D and D/A functions. Subclause 1.4 describes the conversion between 13 bit uniform and 8 bit A‑law samples. Subclauses 1.5 and 1.6 present a simplified description of the principles of the RPE‑LTP encoding and decoding process respectively. In subclause 1.7, the sequence and subjective importance of encoded parameters are given.

Clause 2 deals with the transmission characteristics of the audio parts that are relevant for the performance of the RPE‑LTP codec.

Some transmission characteristics of the RPE‑LTP codec are also specified in clause 2. Clause 3 presents the functional description of the RPE‑LTP coding and decoding procedures, whereas clause 4 describes the computational details of the algorithm. Procedures for the verification of the correct functioning of the RPE‑LTP are described in clause 5.

Performance and network aspects of the RPE‑LTP codec are contained in annex A.

1.3 Functional description of audio parts

The analogue‑to‑digital and digital‑to‑analogue conversion will in principle comprise the following elements:

1) Analogue to uniform digital:

‑ microphone;

‑ input level adjustment device;

‑ input anti‑aliasing filter;

‑ sample‑hold device sampling at 8 kHz;

‑ analogue‑to‑uniform digital conversion to 13 bits representation.

The uniform format shall be represented in two’s complement.

2) Uniform digital to analogue:

‑ conversion from 13 bit /8 kHz uniform PCM to analogue;

‑ a hold device;

‑ reconstruction filter including x/sin x correction;

‑ output level adjustment device;

‑ earphone or loudspeaker.

In the terminal equipment, the A/D function may be achieved either:

‑ by direct conversion to 13 bit uniform PCM format;

‑ or by conversion to 8 bit/A‑ or -law (PCS 1900) companded format, based on a standard A‑ or -law (PCS 1900) codec/filter according to ITU‑T Recommendation G.711/714, followed by the 8‑bit to 13‑bit conversion according to the procedure specified in subclause 1.4.

For the D/A operation, the inverse operations take place.

In the latter case it should be noted that the specifications in ITU‑T recommendation G.714 (superseded by G.712) are concerned with PCM equipment located in the central parts of the network. When used in the terminal equipment, this specification does not on its own ensure sufficient out‑of‑band attenuation.

The specification of out‑of‑band signals is defined in section 2 between the acoustic signal and the digital interface to take into account that the filtering in the terminal can be achieved both by electronic and acoustical design.

1.4 PCM Format conversion

The conversion between 8 bit A‑ or -law (PCS 1900) companded format and the 13‑bit uniform format shall be as defined in ITU‑T Recommendation G.721 (superseded by G.726), subclause 4.2.1, sub‑block EXPAND and subclause 4.2.7, sub‑block COMPRESS. The parameter LAW = 1 should be used for A-law and LAW=0 should be used for -law (PCS 1900).

1.5 Principles of the RPE‑LTP encoder

A simplified block diagram of the RPE‑LTP encoder is shown in figure 1.1. In this diagram the coding and quantization functions are not shown explicitly.

The input speech frame, consisting of 160 signal samples (uniform 13 bit PCM samples), is first pre‑processed to produce an offset‑free signal, which is then subjected to a first order pre‑emphasis filter. The 160 samples obtained are then analysed to determine the coefficients for the short term analysis filter (LPC analysis). These parameters are then used for the filtering of the same 160 samples. The result is 160 samples of the short term residual signal. The filter parameters, termed reflection coefficients, are transformed to log.area ratios, LARs, before transmission.

For the following operations, the speech frame is divided into 4 sub‑frames with 40 samples of the short term residual signal in each. Each sub‑frame is processed blockwise by the subsequent functional elements.

Before the processing of each sub‑block of 40 short term residual samples, the parameters of the long term analysis filter, the LTP lag and the LTP gain, are estimated and updated in the LTP analysis block, on the basis of the current sub‑block of the present and a stored sequence of the 120 previous reconstructed short term residual samples.

A block of 40 long term residual signal samples is obtained by subtracting 40 estimates of the short term residual signal from the short term residual signal itself. The resulting block of 40 long term residual samples is fed to the Regular Pulse Excitation analysis which performs the basic compression function of the algorithm.

As a result of the RPE‑analysis, the block of 40 input long term residual samples are represented by one of 4 candidate sub‑sequences of 13 pulses each. The subsequence selected is identified by the RPE grid position (M). The 13 RPE pulses are encoded using Adaptive Pulse Code Modulation (APCM) with estimation of the sub‑block amplitude which is transmitted to the decoder as side information.

The RPE parameters are also fed to a local RPE decoding and reconstruction module which produces a block of 40 samples of the quantized version of the long term residual signal.

By adding these 40 quantized samples of the long term residual to the previous block of short term residual signal estimates, a reconstructed version of the current short term residual signal is obtained.

The block of reconstructed short term residual signal samples is then fed to the long term analysis filter which produces the new block of 40 short term residual signal estimates to be used for the next sub‑block thereby completing the feedback loop.

1.6 Principles of the RPE‑LTP decoder

The simplified block diagram of the RPE‑LTP decoder is shown in fig 1.2. The decoder includes the same structure as the feed‑back loop of the encoder. In error‑free transmission, the output of this stage will be the reconstructed short term residual samples. These samples are then applied to the short term synthesis filter followed by the de‑emphasis filter resulting in the reconstructed speech signal samples.

1.7 Sequence and subjective importance of encoded parameters

As indicated in fig 1.1 the three different groups of data are produced by the encoder are:

‑ the short term filter parameters;

‑ the Long Term Prediction (LTP) parameters;

‑ the RPE parameters.

The encoder will produce this information in a unique sequence and format, and the decoder shall receive the same information in the same way. In table 1.1, the sequence of output bits b1 to b260 and the bit allocation for each parameter is shown.

The different parameters of the encoded speech and their individual bits have unequal importance with respect to subjective quality. Before being submitted to the channel encoding function the bits have to be rearranged in the sequence of importance as given in GSM 05.03. The ranking has been determined by subjective testing and the procedure used is described in annex A, subclause A.2.

Table 1.1: Encoder output parameters in order of occurrence and
bit allocation within the speech frame of 260 bits/20 ms

==================================================================

Parameter Parameter Parameter Var. Number Bit no.

number name name of bits (LSB-MSB)

==================================================================

==================================================================

1 LAR 1 6 b1 – b6

2 LAR 2 6 b7 – b12

FILTER 3 Log. Area LAR 3 5 b13 – b17

4 ratios LAR 4 5 b18 – b22

PARAMETERS 5 1 – 8 LAR 5 4 b23 – b26

6 LAR 6 4 b27 – b30

7 LAR 7 3 b31 – b33

8 LAR 8 3 b34 – b36

==================================================================

Sub-frame no.1

==================================================================

LTP 9 LTP lag N1 7 b37 – b43

PARAMETERS 10 LTP gain b1 2 b44 – b45

——————————————————————

11 RPE grid position M1 2 b46 – b47

RPE 12 Block amplitude Xmax1 6 b48 – b53

PARAMETERS 13 RPE-pulse no.1 x1(0) 3 b54 – b56

14 RPE-pulse no.2 x1(1) 3 b57 – b59

.. … …

25 RPE-pulse no.13 x1(12) 3 b90 – b92

==================================================================

Sub-frame no.2

==================================================================

LTP 26 LTP lag N2 7 b93 – b99

PARAMETERS 27 LTP gain b2 2 b100- b101

——————————————————————

28 RPE grid position M2 2 b102- b103

RPE 29 Block amplitude Xmax2 6 b104- b109

PARAMETERS 30 RPE-pulse no.1 x2(0) 3 b110- b112

31 RPE-pulse no.2 x2(1) 3 b113- b115

.. … …

42 RPE-pulse no.13 x2(12) 3 b146- b148

==================================================================

Table 1.1: Encoder output parameters in order of occurrence and
bit allocation within the speech frame of 260 bits/20 ms

Sub-frame no.3

==================================================================

LTP 43 LTP lag N3 7 b149- b155

PARAMETERS 44 LTP gain b3 2 b156- b157

——————————————————————

45 RPE grid position M3 2 b158- b159

RPE 46 Block amplitude Xmax3 6 b160- b165

PARAMETERS 47 RPE-pulse no.1 x3(0) 3 b166- b168

48 RPE-pulse no.2 x3(1) 3 b169- b171

.. … …

59 RPE-pulse no.13 x3(12) 3 b202- b204

==================================================================

Sub-frame no.4

==================================================================

LTP 60 LTP lag N4 7 b205- b211

PARAMETERS 61 LTP gain b4 2 b212- b213

——————————————————————

62 RPE grid position M4 2 b214- b215

RPE 63 Block amplitude Xmax4 6 b216- b221

PARAMETERS 64 RPE-pulse no.1 x4(0) 3 b222- b224

65 RPE-pulse no.2 x4(1) 3 b225- b227

.. … …

76 RPE-pulse no.13 x4(12) 3 b258- b260

==================================================================

Figure 1.1: Simplified block diagram of the RPE ‑ LTP encoder

Figure 1.2: Simplified block diagram of the RPE ‑ LTP decoder