8 Sequences for finding the 20 ms framing of the GSM half rate speech encoder

06.073GPPHalf Rate Speech: Test Sequence for GSM Half Rate Speech CodecTS

When testing the decoder, alignment of the test sequences used to the decoder framing is achieved by the air interface (testing of MS) or can be reached easily on the Abis‑interface (testing on network side).

When testing the encoder, usually there is no information available about where the encoder starts its 20 ms segments of speech input to the encoder.

In the following, a procedure is described to find the 20 ms framing of the encoder using special synchronization sequences. This procedure can be used for MS as well as for network side.

Synchronization can be achieved in two steps. First, bit synchronization has to be found. In a second step, frame synchronization can be determined. This procedure takes advantage of the codec homing feature of the half rate codec, which puts the codec in a defined home state after the reception of the first homing frame. On the reception of further homing frames, the output of the codec is predefined and can be triggered to.

8.1 Bit synchronization

The input to the speech encoder is a series of 13 bit long words (104 kbits/s, 13 bit linear PCM). When starting to test the speech encoder, no knowledge is available on bit synchronization, i.e. where the encoder expects its least significant bits, and where it expects the most significant bits.

The encoder homing frame consists of 160 samples, all set to zero with the exception of the least significant bit, which is set to one (0 0000 0000 0001 binary, or 0x0008 hex if written into 16 bit words left justified). If two such encoder homing frames are input to the encoder consecutively, the decoder homing frame is expected at the output as a reaction of the second encoder homing frame.

Since there are only 13 possibilities for bit synchronization, after a maximum of 13 trials bit synchronization can be reached. In each trial, three consecutive encoder homing frames are input to the encoder. If the decoder homing frame is not detected at the output, the relative bit position of the three input frames is shifted by one and another trial is performed. As soon as the decoder homing frame is detected at the output, bit synchronization is found, and the first step can be terminated.

The reason why three consecutive encoder homing frames are needed is that frame synchronization is not known at this stage. To be sure that the encoder reads two complete homing frames, three frames have to be input. Wherever the encoder has its 20 ms segmentation, it will always read at least two complete encoder homing frames.

An example of the 13 different frame triplets is given in sequence BITSYNC.INP (see table 7).

8.2 Frame synchronization

Once bit synchronization is found, frame synchronization can be found by inputting one special frame that delivers 160 different output frames, depending on the 160 different positions that this frame can possibly have with respect to the encoder framing.

This special synchronization frame was found by taking one input frame and shifting it through the positions 0 to 159. The corresponding 160 encoded speech frames were calculated and it was verified that all 160 output frames were different. When shifting the input synchronization frame, the samples at the beginning were set to 0x0008 hex, which corresponds to the samples of the encoder homing frame.

Before inputting this special synchronization frame to the encoder, again the encoder has to be reset by one encoder homing frame. A second encoder homing frame is needed to provoke a decoder homing frame at the output that can be triggered to. And since the framing of the encoder is not known at that stage, three encoder homing frames have to precede the special synchronization frame to ensure that the encoder reads at least two homing frames, and at least one decoder homing frame is produced at the output, serving as a trigger for recording.

The special synchronization frame preceded by the three encoder homing frames are given in SEQSYNC.INP. The corresponding 160 different output frames are given in SYNC000.COD through SYNC159.COD. The three digit number in the filename indicates the number of samples by which the input was retarded with respect to the encoder framing. By a corresponding shift in the opposite direction, alignment with the encoder framing can be reached.

8.3 Formats and sizes of the synchronization sequences

BIT SYNC.INP:

This sequence consists of 13 frame triplets. It has the format of the speech encoder input test sequences (13 bit left justified with the three least significant bits set to zero).

The size of it is therefore:

SIZE (BITSYNC.INP) = 13 * 3 * 160 * 2 bytes = 12480 bytes.

SEQSYNC.INP:

This sequence consists of 3 encoder reset frames and the special synchronization frame. It has the format of the speech encoder input test sequences (13 bit left justified with the three least significant bits set to zero).

The size of it is therefore:

SIZE (SEQSYNC.INP) = 4 * 160 * 2 bytes = 1280 bytes.

SYNCXXX.COD:

These sequences consists of 1 encoder output frame each. They have the format of the speech encoder output test sequences (16 bit words right justified). The values of the VAD and SP flags are set to one in these files.

The size of them is therefore:

SIZE (SYNCXXX.COD) = (18 + 2) * 2 bytes = 40 bytes

Table 7 summarizes this information.

Table 7: Location, size and justification of synchronization sequences

Disk No.

Purpose of Sequence

Name of Sequence

No. of Frames

Size in Bytes

Justification

5

Bit Synchronization

BITSYNC.INP

39

1 2480

Left

5

Frame Synchronization (input)

SEQSYNC.INP

4

1 280

Left

5

Frame Synchronization (output)

SYNC000.COD

SYNC001.COD

SYNC002.COD

"

"

"

SYNC159.COD

1

1

1

"

"

"

1

40

40

40

"

"

"

40

Right

Right

Right

"

"

"

Right