6 Individual parameters

03.383GPPAlphabets and language-specific informationTS

6.1 General principles

6.1.1 General notes

Except where otherwise indicated, the following shall apply to all alphabet tables:

1: The characters marked "1)" are not used but are displayed as a space.

2: The characters of this set, when displayed, should approximate to the appearance of the relevant characters specified in ISO 1073 and the relevant national standards.

3: Control characters:

Code Meaning

LF Line feed: Any characters following LF which are to be displayed shall be presented as the next line of the message, commencing with the first character position.

CR Carriage return: Any characters following CR which are to be displayed shall be presented as the current line of the message, commencing with the first character position.

SP Space character.

4: The display of characters within a message is achieved by taking each character in turn and placing it in the next available space from left to right and top to bottom.

6.1.2 Character packing

6.1.2.1 SMS Point-to-Point Packing

6.1.2.1.1 Packing of 7-bit characters

If a character number  is noted in the following way:

b7 b6 b5 b4 b3 b2 b1

a b c d e f g

The packing of the 7-bits characters in octets is done by completing the octets with zeros on the left.

For examples, packing: 

– one character in one octet:

– bits number:

7 6 5 4 3 2 1 0

0 1a 1b 1c 1d 1e 1f 1g

– two characters in two octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

0 0 2a 2b 2c 2d 2e 2f

– three characters in three octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

0 0 0 3a 3b 3c 3d 3e

– seven characters in seven octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

0 0 0 0 0 0 0 7a

– eight characters in seven octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

8a 8b 8c 8d 8e 8f 8g 7a

The bit number zero is always transmitted first.

Therefore, in 140 octets, it is possible to pack (140×8)/7=160 characters.

6.1.2.2 SMS Cell Broadcast Packing

6.1.2.2.1 Packing of 7-bit characters

If a character number  is noted in the following way:

b7 b6 b5 b4 b3 b2 b1

a b c d e f g

the packing of the 7-bits characters in octets is done as follows:

bit number

7 6 5 4 3 2 1 0

octet number

1 2g 1a 1b 1c 1d 1e 1f 1g

2 3f 3g 2a 2b 2c 2d 2e 2f

3 4e 4f 4g 3a 3b 3c 3d 3e

4 5d 5e 5f 5g 4a 4b 4c 4d

5 6c 6d 6e 6f 6g 5a 5b 5c

6 7b 7c 7d 7e 7f 7g 6a 6b

7 8a 8b 8c 8d 8e 8f 8g 7a

8 10g 9a 9b 9c 9d 9e 9f 9g

.

.

81 93d 93e 93f 93g 92a 92b 92c 92d

82 0 0 0 0 0 93a 93b 93c

The bit number zero is always transmitted first.

Therefore, in 82 octets, it is possible to pack (82×8)/7 = 93.7, that is 93 characters. The 5 remaining bits are set to zero as stated above.

6.1.2.3 USSD packing

6.1.2.3.1 Packing of 7 bit characters

If a character number  is noted in the following way:

b7 b6 b5 b4 b3 b2 b1

a b c d e f g

The packing of the 7-bit characters in octets is done by completing the octets with zeros on the left.

For example, packing: 

– one character in one octet:

– bits number:

7 6 5 4 3 2 1 0

0 1a 1b 1c 1d 1e 1f 1g

– two characters in two octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

0 0 2a 2b 2c 2d 2e 2f

– three characters in three octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

0 0 0 3a 3b 3c 3d 3e

– six characters in six octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

0 0 0 0 0 0 6a 6b

– seven characters in seven octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

0 0 0 1 1 0 1 7a

The bit number zero is always transmitted first.

– eight characters in seven octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

8a 8b 8c 8d 8e 8f 8g 7a

– nine characters in eight octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

8a 8b 8c 8d 8e 8f 8g 7a

0 9a 9b 9c 9d 9e 9f 9g

– fifteen characters in fourteen octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

8a 8b 8c 8d 8e 8f 8g 7a

10g 9a 9b 9c 9d 9e 9f 9g

11f 11g 10a 10b 10c 10d 10e 10f

12e 12f 12g 11a 11b 11c 11d 11e

13d 13e 13f 13g 12a 12b 12c 12d

14c 14d 14e 14f 14g 13a 13b 13c

15b 15c 15d 15e 15f 15g 14a 14b

0 0 0 1 1 0 1 15a

– sixteen characters in fourteen octets:

– bits number:

7 6 5 4 3 2 1 0

2g 1a 1b 1c 1d 1e 1f 1g

3f 3g 2a 2b 2c 2d 2e 2f

4e 4f 4g 3a 3b 3c 3d 3e

5d 5e 5f 5g 4a 4b 4c 4d

6c 6d 6e 6f 6g 5a 5b 5c

7b 7c 7d 7e 7f 7g 6a 6b

8a 8b 8c 8d 8e 8f 8g 7a

10g 9a 9b 9c 9d 9e 9f 9g

11f 11g 10a 10b 10c 10d 10e 10f

12e 12f 12g 11a 11b 11c 11d 11e

13d 13e 13f 13g 12a 12b 12c 12d

14c 14d 14e 14f 14g 13a 13b 13c

15b 15c 15d 15e 15f 15g 14a 14b

16a 16b 16c 16d 16e 16f 16g 15a

The bit number zero is always transmitted first.

Therefore, in 160 octets, is it possible to pack (160*8)/7 = 182.8, that is 182 characters. The remaining 6 bits are set to zero as stated above.

Packing of 7 bit characters in USSD strings is done in the same way as for SMS (subclause 7.1.2.1).The character stream is bit padded to octet boundary with binary zeroes as shown above.

If the total number of characters to be sent equals (8n‑1) where n=1,2,3 etc. then there are 7 spare bits at the end of the message. To avoid the situation where the receiving entity confuses 7 binary zero pad bits as the @ character, the carriage return or <CR> character (defined in subclause 7.1.1) shall be used for padding in this situation, just as for Cell Broadcast.

If <CR> is intended to be the last character and the message (including the wanted <CR>) ends on an octet boundary, then another <CR> must be added together with a padding bit 0. The receiving entity will perform the carriage return function twice, but this will not result in misoperation as the definition of <CR> in subclause 7.1.1 is identical to the definition of <CR><CR>.

The receiving entity shall remove the final <CR> character where the message ends on an octet boundary with <CR> as the last character.

Under certain circumstances, a Pre Phase 2 + MS will perform the carriage return function after displaying the last USSD character received.

6.2 Alphabet tables

This section provides tables for all the alphabets to be supported by SMS. The default alphabet is mandatory. Additional alphabets are optional. Irrespective of support of an individual alphabet, a MS shall have the ability to store a short message coded in any alphabet on the SIM.

6.2.1 Default alphabet

Bits per character: 7

SMS User Data Length meaning: Number of characters

CBS/USSD pad character: CR

Character table:

b7

0

0

0

0

1

1

1

1

b6

0

0

1

1

0

0

1

1

b5

0

1

0

1

0

1

0

1

b4

b3

b2

b1

0

1

2

3

4

5

6

7

0

0

0

0

0

@

D

SP

0

¡

P

¿

p

0

0

0

1

1

£

_

!

1

A

Q

a

q

0

0

1

0

2

$

F

"

2

B

R

b

r

0

0

1

1

3

¥

G

#

3

C

S

c

s

0

1

0

0

4

è

L

¤

4

D

T

d

t

0

1

0

1

5

é

W

%

5

E

U

e

u

0

1

1

0

6

ù

P

&

6

F

V

f

v

0

1

1

1

7

ì

Y

7

G

W

g

w

1

0

0

0

8

ò

S

(

8

H

X

h

x

1

0

0

1

9

Ç

Q

)

9

I

Y

i

y

1

0

1

0

10

LF

X

*

:

J

Z

j

z

1

0

1

1

11

Ø

1)

+

;

K

Ä

k

ä

1

1

0

0

12

ø

Æ

,

<

L

Ö

l

ö

1

1

0

1

13

CR

æ

=

M

Ñ

m

ñ

1

1

1

0

14

Å

ß

.

>

N

Ü

n

ü

1

1

1

1

15

å

É

/

?

O

§

o

à

1) This code is an escape to an extension of the 7 bit default alphabet table. A receiving entity which does not understand the meaning of this escape mechanism shall display it as a space character.

6.2.1.1 GSM 7bit default alphabet extension table

b7

0

0

0

0

1

1

1

1

b6

0

0

1

1

0

0

1

1

b5

0

1

0

1

0

1

0

1

b4

b3

b2

b1

0

1

2

3

4

5

6

7

0

0

0

0

0

|

0

0

0

1

1

0

0

1

0

2

0

0

1

1

3

0

1

0

0

4

^

0

1

0

1

5

2)

0

1

1

0

6

0

1

1

1

7

1

0

0

0

8

{

1

0

0

1

9

}

1

0

1

0

10

3)

1

0

1

1

11

1)

1

1

0

0

12

[

1

1

0

1

13

~

1

1

1

0

14

]

1

1

1

1

15

\

In the event that an MS receives a code where a symbol is not represented in the above table then the MS shall display the character shown in the main default 7 bit alphabet table in section 6.2.1

1 ) This code value is reserved for the extension to another extension table. On receipt of this code, a receiving entity shall display a space until another extension table is defined.

2 ) This code represents the EURO currency symbol. The code value is that used for the character ‘e’. Therefore a receiving entity which is incapable of displaying the EURO currency symbol will display the character ‘e’ instead.

3 ) This code is defined as a Page Break character and may be used for example in compressed CBS messages. Any mobile which does not understand the 7 bit default alphabet table extension mechanism will treat this character as Line Feed

6.2.2 8 bit data

8 bit data is user defined

SMS User Data Length meaning: Number of octets

Padding: CR in the case of an 8 bit character set

Otherwise – user defined

Character table: User Specific

6.2.3 UCS2

Bits per character: 16

SMS User Data Length meaning: Number of octets

CBS/USSD pad character: CR

Character table: ISO/IEC10646 [10 ]

Annex A (Informative):
Document change history

SMG#

TDoc

SPEC

VERS

NEW_VERS

CR

REV

PHASE

CAT

WORKITEM

SUBJECT

s25

096/98

03.38

5.6.0

6.0.0

A015

R97

SIM toolkit security

Class 2 SIM Data download message handling

s26

291/98

03.38

6.0.0

7.0.0

A016

R98

TEI

7 bit default alphabet extensions

s28

99-061

03.38

7.0.0

7.1.0

A017

R98

TEI

changes for CBS 8 bit data and CBS compression

s29

99-482

03.38

7.1.0

7.2.0

A018

R98

C

MExE R98

Data Coding Scheme for WAP over USSD and CB