5 Cell Broadcast Data Coding Scheme
03.383GPPAlphabets and language-specific informationTS
The Cell Broadcast Data Coding Scheme indicates the intended handling of the message at the MS, the alphabet/coding, and the language (when applicable). Any reserved codings shall be assumed to be the GSM default alphabet (the same as codepoint 00001111) by a receiving entity. The octet is used according to a coding group which is indicated in bits 7..4. The octet is then coded as follows:
Coding Group Bits 7..4 | Use of bits 3..0 |
0000 | Language using the default alphabet |
Bits 3..0 indicate the language: | |
0000 German | |
0001 English | |
0010 Italian | |
0011 French | |
0100 Spanish | |
0101 Dutch | |
0110 Swedish | |
0111 Danish | |
1000 Portuguese | |
1001 Finnish | |
1010 Norwegian | |
1011 Greek | |
1100 Turkish | |
1101Hungarian 1110 Polish | |
1111 Language unspecified | |
0001 | 0000 Default alphabet; message preceded by language indication. The first 3 characters of the message are a two-character representation of the language encoded according to ISO 639 [12], followed by a CR character. The CR character is then followed by 90 characters of text. A Pre-Phase 2+ MS will overwrite the start of the message up to the CR and present only the text. 0001 UCS2; message preceded by language indication The message starts with a two 7-bit default alphabet character representation of the language encoded according to ISO 639 [12]. This is padded to the octet boundary with two bits set to 0 and then followed by 40 characters of UCS2-encoded message. An MS not supporting UCS2 coding will present the two character language identifier followed by improperly interpreted user data. 0010..1111 Reserved for European languages |
0010.. | 0000 Czech 0001 .. 1111 Reserved for European Languages using the default alphabet, with unspecified handling at the MS |
0011 | 0000..1111 Reserved for European Languages using the default alphabet, with unspecified handling at the MS |
(continued) |
(concluded)
01xx | General Data Coding indication Bits 5..0 indicate the following: |
Bit 5, if set to 0, indicates the text is uncompressed Bit 5, if set to 1, indicates the text is compressed using the GSM standard compressing algorithm. ( see GSM TS 03.42 ) | |
Bit 4, if set to 0, indicates that bits 1 to 0 are reserved and have no message class meaning Bit 4, if set to 1, indicates that bits 1 to 0 have a message class meaning: | |
Bit 1 Bit 0 Message Class: | |
0 0 Class 0 | |
0 1 Class 1 Default meaning: ME-specific. | |
1 0 Class 2 SIM specific message. | |
1 1 Class 3 Default meaning: TE-specific (see GSM TS 07.05 [8]) | |
Bits 3 and 2 indicate the alphabet being used, as follows: | |
Bit 3 Bit 2 Alphabet: | |
0 0 Default alphabet | |
0 1 8 bit data | |
1 0 USC2 (16 bit) [10] | |
1 1 Reserved | |
1000..1101 | Reserved coding groups |
1110 | Defined by the WAP Forum [15] |
1111 | Data coding / message handling |
Bit 3 is reserved, set to 0. | |
Bit 2 Message coding: | |
0 Default alphabet | |
1 8 bit data | |
Bit 1 Bit 0 Message Class: | |
0 0 No message class. | |
0 1 Class 1 user defined. | |
1 0 Class 2 user defined. | |
1 1 Class 3 | |
default meaning: TE specific (see GSM TS 07.05 [8]) |
These codings may also be used for Unstructured SS Data and MMI/display purposes.
See GSM 04.90 [11] for specific coding values applicable to Unstructured SS Data for MS originated USSD messages and MS terminated USSD messages. USSD messages using the default alphabet are coded with the 7-bit alphabet given in subclause 6.2.1. The message can then consist of up to 182 user characters.
Cell Broadcast messages using the default alphabet are coded with the 7-bit alphabet given in subclause 6.2.1. The message then consists of 93 user characters.
If the 7 bit default alphabet extension mechanism is used then the number of displayable characters will reduce by one for every instance where the 7 bit default alphabet extension table is usedCell Broadcast messages using 8-bit data have user-defined coding, and will be 82 octets in length.
UCS2 alphabet indicates that the message is coded in UCS2 [10]. The General notes specified in subclause 6.1.1 override any contrary specification in UCS2, so for example even in UCS2 a <CR> character will cause the MS to return to the beginning of the current line and overwrite any existing text with the characters which follow the <CR>. Messages encoded in UCS2 consist of 41 characters.
Class 1 and Class 2 messages may be routed by the ME to user-defined destinations, but the user may override any default meaning and select their own routing.
Class 3 messages will normally be selected for transfer to a TE, in cases where a ME supports an SMS/CBS interface to a TE, and the TE requests "TE-specific" cell broadcast messages (see GSM 07.05 [8]). The user may be able to override the default meaning and select their own routing.