Thai Industrial Standard 620-2533

Thai Industrial Standard 620-2533, commonly referred to as TIS-620, is the most common character set and character encoding for the Thai language. The standard is published by the Thai Industrial Standards Institute (TISI), an organ of the Ministry of Industry under the Royal Thai Government, and is the sole official standard for encoding Thai in Thailand. The descriptive name of the standard is "Standard for Thai Character Codes for Computers" (Thai: รหัสสำหรับอักขระไทยที่ใช้กับคอมพิวเตอร์). "2533" refers to year 2533 of the Buddhist Era (1990), the year the present version of the standard was published; a previous revision, TIS 620-2529 (1986), is now obsolete.

TIS-620 is the IANA preferred charset name for TIS-620, and that charset name is used also for ISO/IEC 8859-11 (which adds a no-break space character at 0xA0, which is unassigned in TIS-620). When the IANA name is used the codes are supplemented with the C0 and C1 control codes from ISO/IEC 6429.

Structure

TIS-620 is a conventionally structured Extended ASCII national character set that retains full compatibility with 7-bit ASCII and uses the 8-bit range hex A1 to FB for encoding the Thai alphabet. Due to the complex combining nature of Thai vowels and diacritics, TIS-620 is intended for information interchange only, and an additional display engine is required to compose characters correctly.

Variants

A nearly identical version of TIS-620 has been adopted as ISO/IEC 8859-11 in 2001, the sole difference being that ISO/IEC 8859-11 defines hex A0 as a non-breaking space, while TIS-620 leaves it undefined but reserved. (In practice, this small distinction is usually ignored.)

The ISO/IEC 8859-11 set has also been registered as ISO-IR-166 by Ecma International, but this variation adds explicit escape codes for signaling the beginning and end of Thai character sequences.

The TIS-620 character set ordering has been used essentially as is within Unicode (ISO/IEC 10646) as well. Unicode's Thai range is U+0E01 through U+0E7F, and TIS-620 Thai characters can be converted to UTF-16 simply by prefixing each byte with 0E and subtracting hex A0 from the value.

Code page layout

Legend:

  Alphabetic
  Control character
  Numeric digit
  Punctuation

  Extended punctuation
  Graphic character
  International
  Undefined

TIS-620
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
 
0_
 
 
1_
 
 
2_
 
SP
0020
32
!
0021
33
"
0022
34
#
0023
35
$
0024
36
%
0025
37
&
0026
38
'
0027
39
(
0028
40
)
0029
41
*
002A
42
+
002B
43
,
002C
44
-
002D
45
.
002E
46
/
002F
47
 
3_
 
0
0030
48
1
0031
49
2
0032
50
3
0033
51
4
0034
52
5
0035
53
6
0036
54
7
0037
55
8
0038
56
9
0039
57
:
003A
58
;
003B
59
<
003C
60
=
003D
61
>
003E
62
?
003F
63
 
4_
 
@
0040
64
A
0041
65
B
0042
66
C
0043
67
D
0044
68
E
0045
69
F
0046
70
G
0047
71
H
0048
72
I
0049
73
J
004A
74
K
004B
75
L
004C
76
M
004D
77
N
004E
78
O
004F
79
 
5_
 
P
0050
80
Q
0051
81
R
0052
82
S
0053
83
T
0054
84
U
0055
85
V
0056
86
W
0057
87
X
0058
88
Y
0059
89
Z
005A
90
[
005B
91
\
005C
92
]
005D
93
^
005E
94
_
005F
95
 
6_
 
`
0060
96
a
0061
97
b
0062
98
c
0063
99
d
0064
100
e
0065
101
f
0066
102
g
0067
103
h
0068
104
i
0069
105
j
006A
106
k
006B
107
l
006C
108
m
006D
109
n
006E
110
o
006F
111
 
7_
 
p
0070
112
q
0071
113
r
0072
114
s
0073
115
t
0074
116
u
0075
117
v
0076
118
w
0077
119
x
0078
120
y
0079
121
z
007A
122
{
007B
123
|
007C
124
}
007D
125
~
007E
126
 
8_
 
 
9_
 
 
A_
 

0E01
161

0E02
162

0E03
163

0E04
164

0E05
165

0E06
166

0E07
167

0E08
168

0E09
169

0E0A
170

0E0B
171

0E0C
172

0E0D
173

0E0E
174

0E0F
175
 
B_
 

0E10
176

0E11
177

0E12
178

0E13
179

0E14
180

0E15
181

0E16
182

0E17
183

0E18
184

0E19
185

0E1A
186

0E1B
187

0E1C
188

0E1D
189

0E1E
190

0E1F
191
 
C_
 

0E20
192

0E21
193

0E22
194

0E23
195

0E24
196

0E25
197

0E26
198

0E27
199

0E28
200

0E29
201

0E2A
202

0E2B
203

0E2C
204

0E2D
205

0E2E
206

0E2F
207
 
D_
 

0E30
208
◌ั
0E31
209

0E32
210

0E33
211
◌ิ
0E34
212
◌ี
0E35
213
◌ึ
0E36
214
◌ื
0E37
215
◌ุ
0E38
216
◌ู
0E39
217
◌ฺ
0E3A
218
฿
0E3F
223
 
E_
 

0E40
224

0E41
225

0E42
226

0E43
227

0E44
228

0E45
229

0E46
230
◌็
0E47
231
◌่
0E48
232
◌้
0E49
233
◌๊
0E4A
234
◌๋
0E4B
235
◌์
0E4C
236
◌ํ
0E4D
237
◌๎
0E4E
238

0E4F
239
 
F_
 

0E50
240

0E51
241

0E52
242

0E53
243

0E54
244

0E55
245

0E56
246

0E57
247

0E58
248

0E59
249

0E5A
250

0E5B
251
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F

In the table above, 20 is the regular SPACE character. Code values 00-1F, 7F, 80-9F, A0, DB-DE and FC-FF are not assigned to characters by TIS-620.

Code values D1, D4-DA, E7-EE are combining characters.

This article is issued from Wikipedia - version of the 11/18/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.