The Keyboard Layouts and Input Method of the Thai Language

Thaweesak Koanantakool, Ph.D

Information Processing Institute for Education and Development
Thammasat University
Bangkok 10200, Thailand
Personal Homepage


ตีพิมพ์ใน"Proceedings of the Symposim on Natural Language Pocessing in Thailand 1993"


Internet Editor : Chularat Tanprasert

Homepage Developer : Pojaporn Pinrod




1. Abstract

The computer keyboard inherits its user interface form the typewriter. The Thai keyboard layout is no exception. It follows closely the layout of the popular layout of Thai typewriter. This paper gives an overview on the development of the Thai keyboard layout on the typewriters. It then discusses the development of the Thai Industrial Standard for Layout of Thai Character Keys on Computer Keyboards (TIS 820-2531) and the associated input method (WTT2.0 draft standard). Further development trends of the keyboard layout (draft TIS 820-2536) and some research directions towards improved data entry of Thai scripts are given.

2. The Thai Typewriter

According to Sarit Pattajoti (1966), the first English typewriters imported for use in Thailand were those made by Hammond and Remington. Not until 1891 was the first Thai typewriter invented by Edwin Hunter, the second son of Samuel Gamble Macfarland, an American missionary despatched to Bangkok. Edwin Hunter was born in Bangkok in 1866 and later served in the Ministry of Education. He modified a typewriter made by Smith Premier Company while on vacation in USA in the year 1891.
Edwin died five years after his invention and transferred his rights on the typewriter to his younger brother, Dr. George Bradley Macfarland (อำมาตย์เอก พระอาจวิทยาคม). The first lot of Thai typewriters were shipped to Thailand in 1896.
The first keyboard layout consists of seven rows of keys, totalling 84 keys on the keyboard. This type of layout is now called double keyboard because it has twice as many keys as the modern typewriters where each key represents two symbols - - one as a lower-case and another as upper-case. One such typewriter is on show at the Olympia (Thai) Typewriter museum in Bangkok. Thanks to the invention of the shift-key, as used in contemporary typewriter, the double keyboard was later out of fashion due to its unsuitability for touch -typing (ie., poor user interface). No information was available as how the Thai characters were laid out on Edwin's original double-keyboard typewriter.
The four-row typewriter for the Thai language was subsequently jointly developed by the Smith Premier Store (นายวิริยะ ณ ศีลวันต์), and Mr. Plueng Suthikham (อาจารย์เปลื้อง สุทธิคำ), a teacher at the Bangkok Christian School upon the request of Dr. George Macfarland. This type of keyboard, invented in the 1920's was the basis of the contemporary layout, with perhaps only minor variations.

3. The Ketmanee Layout - แป้นแบบเกษมณี

There is no clear evidence the origin of Ketmanee except in Chawalit (1970) where it was stated that the name was christened after its designer, Suwanprasert Ketmanee (สุวรรณประเสริฐ เกษมณี). This layout had always been called the traditional layout standard because it has been used since the early days of Thai typewriting. The name Ketmanee was honoured by the Ketmanee supporters after Pattajoti promoted his new layout. The Ketmanee layout consists of 42 keys at the minimum (with a typical of 46 keys for English compatibility). The layout, as illustrated in Figure 1, consists of four rows of character keys (not including the space bar). The top row is used mainly for numerals. The other three rows consists of a mixture of consonants, vowels, tonemarks and punctuation marks.


Figure 1. The Ketmanee Layout


4. Pattajoti's Research - ผลงานวิจัยของสฤษดิ์ ปัตตะโชติ

Sarit Pattajoti (1966), then the Chief of the Photographic and Printing Section of the Royal Irrigation Department, made a statistical analysis of Thai texts with an aim to improve the Ketmanee layout. Pattajoti took a typical distribution of Thai characters from 50 sample texts (1,000 characters each) and perform statistical analysis on the finger-load distribution of the Ketmanee layout. His findings on the layout were
The detail of Pattajoiti's statistics when applied to the Ketmanee layout are summarised in Table 1 below. Pattajoti rightly concluded that typists on the Ketmanee layout would find that their right little fingers easily got tired.

Table 1 Typing load distribution for the Ketmanee layout.
LEFT HANDRIGHT HAND
little ring middle index thumb thumb index middle ring little
normal .00128 .00140 .00270 .00799 - - .03022 .00612 .00516 .00722
shifted .00796 .05518 .06948 .15984 - - .18198 .17164 .11056 .18184
TOTAL .00924 .05658 .07218 .16783 (space bar) .21220 .17776 .11572 .18870
TOTAL .30583 .69438

5. The Pattajoti Layout

The gold of the new keyboard layout attempted by Pattajoti are:
Pattajoti used the statistics of Thai keystroke distributions to design a new keyboard layout using the following principles: (i) put the most often used symbols on the home rows with priority based towards the center,(ii) place all other symbols except numerals on the adjacent two rows, and (iii) place the numerals on the top row. He later verified the design towards the total load balanced to 47%:53% (left : right) taking the fact that the left hand has to do carriage return function in addition to the typing.
The outcome of the research on the new layout is shown in Figure 2. It is to be noted that there are two versions of the Pattajoti layout, with the positions for the symbols sara-i ( อิ ) and mai-tho ( อ้ ) in the original design reversed form one shown in Figure 2, which is the final version. The statistics of the finger-load distribution are shown in Table 2 and Figure 3 below.


Figure 2. The Pattajoti Layout.


Table 2.
LEFT HANDRIGHT HAND
little ring middle index thumb thumb index middle ring little
Ketmanee .00924 .05658 .07218 .16783 (space bar) .21220 .17776 .11572 .18870
Pattajoti .05200 .06900 .11200 .23300 (space bar) .24500 .13500 .08400 .06870
TOTAL .4600 .53270




Figure 3. Finger load comparison between Ketmanee and Pattajoti.

From Figure 3, it is obvious that the load distribution of Pattajoti (light-grey bars) puts the highest load to the index fingers and monotonously decreasing loads towards the little finger. Ketmanee, on the other hand, has an unreasonably high load on the right little finger.

6. Other layouts

Only two post Pattajoti research papers exist: Borwornwit (1977) and Thanakan (1986). Borwornwit, a student of Pattajoti, proposed a complete Thai-English (dual language) typewriter design using the Pattajoti layout. Thanakan proposed a modified Pattajoti layout for a sequential numeric keys between the Thai and English layouts. No preformance improvement was claimed against the Pattajoti layout. The adaptation was apparently a logical one considering that Pattajoti's layout places the numerals on the top row as "2 3 4 5 อู 7 8 9 0 1 6 " . The placement of " 1 6 " after the "0 " was illogical, but probably due to mechanical constraints. Thanakan's layout placed the "1 " where it should be : before "2 " , using an additional key. The repositioning of " 6 " to the position between " 5 " and " 7 " caused the key [sara-u sara-uu] to be moved to the one originally assigned to " 6 " in Figure 2. Thanakan claimed that his improvement helped dual-language typists.
The implementation of Thai mechanical typewriters relied heavily on the so-called dead-key mechanism. This mechanism is one that causes the carriage not to advance when dead-keys are typed. Dead-keys are those associated with the Thai characters which must be placed above or below the previous base-line character. Thus it is necessary that dead-keys are placed consecutively in the Ketmanee layout to make sure that all dead-keys are occupying consecutive positions on the type-bar assembly. In the Pattajoti layout, dead-keys are spread into two groups: one at the left and one in the middle of the keyboard. This includes one dead-key that is in the middle between the keys for "5 " and " 7 " . Thus " 6 " was moved to the rightmost position on the top rows.

7. Implementation of the Pattajoti Standard

The Pattajoti layout was made the official standard for Thai typewriters by the Thai Cabinet upon a proposal by the National Research Council. Standard enforcement was carried out in several ways: a training school was set up at the Royal Irrigation Department; typewriter manufacturers were encouraged to produce the new typewriters and government institutions were urged to use the Pattajoti layout.
The firmly installed base of the old Ketmanee layout and people's resistance to change was finally the main cause for double standard in the country, where the private sectors still preferred the traditional layout. Finally, the critical mass for Pattajoti users was not created and the layout diminished from the market, leaving few users who are still keen on the Pattajoti type of keyboards even nowadays. According to some Pattajoti advocates, the ease of typing in the Pattajoti layout is superior to that of Ketmanee. The placement of numerals in lower case (no shift), despite the mispositioned "6" , is fairly valuable for office usage where typing numbers with Thai is common. To enter lots of numeric data through the Ketmanee layout is a pain.
Pattajoti's layout suffered a final blow when a political disagreement between some politicians and the National Research Council. A member of the parliament (Chawalit (1970)) , argued that the cost of 400 baht for modification of existing typewriters was expensive. A counter argument was that the MP was an owner of a typing school which owned hundreds of typewriters, thus he was facing a high modification expenses. The fight went on without ending.
After the student unrest of 14 October 1971, the cabinet's order for the use of Pattajoti layout in all governmental procurements seemed to be ignored for good.
Perhaps the failure to make Pattajoti a de facto standard was caused by many factors. One crucial reason is probably the mechanical constraints of the typewriters which limit it to only a single-layout device. Users have no choice of the keyboard layout once they purchased the typewriter. The choice must be made before purchasing. Another strong reason is probably the fact that touch typing on the Pattajoti layout is only marginally faster (about 27%) with only 8.5% less finger movement at the cost of a more frequently use of the shift keys (three additional shifts for every 100 character typed ) when compared to Ketmanee. (Note that if the tests include more percentage of numeric typing, the Pattajoti layout would certainly require less shifting then the Ketmanee layout.) This gain in performance is not sufficient to make the new layout popular by its own virtue.

8. TISI Standards for Computer Keyboards

In 1986, Thai Industrial Standards Institute (TISI) , announced TIS 620-2529 , the Thai standard character code for computers. Two years later TISI announced the Ketmanee layout as the standard layout for computers (TIS 820-2531).
At the time of this article (1993), Thai computer industry does produce keytops adhering to this industrial standard, but not fully compliant to it. Some manufacturers actually make mistakes even from the classic Ketmanee Layout . An example of this can be seen on a Chicony keyboard at the position corresponding to the letter Y (on the QWERTY layout). When TIS 820 is implemented on an extended keyboard (known as 101-key keyboards for PC), most manufacturers provide some extensions to the undefined keys using their own choice. Most users find these extensions inconsistent with one another. This is another cause for confusion.
In 1990 and 1991 , the Thai API Consortium (TAPIC) , a collective group of computer experts in Thailand from the private sector and universities, explored the possibility of setting up a common specifications for the handling methods of the Thai language in the computer called the WTT 2.0 specifications (see Thaweesak et al (1991)). One of the many results were the proposal to extend the TIS 820-2531 to cover the undefined positions which are usually available on most computer keyboards. The extended standard proposal is shown shaded in Figure 4.
TAPIC was sponsored by the National Electronics and Computer Technology Center (NECTEC) and was later appointed by TISI to be its sub-committee on Thai character set and software (TC536/SC2). The WTT 2.0 specifications were then made a draft proposal standard to TISI.




Figure 4 TIS 820-2531
(shaded keys are WTT 2.0 extensions for TIS 820-2536)

9. The Thai Input Method

In addition to the Thai Keyboard Layout standard draft proposal, WTT 2.0 also defined the Thai Input Method to be used with the keyboard. The specifications (draft WTT 2.0, part 2) classified the Thai characters into six classes: (1) control characters, (2) consonants, (3) vowels, (4) tonemarks, (5) diacritics, and (6) non-composibles) . The six classes are further classified into a total of 17 subclasses (Thaweesak et al (1991)) as shown in table 3 below.


Table 3 WTT 2.0 classification of Thai characters in TIS 620-2533.
1. CTRL

control characters (corresponds to ASCII control characters and delete);

2. NON

non-composible characters (all English alphabets and TIS 620-2533 punctuations such as paiyannoi (ฯ) , bath-sign (฿) , maiyamok (ๆ), khomut () , fongman ( ๐ ) ,angkhankhu (ฯl ), and numerals (๐ ๑ ๒ ๓ ๔ ๕ ๖ ๗ ๘ ๙) ;

3. CONS

Thai consonants, consisting of the forty-four consonants of the Thai alphabets ;

4. LV

leading vowels , consist of five symbols: เ แ โ ใ ไ ;

5. FV1

following vowels type 1 , consist of อะ อา อำ;

6. FV2

following vowels type 2 , consist of lakkhangyao ( ๅ );

7. FV3

following vowels type 3, consist of ฤ ฦ ;

8. BV1

below vowel type 1, sara-u ( อุ );

9. BV2

below vowel type 2, sara-uu ( อู );

10. BD

below diacritic (underdot), pinthu ( . );

11.TONE

the four tonemark : อ่ อ้ อ๊ อ๋ ;

12. AD1

above diacritics type 1, consist of nikhahit ( อํ ) and thanthakhat ( อ์ );

13. AD2

above diacritic type 2, maitaikhu ( อ็ );

14. AD3

above diacritic type 3, yamakkan ();

15. AV1

above vowel type 1, sara-i ( อิ );

16. AV2

above vowels type 2, consist of mai han-akat ( อั )and sara-ue ( อึ );

17. AV3

above vowels type 3, consist of sara-ee ( อี ) and sara-uee ( อื );


TAPIC makes use of the character classification in Table 3 to define the Thai input method for a particular cell (column) of Thai text. A cell in the Thai language may compose of one (at the base level) or two (base-level character and another above or below) or three symbols (at the vase level with two above or with one below and one above the base-line character). A simple definition for a Thai cell composition is illustrated in Figure 5a. The formal and rigorous definitive input method for Thai is summarised in Figure 5b , with some actual samples in Figure 6. Derivation of Figure 5b was made from an analysis of all possible combination within a cell (see Phraya Uphakit Silapasan (1918)).
Figure 5. State transition diagram for Thai input method of an independent cell. (a) basic checking, (b) vigorous checking.


Fig. (a) basic checking.--------------------- Fig. (b) vigorous checking.


Rendition of the string A ดุ ตี ปู่ พี่ (11 bytes, 5 cells).


Figure 6. Example of Thai Text.


TAPIC's WTT2.0 specifications for the input method also describe an easily implementable method to detect the violation of input sequence.It is expected that all Thai-capable computer systems have to implement such an error filtering scheme to preserve data integrity.
The cell-oriented approach is easy to implement and capable of eliminating all illegal character combinations within a particular cell. Since it does not operate beyond one cell, there is no restrictions in entering any combinations of "spelling" of Thai words into the system. This open flexibility gives the users all the freedom in the input stream at the level equivalent to that of the English input method. To give an example, any computer will allow the user to input meaningless string of English characters like "aArrdVark9 " , and so does the Thai input method allow any cell-to-cell combinations. Within a cell, however, combinations such as (one base-line character plus one below-vowel and one above-vowel ), which violate the flow of Figure 5b, are never allowed. The only exception for such a violation is probably only for a deliberate explanation to such a thing, where input checking mechanism must be turned off before it can be done!

10. Suggestions and Conclusion

In conclusion, the author has addressed the historical background to the Thai keyboard layout for typewriters. The computer industry , as later endorsed by the TIS standard, makes use of the usual typewriter interface for computer users directly. With computers, it is a matter of software programming that define the meaning of each key on the computer keyboard. It is thus easy to implement both the Ketmanee and Pattajoti layouts on a single machine and the user can select the layout as he/she likes it. The Pattajoti layout suffered political problems and unable to gather momentum when it was initially introduced. It is unlikely that the mass installed base of typists will switch again, despite the fact that Pattajoti proved to be more efficient.
This does not mean that the Thai keyboard layout is settled and final. Due to the popularity of desktop computers, there seem to be another keyboard crisis: executive cannot type! Only those young persons who train themselves at school and college feel at home with the Thai keyboards. Others find the present (Ketmanee) keyboard difficult ( if not impossible) to learn. Hunting and pecking the Ketmanee can only give some 10-15 wpm (word per minute) of data entry speed. With electronic keyboard and numeric pad, the advantage gap between the Pattajoti and the Ketmanee layout is reduced. All keystrokes require only small force to operat.e Numbers are better entered via the numeric keypad. No more lifting of the left hand to return the carriage. To improve the input speed significantly, perhaps we need a breakthrough (revolutionary?) in the keyboard input method.
It may be possible to enhance the Thai keyboard input sequence all together by assigning a different symbolic system to the keyboard, possibly a phonetic scheme. The desktop computer can then process the input sequence with some assistant from logical relations between sounds to recreate the actual spelling of the words. There may be many other possibilities which need some real study. Whatever solution it may be, the mew approach cannot be made popular unless there are some significant gains (say, 200% relative improvement over Ketmanee for both beginners and experts) in the input speed and ease of learning. Pen input, handwriting input and voice recognition are also the possibilities. But from the autor's point of view , the mechanical keyboard still holds a strong promise and a potential for research.
Alternative methods may involve easily customisable Pattjoti-style layout (or its variants), or user-defined layout. Thus the sustained improvement of more than 27% should be available to everyone. Perhaps, a new way of keyboard training, (Chanwangsa (1993)), by making use of no-nonsense drills designed from meaningful sequences can help shortening the typing course and improve the typing speed of the average person. A computer-aided design of typing drills can be generated by sampling from real texts to satisfy an individual keyboarding practice criteria of three to seven keystrokes per word.

11. Bibliography (All references are in Thai).

  1. Phraya Uphakit Silapasan [1918]: Thai Germmar.(อำมาตย์เอก พระยาอุปกิตศิลปสาร "หลักภาษาไทย -- อักขรวิธี วจีภาค วากยสัมพันธ์ ฉันทลักษณ์ " พ.ศ. ๒๔๖๑)
  2. Pattajoti, Sarit [1966]: " The Evolution of the Typewriter" , National Research Council, The Office of the Prime Minister, 4 November 1966. (สฤษดิ์ ปัตตะโชติ "วิวัฒนาการของเครื่องพิมพ์ดีด" สภาวิจัยแห่งชาติ สำนักนายกรัฐมนตรี 4 พฤศจิกายน ๒๕๐๙ )
  3. Pattajoti, Sarit [ca. 1966]: " A New Keyboard Layout System" (research final report) , National Research Council, The Office of the Prime Minister , no date. (สฤษดิ์ ปัตตะโชติ "รายงานผลการวิจัยระบบการวางแป้นพิมพ์ดีดใหม่" สภาวิจัยแห่งชาติ สำนักนายกรัฐมาตรี ไม่มีวันที่พิมพ์)
  4. Pattajoti , Sarit [ca. 1967]: " Thai Typewriter Practicing Manual" , The Royal Irrigation Department, no date.(สฤษดิ์ ปัตตะโชติ " ตำราพิมพ์ดีดภาษาไทย แบบ ปัตตะโชติ" อภินันทนาการจากกรมชลประทาน ไม่มีปีที่พิมพ์)
  5. Panyalak, Chawalit [1970]: " The Typewriter Keyboard" Wattayasan paritat, Vol.21, No. 27, 20 July 1970. (ชวลิต ปัญญาลักษณ์ "แป้นอักษรพิมพ์ดีด" วิทยาสารปริทัศน์ ปีที่ ๑ ฉบับที่ ๗ , ๒๐ กรกฎาคม ๒๕๑๓)
  6. Natsiri, Borwornwit [1977]: "Thai-English Typewriter" , a Master Thesis on Industrial Arts, Faculty of Architecture, King Monkut Institute of Technology at Lad Krabang,1977. (บวรวิชญ์ นิติสิริ " เครื่องพิมพ์ดีดไทยอังกฤษ" วิทยานิพนธ์หลักสูตรสถาปัตยกรรมศาสตร์บัณฑิต ภาควิชาศิลปอุตสาหการ คณะสถาปัตยกรรมศาสตร์ สถาบันเทคโนโลยีพระจอมเกล้า วิทยาเขตเจ้าคุณทหารลาดกระบัง ๒๕๒๐)
  7. Pattarakan, Thanakan[1986]: " A Modern Thai Keyboard " , Science, Vol.40 ,No. 9 , September 1986. (ธนกาญจน์ ภัทรากาญจน์ " แป้นพิมพ์ดีดภาษาไทยยุคใหม่" วิทยาศาสตร์ ปีที่ ๔๐ ฉบับที่ ๙ กันยายน ๒๕๒๙)
  8. Koanantakool, Thaweesak and TAPIC [1991]: " Computers and the Thai Language" , A Draft Proposal WTT 2.0 Standards, and a reprint of existing TISI standards on IT, NECTEC, October 1991. (ทวีศักดิ์ กออนันตกูล และ คณะทำงานร่างข้อกำหนดร่วมเพื่อการเขียนโปรแกรมซึ่งแสดงผลเป็นภาษาไทย "คอมพิวเตอร์กับภาษาไทย" ศูนย์เทคโนโลยีเล็กทรอนิกส์และคอมพิวเตอร์แห่งชาติ ตุลาคม ๒๕๓๔)
  9. Chanawangsa, Somseen [1993]: private communication. (สมศีล ฌานวังศะ การติดต่อส่วนตัว พ.ศ. ๒๕๓๖)