Tamil Unicode By Sinnathurai Srivas Avarangal. தமிழ் ஜனிகோட் பக்கம்கள். ஆசிரியர் சின்னத்துரை ஸ்றீவாஸ்.
தமிழ் ஐுநிகோ
Tamil Unicode
Request For Suggestion 4000: தமிழ் ஜனிகோட் வரைவிலக்கணம்Tamil Unicode Definitions
Request For Suggestion 5000: Sequence of Input Method for diacritics.
JPG graphic images of major Tamil characters/glyphs
Win2000 Keyboard Layout வின்டோ2000 தமிழ் விசைப்பலகைஜுநிகோட் சரியாக தெரிகின்றதா என்பதை நுணுக்கமாக பார்க்கவும்.
|
தமிழ் ஜுநிகோட், மற்றும் தமிழ்நாடுஅரசின் ரப்பரிந்துரைப்பான ரப்ஆவரங்கால் மற்றும்தொடர்ந்தும்இணைக்கபடும் ஆவரங்கால் ஆய்தஎழுத்துகளையும்உள்ளடக்கிஇப்பொழுது வெழியிடப்பட்டுள்ளது TabAvarangal2, TSC_Avarangal and AvarangalUni |
0 |
If you have not already installed the latest version of TabAvarangal2 font then download and install to read further. The TabAvarangal2 and TSC_Avarangal fonts NOW includes Test version of Tamil Unicode font as well as the normal recomended 8bit versions. Tab and TSC also continues to give support for additional Tamil Diacritics used for advanced purposes. |
____________________________________________________
This is a list that I drawn up. Could others add and comment please. sisrivas@hotmail.com Sinnathuari Srivas.
0.0.1 Displaying and Printing of Rendered Tamil Characters (Normal usage) to be supported.
0.0.2 Displaying and Printing of Linear Tamil Characters (un-rendered) to be supported via secondary fonts. Hence the OS should allow unrendered display and printing, if a font is configured to do so.
0.1 Tamil Diacritics
(To be defined)0.2 Tamil based International Diacritics (for common use)
(To be defined)0.3 International Diacritics
(Usage for Tamil to be defined) (Consider placing diacritics in front of alphabets and rear of alphabets and their usage for various languages too)1.0 Tamil: Independent Vowels
1.1 Tamil: Leading Vowels
(These characters are already defined. (Note: Both Independent and Leading vowels are denoted by same character.))2.0 Tamil: Trailing Vowels
(Note: not the same as lowercase)2.0.1 Already defined: Trailing Vowels (aa, i, ii, u, uu, e, ee, o, oo, ai,
ou, au)2.0.2 Partially Defined: Trailing vowels (from above: u, uu, e, ee, o, oo)
2.0.3 To Define: Trailing vowel
2.0.3.1 Manathil "a" ("a" in mind as inherent descriptor)
(Real slot allocation but represents the imaginary/inherent "a" for descriptive displays)
2.0.3.2 The " e, ee, ai, o, oo,au " will use otai-combu, itaddai-combu, ai marker and au marker to render ligatures.
3.0 Tamil Aythams
3.0.1 aHenum (Primary aytham)
3.0.2 meypuLLi
3.0.2 timing Indicator (duplicates as aa)
3.0.3 otaicombu
3.0.4 itaddaicombu
3.0.5 ai marker
3.0.6 au marker
4.0 Tamil Consonants (infact: Consonant+ahara)
4.0.0 Primary Tamil consonants:
k,ng, ch, ing, d, n2, th, nth, p, m, y, t, l, v, L2, L, r,n
4.0.1 Phase1 Extensions: Tamil consonants:
s, sh, h, j, ksh
4.0.2 Phase2 extensions: Tamil consonants:
Allow aHenum/Aytham free usage. There should not be
any ristrictions on when and where an aytham could be
used. An abnormal usage examples, f=aHenum+P,
b=aHenum+P. Normal usage is for Glotalisation.
(Note: Description: The Primary Aytham/ aHenum should not have any ristrictions on usage. Note: There may be possible ristrictions on a similar character in Thevanagari.)
5.0 Tamil Numarals
5.1.0
5.1.1 Include Tamil numeral "0"
5.1.2 Consider absolute and relative zero symbols
10.1 Tamil Calendrical Symbols
10.2 International Calendrical Symbols
11.1 Tamil Accounting Symbols:
12.1 Tamil Monitory Symbol:
12.2 International Monitary Symbols: Rupees and Bisa
13.1 Tamil Religious Symbols
13.2 International Religious and Atheistic symbols
14.0 Tamil: Historic ligatures.
(Such as lai, raa, etc..)
14.1 Private use area and Historic characters.
A vast number of undefined symbols to be made
available via privateuse area and retrival/rendering
of code point to be defined for this purpose.
15.0 Identify and name Reserved code points
(Such as Indic equivalents, etc..)
+
+
20.0 Tamil Ligatures
20.1 Consonants (True consonants by including the dot above.) (Treated as ligatures to avoid kerning necessities.)
20.2 Ahara-mey Ahara-consonants
Not considered ligature. Not Applicable
20.3 Aahaara-mey Aahara-consonants
No need to treat as ligature. Not applicable
20.4 ihara-mey
(Treated as ligatures to avoid kerning necessities)
20.0.4.1 "di" is also treated as ligature
20.5 iihara-mey
(Treated as ligatures to avoid kerning necessities)
20.5.1 "dii" is also treated as ligature
20.6 uhara-mey
20.6.0 (See 4.0.0 Pri and 4.0.2: Treated as ligatures.)
20.6.1 (See 4.0.1 Phase1 Extensions: ) Should " s, sh, h, j, ksh" be treated as ligatures?
20.7 uuhara-mey
as in 20.6 with uu
20.8 ehara-mey
Treat as ligature?
20.9 eehara mey
Treat as ligature?
20.10 aihara mey
Treat as ligature?
20.11 ohara mey
Treat as ligature?
20.12 oohaara may
Treat as ligature?
20.13 auhaara mey
Treat as ligature?
12.1 Tamil Monitory symbol : Rupees and Bisa
12.2 International Monitory Symbols: Rupees and Bisa
In order to facilitate the use of diacritics a standardised sequence of input method of code points need to be established. The suggestions should consider Unicode processing as a primary data processing system.
Request 5000-v1: Should the code points for diacritics be input before or after the code point of an alphabet. As Unicode utilises linear processing of standard alphabets, all relative processings are designed using linearity of these code points.
As an example, In Tamil this linearity takes the form of consonant + trailing vowel to form the consonant-vowel ligature. By placing the code point of a descriptive diacritic in front of a consonant all other standard processings can be kept untouched. However, this has the reverse effect if the code point of a descriptive diacritic is placed in front of a vowel as opposed to a consonant. This is because a trailing vowel combines with a consonant.
Solution 1: Con-diacritic + Consonant + vow-diacritic + vowel
`
+ k + ^ + i `k^i (ki)Solution 2: Con-diacritic + Consonant + vowel + vow-diacritic
`
+ k + i + ^ `ki^ (ki)Solution 3: Consonant + Con-diacritic + vowel + vow-diacritic
k +
` + i + ^ k`i^ (ki)While a uniform method of inputting diacritics seems ideal, this method may cause unmanageable burden on Unicode itself. Suggestions, comments and recommendations are welcome for this RFS 5000.
===
Thanks Michka,
Actually I'm concerned about the order of code points within texts. Calling this encoding or input method may be wrong. what is the standard description for sequences of code points within text.
What effect this has on, say for example rendering of characters or in another example sorting of text. How do we establish a standard procedure.
Sinnathurai
====
JPG graphic images of major Tamil characters/glyphs.
0.1/
0.2/
1/ முன்,
மற்றும் சுதந்திர உயிர்எழுத்துகள்2/ பின் உயிர்எழுத்துகள்
3/
1ம், 2ம், 3ம் வகை ஆய்தஎழுத்துகள்5/
6/ Absolute zero, relative zero, zero and numerals
8/ Reserved
10/
11/
மெய் எழுத்துகள் Consonants12/
13/
14/
______________________________
Stored values and rendering:Rendering, Glyph substitution, and other shaping necessities
Many believe that Tamil should not be put through the complex rendering process as it does not use a resource consuming complex writing system. In addition to this a mechanism to include the use of aytham/diacritics for Tamil is
in need of Unicode support.As I am involved in the research of Tamil writing systems and Tamil phonemes, I've had to get involved into the art of font creation. I was initially involved with single byte coding to provide the additional computing facility for Tamil research.
As the double byte Unicode is gradually taking prominence, it has become necessary to adapt to Unicode.
Following is a piece of information that I'm currently investigating to find out weather what the apparently compulsory rendering (
by that I mean the minimum functionality of an operating system and not of a font's internal substitution/rendering system) is going to be a hindering influence for the Languages in question. This compulsory part can cause severe limitations to the growth of Languages as it is not apparently independent from Operating System.Notice the differences in,
ke=ek=e'k=(கெ) 
_______________
MS Supplied Win2000 Keyboard Layout
வின்டோ2000 தமிழ் விசைப்பலகை

________________
Test for correct display of Tamil Characters
ஜுநிகோட் சரியாக தெரிகின்றதா என்பதை நுணுக்கமாக பார்க்கவும்.
பழய இன்றய தமிழில் தெரிகின்றதா?
புதிய வரும்கால ஆராய்ச்சி தமிழில் தெரிகின்றதா?
அல்லது பிழையாக தெரிகின்றதா?
Is it displyed as Linear Unicode or Normal Tamil Unicode or is it displayed wrong?
அ ஆ இ ஈ உ ஊ எ ஏ ஐ ஒ ஓ ஔ ஃ
கஃ ஃக க்
க கா கி கீ கு கூ
கெ கே கை கொ கோ கௌங
ச
ஞ
ட
ண
த
ந
ப
ம
ய
ர
ல
வ
ழ
ள
ற
ன
ஹ
ஸ
ஷ
க்ஷ ksh/x)
ஸ்ரீ (stii)
ஸ்றீ (srii)
ஸ்றி (sri)
ஸ்ரி (sti)
ஸ்ரீ (stii)
___________________________
___________________________-
Please report any technical problems in displaying and printing TabAvarangal2 and TSC_Avarangal (including TamilUnicode) Font and AvarangalUni Font to