From tut@eng.sun.com Fri Apr 5 22:50:24 1991 Received: from Sun.COM by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8) id AA28119; Fri, 5 Apr 91 22:50:24 +0200 Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA18553; Fri, 5 Apr 91 12:49:49 PST Received: from cairo.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA28582; Fri, 5 Apr 91 12:49:47 PST Received: by cairo.Eng.Sun.COM (4.1/SMI-4.1) id AA11684; Fri, 5 Apr 91 12:48:37 PST Date: Fri, 5 Apr 91 12:48:37 PST From: tut@eng.sun.com (Bill "Bill" Tuthill) Message-Id: <9104052048.AA11684@cairo.Eng.Sun.COM> To: erik@sra.co.jp, unicode@Sun.COM, wg15rin@dkuug.dk Subject: Re: shortcomings in XPG locale (was Sort sequence) X-Charset: ASCII X-Char-Esc: 29 > > 1. There is no LC_BIDI database to store direction information. > > Did X/Open ever think about Hebrew and Arabic? > > While we're at it, we might as well consider vertical printing, as is > sometimes used in Japan. So maybe we should call it LC_DIRECTION, or > LC_TEXTDIRECTION, or maybe just include it in LC_CTYPE? Yes, LC_DIRECTION sounds like a good name. And perhaps text direction really is related to character type. > > 2. Input methods for disambiguating Japanese, Korean etc. are not > > codified in any LC_INPUT file. I'm talking about typing in Hiragana and having the system convert to Kanji when necessary, giving you a menu of possible choices when the choice of Kanji character is ambiguous. Maybe this is so complicated that it must be handled by software input modules. > > 3. LC_FONT (or whatever) for supporting extended character sets > > and typography is missing. If I switched to ISO-Arabic or ISO-Cyrillic, where would my fonts come from? Is this another thing to throw in LC_CTYPE? > > 4. For Indic languages (and any heavily composed script), there > > is no LC_COMPOSE database. There are examples in Indic scripts where characters appear in input several positions after their actual physical location. Depending on intervening characters, the shape of a base character can change. Should this kind of thing be encoded in LC_CTYPE as well? Bill