From erik@sran8.sra.co.jp Wed May 22 05:28:21 1991 Received: from mcsun.EU.net by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8) id AA24165; Wed, 22 May 91 05:28:21 +0200 Received: from srawgw.sra.co.jp by mcsun.EU.net with SMTP; id AA19706 (5.65a/CWI-2.87); Wed, 22 May 91 05:28:19 +0200 Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA17775; Wed, 22 May 91 12:28:08 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA24434; Wed, 22 May 91 12:27:19 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA13791; Wed, 22 May 91 12:27:50 JST Return-Path: Message-Id: <9105220328.AA13791@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: erik@sra.co.jp (Erik M. van der Poel) To: wg15rin@dkuug.dk Subject: Re: (wg15rin 116) paper Date: Wed, 22 May 91 12:27:48 +0900 Sender: erik@sran8.sra.co.jp X-Charset: ASCII X-Char-Esc: 29 Hi, Donn! Thanks for posting the paper on POSIX i18n. Some minor comments: > LC_CTYPE: The characteristics of the characters making up > the language: which are alphabetic and which are > not even part of that language. What does one use to find out whether or not a given character is part of the current language? Would it be correct for iswgraph() to return false for a graphic character that is not part of the current language? If the system uses 10646 for its wchar_ts, I think iswgraph() should return true for all graphic characters, no matter what the language is. > In addition, Japanese tends to use the character set native > to the subject language for inclusions from languages other than Japanese. > In effect, for proper Japanese usage they also require at least the full > Western European and Cyrllic character set in addition to their own. I haven't seen many inclusions of Western European (e.g. Latin-1) or Cyrillic in Japanese. ASCII is often included, however. > Current technology makes it very difficult for an > application to deal with characters where the size of the basic > character varies dynamically (as opposed to where two or more basic > characters are used to represent a single element of the character set). > However, the underlying systems do in fact vary the size of the basic > character. Which systems do you have in mind? Do you have any examples? > Typically it would be 8 in Europe, > where Asian languages are not frequently processed, and 16 in Asia, > where larger character sets are the norm. I don't know what the other Asian countries will specify, but the current Japanese national profile says that CHAR_BIT shall be 8. We imagine that people will use wchar_t, instead of large chars. Regards, Erik