From erik@sran8.sra.co.jp Wed Nov 28 08:51:53 1990 Received: from mcsun.EU.net by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8) id AA26779; Wed, 28 Nov 90 08:51:53 +0100 Received: by mcsun.EU.net with SMTP; Wed, 28 Nov 90 08:54:29 +0100 Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4) id AA15290; Wed, 28 Nov 90 16:53:42 +0900 Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW) id AA19892; Wed, 28 Nov 90 16:53:31 +0900 Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ) id AA02415; Wed, 28 Nov 90 16:51:53 JST Return-Path: Message-Id: <9011280752.AA02415@sran8.sra.co.jp> Reply-To: erik@sra.co.jp From: Erik M. van der Poel To: seki@sysrap.cs.fujitsu.co.jp Cc: wg15rin@dkuug.dk, XoTGinter@xopen.co.uk, erik@sra.co.jp Subject: Re: Japanese Profile Date: Wed, 28 Nov 90 16:51:52 +0900 Sender: erik@sran8.sra.co.jp X-Charset: ASCII X-Char-Esc: 29 Sekiguchi-san, Thank you very much for forwarding the Japanese locale definition to WG15 RIN. We were trying to write a profile for Japan. Your contribution is very welcome at this stage. > # Based on POSIX.2 D10 syntax with X/Open extension. I hope you have proposed these "X/Open extensions" to the Posix people. It would be better for X/Open and Posix to be compatible with each other. > # This definition implicitly assume that underlying encoding > # is UJIS (EUC-JIS) or similar one. (Although characters in > # G2 and G3 are completely ignored.) The definition may not > # work if the systems uses other encoding. Which parts of your definition depend on the encoding? I believe that, in general, locale definitions should be independent of encodings. That's why we have charmaps. The charmap provides the mapping between the symbolic names of the characters and the codepoints. The locale definitions should only contain references to the symbolic names, and are therefore independent of the encoding. At least, this is my understanding of the current Posix draft. On the other hand, I have heard rumors that some people have commented that most implementations will probably only support one or a few encodings, and the full generality of the charmap system will probably be compromised. Perhaps the first implementations will not support the charmap system completely. This is understandable, since it takes some time to implement this new system. However, if people do not think that the charmap system will ever be fully implemented, then I find the very existence of this concept in the Posix draft highly questionable. I urge the WG15 RIN members responsible for the above-mentioned rumors to respond. > upper ;;;;;;;;;;;;;\ > ;;

;;;;;;;;;;;\ > <2341>;...;<235A>;\ Japanese is not the only language that uses two bytes for the representation of its characters. For example, China also uses two bytes. So the names of the Japanese characters should contain something that distinguishes them from the names of other characters. Keld has suggested that we use names like "j1625" for the Japanese characters. The numbers are in decimal, so that it is easy to compare the names with the numbers that appear in the JIS table. > # Era year definition: THIS IS AN X/OPEN EXTENSION > # This definition handles these 4 era only, i.e., HEISEI, > # SHOWA, TAISHO and MEIJI. Years befor MEIJI are printed > # as SEIREKI (which is ``A.D.'') or KIGENZEN (which is ``B.C.'') > era "+:2:1990/01/01:+*:<4A3F><402E>:%N%o<472F>";\ > "+:1:1989/01/08:1989/12/31:<4A3F><402E>:%N<3835><472F>";\ This is all very well for Japan, but what if some African tribe wants to define their locale and decide that they also need some kind of "era year" system, but find that their requirements are slightly different and are not met by this proposal? I don't mean to offend anyone by comparing the Japanese with the Africans; I just want to make my point absolutely clear by giving an extreme example. (Also, I don't mean to offend the Africans by saying that this example is extreme. :-) If it is possible that a country other than Japan may want to have a slightly different way of defining their era year, then I think that this keyword should not be called "era". It is unfair for any one country to reserve a general word like "era". Perhaps it would be better to take "era" out of the general LC_COLLATE rules, and add a hook to the rules for defining locale-specific rules. I can hear all of you saying "But how can you internationalize programs then?" Well, I think that it is likely that only Japanese programs will use %E and %o (for the era year), so in some sense, these programs would be localized rather than internationalized. Erik