From greger@iuk Fri Sep 13 21:07:27 1991 Received: from ism.isc.com by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8) id AA14504; Fri, 13 Sep 91 21:07:27 +0200 Received: by ism.isc.com (Sendmail5.65/1.35) id AA18204; Fri, 13 Sep 91 12:09:07 -0700 Received: from friherr by iuk.isc.com (5.65/smail2.2/11-14-88) id AA13468; Fri, 13 Sep 91 18:30:58 GMT Received: by (5.65/1.35/jcb-s) id AA15264; Fri, 13 Sep 91 18:48:14 +0100 Date: Fri, 13 Sep 91 18:48:14 +0100 Message-Id: <9109131748.AA15264@> To: keld%dkuug.dk@ism Cc: wg15rin%dkuug.dk@ism, hlj@posix From: greger@ism.isc.com ("greger@ism.isc.com (Greger Leijonhufvud, ISC, High Wycombe, U.K.)") Subject: Re: (wg15rin 134) Re: Ballot resolution X-Charset: ASCII X-Char-Esc: 29 In reply to your message of Thu Sep 5 22:00:32 1991 ------- >Some comments to Gregers ballot resolutions: >> Chapter 2.2: >> =========== >> >> Replace line 367 with: >> "The character order, as defined for the LC_COLLATE category >> in the current locale (see 2.5.2.2), defines the relative order >> of all collating elements, such that each element occupies >> a unique position in the order. In addition, one or more >> collation weights may be assigned for each collating element; >> these weights are used to determine the relative order or >> strings in e.g. the sort utility." >Does this mean that you cannot define a collating sequence, >which is indeterministic? We had trouble in DS defining such >an indeterministic collation order when we started out some three years >ago. Some in the DS i18n group said that they wanted an indeterministic >order. >How will the standard ensure that the collating order is deterministic? >Will there be an error code? The above change of text does not imply any substantive change compared with the previous text, but is intended to be clearer. It attempts to differentiate between the character order, which is used by regular expressions, and the weights allocated to collating elements. The default weights are such that they imply the character order (i.e., if no weights are defined, then the character order is also the collating order). It the latter case the order is deterministic, and in fact a "level 1" sort (using your terminology). Because characters can be ignored, and theoretically not participate at all in collation (i.e., sorting), the collation order may not be deterministic. In practice, however, the sort program will, if two strings are equal based on the actual weights, finally compare them on character values.... >> Chapter 2.4: >> =========== >> Replace the last sentence on lines 1303-1304 with: >> "The default character shall be the number sign (#). >> This declaration shall only be specified if the coded >> character set does not contain the number sign character." >One of the main issues in RIN is that we advocate coded character >set independent specifications of locales and charmaps. >It then seems inappropiate to speak about specific coded >character set specifications. Most charmaps and locales will >hopefully be specified for a lot of character sets in the same >source. >It seems overly restrictive to hace to two last lines here, and I >suggest that they be removed. >> Insert after the first sentemce on line 1606: >> This declaration shall only be specified if the coded >> character set does not contain the number sign character." >Same comment as above. Hal will discuss this issue with the balloteer this came from. >> Change line 1622-1625 to: >> "(1) A character can be represented via a symbolic name, enclosed >> within angle brackets (< and >). The symbolic name, including >> the angle brackets, shall exactly match a symbolic name > angle >> defined in the charmap file specified via the localedef -f >> option, and shall be replaced by the corresponding value >> from the charmap file." >> >I am not sure this is the only - nor best - way to do it - replacing >the symbolic name with the corresponding value from the charmap. >I have ideas on having all of the charmaps translate into some kind >of widechar value (with widechar maybe defined by the locale appearance). >This may define some tables which are a lot smaller than prescribed >by the above specifications. >I suggest that the last sentence (and shall be replaced... ) be removed. >It is too implementation oriented. I don't think so. The use of the charmap is to allow code set independent locale definitions, not necessarily to make more complex schemes. However, I suggest (Hal!!!!!) the following text instead: "...option; and shall be replaced by a character value determined from the value associated with the symbolic name in the charmap file." >> Add after line 1662: >> "If a charmap file is present, only characters defined >> in the charmap shall be specified." >I do not have the draft present, but does that eliminate the possibility >of defining charset independendent locales. That would indeed >be a pity. You are right, and that was not intended. What the balloteer wanted to stop was the ability to define characters not existing in the character set. A better text would be (Hal, note!!!!!!) "If a charmap file is present, only characters defined in the charmap shall be specified using octal, decimal, or hexadecimal constants. Symbolic names not present in the charmap file may be specified and shall be ignored, as specified under (1) above." >> Delete lines 1958-1959. (substitute) >> >> Delete lines 2137-2157. (substitute) >> >> Delete lines 2180-2183. (substitute) >Does that mean that the substitute command is eliminated? >Is that OK with the japanese? It seems so... or I will get other ballots! >> Replace lines 2333-2334 in draft 11 with the following: >> "The directives that can be specified in an operand >> to the order_start keyword are based on the requirements >> specified in several proposed standards and in customary >> use. The following is a rephrase of rules defined for >> "lexical ordering in English and French" by the Canadian >> Standards Association (text is brackets is re-phrased): >> 1. Once special characters ([punctuation]) have been removed >> from original strings, the ordering is determinded by > > determined >> scanning forward (left to right) [disregarding case and >> diacriticals]. >> 4. If there is still an ordering equivalence after rules 1 >> through 3 have been applied, then only special characters >> and the position they accupy in the string are considered to > occupy >> determine ordering. The string that has a special character >> in the lowest position comes first. If two strings have a >> special character in the same position, the character [with >> the lowest collation value] comes first. In case of equality, >> the other special characters are considered until there is a >> difference or all special characters have been exhausted." >> >> Delete lines 2344-2364 (Draft 11 line numbers). >> >> Lines 2530-2531: >> The currency_symbol does not appear in the LC_MONETARY >> category definition in the POSIX locale because it is >> not defined in the C Standard's {7} C locale. >> The C Standard {7} limits the size of decimal points >> and thousands delimiters to single-byte values. In >> locales based on multi-byte coded character sets this >> cannot be enforced, obviously; this standard does not >> prohibit such characters but makes the behavior >> unspecified. >I do not think we should limit ourselves to what C has limited >itself to. The C standard is under revision and faults can be >corrected there. Please do not remove "currency_symbol" as this is >very much needed for i18n purposes. I cannot see how i18n >applications dealing with monetary amounts can be localized >without having a currency_symbol specification. In Denmark we >could not have applications using "DKK " all over the place, >when "kr. " is the much more accepted notation for ordinary amounts. We are not eliminating currency_symbol; we are saying that the C locale doesn't have any value there (see ANSI C standard). -greger- >Keld -------