ISO/IEC SC22/WG15 SD3 1996-06-25
2. Documenting National Profiles. (Open) 3. Disposition of Canadian Comment, 9945-1. (Open) 7. WG15 base standards subsetting (Open) 8. localedef user-specified collation weight names (Open) 9. Japanese proposal for LC_CTYPE extension (Open) 10. Range expression dependency (Open) 1. RIN input to WG20 (Resolved) 4. Publication of responses to Ballot comments (Resolved) 5. CD 9945-2.2 Document not identified (Resolved) 6. Language interaction pertaining to real time. (Resolved)
National Profiles, Profiles, internationalizationDescription:
Where should National Profiles be documented: - as ISPs? - as entries in a registry? - in a separate TR? - or a combination thereof?Originator:
WG15 (06/90)Alternatives:
Ad-hoc: (05/91) Need a separate document, as all national profiles would be too large a set.Arguments:
None Recorded, but see history for background.Resolution:
Open. Pending the outcome of the RIN report N273, and the planned guideline on national profiles.History:
The issue was revisited at the Rotterdam meeting, where the following action item was approved:
9010-9: RIN & convenor: POSIX National Profiles. Seek guidance from Rapporteur Group on Internationalization on improvements to the practice of WG15's handling of POSIX National Profiles. (9006-37)
Open pending start of activity by SC22/WG20. Becomes 9105-2.
An ad-hoc group on (Profile?) Coordination also fed into the Rotterdam WG15 Plenary, where Arnie Powell reported that:
"The break-out group was of the opinion that standards documents should not contain copies of all applicable national profiles; there was need of guidance from JTC1/SGFS on the the format of profile standards. (See resolution 117.) Jim Isaak commented that "they will not have any idea what [a national profile] is", so they would have to be fully briefed before being able to deliver useful guidance. Although no specific action arose, Willem Wakker is to coordinate with SGFS. (See also [Rotterdam minutes] 6.3, and actions 9105-7 and 9105-30.)"
[Rotterdam also passed resolution 117 on visiting this item in the Issues list, but this is concerned largely with SGFS coordination.]
Rotterdam resolution 156 also refers to this issue:
156 Synchronizing National Profiles
Whereas ISO/IEC JTC1/SC22/WG15 understands that synchronization problems may arise if National Profiles are included in multiple POSIX standards documents,Therefore ISO/IEC JTC1/SC22/WG15 instructs its Convenor to review this issue and seek comment from the IEEE Standards Department as to how this might be addressed and to report back the findings to ISO/IEC JTC1/SC22/WG15 at the next ISO/IEC JTC1/SC22/WG15 meeting.
Orlando meeting 1995-10:
It was noted that it is not feasible to have the national profiles in separate sections of the standards, as the profiles are too big and diverse. This possible solution was thus removed from the description. It was also noted that the forthcoming CEN cultural registry has provisions for registering national profiles, and that there is a possiblity, given appropiate definined entries in the taxonomy, that national profiles could be given ISP status according to TR 10000-3. Furthermore it was noted that the planned "Guidelines for national profiles and locales" TR would address the problem fully.
9945-1, Canada, internationalization, interchange.Description:
Tar and cpio interchange formats are not adequate for international work.Originator:
WG15 (Canada) (06/90)Alternatives:
WG15 (6/90): hold as open issue until changes can be addressed.Arguments:
None recorded.Resolution:
Open, pending technical resolution.History:
The disposition of comments (N085) item 6 addressed Canadian comments by suggesting additional text to 9945-1.2 section 10 lines 80-81 and 102-104 (tar) and lines 270-272 (cpio).
However, the changes were considered to be normative and therefore could not be endorsed for inclusion in the 9945-1 standard at the time the issue was entered.
The proposed resolution was to include the proposed changes in the next amendment to 9945-1.
It is unknown whether this solution was formally adopted.
Action on the US Development Body: Query the status of this issue and provide a report by the Oct 94 Meeting.
Orlando meeting 1995-10: A solution satisfying the Canadian member body is being worked on in the planned amendment to ISO/IEC 9945-2, covering IS 10646 usage in the pax utility.
Profiles, subsetting, SGFSDescription:
What approach, if any, is appropiate for subsetting of WG15 base standards?Originator:
RGCPAAlternatives:
No subsetting allowedArguments:
None recordedResolution:
noneHistory:
Original description: There are no standards for how standards should be subsetted. The "no subsets allowed" principle established by SGFS may not be sufficient.
PASC has an ad-hoc working on this subject to report back to the PASC in 01/94.
P1003.13 will define how the pieces should be broken out, this will then be fed back to 1003.1, who will then find a way to subset 1003.1. 1003.13 will then point to 1003.1.
Recorded 1995-10-25: PASC granted 1003.13 a waiver to do subsetting as an exeptional case, and 1003.13 revised their way pieces should be broken out from 9945-1. These are included in P1003.13/D7, August 1995.
localedef, collation, weight, LC_COLLATEDescription:
A mechanism for the specification of named collation weights in the LC_COLLATE section of locales, particularly to support non-latin character scripts to manage a number of sorting algorithms.Originator:
JAlternatives:
NoneDocuments:
N245 Summary of voting & comments on 2nd CD 9945-2: Shell & Utilities N281 Disposition of comments on CD 9945-2.2 N330 Japanese comments on Posix .2b/D4 RIN N106 Japanese Proposal to POSIX 1003.2b N602 Japanese Action Item Report to WG15, October 1995 N640r US TAG N573, N587: AI 9510-14, Report on POSIX.2b IssuesSolution:
None as yet. The proposal has been accepted in principle. The US development body has asked for specific wording to be supplied by Japan for inclusion in a revision to the standard.Status:
Open. Awaiting input from the Japanese MB to 9945-2Amd2b.History:
From WG15 Hamilton, May 1992:
N245, the comments on CD 9945-2, and N281, the disposition of those comments, contained the Japanese MB objection <ITSCJ.30> relating to collation weight names; a similar later version (below) was recorded at the WG15 Reading meeting. The proposed disposition of <ITSCJ.30> is contained in N281 as:
We believe that this change, or something similar to accomplish the same objective, should be studied for inclusion in the POSIX.2b revision and the full international standard.From WG15 Reading, October 1992:
N330 contained the Japanese MB comments on POSIX.2b D4; they included:
<ITSCJ.2b.9> Sect 2.5.2.2.3 (LC_COLLATE) PROPOSAL Problem: In most cases of ideographic characters, it is a requirement that a user be able to specify collation weights as he/she wants. In case of Japanese characters (Kanji), for example, there are five possible collation weights for supporting Japanese SORT. The five weights are On-yomi (psuedo-Chinese pronunciation), Kun-yomi (Japanese pronunciation, number of strokes, radical (components of Kanji), and Kanji character code. There could be more weights. The LC_COLLATE part of localedef specifications should allow a user to describe these weights and give names to the weights. Any combinations of the defined weights should be able to be specified by the user at run-time.
Proposal:
LC_COLLATE extension for specifying weight name
=> 2.5.2.2.3 order start Keyword. Add the following directive description and the Example.
It is implementation defined whether the following optional directive shall be recognised. If they are not supported, but present in a localedef source, they shall be ignored.
name specifies the name of a collation weight by a string. An order of weights may be specified by using the name at run time. The syntax for the name directive shall be:
"name = Example: order_start forward,name="kunyomi";forward,name="radical"
If an operand has a name directive, the definition of the primary, secondary, or subsequent weights for the collation element may be different from the order of operands to the order_start keyword.
=> 2.5.3.2 Locale Grammar. Modify the opt_word description as follows: opt_word : 'forward' | 'backward' | 'position' | 'name' '=' weight_name
weight_name : '"' char_list '"'
Rationale: User's requirements for character collation in Asia are diverse. Ideographic characters have several rules to sort such as by pronunciations, strokes, etc. and the combination of the rules are used for their sorting. Those properties for a charcter such as pronunciation can be assigned as weights for a character element. However, no standard primary weight, secondary weight and so on exists for the weights (properties). The weight name extension for LC_COLLATE allows the order of multiple weights to be defined at run time in the different order than the order than the order of operands to order_start keyword. To make the different order effective, the weight names can be specified in the setting of LC_COLLATE category.
order_start forward,name="kunyomi";forward,name="radical"
When a ja_JP.eucJP locale has the above definition in the LC_COLLATE part, the order of sorting rules can be specified as follows by using the weight names:
LC_COLLATE = ja_JP.eucJP@weights=radical,kunyomi
This means that the sort-rule "radical" is used as the primary weight and "kunyomi" is used as the secondary weight.From WG15 RIN Heidelberg, May 1993:
3.1.3 user-specified collation weight names based upon phonetic, character based(radical), or code based. Dynamic based control of collation based upon sort key. The ability to switch pointer dynamically to bring collation tables into correct sequence. Japanese delegation has submitted two written requests without supporting material.[?] Next version would be submitted by June 18, 1993.From WG15 RIN Annapolis, October 1993:
Action Item reports: The action list was lost. The minutes of the previous meeting were scanned to recover as many action items as possible; these were determined to be as follows:
9305-01 Requirement for user-specified collation weights. MDR-02 contains the Japanese proposal on collation weights. (Closed)
MDR-02 -> RIN N106: Japanese Proposal to POSIX 1003.2b
3.1 I18N in POSIX.2b
Specific actions were taken in Annex H to address Denmark and Japanese concerns for May 93 Heidelberg meeting. Japan needs feedback for timeline to produce material for coordination with 1003.2b Resolution to be produced asking for timeline for national body contributions. The rest of 3.1 [including N106] was postponed to the next meeting, due to lack of knowledge of the current status of .2b and lack of input papers received in time.
9310-09 Lead Rapporteur: distribute documents N105, N106, N109 and N113 to the RIN mailing list together with a cover note indicating that these documents will be discussed at the next WG15 RIN meeting, May 1994, and also indicating which agenda items will be touched by the documents.From WG15 RIN Vancouver, October 1994:
9405-05 Member Bodies to review N105 (Japanese comments on .1a), N106 (Japanese comments on .2b), N109 (SC22/WG20 guidelines for the use of extended identifiers in programming languages), N113 (CEN standard for string ordering) for determination of appropriate action prior to Oct. Meeting 10/94: OPEN: Prof. Saito noted they are preparing a Japanese standard for character ordering.
The above action item was carried through from May 1994 to the May 1995 meeting.From WG15 RIN Twente, May 1995:
3.1.3 localedef user-specified collation weight names--Japan making proposal for Annex H--removed to issues listFrom 9945-2:1993 Annex H.1:
(4) The LC_COLLATE (2.5.2.2) locale definition should be enhanced to allow user-specified names for collation weights. A proposal from Japan is expected in this area.
This text has been removed from P1003.2b Draft 11, May 1995.From WG15 RIN Orlando, October 1995:
N158 [WG15 N602] includes new input to this item; Japan is still working on this item; solution to some of the problems are not yet obvious. Japan needs discussion of their paper to help them go forward.
[N602 includes the following:] LC_COLLATE extension for user-specific names of collation weights
Title: Japanese proposal to POSIX.2b on LC_COLLATE extension for user-specified names of collation weights
Status: Japanese positionShort description: Japan proposes to extend LC_COLLATE locale definition in POSIX.2b so that names can be assigned to collation weights. This proposal is the response to the item (4) of ISO/IEC 9945-2:1993 Annex H.1 in which a proposal from Japan is expected.
Text of contribution: [Note: The page numbers refer to the ones of P1003.2/D10.]
Sect 2.5.2.2.3 (LC_COLLATE) PROPOSAL. page 10:
Problem: 1. General Requirements
In most cases of ideographic characters, it is a requirement that a user be able to specify the combination of collation weights as he/she wants. Japanese kanji characters, for example, have five (or more) typical collation weights to support Japanese SORT. The five weights are On-yomi (pseudo-Chinese pronunciation), Kun-yomi (Japanese pronunciation), Number of strokes, Radical (components of Kanji), and Kanji character code. There are many possible combinations of these weights and the requirements for them (number and order of weights) may change according to the type of data sorted, the purpose of sorting, user's preference, etc. Users (or applications) want to specify the method of sorting by specifying the primary weight and the secondary weight, and so on. Because no names are available for the combination of multiple weights, it is reasonable requirement that users can use the name of each collation weight for specifying the method of collation. That is the way in which most sorting utilities existing in Japan are implemented.
The concept of each weight for kanji characters mentioned above are common knowledge for Japanese. However, there are no standards for the weights of Japanese kanji characters. So the detail of assigning weights can be slightly different among implementations depending on which information source (dictionary, etc.) is used for making the weights. It is difficult to handle such difference by using pre-defined sorting method. If each weight can be handled independently, it will be easier to manage.
ISO 10646 (UCS) is now a standard. UCS can be used as a codeset for any locale whose character sets are included in. Even if UCS can be used for many different countries, the requirements for sorting characters are different country by country. The size of locale databases are concerns about using UCS. It is a requirement that there should be no problem for providing solutions to the above kanji sorting requirements when UCS is used as a codeset.
2. Problem in using current POSIX.2 standards specification
Current locale model seems to assume having a well-defined collation definition for each locale. However, it does not match with the requirements for sorting ideographic characters. There is an opinion that it's not totally impossible for the current .2 specification to allow implementation of satisfying most of (not all) the above requirements. Producing locales for all possible combinations of weights as well as naming each locale is the possible solution based on the existing standards specification. In addition to that it is not a complete solution, the approach seems not practical in the following points.
a. Size of locale databases There are about 12,000 kanji characters defined in JIS standards (JIS X0208 + JIS X0212). Because each possible combination of available weights needs to have a database, the total size of locale databases containing such big number of characters cannot be ignored. (for examples, 12,000 characters x 20 databases) When a local for ISO 10646 code set is defined, the problem must be more serious.
b. Identification of each collation method "Onyomi", "Kunyomi", etc. are well-known names as methods of sorting kanji characters. However, the problem is that no names are available for the combinations of the primitive methods. Implementors need to invent new names for the methods. (for example, onyomi_strokes_radical, kanji0102, etc.) The possibility of making standard or de facto standard for the names of these combinations are very low. Hence, this approach will not be portable.
Considering these problems, without extending current specification of LC_COLLATE, standard collation API such as wcscoll can support only limited ways of collation for kanji data, for example JIS code values. In this situation, applications which handle character orderings (for example, database applications) cannot rely on locale databases to sort kanji data. Some applications will support several collating methods by having their own ordering databases. Some applications will simply neglect the various sorting requirements for Kanji.
3. Overview of LC_COLLATE proposal By extending LC_COLLATE specification, single locale database can define multiple definitions of weights for kanji with their names. It is envisioned that the order of multiple weights can be specified at run time in the different order than the order of operands to order_start keyword. To make the different order effective, extension of another part of POSIX standards may be necessary. The weight names specified in the database should be referenced by a user or an application and the behavior of collation API needs to be modified according to the specified sorting method.
The proposal for allowing users to specify collation methods is expected to work as follows.
a. Define collation weights with names in LC_COLLATE
Define collation weights with names in the locale database.
EXAMPLE order_start forward,name="kunyomi";forward,name="radical" <char-1> <kunyomi weight for char-1>;<radical weight for char-1> <char-2> <kunyomi weight for char-2>;<radical weight for char-2> : : order_end b. Specify sorting methods
There are two possible extensions to specify preferred collation. One is to introduce new environment variable (b.1), and the other is to use LC_COLLATE (b.2).
b.1 Set the environment variable COLLWEIGHTS to preferred collation combination using names defined in the locale database.
EXAMPLE COLLWEIGHTS=radical,kunyomi
(Primary weight=radical, Secondary weight=kunyomi)
b.2 Alternatively, existing LC_COLLATE environment variable can be used to specify user's preference. The weight names are specified after the string "@weights=" modifier.
EXAMPLE LC_COLLATE=ja_JP.eucJP@weights=radical, kunyomi
c. Initialize collation data
There are two possible extensions to set collation methods at run time. One is to introduce new API (c.1), and the other is to use setlocale() (c.2).
c.1 The call to setweights() initialize the collation method from the setting of COLLWEIGHTS environment variable. The setweights function can be used to change the method of collation at run time.
c.2 The call to setlocale(LC_ALL, "") initialize the collation method from the setting of COLLWEIGHTS (or LC_COLLATE) environment variable. The setlocale function can be used to change the method of collation at run time.
d. API behavior
Collation APIs such as wcscoll work depending on the current setting of collation method.
The details of the proposal for extended use of environment variables and the initialization by API are not decided yet. The proposed extension to locale definition file is described below. The detail proposals for other parts are not ready yet.
4. Proposal for POSIX.2b LC_COLLATE locale definition file
Proposal: [LC_COLLATE extension for specifying weight name]
The LC_COLLATE part of localedef specifications should allow a user to give names to the weights.
=> 2.5.2.2.3 order_start Keyword. Add the following directive description and the Example.
It is implementation defined whether the following optional directive shall be recognized. If they are not supported, but present in a localedef source, they shall be ignored.
name specifies the name of a collation weight by a string. An order of weights may be specified by using the name at run time. The syntax for the name directive shall be:
"name = \"%s\"", <weight-name>
Example:
order_start forward,name="kunyomi";forward,name="radical"
If an operand has a name directive, the definition of the primary, secondary, or subsequent weights for the collation element may be different from the order of operands to the order_start keyword.
=> 2.5.3.2 Locale Grammar. Modify the opt_word description as follows:
opt_word : 'forward' | 'backward' | 'position' | 'name' '=' weight_name ;
weight_name : '"' char_list '"'
[Attachment : Example] Possible LC_COLLATE definition ============================== # Stroke collating-symbol <3stoke> collating-symbol <4stoke> collating-symbol <6stoke> collating-symbol <7stoke> collating-symbol <10stoke> # Onyomi collating-symbol <a> collating-symbol <i> collating-symbol <ka> collating-symbol <san> # Radical collating-symbol <ninben> collating-symbol <kuchi> collating-symbol <yama> order_start forward,name="stroke";forward,name="onyomi";\ forward,name="radical";forward,name="JISnumber" <j1602> <10stroke>;<a>;<kuchi>;<j1602> <j1643> <6stroke>;<i>;<ninben>;<j1643> <j1644> <7stroke>;<i>;<ninben>;<j1644> <j1829> <4stroke>;<ka>;<ninben>;<j1829> <j2719> <3stroke>;<san>;<yama>;<j2719>
Changing the order by assigning values to LC_COLLATE (b.2 method) ==================================================== LC_COLLATE=ja_JP.eucJP@weights=stroke,onyomi,radical,JISnumber
Behavior of collation functions ===============================
Output from weights=stroke,onyomi,radical,JISnumber (default) <j2719> < <j1829> < <j1643> < <j1644> < <j1602>
Output from weights=radical,onyomi,stroke,JISnumber <j1643> < <j1644> < <j1829> < <j1602> < <j2719>From WG15 Copenhagen, May 1996:
PASC WG has captured this issue and has emailed an awk script (in N640r) which solves the problem. Japan would like to take the proposed solution back to Technical Experts to ensure it answers their concerns. The US DB would like comments ASAP to ensure it hits the .2b ballot window. Action on Denmark and Japan to ensure the script works for them. The issue remains open - the US DB believes their solution will not be changed.
locale, char, character, character map, LC_CTYPE, wctrans(), towctrans(), charconv, charclassDescription:
Japan proposes that LC_CTYPE locale definition should be extended to allow locale-specific character mappings to be specified. This extension is necessary to implement wctrans() and towctrans() functions in ISO C amendment on a POSIX conforming system.Originator:
JAlternatives:
Documents:
N602 RIN N158: Japanese Action Item report to WG15 N657 Data specification format for transliteration and transcription N664 Proposal for culturally dependent fallback: ResponseSolution:
Status:
Open.History:
From WG15 RIN Orlando, October 1995:
N602 proposed the following extension to 1003.2b:
[Note: The page numbers refer to the ones of P1003.2/D10.]
Sect 2.5 (Locale) PROPOSAL. Page 8-9,12:
Problem: The LC_CTYPE (2.5.2.1) locale definition should be enhanced to allow user-specified additional character mapping, similar in the concept to the user-specified additional character class. In the Amendment of ISO C standard, extended character mapping functions (wctrans/towctrans) are specified. The following proposed extension will serve for the machinery to define locale specific character mappings used by the functions. Without having this extension, POSIX conforming systems need to have their own extensions to implement ISO C Amendment specifications.
Proposal:[LC_CTYPE extension for specifying character mapping]
The proposed extension for character mapping is similar to the extension of character class, which is already specified in .2b draft. New keyword 'charconv' is introduced to define locale- specific character mappings instead of 'charclass' keyword for character class. The way of defining character mapping is not extended with this proposal. The same specification for toupper/ tolower mapping can be used for locale-specific character mappings.
EXAMPLE:
LC_CTYPE
# define the names of locale-specific character mappings charconv tojkata;tojhira # tojkata: hiragana => katakana mapping tojkata (<j0401>,<j0501>);(<j0402>,<j0502>);\ .....definition..... # tojhira: katakana => hiragana mapping tojhira (<j0501>,<j0401>);(<j0502>,<j0402>);\ .....definition..... END LC_CTYPE
[Proposed extension to .2b text]
[Page 8] => 2.5.2.1 LC_CTYPE. Add the following keyword items after the item labeled tolower:
charconv Define one or more locale-specific character mapping names as strings separated by semicolons. Each named character mapping can then be defined subsequently in the LC_CTYPE definition. A character mapping name shall consist of at least one and at most fourteen bytes of alphanumeric characters from the portable filename character set. The first character of a character mapping name cannot be a digit. The name cannot match any of the LC_CTYPE keywords defined in this standard.
charconv-name Define the named locale-specific character mapping. In the POSIX Locale, the locale-specific named character mapping need not exist.
If a mapping name is defined by a charconv keyword, but no character mappings are subsequently assigned to it, this is not an error; it shall represent a mapping without any character pairs belonging to it.
[Page 12] => 2.5.3.1 Locale Lexical Conventions. Add the following token description:
CHARCONV A string of alphanumeric characters from the portable character set, the first of which shall not be a digit, consisting of at least one and at most fourteen bytes, and optionally surrounded by double-quotes.
[Page 12] => 2.5.3.2 Locale Grammar. Modify the ctype_keyword and charconv_keyword descriptions as follows:
ctype_keyword : charclass_keyword charclass_list EOL | charwidth_keyword charclass_list EOL | defwidth_keyword defwidth_value EOL | charconv_keyword charconv_list EOL | 'charclass' charclass_namelist EOL | 'charconv' charconv_namelist EOL ;
charconv_namelist : charconv_namelist ';' CHARCONV | CHARCONV ;
charconv_keyword : 'toupper' | 'tolower' | CHARCONV ;From WG15 Copenhagen, May 1996:
N657 and N664 refer. N657 is an expert contribution from Denmark, N664 is not an official US response - it comes direct from the .2b group. The US development body asked for clarification of the Japanese proposal: does it require just character-to-character translation, or character-to-string, which is a much larger problem. WG15 actioned KS to provide details of existing implementations of the proposal in N657:
| 9605-23 Keld Simonsen - supply a table of information about | research and products that support the functionality | the of LC_TRANS extension to the IEEE 1003.2 working | group by June 15. WG15 further actioned KS to respond to the queries raised in N664 by 1-July for consideration by the IEEE 1003.2b DB.
| From: keld@dkuug.dk (Keld J|rn Simonsen) | Date: Sun, 7 Jul 1996 17:17:11 +0200 | In-Reply-To: Yoichi Suehiro <suehiro@jrd.dec-j.co.jp> | "(SC22WG15.849) Comments on .2b Japanese proposals" (Jul 1, 8:28) | Subject: Re: (SC22WG15.849) Comments on .2b Japanese proposals | The reference is the C3 system for coded character information conversion, | for further information refer to http://www.nada.kth.se/i18n/c3/
Description:collation, element, regular, expression, pattern, LC_COLLATE, localedef
Originator:The user-defined ordering of collation elements in an LC_COLLATE table is inadequately specified. Different but equally valid tables can produce differing results when used as the basis of regular expressions, pattern matching, etc
DKAlternatives:
None.Documents:
N605 RIN N160: DS Additional comments on P1003.2b/D11Solution:
Status:
Open.History:
From WG15 RIN Orlando, October 1995:
@ 2.8 o 5
line 379: The range expression should not be dependent on the collation element order, but rather the result of the comparison using the relevant collation. Using the collating element order is not proper, and confusing to users that only have expectations as defined by the collation rules.From WG15 Copenhagen, May 1996:
Additional historical notes:1003.2 is ambiguous on this point and 1003.2b will not be able to fix the problem. There are two fairly simple solutions, but they are mutually exclusive, and the proponents of each solution do not readily admit to the possibility that the alternative solution may be valid.This issue remains open.
This request was forwarded to IEEE from X/Open end 1993 for interpretation.(Section 2.5.2.2, LC_COLLATE, "User-defined ordering of collating elements. Each collating element shall be assigned a collation value defining its order in the character (or basic) collation sequence. This ordering is used by regular expressions and pattern matching and, unless collation weights are explicitly specified, also as the collation weight to be used in sorting."Given this passage, assume there are two similar LC_COLLATE fragments. The fragments include lowercase letters only to simplify the examples. Here is the first fragment:<a <a>;<a>;<a> <a-grave<a>;<a-grave>;<a-grave> <a-acute<a>;<a-acute>;<a-acute> <b <b>;<b>;<b> <c <c>;<c>;<c> <d <d>;<d>;<d> . . . <z <z>;<z>;<z> . . . Here is the second fragment: <a <a>;<a>;<a> <b <b>;<b>;<b> <c <c>;<c>;<c> <d <d>;<d>;<d> . . . <z <z>;<z>;<z> <a-grave<a>;<a-grave>;<a-grave> <a-acute<a>;<a-acute>;<a-acute> . . . Suppose a user wanted to find all words that begin with a letter in the range a-c. An XoJIG meeting agreed that a locale built using the first fragment returns words that begin with <a>, <a-grave>, <a-acute>, <b>, and <c>. However, there were varying opinions about whether the second fragment would return the same results, or would exclude <a-grave> and <a-acute>. So the question is this:Should an RE run against a locale built using the second fragment include the accented a's in the range because they are defined as being in the same equivalence class as <a>, or should it exclude the accented a's because they are listed outside the range of a-c?A preliminary response was obtained from IEEE in Feb 1994:The standard is unclear on this issue, and as such no conformance distinction can be made between alternative implementations based on this. This is being referred to the Sponsors of the standard for clarifying wording in the next amendment.This response will be incorporated in an IEEE interpretations publication, and will be also made available on-line on the IEEE SPAsystem.IEEE Interpretation for 1003.2-1992 ----------------------------------- The standard is ambiguous in this area, since it is not clear what the phrase "collation sequence order" means or is. The two possibilities are "the order in locale file", or "the order determined by the weights in the locale file". The standard allows either behavior. Concern over the wording of this area has been forwarded to the Sponsors of the standard.Rationale for Interpretation: ----------------------------- None. ________________________________________________________________ (c) 1994 The Institute of Electrical and Electronic Engineers, Inc. Not to be published without prior written permission of the IEEE.Andrew Josey | PASC Vice-Chair Interpretations------DS finds it unnecessarily complex to introduce two levels for comparisons, one that is related to the comparison functions, and then one that is related to the order the weights appear in a localedef definition file. The latter is normally not part of the definition of the collation order, but becomes significant if this interpretation is favoured. The first interpretation should be favoured, as the algoritm is already known by the user, and gives the less unexpected result.
Description:
Originator:
Alternatives:
Arguments:
Resolution:
History:
RIN, Internationalization, WG20, multi-lingual.Description:
Proposed WG on internationalization should have input from RIN, but it isn't formed yet. How to address liaison statements.Originator:
RIN (6/90)Alternatives:
WG15, 6/90: Wait until group is formed, then send liaison statement. Keep as issue until completed.Arguments:
None.Resolution:
WG15, 06/91: Send liaison statement to WG20 now that it is formed.History:
This issue was raised at the WG15 Paris meeting when the Rapporteur Group on Internationalisation requested WG15 (RIN 027 refers) to advise the new SC22 working group on Internationalisation about the problems of multi-lingual and multi-directional presentation issues, eg numbers in Hebrew text.
There was uncertainty on how to proceed this issue at the time.
This issue was closed at the Rotterdam meeting, where the issues list was reviewed. It was noted that the problem can be forwarded to SC22/WG20, now that it has been 'formed'.
See WG15 resolution 133:
133 Recommendation to WG20
Whereas ISO/IEC JTC1/SC22/WG20 is newly formed for internationalization under ISO/IEC JTC1/SC22, and
Whereas a Rapporteur Group for Internationalization (RIN) has been formed in ISO/IEC JTC1/SC22/WG15 with active participation from the National Bodies, and
Whereas ISO/IEC JTC1/SC22/WG15 work such as ISO/IEC 9945-2 has introduced a suitable set of internationalization features including, but not limited to, the following:
- date and time format - numeric and monetary representation - collation - character map - character classification - national locales - national profiles, and
Whereas ISO/IEC JTC1/SC22/WG15 believes that the RIN work also would be useful for the work of ISO/IEC JTC1/SC22/WG20,
Therefore ISO/IEC JTC1/SC22/WG15 requests that ISO/IEC JTC1/SC22/WG20 work with ISO/IEC JTC1/SC22/WG15/RIN to agree how to best move work in these areas forward.
This action was reviewed at the WG15 meeting, Stockholm, as 9105-08:
"Convenor: Forward WG15RIN-N027 to WG20, Internationalisation, and to SC2, SC18, and SC20. (Arises from discussion of issues list item, "multi - lingual, multi - directional presentation", under agenda item 2.2.) Status: Done."
ballot responses, NPsDescription:
A concern about the visibility of responses to ballots returned to SC22, they should be cireculated to WG15 as they become available.Originator:
WG15 (05/91)Alternatives:
None proposed.Arguments:
Resolution:
Resolved by SC22 disposition of NPs.History:
This issue was first raised at WG15's Rotterdam meeting:
Comments in the Rotterdam minutes on this item noted that: "in order that WG15 member's concerns about the visibility of responses to ballots returned to SC22, these responses will be circulated to WG15 as they become available in future."
CD 9945-2.2Description:
When will 9945-2.2 become available?Originator:
WG15 (05/91)Alternatives:
US: (05/92) When the IEEE 1003.2 document becomes sufficiently stable. WG15: (05/92) Move to issues list.Arguments:
None recorded.Resolution:
Closed by SC22 N1063History:
This issue was first raised at WG15's Rotterdam meeting:
This was formerly action 9010-7, but was moved to the issues list pending a decision by the US member body that 1003.2 has become stable enough for submission. As well as being required for CD registration, it is also required for liaison with SC2 on character set issues.
real time, concurrency, FranceDescription:
There is a need to investigate potential conflict with the real time extension of POSIX and the real time concurrency features of programming languages.Originator:
WG15 (10/92)Alternatives:
None proposed.Arguments:
None recorded.Resolution:
The threads amendment ISO/IEC 9945-1 AM2 contains specifications that resolves the issue satisfactorily to the French member body.History:
This issue was first raised at WG15's Reading meeting:
Liaison between AFNOR POSIX WG, and AFNOR Ada Real Time WG (N295) led to discussions about potential overlap and inconsistencies between the real time amendment to 9945-1 (currently at CD stage) and the NP on real time extension to Ada.
A future work item for WG13 is the development of a multiple processor real time facility. The interaction of this with POSIX services in this area may conflict with each other.
At the time of origination of this issue, only Ada and Modula-2 are known to be of concern.
At the Orlando meeting 1995-10 the issue was recorded as resolved and closed, as noted above.