WG15 Issues List

                                                        ISO/IEC SC22/WG15 SD3

ISO/IEC JTC1/SC22/WG15 Issues List


2.   Documenting National Profiles.                     (Open)    
3.   Disposition of Canadian Comment, 9945-1.           (Open)    
7.   WG15 base standards subsetting                     (Open)    
8.   localedef user-specified collation weight names    (Open)    
9.   Japanese proposal for LC_CTYPE extension           (Open)    
10.  Range expression dependency                        (Open)    

1.   RIN input to WG20                                  (Resolved)
4.   Publication of responses to Ballot comments        (Resolved)
5.   CD 9945-2.2 Document not identified                (Resolved)
6.   Language interaction pertaining to real time.      (Resolved)


Open Issues:

2. Title: Documenting National Profiles. [Open]

        National Profiles, Profiles, internationalization

        Where should National Profiles be documented:
        - as ISPs?
        - as entries in a registry?
        - in a separate TR?
        - or a combination thereof?
        WG15 (06/90)

        Ad-hoc: (05/91) Need a separate document, as all national profiles
           would be too large a set.
        None Recorded, but see history for background.
        Open.  Pending the outcome of the RIN report N273, and the planned
        guideline on national profiles.
        The issue was revisited at the Rotterdam meeting,  where the
        following action item was approved:
        9010-9:  RIN & convenor:  POSIX National Profiles.  Seek
                 guidance from Rapporteur Group on Internationalization
                 on improvements to the practice of WG15's handling of
                 POSIX National Profiles.  (9006-37)
                 Open pending start of activity by SC22/WG20.
                 Becomes 9105-2.
        An ad-hoc group on (Profile?) Coordination also fed into the
        Rotterdam WG15 Plenary, where Arnie Powell reported that:
        "The break-out group was of the opinion that standards
         documents should not contain copies of all applicable
         national profiles; there was need of guidance from JTC1/SGFS
         on the the format of profile standards.  (See resolution
         117.)  Jim Isaak commented that "they will not have any idea
         what [a national profile] is", so they would have to be fully
         briefed before being able to deliver useful guidance.
         Although no specific action arose, Willem Wakker is to
         coordinate with SGFS. (See also [Rotterdam minutes] 6.3, and
         actions 9105-7 and 9105-30.)"
       [Rotterdam also  passed resolution 117  on visiting this item
        in the Issues list,  but this is concerned largely with SGFS
        Rotterdam resolution 156 also refers to this issue:
       156   Synchronizing National Profiles
Whereas ISO/IEC JTC1/SC22/WG15 understands that synchronization 
problems may arise if National Profiles are included in multiple 
POSIX standards documents,
Therefore ISO/IEC JTC1/SC22/WG15 instructs its Convenor
to review this issue and seek comment from the IEEE
Standards Department as to how this might be addressed
and to report back the findings to ISO/IEC JTC1/SC22/WG15
at the next ISO/IEC JTC1/SC22/WG15 meeting.
        Orlando meeting 1995-10:
        It was noted that it is not feasible to have the national
        profiles in separate sections of the standards, as the profiles
        are too big and diverse. This possible solution was thus removed
        from the description. It was also noted that the forthcoming CEN
        cultural registry has provisions for registering national
        profiles, and that there is a possiblity, given appropiate
        definined entries in the taxonomy, that national profiles could
        be given ISP status according to TR 10000-3. Furthermore it was
        noted that the planned "Guidelines for national profiles and
        locales" TR would address the problem fully.

3. Title: Disposition of Canadian Comment, 9945-1. [Open]

        9945-1, Canada, internationalization, interchange.

        Tar and cpio interchange formats are not adequate for international
        WG15 (Canada) (06/90)

        WG15 (6/90): hold as open issue until changes can be addressed.
        None recorded.
        Open, pending technical resolution.
        The disposition of comments (N085) item 6 addressed Canadian
        comments by suggesting additional text to 9945-1.2 section 10
        lines 80-81 and 102-104 (tar) and lines 270-272 (cpio).
        However, the changes were considered to be normative and
        therefore could not be endorsed for inclusion in the 9945-1
        standard at the time the issue was entered.
        The proposed resolution was to include the proposed changes in
        the next amendment to 9945-1.
        It is unknown whether this solution was formally adopted.
        Action on the US Development Body:
                Query the status of this issue and provide a report by the
                Oct 94 Meeting.
        Orlando meeting 1995-10: A solution satisfying the Canadian member
        body is being worked on in the planned amendment to ISO/IEC 9945-2,
        covering IS 10646 usage in the pax utility.

7. Title: WG15 base standards subsetting [Open]

        Profiles, subsetting, SGFS
        What approach, if any, is appropiate for subsetting of WG15 base
        No subsetting allowed
        None recorded
        Original description: There are no standards for how
        standards should be subsetted. The "no subsets allowed"
        principle established by SGFS may not be sufficient.
        PASC has an ad-hoc working on this subject to report back
        to the PASC in 01/94.
        P1003.13 will define how the pieces should be broken out,
        this will then be fed back to 1003.1, who will then find
        a way to subset 1003.1. 1003.13 will then point to 1003.1.
        Recorded 1995-10-25: PASC granted 1003.13 a waiver to do
        subsetting as an exeptional case, and 1003.13 revised
        their way pieces should be broken out from 9945-1. These
        are included in P1003.13/D7, August 1995.

8. Title: localedef user-specified collation weight names [Open]

                localedef, collation, weight, LC_COLLATE
                A mechanism for the specification of named collation
                weights in the LC_COLLATE section of locales,
                particularly to support non-latin character scripts to
                manage a number of sorting algorithms.
        N245    Summary of voting & comments on 2nd CD 9945-2: Shell & Utilities
        N281    Disposition of comments on CD 9945-2.2
        N330    Japanese comments on Posix .2b/D4
    RIN N106    Japanese Proposal to POSIX 1003.2b
        N602    Japanese Action Item Report to WG15, October 1995
        N640r   US TAG N573, N587: AI 9510-14, Report on POSIX.2b Issues
                None as yet.  The proposal has been accepted in
                principle.  The US development body has asked for
                specific wording to be supplied by Japan for inclusion
                in a revision to the standard.
                Open.  Awaiting input from the Japanese MB to 9945-2Amd2b.

From WG15 Hamilton, May 1992:

        N245, the comments on CD 9945-2, and N281, the disposition of
        those comments, contained the Japanese MB objection <ITSCJ.30>
        relating to collation weight names; a similar later version
        (below) was recorded at the WG15 Reading meeting.  The proposed
        disposition of <ITSCJ.30> is contained in N281 as:
        We believe that this change, or something similar to accomplish
        the same objective, should be studied for inclusion in the
        POSIX.2b revision and the full international standard.
From WG15 Reading, October 1992:
        N330 contained the Japanese MB comments on POSIX.2b D4; they
        <ITSCJ.2b.9>  Sect  (LC_COLLATE)  PROPOSAL
        In most cases of ideographic characters, it is a requirement
        that a user be able to specify collation weights as he/she
        wants.  In case of Japanese characters (Kanji), for example,
        there are five possible collation weights for supporting
        Japanese SORT.  The five weights are On-yomi (psuedo-Chinese
        pronunciation), Kun-yomi (Japanese pronunciation, number of
        strokes, radical (components of Kanji), and Kanji character
        code.  There could be more weights.  The LC_COLLATE part of
        localedef specifications should allow a user to describe these
        weights and give names to the weights.  Any combinations of the
        defined weights should be able to be specified by the user at
        LC_COLLATE extension for specifying weight name
        => order start Keyword.  Add the following directive
        description and the Example.
            It is implementation defined whether the following optional
            directive shall be recognised.  If they are not supported,
            but present in a localedef source, they shall be ignored.
            name    specifies the name of a collation weight by a
                    string.  An order of weights may be specified by
                    using the name at run time.
                    The syntax for the name directive shall be:
                    "name =
            If an operand has a name directive, the definition of the
            primary, secondary, or subsequent weights for the collation
            element may be different from the order of operands to the
            order_start keyword.
        => Locale Grammar.  Modify the opt_word description as
            opt_word            : 'forward' | 'backward' | 'position'
                                  | 'name' '=' weight_name
            weight_name         : '"' char_list '"'
        User's requirements for character collation in Asia are diverse.
        Ideographic characters have several rules to sort such as by
        pronunciations, strokes, etc. and the combination of the rules
        are used for their sorting.  Those properties for a charcter
        such as pronunciation can be assigned as weights for a character
        element.  However, no standard primary weight, secondary weight
        and so on exists for the weights (properties).  The weight name
        extension for LC_COLLATE allows the order of multiple weights to
        be defined at run time in the different order than the order
        than the order of operands to order_start keyword.  To make the
        different order effective, the weight names can be specified in
        the setting of LC_COLLATE category.
                order_start forward,name="kunyomi";forward,name="radical"
        When a ja_JP.eucJP locale has the above definition in the
        LC_COLLATE part, the order of sorting rules can be specified as
        follows by using the weight names:
                LC_COLLATE = ja_JP.eucJP@weights=radical,kunyomi
        This means that the sort-rule "radical" is used as the primary
        weight and "kunyomi" is used as the secondary weight.
From WG15 RIN Heidelberg, May 1993:
        3.1.3 user-specified collation weight names based upon phonetic,
        character based(radical), or code based.  Dynamic based control
        of collation based upon sort key.  The ability to switch pointer
        dynamically to bring collation tables into correct sequence.
        Japanese delegation has submitted two written requests without
        supporting material.[?]  Next version would be submitted by June
        18, 1993.
From WG15 RIN Annapolis, October 1993:
        Action Item reports:
        The action list was lost.  The minutes of the previous meeting
        were scanned to recover as many action items as possible;  these
        were determined to be as follows:
        9305-01 Requirement for user-specified collation weights.
                MDR-02 contains the Japanese proposal on collation
                weights.  (Closed)
                MDR-02  ->    RIN N106: Japanese Proposal to POSIX 1003.2b
        3.1    I18N in POSIX.2b
        Specific actions were taken in Annex H to address Denmark and
        Japanese concerns for May 93 Heidelberg meeting.  Japan needs
        feedback for timeline to produce material for coordination with
        1003.2b  Resolution to be produced asking for timeline for
        national body contributions.  The rest of 3.1 [including N106]
        was postponed to the next meeting, due to lack of knowledge of
        the current status of .2b and lack of input papers received in
        9310-09 Lead Rapporteur:  distribute documents N105, N106, N109
                and N113 to the RIN mailing list together with a cover
                note indicating that these documents will be discussed
                at the next WG15 RIN meeting, May 1994, and also
                indicating which agenda items will be touched by the
From WG15 RIN Vancouver, October 1994:
        9405-05 Member Bodies to review N105 (Japanese comments on .1a),
        N106 (Japanese comments on .2b), N109 (SC22/WG20 guidelines for
        the use of extended identifiers in programming languages), N113
        (CEN standard for string ordering) for determination of
        appropriate action prior to Oct. Meeting 10/94:  OPEN:  Prof.
        Saito noted they are preparing a Japanese standard for character
        The above action item was carried through from May 1994 to the
        May 1995 meeting.
From WG15 RIN Twente, May 1995:
        3.1.3  localedef user-specified collation weight names--Japan
                making proposal for Annex H--removed to issues list
From 9945-2:1993 Annex H.1:
     (4)  The LC_COLLATE ( locale definition should be enhanced to
          allow user-specified names for collation weights.  A proposal
          from Japan is expected in this area.
        This text has been removed from P1003.2b Draft 11, May 1995.
From WG15 RIN Orlando, October 1995:
        N158 [WG15 N602] includes new input to this item; Japan is still
        working on this item; solution to some of the problems are not
        yet obvious.  Japan needs discussion of their paper to help them
        go forward.
        [N602 includes the following:]
        LC_COLLATE extension for user-specific names of collation weights
        Title:    Japanese proposal to POSIX.2b on LC_COLLATE extension for
                  user-specified names of collation weights
Status: Japanese position
Short description: 
        Japan proposes to extend LC_COLLATE locale definition in 
        POSIX.2b so that names can be assigned to collation 
        weights. This proposal is the response to the item (4) of 
        ISO/IEC 9945-2:1993 Annex H.1 in which a proposal from
        Japan is expected.
        Text of contribution:
        [Note: The page numbers refer to the ones of P1003.2/D10.]
        Sect (LC_COLLATE)  PROPOSAL.                  page 10:
        1. General Requirements
        In most cases of ideographic characters, it is a requirement that
        a user be able to specify the combination of collation weights as
        he/she wants. Japanese kanji characters, for example, have five
        (or more) typical collation weights to support Japanese SORT.
        The five weights are On-yomi (pseudo-Chinese pronunciation),
        Kun-yomi (Japanese pronunciation), Number of strokes, Radical
        (components of Kanji), and Kanji character code. There are many
        possible combinations of these weights and the requirements for
        them (number and order of weights) may change according to the
        type of data sorted, the purpose of sorting, user's preference,
        etc. Users (or applications) want to specify the method of
        sorting by specifying the primary weight and the secondary
        weight, and so on. Because no names are available for the
        combination of multiple weights, it is reasonable requirement
        that users can use the name of each collation weight for
        specifying the method of collation. That is the way in which most
        sorting utilities existing in Japan are implemented.
        The concept of each weight for kanji characters mentioned above
        are common knowledge for Japanese. However, there are no
        standards for the weights of Japanese kanji characters. So the
        detail of assigning weights can be slightly different among
        implementations depending on which information source
        (dictionary, etc.) is used for making the weights. It is
        difficult to handle such difference by using pre-defined sorting
        method. If each weight can be handled independently, it will be
        easier to manage.
        ISO 10646 (UCS) is now a standard. UCS can be used as a codeset
        for any locale whose character sets are included in. Even if UCS
        can be used for many different countries, the requirements for
        sorting characters are different country by country. The size of
        locale databases are concerns about using UCS. It is a
        requirement that there should be no problem for providing
        solutions to the above kanji sorting requirements when UCS is
        used as a codeset.
        2. Problem in using current POSIX.2 standards specification
        Current locale model seems to assume having a well-defined
        collation definition for each locale. However, it does not match
        with the requirements for sorting ideographic characters. There
        is an opinion that it's not totally impossible for the current .2
        specification to allow implementation of satisfying most of (not
        all) the above requirements. Producing locales for all possible
        combinations of weights as well as naming each locale is the
        possible solution based on the existing standards specification.
        In addition to that it is not a complete solution, the approach
        seems not practical in the following points.
            a. Size of locale databases
            There are about 12,000 kanji characters defined in JIS standards
            (JIS X0208 + JIS X0212). Because each possible combination of
            available weights needs to have a database, the total size of
            locale databases containing such big number of characters cannot
            be ignored. (for examples, 12,000 characters x 20 databases) When
            a local for ISO 10646 code set is defined, the problem must be
            more serious.
            b. Identification of each collation method
            "Onyomi", "Kunyomi", etc. are well-known names as methods of
            sorting kanji characters. However, the problem is that no names
            are available for the combinations of the primitive methods.
            Implementors need to invent new names for the methods. (for
            example, onyomi_strokes_radical, kanji0102, etc.) The possibility
            of making standard or de facto standard for the names of these
            combinations are very low. Hence, this approach will not be
            Considering these problems, without extending current
            specification of LC_COLLATE, standard collation API such as
            wcscoll can support only limited ways of collation for kanji
            data, for example JIS code values. In this situation,
            applications which handle character orderings (for example,
            database applications) cannot rely on locale databases to sort
            kanji data. Some applications will support several collating
            methods by having their own ordering databases. Some applications
            will simply neglect the various sorting requirements for Kanji.
        3. Overview of LC_COLLATE proposal
        By extending LC_COLLATE specification, single locale database can
        define multiple definitions of weights for kanji with their
        names. It is envisioned that the order of multiple weights can be
        specified at run time in the different order than the order of
        operands to order_start keyword. To make the different order
        effective, extension of another part of POSIX standards may be
        necessary. The weight names specified in the database should be
        referenced by a user or an application and the behavior of
        collation API needs to be modified according to the specified
        sorting method.
        The proposal for allowing users to specify collation methods is
        expected to work as follows.
            a. Define collation weights with names in LC_COLLATE
            Define collation weights with names in the locale database.
             order_start forward,name="kunyomi";forward,name="radical"
             <char-1>   <kunyomi weight for char-1>;<radical weight for char-1>
             <char-2>   <kunyomi weight for char-2>;<radical weight for char-2>

            b. Specify sorting methods
            There are two possible extensions to specify preferred collation.
            One is to introduce new environment variable (b.1), and the other
            is to use LC_COLLATE (b.2).
            b.1 Set the environment variable COLLWEIGHTS to preferred
               collation combination using names defined in the locale database.
                (Primary weight=radical, Secondary weight=kunyomi)
            b.2 Alternatively, existing LC_COLLATE environment variable
                can be used to specify user's preference. The weight
                names are specified after the string "@weights=" modifier.
                LC_COLLATE=ja_JP.eucJP@weights=radical, kunyomi
            c. Initialize collation data
                There are two possible extensions to set collation methods
                at run time. One is to introduce new API (c.1), and the
                other is to use setlocale() (c.2).
            c.1 The call to setweights() initialize the collation method
                from the setting of COLLWEIGHTS environment variable. The
                setweights function can be used to change the method of
                collation at run time.
            c.2 The call to setlocale(LC_ALL, "") initialize the collation
                method from the setting of COLLWEIGHTS (or LC_COLLATE)
                environment variable.  The setlocale function can be used
                to change the method of collation at run time.
            d. API behavior
               Collation APIs such as wcscoll work depending on the current
               setting of collation method.
        The details of the proposal for extended use of environment
        variables and the initialization by API are not decided yet. The
        proposed extension to locale definition file is described below.
        The detail proposals for other parts are not ready yet.
        4. Proposal for POSIX.2b LC_COLLATE locale definition file
        Proposal: [LC_COLLATE extension for specifying weight name]
        The LC_COLLATE part of localedef specifications should allow a
        user to give names to the weights.
        => order_start Keyword. Add the following directive
           description and the Example.
                It is implementation defined whether the following optional
                directive shall be recognized. If they are not supported, but
                present in a localedef source, they shall be ignored.
                name    specifies the name of a collation weight by a string.
                        An order of weights may be specified by using the name
                        at run time.
                        The syntax for the name directive shall be:
                                "name = \"%s\"", <weight-name>
                    order_start forward,name="kunyomi";forward,name="radical"
                If an operand has a name directive, the definition of the
                primary, secondary, or subsequent weights for the collation
                element may be different from the order of operands to the
                order_start keyword.
        => Locale Grammar. Modify the opt_word description as follows:
                opt_word        : 'forward' | 'backward' | 'position'
                                | 'name' '=' weight_name
                weight_name     : '"' char_list '"'
        [Attachment : Example]
        Possible LC_COLLATE definition
        # Stroke
        collating-symbol <3stoke>
        collating-symbol <4stoke>
        collating-symbol <6stoke>
        collating-symbol <7stoke>
        collating-symbol <10stoke>
        # Onyomi
        collating-symbol <a>
        collating-symbol <i>
        collating-symbol <ka>
        collating-symbol <san>
        # Radical
        collating-symbol <ninben>
        collating-symbol <kuchi>
        collating-symbol <yama>

        order_start     forward,name="stroke";forward,name="onyomi";\
        <j1602>         <10stroke>;<a>;<kuchi>;<j1602>
        <j1643>         <6stroke>;<i>;<ninben>;<j1643>
        <j1644>         <7stroke>;<i>;<ninben>;<j1644>
        <j1829>         <4stroke>;<ka>;<ninben>;<j1829>
        <j2719>         <3stroke>;<san>;<yama>;<j2719>
        Changing the order by assigning values to LC_COLLATE (b.2 method)
        Behavior of collation functions
        Output from weights=stroke,onyomi,radical,JISnumber (default)
                <j2719> < <j1829> < <j1643> < <j1644> < <j1602>
        Output from weights=radical,onyomi,stroke,JISnumber
                <j1643> < <j1644> < <j1829> < <j1602> < <j2719>
From WG15 Copenhagen, May 1996:
PASC WG has captured this issue and has emailed an awk script
(in N640r) which solves the problem.  Japan would like to take
the proposed solution back to Technical Experts to ensure it
answers their concerns.  The US DB would like comments ASAP to
ensure it hits the .2b ballot window.  Action on Denmark and
Japan to ensure the script works for them.  The issue remains
open - the US DB believes their solution will not be changed.

9. Title: Japanese proposal for LC_CTYPE extension [Open]

                locale, char, character, character map, LC_CTYPE,
                wctrans(), towctrans(), charconv, charclass
                Japan proposes that LC_CTYPE locale definition should be
                extended to allow locale-specific character mappings to
                be specified. This extension is necessary to implement
                wctrans() and towctrans() functions in ISO C amendment
                on a POSIX conforming system.


        N602    RIN N158: Japanese Action Item report to WG15
        N657    Data specification format for transliteration and transcription
        N664    Proposal for culturally dependent fallback: Response



From WG15 RIN Orlando, October 1995:

        N602 proposed the following extension to 1003.2b:
        [Note: The page numbers refer to the ones of P1003.2/D10.]
        Sect 2.5 (Locale) PROPOSAL.                             Page 8-9,12:
         The LC_CTYPE ( locale definition should be enhanced to allow
         user-specified additional character mapping, similar in the concept
         to the user-specified additional character class. In the Amendment
         of ISO C standard, extended character mapping functions
         (wctrans/towctrans) are specified. The following proposed extension
         will serve for the machinery to define locale specific character
         mappings used by the functions. Without having this extension,
         POSIX conforming systems need to have their own extensions to
         implement ISO C Amendment specifications.
        Proposal:[LC_CTYPE extension for specifying character mapping]
         The proposed extension for character mapping is similar to the
         extension of character class, which is already specified in .2b
         draft.  New keyword 'charconv' is introduced to define locale-
         specific character mappings instead of 'charclass' keyword for
         character class.  The way of defining character mapping is not
         extended with this proposal.  The same specification for toupper/
         tolower mapping can be used for locale-specific character mappings.
             # define the names of locale-specific character mappings
             charconv tojkata;tojhira

             # tojkata: hiragana => katakana mapping
             tojkata (<j0401>,<j0501>);(<j0402>,<j0502>);\

             # tojhira: katakana => hiragana mapping
             tojhira (<j0501>,<j0401>);(<j0502>,<j0402>);\

             END LC_CTYPE
        [Proposed extension to .2b text]
        [Page 8]
        => LC_CTYPE. Add the following keyword items after the item
           labeled tolower:
        charconv  Define one or more locale-specific character mapping names as
                  strings separated by semicolons. Each named character mapping
                  can then be defined subsequently in the LC_CTYPE definition.
                  A character mapping name shall consist of at least one and at
                  most fourteen bytes of alphanumeric characters from the
                  portable filename character set. The first character of a
                  character mapping name cannot be a digit. The name cannot
                  match any of the LC_CTYPE keywords defined in this standard.
                  Define the named locale-specific character mapping.
                  In the POSIX Locale, the locale-specific named character
                  mapping need not exist.
                  If a mapping name is defined by a charconv keyword, but no
                  character mappings are subsequently assigned to it, this
                  is not an error; it shall represent a mapping without any
                  character pairs belonging to it.
        [Page 12]
        => Locale Lexical Conventions. Add the following token
        CHARCONV  A string of alphanumeric characters from the portable
                  character set, the first of which shall not be a digit,
                  consisting of at least one and at most fourteen bytes,
                  and optionally surrounded by double-quotes.
        [Page 12]
        => Locale Grammar. Modify the ctype_keyword and
                   charconv_keyword descriptions as follows:
           ctype_keyword        : charclass_keyword charclass_list EOL
                                | charwidth_keyword charclass_list EOL
                                | defwidth_keyword defwidth_value EOL
                                | charconv_keyword charconv_list EOL
                                | 'charclass' charclass_namelist EOL
                                | 'charconv' charconv_namelist EOL
           charconv_namelist    : charconv_namelist ';' CHARCONV
                                | CHARCONV
           charconv_keyword     : 'toupper' | 'tolower'
                                | CHARCONV
From WG15 Copenhagen, May 1996:
        N657 and N664 refer.  N657 is an expert contribution from
        Denmark, N664 is not an official US response - it comes direct
        from the .2b group.
        The US development body asked for clarification of the Japanese
        proposal: does it require just character-to-character translation,
        or character-to-string, which is a much larger problem.
        WG15 actioned KS to provide details of existing implementations
        of the proposal in N657:
   |    9605-23   Keld Simonsen  - supply a table of information about
   |      research and products that support the functionality
   |      the of LC_TRANS extension to the IEEE 1003.2 working
   |      group by June 15.
        WG15 further actioned KS to respond to the queries raised in N664
        by 1-July for consideration by the IEEE 1003.2b DB.
   |    From: (Keld J|rn Simonsen)
   |    Date: Sun, 7 Jul 1996 17:17:11 +0200
   |    In-Reply-To: Yoichi Suehiro <>
   |       "(SC22WG15.849) Comments on .2b Japanese proposals" (Jul  1,  8:28)
   |    Subject: Re: (SC22WG15.849) Comments on .2b Japanese proposals

   |    The reference is the C3 system for coded character information conversion,
   |    for further information refer to

10. Title: Range expression dependency [Open]

collation, element, regular, expression, pattern,
LC_COLLATE, localedef
The user-defined ordering of collation elements in an
LC_COLLATE table is inadequately specified.  Different
but equally valid tables can produce differing results
when used as the basis of regular expressions, pattern
matching, etc
        N605    RIN N160: DS Additional comments on P1003.2b/D11



From WG15 RIN Orlando, October 1995:

        @ 2.8 o 5
        line 379: The range expression should not be dependent on the
        collation element order, but rather the result of the
        comparison using the relevant collation. Using the collating
        element order is not proper, and confusing to users that only
        have expectations as defined by the collation rules.
From WG15 Copenhagen, May 1996:
1003.2 is ambiguous on this point and 1003.2b will not be able
to fix  the problem.   There are two fairly  simple solutions,
but they are mutually exclusive,  and the  proponents  of each
solution  do not  readily  admit to  the  possibility that the
alternative solution may be valid.
This issue remains open.
Additional historical notes:
This request was forwarded to IEEE from X/Open end 1993 for
(Section, LC_COLLATE, 
"User-defined ordering of collating elements. Each collating
element shall be assigned a collation value defining its order
in the character (or basic) collation sequence. This ordering
is used by regular expressions and pattern matching and, unless
collation weights are explicitly specified, also as the collation
weight to be used in sorting."
Given this passage, assume there are two similar LC_COLLATE
fragments. The fragments include lowercase letters only to
simplify the examples. Here is the first fragment:
<a      <a>;<a>;<a>
<b      <b>;<b>;<b>
<c      <c>;<c>;<c>
<d      <d>;<d>;<d>
. . .
<z      <z>;<z>;<z>
. . .

Here is the second fragment:

<a      <a>;<a>;<a>
<b      <b>;<b>;<b>
<c      <c>;<c>;<c>
<d      <d>;<d>;<d>
. . .
<z      <z>;<z>;<z>
. . .

Suppose a user wanted to find all words that begin with a letter
in the range a-c. An XoJIG meeting agreed that a locale
built using the first fragment returns words that begin with <a>,
<a-grave>, <a-acute>, <b>, and <c>. However, there were varying
opinions about whether the second fragment would return the same
results, or would exclude <a-grave> and <a-acute>. So the
question is this:
Should an RE run against a locale built using the second fragment
include the accented a's in the range because they are defined as
being in the same equivalence class as <a>, or should it exclude
the accented a's because they are listed outside the range of a-c?
A preliminary response was obtained from IEEE in Feb 1994:
The standard is unclear on this issue, and as such no conformance
distinction can be made between alternative implementations based
on this. This is being referred to the Sponsors of the standard
for clarifying wording in the next amendment.
This response will be incorporated in an IEEE interpretations
publication, and will be also made available on-line on the IEEE
IEEE Interpretation for 1003.2-1992
The standard is ambiguous in this area, since it is not clear
what the phrase "collation sequence order" means or is. The two
possibilities are "the order in locale file", or "the order
determined by the weights in the locale file". The standard
allows either behavior. Concern over the wording of this area
has been forwarded to the Sponsors of the standard.
Rationale for Interpretation:
(c) 1994 The Institute of Electrical and Electronic Engineers, Inc.
Not to be published without prior written permission of the IEEE.
Andrew Josey | PASC Vice-Chair Interpretations
DS finds it unnecessarily complex to introduce two levels for
comparisons, one that is related to the comparison functions,
and then one that is related to the order the weights appear in
a localedef definition file. The latter is normally not part of
the definition of the collation order, but becomes significant
if this interpretation is favoured. The first interpretation
should be favoured, as the algoritm is already known by the user,
and gives the less unexpected result.

_. Title: [Open/Resolved]








Resolved Issues:

1. Title: RIN input to WG20 [Resolved]

        RIN, Internationalization, WG20, multi-lingual.
        Proposed WG on internationalization should have input from
        RIN, but it isn't formed yet.  How to address liaison statements.
        RIN (6/90)

        WG15, 6/90: Wait until group is formed, then send liaison
        statement.  Keep as issue until completed.
        WG15, 06/91: Send liaison statement to WG20 now that it is formed.
        This issue was raised at the WG15 Paris meeting when the Rapporteur
        Group on Internationalisation requested WG15 (RIN 027 refers) to
        advise the new SC22 working group on Internationalisation about
        the problems of multi-lingual and multi-directional presentation
        issues, eg numbers in Hebrew text.
        There was uncertainty on how to proceed this issue at the time.
        This issue was closed at the Rotterdam meeting, where the issues
        list was reviewed.  It was noted that the problem can be
        forwarded to SC22/WG20, now that it has been 'formed'.
        See WG15 resolution 133:
        133   Recommendation to WG20
          Whereas ISO/IEC JTC1/SC22/WG20 is newly formed for
          internationalization under ISO/IEC JTC1/SC22, and
          Whereas a Rapporteur Group for Internationalization
          (RIN) has been formed in ISO/IEC JTC1/SC22/WG15 with
          active participation from the National Bodies, and
          Whereas ISO/IEC JTC1/SC22/WG15 work such as ISO/IEC
          9945-2 has introduced a suitable set of
          internationalization features including, but not
          limited to, the following:
          -    date and time format
          -    numeric and monetary representation
          -    collation
          -    character map
          -    character classification
          -    national locales
          -    national profiles, and
          Whereas ISO/IEC JTC1/SC22/WG15 believes that the RIN
          work also would be useful for the work of ISO/IEC
          Therefore ISO/IEC JTC1/SC22/WG15 requests that ISO/IEC
          JTC1/SC22/WG20 work with ISO/IEC JTC1/SC22/WG15/RIN to
          agree how to best move work in these areas forward.
        This action was reviewed at the WG15 meeting, Stockholm, as 9105-08:
        "Convenor:  Forward WG15RIN-N027 to WG20, Internationalisation,
         and to SC2, SC18, and SC20.  (Arises from discussion of issues
         list item, "multi - lingual, multi - directional presentation",
         under agenda item 2.2.)
        Status:  Done."

4. Title: Publication of responses to Ballot comments [Resolved]

        ballot responses, NPs
        A concern about the visibility of responses to ballots returned
        to SC22, they should be cireculated to WG15 as they become available.
        WG15 (05/91)

        None proposed.


        Resolved by SC22 disposition of NPs.
        This issue was first raised at WG15's Rotterdam meeting:
        Comments in the Rotterdam minutes on this item noted that:  "in
        order that WG15 member's concerns about the visibility of
        responses to ballots returned to SC22, these responses will be
        circulated to WG15 as they become available in future."

5. Title: CD 9945-2.2 Document not identified [Resolved]

        CD 9945-2.2
        When will 9945-2.2 become available?
        WG15 (05/91)

        US: (05/92) When the IEEE 1003.2 document becomes sufficiently stable.
        WG15: (05/92) Move to issues list.
        None recorded.
        Closed by SC22 N1063
        This issue was first raised at WG15's Rotterdam meeting:
        This was formerly action 9010-7, but was moved to the issues
        list pending a decision by the US member body that 1003.2 has
        become stable enough for submission.  As well as being required
        for CD registration, it is also required for liaison with SC2 on
        character set issues.

6. Title: Language interaction pertaining to real time. [Resolved]

        real time, concurrency, France
        There is a need to investigate potential conflict with the real
        time extension of POSIX and the real time concurrency features of
        programming languages.
        WG15 (10/92)

        None proposed.
        None recorded.
        The threads amendment ISO/IEC 9945-1 AM2 contains specifications
        that resolves the issue satisfactorily to the French member body.
        This issue was first raised at WG15's Reading meeting:
        Liaison between AFNOR POSIX WG, and AFNOR Ada Real Time WG (N295)
        led to discussions about potential overlap and inconsistencies
        between the real time amendment to 9945-1 (currently at CD stage)
        and the NP on real time extension to Ada.
        A future work item for WG13 is the development of a multiple
        processor real time facility.  The interaction of this with POSIX
        services in this area may conflict with each other.
        At the time of origination of this issue, only Ada and Modula-2
        are known to be of concern.
        At the Orlando meeting 1995-10 the issue was recorded as resolved
        and closed, as noted above.

SD3 1996-6-25: WG15 Issues List