From suehiro@jrd.dec-j.co.jp Tue Sep 5 23:32:24 1995 Received: from gatekeeper.dec-j.co.jp (gatekeeper.dec-j.co.jp [202.34.226.2]) by dkuug.dk (8.6.12/8.6.12) with ESMTP id XAA12012; Tue, 5 Sep 1995 23:32:13 +0200 Received: by gatekeeper.dec-j.co.jp (8.6.12+usagi/JNET-GW-940327.1); id GAA02650; Wed, 6 Sep 1995 06:32:11 +0900 Received: from cobra.jrd.dec.com by garfield.jrd.dec.com (8.6.12+usagi/JULT-4.4-gar) id GAA01080; Wed, 6 Sep 1995 06:31:53 +0900 Received: from localhost by cobra.jrd.dec.com (5.65v3.0/JOSF-3.0-cobra) id AA21500; Wed, 6 Sep 1995 06:32:34 +0900 Message-Id: <9509052132.AA21500@cobra.jrd.dec.com> To: sc22wg15rin@dkuug.dk Cc: sc22wg15@dkuug.dk Subject: LC_CTYPE extension proposal to POSIX.2b for wctrans support Date: Wed, 06 Sep 1995 06:32:34 +0900 From: Yoichi Suehiro This is an input from Japan to October RIN meeting for discussion. The topic has ever been discussed several times in I18n community but never recorded as an official input to POSIX standards. regards, Yoichi Suehiro =========================================================================== Source: Japan Title: Japanese proposal to POSIX.2b on LC_CTYPE extension for locale-specific character mapping Status: Japanese position Short description: Japan proposes that LC_CTYPE locale definition should be extended to allow locale-specific character mappings to be specified. This extension is necessary to implement wctrans() and towctrans() functions in ISO C amendment on a POSIX conforming system. Text of contribution: ---------------------------------------------------------------------------- [Note: The page numbers refer to the ones of P1003.2/D10.] Sect 2.5 (Locale) PROPOSAL. Page 8-9,12: Problem: The LC_CTYPE (2.5.2.1) locale definition should be enhanced to allow user-specified additional character mapping, similar in the concept to the user-specified additional character class. In the Amendment of ISO C standard, extended character mapping functions (wctrans/towctrans) are specified. The following proposed extension will serve for the machinery to define locale specific character mappings used by the functions. Without having this extension, POSIX conforming systems need to have their own extensions to implement ISO C Amendment specifications. Proposal:[LC_CTYPE extension for specifying character mapping] The proposed extension for character mapping is similar to the extension of character class, which is already specified in .2b draft. New keyword 'charconv' is introduced to define locale-specific character mappings instead of 'charclass' keyword for character class. The way of defining character mapping is not extended with this proposal. The same specification for toupper/tolower mapping can be used for locale-specific character mappings. EXAMPLE: LC_CTYPE # define the names of locale-specific character mappings charconv tojkata;tojhira # tojkata: hiragana => katakana mapping tojkata (,);(,);\ .....definition..... # tojhira: katakana => hiragana mapping tojhira (,);(,);\ .....definition..... END LC_CTYPE [Proposed extension to .2b text] [Page 8] => 2.5.2.1 LC_CTYPE. Add the following keyword items after the item labeled tolower: charconv Define one or more locale-specific character mapping names as strings separated by semicolons. Each named character mapping can then be defined subsequently in the LC_CTYPE definition. A character mapping name shall consist of at least one and at most fourteen bytes of alphanumeric characters from the portable filename character set. The first character of a character mapping name cannot be a digit. The name cannot match any of the LC_CTYPE keywords defined in this standard. charconv-name Define the named locale-specific character mapping. In the POSIX Locale, the locale-specific named character mapping need not exist. If a mapping name is defined by a charconv keyword, but no character mappings are subsequently assigned to it, this is not an error; it shall represent a mapping without any character pairs belonging to it. [Page 12] => 2.5.3.1 Locale Lexical Conventions. Add the following token description: CHARCONV A string of alphanumeric characters from the portable character set, the first of which shall not be a digit, consisting of at least one and at most fourteen bytes, and optionally surrounded by double-quotes. [Page 12] => 2.5.3.2 Locale Grammar. Modify the ctype_keyword and charconv_keyword descriptions as follows: ctype_keyword : charclass_keyword charclass_list EOL | charwidth_keyword charclass_list EOL | defwidth_keyword defwidth_value EOL | charconv_keyword charconv_list EOL | 'charclass' charclass_namelist EOL | 'charconv' charconv_namelist EOL ; charconv_namelist : charconv_namelist ';' CHARCONV | CHARCONV ; charconv_keyword : 'toupper' | 'tolower' | CHARCONV ; ---------------------------------------------------------------------------- [END]