May 12 13:51 1993 Action 9210-32 - Investigation of I18N Guidelines Message-Id: <9305061155.AA08879@tsbome.ome.toshiba.co.jp> To: wg15rin@dkuug.dk Cc: martin@xopen.co.uk (Martin Kirk, as WG15 9210 meeting secretary), isaak@decvax.dec.com From: ynk@ome.toshiba.co.jp (Yasushi Nakahara) Date: Thu, 06 May 93 20:46:49 JST Subject: Action 9210-32 - Investigation of I18N Guidelines Hi all RIN people, Per request of RIN Lead Rapporteur Mr. Keld Simonsen, I'm sending a preliminary input for the WG15 action item 9210-32 which was identified at the last Reading meeting. Please take this into your consideration at the RIN 9305 meeting or RIN email discussion. | Date: Fri, 06 Nov 92 17:00:14 GMT | Subject: (posix 1025) (SC22WG15.157) WG15 Minutes (ASCII 1 of 3) | | ISO/IEC JTC 1/SC22/WG15 N326 | | ISO/IEC JTC 1/SC22/WG15 | | Minutes of Meeting | 27-30 October 1992 | Reading, UK | 3.0 Actions Arising from Reports | Action 9210-32: RIN Lead Rapporteur: Investigate the production of guidelines | for standards developers for the usage of the terms character and byte in the | definition of interfaces, with especial attention to the internationalisation | issues arising from character-based interfaces. For your good understanding of this action item, some background information may be required. If I remember correctly, this action was derived from my comments at the plenary session. So, I'm adding some explanations. See an excerpt from the Reading minutes and my comments below. | 2.8 Rapporteur Group report/status | | 2.8.1 Security | The RGSec report is WG15/N320. The report was presented by Jon Spencer. | | It is required to transmit D13 of P1003.6 to SC21 and SC27. A resolution was | developed to address this issue. | | RGSec recommends that it should continue in existence as there is outstanding | current work, notably X.400 and X.500. In addition there is increasing | activity in this whole area of security and there is therefore a continuing | need for coordination activities. | | It was also felt that the Danish No vote on N304 could be addressed in one of | the small groups by the production of a suitable Disposition of Comments. | | Japan noted that their was discussion on issues related to ^^^^^ typo --> there | internationalisation and audit logs. The need for a resolution was identified,| but at the current time it was felt that further information was required | before that resolution could be identified. | | Japan further identified problems in the usage of the terms "character" and | "byte" in the P1003.6 document. RIN should be requested to provide guidance | to standards developers in order to avoid such problems in the future. The | specification of character-oriented interfaces require careful consideration | of internationalisation issues that do not affect interfaces specified in | terms of bytes. The last paragraph was an actual (partial) log of such discussion, although at that time in conjunction with Jon's comment on I18N issues I added that not only P1003.6, but also almost all the P1003.x documents may have I18N issues wherever "character" interfaces are being specified. More specifically, I explained that the recent P1003.4 and P1003.7(.x) drafts have the similar I18N issues to what Japanese POSIX WG has been actively commenting on POSIX.1 and POSIX.2 specifications since 1989 in terms of I18N/L10N features and "character vs. byte" issues, and that Japan has to repeatedly send the similar comments again and again on each POSIX.n draft, which may be neither effective nor productive. So, I suggested, rather than such patch works, that concerned National Bodies and/or RIN should develop certain designing/reviewing guidelines (or appropriate template) for I18N/L10N specifications, in order to make each ballot/disposition process of POSIX.n draft more productive and consistent (in terms of I18N/L10N specifications). Actually, the Japanese ballot comments on CD 9945-2 pointed out such cross functional aspects of I18N/L10N issues and introduced some proposed designing/reviewing guidelines for I18N/L10N specifications. With these things in mind, I'm enclosing draft proposed reviewing/designing guidelines for I18N/L10N specifications. Please send your comments to the RIN mailing list. ______________________________________________________________________ Draft Proposed I18N/L10N Guidelines for (POSIX) Standard Interface Design and Review 1. Take into account of the following aspects: - Character counts != byte counts - Character counts != display width - Byte counts != display width - Only the "wchar_t" type in C language (known as a "wide character") corresponds to the concept of a character. 2. Do not use a term "character" neither in the meaning of "byte" nor in the meaning of "display width" or "column position". 3. Determine which interfaces are character-oriented (arguments or operands, input data, output data, I/O format and etc.) If the interface in question is byte-oriented, carefully use a term "byte" or an appropriate wording so that interpretation of the specification should not be mixed up with the concept (definition) of a character. And, skip the following guidelines (which are fully character-oriented). 4. Carefully study the features of character-oriented interfaces and give appropriate specifications (or review the proposed specifications in reviewing process) in terms of the following aspects: - Character boundary recognition [This shall be generic "character" based.] - Limit check & truncation in various units, in particular, make clear what units (byte, character, column, width, and etc.) shall be applied. - Character/string width recognition [This shall be generic "character" based.] - Character/string parsing & manipulation [This shall be generic "character" based.] Also, locale dependency such as LC_CTYPE and LC_COLLATE shall be well defined. - Language dependency of text data including message data [Make clear what natural language dependencies are (explicitly/ implicitly) included in the target text.] - Culture dependency of representations [Make clear what (other) locale dependencies are covered by the specification via suitable LC_XXX (such as LC_TIME, LC_NUMERIC, LC_MONETARY, LC_MESSAGE ) and LANG variables.] ______________________________________________________________________ That's all at this moment. Thanks. Best Regards, - ynk Yasushi Nakahara TOSHIBA Corp. Phone: +81 428-33-1346|1347 Fax: +81 428-32-0018 Email: ynk@ome.toshiba.co.jp