From ynk@ome Thu Nov 15 16:41:51 1990 Return-Path: Message-Id: <9011150346.AA06229@tis1.tis.toshiba.co.jp> To: wg15rin@dkuug.dk Subject: Re: Questionnaire.... Date: Thu Nov 15 10:50:28 JST 1990 From: ynk@ome.toshiba.co.jp (Yasushi Nakahara) X-Sequence: wg15rin@dkuug.dk 11 Errors-To: wg15rin-request@dkuug.dk X-Charset: ASCII X-Char-Esc: 29 Hi Donn and RIN members, On Donn's request (rin email #7) and Erik's suggestions (rin email #8), I'm sending my personal raw answer to the questionnaire (not on the questionnaire itself), although I know that it's lengthy, which may cause a waste of resources such as network transmission costs and members' disk space. The answer was also submitted to the Japanese POSIX Committee (SSI/POSIX) in September for discussion, but we haven't discussed it sufficiently yet. The future completion of the answer, as one of the National Bodies' answers, is expected now. Japanese members (and myself) are continuing to do so. I hope it will encourage other people to comment on the questionnaire itself :-) and to try to make their answers. Also, your comments on my answer are certainly welcome. Please send them to the WG15RIN mailing list. It would be greatly appreciated. Regards, Yasushi Nakahara TOSHIBA Corp. Phone: +81-472-77-8670 Fax: +81-472-79-2628 Email: ynk@ome.toshiba.co.jp P.S. Erik, How was the PortSoft meeting in Beijing (China) last week? I missed to attend the meeting due to another urgent business. I would suggest that the report from the PortSoft Beijing meeting be posted not only to the "portsoft@jus.or.jp" mailing list but also to the "wg15rin@dkuug.dk" so that the I18N concerned people could understand what is happening in Asia under the name of "PortSoft". =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Message-Id: <9006052304.AA08177@hpfcrn.HP.COM> > To: wg15rin@dkuug.dk > Subject: Questionnaire > Date: Tue, 05 Jun 90 17:04:29 MDT > From: Donn Terry > > As promised at the last RIN meeting, here is the initial draft of the > questionnaire to shake out the unknown problems in internationalization. > > We can go over it in detail in Paris next week. > > Donn > > > **** DRAFT**** > For comment on the questionaire itself > > To: National Standards Bodies, ISO member countries. > > From: Internationalization Rapporteur group > SC22/WG15 > > Subj: National conventions. > > As computers become more prevalent, they must deal with local and national > cultural conventions, rather than reflecting the conventions of limited > populations. To do this, much information must be gathered so that the > mechanisms can deal successfully with all the conventions rather > than finding that some were omitted and cannot be easily retrofitted. > > The technology is not ready to deal with issues such as natural-language > translation, but issues such as time and date, currency, and timezones are > ready to be considered. > > The issue of character sets is being addressed in SC2. We presume that > the necessary characters can be represented. > > Attached is a questionnaire that we would like to have filled out by as > many nations, representing as many cultures, as possible. Within a > nation that has more than one culture or set of conventions, please fill > it out for each culture or set of conventions. The viewpoint reflected > should be that of the culture, rather than responding from the viewpoint > of a computer expert who is able to deal with the representations that > do not match the culture. > > The questionnaire first explains what the issue is, and then shows examples > of what the current technology can handle. This is to give you an idea > of what the problem is, as it is currently perceived. We would ask you to > answer several questions in each area: > > 1) Is the current technology (as represented in the questionnaire, > not in terms of actual products) minimally acceptable; can you > operate successfully in your culture, for computer use only, with > what is available? > > 2) Is the current technology adequate for most computer usage? > Does it meet all your national or cultural needs when > computers are being used as data processing devices? If not, > please describe the problem, and how that information should > be represented to meet local needs. > > 3) Is the current technology adequate for non-expert usage? Are > there situations where people who do not normally use computers > would be presented with information in an unfamiliar form if > the current technology were not extended? Again, we would ask > for descriptions of the problem if the needs are not fully met. > > 4) We realize that there are also historical usages, such as > obsolete currencies, that would need to be represented in textual > documents. If those would also be used by computers, in terms > of manipulating them, please describe them. If, however, a > computer would not have to deal with them (except possibly as > uninterpreted text) they are not within the goals of this > questionnaire. > > Please use the examples as a guideline both to understand the questions we > are asking, and also to help us understand your response. In no case can > the examples be complete. If you are unsure whether the needs of your culture > are met, indicate that, and we can evaluate the situation to see if the > technology can already do it. > > Because of the diversity of cultures, it may not be possible to represent > every concept in all possible ways at a reasonable cost. However, by > knowing of the issues, we can hope to do a better job than otherwise. It > remains up to the programmer to actually use these facilities, so they will > not automatically be present in programs even when they are available. > > Where we suspect that there might be a problem, a list of "possible issues", > to start thinking about the problems, is mentioned. > > Please provide us with a contact person for each culture or set of conventions > so that we may ask further questions. > > > Date and time. > > Dates and times can be converted from an internal representation (representing > UCT) to external forms with the following rules: > > The month can be represented as: > - one or two digit number > - a two digit number > - a month name abbreviation > - an arbitrary length month name > - Capitalization of the month name can be varied. Form 1: In Japan, a notation "99M" is normally used. (Japanese modern) Where, "99" is one or two digit number and "M" may be a Kanji character denoting "month". It looks like a numerical quantity(99) plus an unit quantifier(M). The "M" never proceeds the "99" month number. Form 2: A notation "QQM" is sometimes used. (Japanese classic) This is the same as the Form 1 except that "QQ" is one or two numerical Kanji character(s). The "M" character and its order are as described above. Form 3: A notation "XYZ" is occasionally used. (Japanese ancient) Although it is a rare case in computer processing, we have an old fashioned month name for each "99M" or "QQM", which is mainly of two Kanji characters, but you can think it as an arbitrary length of Kanji characters. Form 4: Also, a western format is sometimes used. (western) In this case, ordinary western (or american) rules should be applied. Fact: In Japan, capitalization is meaningless for the month name of Kanji character. > The day of month can be represented as: > - a one or two digit number > - a two digit number Form 1: In Japan, a notation "99D" is normally used. (Japanese modern) Where, "99" is one or two digit number and "D" may be a Kanji character denoting "day". It looks like a numerical quantity(99) plus an unit quantifier(D). The "D" never proceeds the "99" day number. Form 2: A notation "QQD" is sometimes used. (Japanese classic) This is the same as the Form 1 except that "QQ" is one or two numerical Kanji character(s). The "D" character and its order are as described above. Form 3: A notation "XYZ" is occasionally used. (Japanese ancient) Although it is a rare case in computer processing, we have an old fashioned day name for special "99D" or "QQD", which is mainly of two or three Kanji characters, but you can think it as an arbitrary length of Kanji characters. Form 4: Also, a western format is sometimes used. (western) In this case, ordinary western (or american) rules should be applied. Fact: In Japan, one or two digit number of the day is always followed the "D" Kanji character. Except that it is used with a month number or name in an abbreviation format, the day number alone is never used. > The year can be represented as: > - The four digits of the Western era > - the last two digits of the Western era > - Other eras: > + Dates can be started from other bases than the Western era > + Names of eras can be attached > + The first year of an era can be named, rather than numbered. Form 1: In Japan, a notation "9999Y" is popular. (Japanese modern) Where, "9999" is the four digit of the Western era and "Y" is the Kanji character denoting "year". Form 2: A notation "EE99Y" is an another popular format. (Japanese official) Where, "EE" is a name of the "emperor" era with Kanji characters, "99" is one or two digit of the era except the first year. The first year has another Kanji expression (like a "name" of the year from western point of view, but it's just a special numerical expression in Kanji character). The "Y" is the same as described above. The most important remark is that this format is officially supported by Japanese Government and other local governments in Japan. Form 3: A notation "QQQQY" or "EEQQY" is sometimes used. (Japanese classic) This is the same as the Form 1 or 2 except that "QQQQ" or "QQ" is numerical Kanji character representation. The "Y" or "EE" and its order is the same as described above. Form 4: Also, a western format is sometimes used. (western) In this case, ordinary western (or american) rules should be applied. > The day of the week can be represented as: > - A day of week name abbreviation > - An arbitrary length week day name > - Numeric day of the week (0=Sunday) > - Capitalization of the day name can be controlled. Form 1: In Japan, a notation "NWD" is most popular. (Japanese) Where, "N" is one Kanji character for each day of week, and "WD" is fixed expression with two Kanji characters. Therefore, "NWD" may be a week day name and the "N" is an abbreviation for a day of week. Form 2: Also, a western format is sometimes used. (western) In this case, ordinary western (or american) rules should be applied. Fact: In Japan, capitalization is meaningless for the week day name of Kanji character. > Hours can be represented as: > - One or two decimal digits > - Two decimal digits. > - In 12 or 24 hour time, with or without AM/PM notation. > + The AM and PM notation can be changed. Form 1: In Japan, a notation "99H" is normally used. (Japanese modern) Where, "99" is one or two digit number and "H" may be a Kanji character denoting "hour". The "H" never proceeds the hour number "99". Form 2: A notation "QQH" is sometimes used. (Japanese classic) This is the same as the Form 1 except that "QQ" is one or two numerical Kanji character(s). Form 3: Also, a western format is sometimes used. (western) In this case, ordinary western (or american) rules should be applied. Fact 1: Regarding AM/PM notation, Japanese notation is as follows. "XX99H..." where "XX" is two Kanji character representation corresponds to AM or PM, such as "XY" or "XZ". This would raise two problems. Problem 1: Alternation of "AM"/"PM" is required with an arbitrary length of characters. Problem 2: An ability to specify the order of AM/PM (or similar) modifier is necessary. [ Some Japanese people feel that this is also requested even for the western format, because in Japan both formats "99:99 AM" and "AM 99:99" are equally used. ] Does the current "algorithm" proposed in POSIX support a positioning of the AM/PM modifier such as a prefix or a suffix? Fact 2: In AM/PM notation, for just noon or midnight we Japanese never use "XY12" (for noon) nor "XZ12" (for midnight), rather we use another entirely different Kanji character representations without hour digit number "AB" (for noon) and "AC" (for midnight). Does the current "algorithm" proposed in POSIX solve this issue? Fact 3: Another Japanese concern on 12 hour time notation. Most people prefer the "0 - 11" notation to the "1 - 12" notation. For example, we often use "PM 0:30", rather than "PM 12:30". Does the current "algorithm" proposed in POSIX support this variation? Is there other variation in the world? [ This discussion may be applicable for 24 hour time notation, i.e. "0 - 23" versus "1 -24" problem. Which is your normal form, "0:30 AM" or "24:30 AM"? I think most Japanese people use the "0:30 AM" notation. ] > Minutes and seconds can be represented as: > - Two decimal digits > Form 1: In Japan, a notation "99M99S" is normally used. (Japanese modern) Where, "99" is one or two digit number and "M" and "S" may be Kanji characters denoting "minutes" and "seconds" respectively. The "M"/"S" never proceeds the digit number "99". Form 2: A notation "QQMQQS" is potentially used. (Japanese classic) This is the same as the Form 1 except that "QQ" is one or two numerical Kanji character(s). Form 3: Also, a western format is sometimes used. (western) In this case, ordinary western (or american) rules should be applied. > Weeks can be represented as the week number of the year. Either Sunday > or Monday can be used as the first day of the week. Fact: In Japan, this notation is scarcely used. > The current timezone name can be printed. Fact: The timezone is requested only for overseas context, because there is a single timezone in Japan. Issue: To satisfy general Japanese people, the timezone is sometimes requested to be printed in Kanji character name, instead of abbreviation or symbolic name of western alphabet characters. Does the current "algorithm" proposed in POSIX satisfy this requirement? > The above elements can be combined in arbitrary order, with any fixed > punctuation between them. See the above comments. > Some example dates that can be generated include > > Feb 28, 1990 > february 28, 1990 > HH2Y2M28D (Where HH Y M and D would be Kanji) > Wednesday 28 February, 1990 > 02/28/1990 > 28/02/90 > 28 II, 1990 (The month name would be a Roman Numeral) > > 10:01 PM > 2201 > 10:01:02 > 22:01:02 > 10:10 PM EST > > Possible issues: solar time, calendars that do not align with the Western one. Yeah, there are other calendars in Japan or other Asian countries, like the lunar time and calendars, Japanese local calendars derived from ancient Chinese calendars (Chinese zodiac system based), and etc. However, I have no knowledge about those to comment further. > Is some combination of the elements above minimally acceptable? I would say "No". See the comments above. > Is there some capability missing for normal computer usage? If so, what? Yes. 1. Several issues on AM/PM notations. See the above. 2. Numerical Kanji character conversion. See the above. > Is there some capability missing for non-computer usage? If so, what? Yes. 1. Name of day. See the above. 2. Alternation of timezone name (in native character). See the above. 3. Lunar and/or zodiac calendars. See the above. > > Timezones: > > The timezone in which a date or time needs to be represented needs to > be represented as an offset from GMT. Timezones can be represented > in terms of: > > - Offset from UCT, in hours, minutes and seconds, + or - 24 hours. > > - The start and end of daylight/summer time: > + In terms of a day number of the year > + In terms of a particular day of week, week of month, and > month number > + At a specified time. > > - The offset (in hours, minutes, and seconds) of daylight/summer > time from the normal time. > > - The names of the normal and summer timezones > > Examples: > > 7 hours west of GMT, with one hour for summer time on the first > Sunday of April, ending on the last Sunday of October, both at > 0200. Names MST and MDT. > > One hour east of GMT, one hour for summer time, starting on the last > Sunday in March and ending the last Sunday in September. Names > MEZ and MESZ. (Or MET and METDST.) > > Nine and 1/2 hours east of GMT, one hour for summer time, starting > on the first Sunday in October, ending the first Sunday in March. > Names CST, CDT (Australia.) > > Five Hours west of GMT. No summer time, but the timezone name > changes in the summer by the same rules as the first example. > Names of EST and CDT (Indiana). > > Is some combination of the elements above minimally acceptable? Yes, because the JST is only one Japanese Standard Time, 9 hours east of GMT, with no summer time. > Is there some capability missing for normal computer usage? If so, what? None. > Is there some capability missing for non-computer usage? If so, what? None. > > Character set characteristics: > > Character sets can be classified into the following classes: > Upper Case > Lower Case > Numeric > Punctuation > White space > (Plus several that are primarily for computer usage). > > Translation of characters between upper and lower case can be done > with or without loss of accent marks. > > These concepts need not be applied to languages which do not have the > concepts of case or other character classes. > > Examples: > The character a-accent-grave can be translated to either A or > A-accent-grave. > > The three Russian characters that never occur in upper case can > be left alone during translation. > > Is some combination of the elements above minimally acceptable? I would say "no". (Examples and other descriptions are to be completed) > Is there some capability missing for normal computer usage? If so, what? Yes! (Examples and other descriptions are to be completed) > Is there some capability missing for non-computer usage? If so, what? Yes. (Examples and other descriptions are to be completed) > > Collation: > > Collation is the ordering of textual material into some predefined order. > > The rules which can be used to determine the collation of text include: > > - The specification of a collation order different from that > which occurs naturally in the computer character set. > (French and Canadian French use the same character codes, but > collate in different orders.) > > - Certain characters do not participate in collation decisions. > > For example, as required on page 10 of Webster's Ninth New > Collegiate Dictionary: > > The main entries follow one another in alphabetical order letter > by letter without regard to intervening spaces or hyphens: > > - Certain characters should collate equally even if they are > different characters. (E.g. in some languages the accented > vowels are all equal and do not participate in collation > decisions.) > > - Certain characters should collate equally until they are the > only difference, and then collate in a specified order. (As > in the example above, only when two strings differ only by > accent marks, the order is specified.) > > - Certain pairs of characters should be treated as a single > character. (E.g. ll and ch in Spanish.) > > - Certain characters should collate as if they were two characters. > (S-zed in German, the ae diphthong.) > > - Collation can be done either with upper and lower case characters > distinct, or with the upper and lower case characters treated > equivalently. The upper to lower translations mentioned for > character collation can be done. > > > Examples: > > German requires the following: > > - the ability to process a single character as two distinct > collation elements each of which is distinct from all other > collation elements. An example is the character > which looks similar to the Greek beta and is also referred to > as . is collated as two identical collation > elements which are ordered between and . > > Experts understand the issues of Chinese "character collation", > French collation concerns, and Japanese "word collation". They > are too long to give as examples here. > > Due to the complexity of collation issues, a reference to a standard work > on collation for your culture or language would be very useful. > > Is some combination of the elements above minimally acceptable? I would say "no". > Is there some capability missing for normal computer usage? If so, what? Yes. An ability to define the collation between several character sets or character classes, instead of listing up all character/collation elements. (Examples and other descriptions are to be completed) > Is there some capability missing for non-computer usage? If so, what? Yes. (Examples and other descriptions are to be completed) > > Numbers: > > Numbers can be represented with or without thousands separators > (every three digits), and with either . or , as the radix point. > > Examples: > > 123456.7890 > 123 456.789 0 > 123,456.7890 > 123.456,7890 > > Possible issue: Some countries use Hindi digits. Are there other digit > systems in use; are there other patterns of "thousands separators"? > > Is some combination of the elements above minimally acceptable? In Japan, we may have several problems. Prob 1: Japanese classic representation would require 10 thousands separators (every four digits) rather that thousands separators. Prob 2: Japanese classic representation sometimes requires Kanji numeric character representations rather than Western numeric character. Prob 3: (to be completed) > Is there some capability missing for normal computer usage? If so, what? I would say "yes". (Examples and other descriptions are to be completed) > Is there some capability missing for non-computer usage? If so, what? Yes. (Examples and other descriptions are to be completed) > > Currency. > Currency can be represented using any of the numeric formats, but > can be separately identified from the numeric formats. (That is, > numeric formats could use a different thousands separator from > monetary formats.) > > Separate local and international currency symbols are maintained. > > The currency symbol can be placed at the beginning or the end, > and can be multiple characters. It can be separated from the > amount by a space. The decimal delimiter can be specified. > > Specific strings can be used for specific signs. > > $123456.45 > $ 123 456.45 > ( 123 456.45 ) > 123 456.45 CR > 123 456$45 > > > > Is some combination of the elements above minimally acceptable? I think so. > Is there some capability missing for normal computer usage? If so, what? The problems in Japan are the same as for numerical expressions. Especially, Japanese classic notation of currency value using special Kanji numerical characters. (Examples and other descriptions are to be completed) > Is there some capability missing for non-computer usage? If so, what? Yes. (Examples and other descriptions are to be completed) > > Messages and Responses: > > Messages, and the strings that the user uses to respond to messages, can > be kept separate from the program, and can be separately translated > (not automatically, however) to any supported language. The order > in which substitutions (such as amounts or names) appear can be controlled. > > The text of message responses can be stored in the same way. A single > string for "yes", and a single string for "no" is always available. > > Is some combination of the elements above minimally acceptable? Messages, yes. Responses, I would say "yes", as far as a positive question is concerned. > Is there some capability missing for normal computer usage? If so, what? Generic comment. In case of an uncertain nationality/language environment of the user, an ability to print multilingual messages and to get one specific language response may be required. Or an ability to provide menu interface by asking the user to select an appropriate number which corresponds the yes/no response, instead of asking the user to type the yes/no response string. > Is there some capability missing for non-computer usage? If so, what? I have no idea. > > Text presentation: > > Not all natural languages are read and written in the European > left-to-right, top-to-bottom order of presenting characters. > > Presentation in either right-to-left, top-to-bottom or top-to-bottom, > left-to-right order is currently available. > > Possible issues: varying directions, other major direction patterns; > in a computer environment, displays typically "scroll". Does this > present problems? Yes. > Is there some capability missing for normal computer usage? If so, what? Yes. (Examples and other descriptions are to be completed) > Is there some capability missing for non-computer usage? If so, what? Yes. (Examples and other descriptions are to be completed) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= EOF