From keld@dkuug.dk Fri Sep 27 21:32:44 1991 Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8) id AA17537; Fri, 27 Sep 91 21:32:44 +0200 Date: Fri, 27 Sep 91 21:32:44 +0200 From: Keld J|rn Simonsen Message-Id: <9109271932.AA17537@dkuug.dk> To: donn@hpfcrn.fc.hp.com Subject: Re: (wg15rin 155) Re: Ballot resolution Cc: greger@ism.isc.com, hlj@posix, wg15rin@dkuug.dk X-Charset: ASCII X-Char-Esc: 29 > However... what will those implementations do for sh and awk and all the > other things that need #? They're going to have to solve the problem in > some way. (I'll take that as a given.) For the C language, where the same problem exists, they use trigraphs - to my knowledge. Poor bastards:-) > They *will* have to substitute something for #. Whether when doing that > the implementation is strictly standard conformant I don't want to get into. > (Or they absolutely insist that there's only one translation, which is > to the 8859-like EBCDIC, which always has #.) Yes, but in different places... > > Whatever character they end up using for # is # in shell, # in awk, and > should be # in localedef! Thus, I don't see that EBCDIC has the problem > in the same sense as 646. (At least they have a place to put it and > the national characters; 646 just doesn't have room!) They may not have a feasible place to place it in the code. There may be all kinds of hardware and software restrictions: you cannot generate the character from the keyboard, you cannot display it on the screeen, you cannot print it on the printer, you cannot get it thru the terminal cluster, you cannot get shell to handle it etc. > I believe that any reasonable POSIX on a EBCDIC implementation would > insist that the 8859-like EBCDIC be the only character set supported. I beleive that not many EBCDIC systems will be sold as UNIX systems, but the UNIX system will coexist with other IBM systems running some EBCDIC and that UNIX will run the same EBCDIC as the other system, to be able to access files. This would indicate using older EBCDICs, something like 2nd generation or 3rd generation EBCDICs, which both have the problem with #. And I would expect more 2nd than 3rd generation EBCDICs. Well, that is guessing, maybe we could have an opinion from a IBM person... > (On a hosted system, it might support other variations in the non-POSIX > environment, but withing POSIX...) The historical uglyness of 646 > (because of its 7-bit nature) should not affect an inherently 8-bit > character set. Some EBCDICS are restricted to using only a subset of the 8-bit characters, resembling a national 7-bit character set. All IBM machines in my neighborhood is doing that! ... Oh well some even use the very old non-ISO compatible codes. It is my impresssion that the Danish environment is one of the more active in going full 8-bit, so I expect things to be worse other places. Still in the 8859-like EBCDICS the # is at different codeplaces in quite a few of these chracter sets. So it is still a mess with #. > I don't see any reason why, in an EBCDIC environment, localedef can't > count upon having # available. You could say the same about using C in EBCDIC environments. Still my experience says someting different. Keld