From ajosey@rdg.opengroup.org Tue Nov 18 15:02:47 1997 Received: from mailgate.rdg.opengroup.org (mailgate.rdg.opengroup.org [192.153.166.4]) by dkuug.dk (8.6.12/8.6.12) with SMTP id PAA09406 for ; Tue, 18 Nov 1997 15:02:30 +0100 Received: by mailgate.rdg.opengroup.org; id AA32765; Tue, 18 Nov 1997 14:03:11 GMT Message-Id: <9711181403.AA32765@mailgate.rdg.opengroup.org> Received: from mailhome [192.153.166.5] by mailgate.rdg.opengroup.org via smtpd ; Tue Nov 18 14:03 GMT 1997 Received: by mailhome.rdg.opengroup.org (1.36.108.10/16.2) id AA25391; Tue, 18 Nov 1997 13:56:31 GMT From: ajosey@rdg.opengroup.org (Andrew Josey) Date: Tue, 18 Nov 1997 13:56:31 +0000 Reply-To: ajosey@rdg.opengroup.org (Andrew Josey) X-Mailer: Mail User's Shell (7.2.5 10/14/92) To: sc22wg15@dkuug.dk Subject: Defect Reports IS 9945-2 Attached are three recent defect reports for IS 9945-2 (references IS9945-2#155,#156,#157). Please note that these predate the switch to using Form 14 for PASC interps, but i have attached a WG15 status block to each one to assist. The status block will need completing by the WG15 PE. best regards Andrew WG15 Status Block: ------------------------------------------------------------------------ 1 Defect report number: IS9945-2#155 2 Submitter: IEEE PASC November 18 1997 3 Addressed to: JTC1/SC22 /WG15 editor's group on ISO 9945-2 4 WG secretariat: ------------------------------------------------------------------------ 5 Date circulated by WG secretariat: 6 Deadline on response from editor: 7 Defect Report concerning (number and title of International Standard or DIS final text): IS 9945-2 1993 8 Qualifier (e.g. error, omission, clarification required): clarification required 9 References in document (e.g. page, clause, figure, and/or table numbers): Relevant Sections: 5.10.7.3 (Regular Expression) 2.8.3.2 (RE Bracket Expression) 10 Nature of defect (complete, concise explanation of the perceived problem): See attached ------------------------------------------------------------------------ 11 Solution proposed by the submitter (optional): See attached ------------------------------------------------------------------------ 12 Editor's response (any material proposed for processing as a technical corrigendum to, an amendment to, or a commentary on the International Standard or DIS final text is attached separately to this completed report): See interpretation response attached _____________________________________________________________________________ PASC Interpretation reference 1003.2-92 #155 _____________________________________________________________________________ Interpretation Number: xxxx Topic: ex behaviour Relevant Sections: 5.10.7.3 (Regular Expression) 2.8.3.2 (RE Bracket Expression) Interpretation Request: ----------------------- From: Shalini S Dikshit Date: Fri, 18 Jul 1997 18:21:14 IST Clarification requested for Section 5.10.7.3 of POSIX standard - ex The contents on page 535 lines 1793 - 1799 state that '~' should be expanded to the replacement part of the last substitute command. The section appears to exclude the case when '~' appears at the starting point or at the end point of a range within a bracket expression. The section also mentions that '~' loses its special meaning if it is preceded by a '\'. But in a bracket expression the '\' itself loses its special meaning (Ref: Section 2.8.3.2 page 79-80 lines 2873-2876). So if '~' appears in a range in a bracket expression then should '~' be expanded to the replacement part of the last substitute command? Proposed Interpretation for Section 5.10.7.3: The interpretation is based on Section 5.10.7.3 of POSIX Standard which states (in part): 1793 Match the replacement part of the last substitute command. The 1794 tilde(~) character can be escaped in a BRE to become a normal character 1795 with no special meaning. As per the standard '~' should be expanded to the replacement part of the last substitute command. The '~' loses its special meaning if it is preceded by a '\'. Proposed Rationale for Interpretation: As is pointed out in the interpretation request the standard is not clear in the case when '~' appears in a range in a bracket expression. If tilde(~) appears at the starting point or at the end point of a range in a bracket expression and if tilde(~) is expanded then the range gets changed or may become an invalid range. This behaviour confuses users. For example consider the following set of vi commands 1. If the user intends to substitute all occurances of "aa" by "bb", then he can use the following substitute command :s/aa/bb 2. Now if the user intends to search characters which fall in the range 'a' to '~', then he can use the command /[a-~] /* Note: here '~' appears in a range within a bracket expression */ Here vi treats '~' as a metacharacter and expands it to the replacement text of the last substitute command. So in effect it searches for characters in the range 'a' to 'bb' which is not what the user intended. Moreover, according to line 1794 and 1795 tilde(~) can be escaped to become a normal character. But Section 2.8.3.2 (Special Character), lines 2873-2876 say: 2873 The special characters 2874 . * [ \ 2875 (period, asterisk, left bracket, and backslash, respectively) shall 2876 loose their special meaning within a bracket expression. The standard description in lines 1794-1795 and in 2873-2876 are contradicting each other. The standard should clearly state such inconsistency when specific behaviour overrides general behaviour. According to standard, regular expression definition in vi/ex may be different from general regular expression definition. But there should be a standard definition of regular expression and it should not change depending on utility. According to the section 5.10.7.3 to write the range expresson whose starting point is say 'a' and end point is '~' actually [a-\~] should be written, but according to 2.8.3.2 back slash(\) does not have special meaning in bracketed expression and the meaning of above expression is not same as described in second line of this paragraph. If following substitution command is given in ex(1) then the actual strings getting deleted may be different from the intended ones. :%s/[ -~]//g For informational purposes, our analysis of current vendor implementations like Sun's Solaris 5.5.1 and IBM AIX version 2 shows that the historical behavior for this situation is that '~' is not expanded when it appears in a range in a bracket expression. Is it the intention of the standard to diverge from historical practice in this case ? The tilde(~) should not be expanded when it appears in bracketed regular expressions, as the expansion confuses users. ------------------------------------------------------------------------------ Interpretation response -------------------------------- The standard is unclear on this issue. The standard states (pg 535 ll 1793-1795) that the tilde character can be escaped in a BRE but does not describe the escape mechanism. As such no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Rationale ----------- There are at least two levels of parsing, one in ex/vi before it is passed to the regular expression parsing routines, therefore saying it can be escaped in ex is not in conflict with the other statement. Most existing versions of ex/vi at time standard written had their own RE parsers and it was expected that existing practice would change. Forwarded to Interpretations group: Jul 22 1997 Proposed resolution: Oct 14 1997 Finalised: Nov 18 1997 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ WG15 Status Block: ------------------------------------------------------------------------ 1 Defect report number: IS9945-2#156 2 Submitter: IEEE PASC November 18 1997 3 Addressed to: JTC1/SC22 /WG15 editor's group on ISO 9945-2 4 WG secretariat: ------------------------------------------------------------------------ 5 Date circulated by WG secretariat: 6 Deadline on response from editor: 7 Defect Report concerning (number and title of International Standard or DIS final text): IS 9945-2 1993 8 Qualifier (e.g. error, omission, clarification required): clarification required 9 References in document (e.g. page, clause, figure, and/or table numbers): Relevant Sections: 3.14.13 10 Nature of defect (complete, concise explanation of the perceived problem): See attached ------------------------------------------------------------------------ 11 Solution proposed by the submitter (optional): None. ------------------------------------------------------------------------ 12 Editor's response (any material proposed for processing as a technical corrigendum to, an amendment to, or a commentary on the International Standard or DIS final text is attached separately to this completed report): See interpretation response attached _____________________________________________________________________________ PASC Interpretation reference 1003.2-92 #156 _____________________________________________________________________________ Interpretation Number: xxxx Topic: shell Relevant Sections: 3.14.13 Interpretation Request: ----------------------- Date: Wed, 13 Aug 1997 16:32:31 +0200 (MET DST) From: Wilhelm Mueller (1) If the shell accepts as an extension the conditions with a leading SIG prefix, will it be permitted to print this prefix when trap has no operands? An example is GNU's bash: (input)> trap xxx ALRM (input)> trap trap -- 'xxx' SIGALRM and bash allows that last form to be used as input, too. (2) Does the wording ``..., so that it is suitable for re-input to the shell as commands that achieve the same trapping results.'' (lines 1754-1756) mean that traps set to their default value must be listed? As an example, consider a function which sets its own traps: f () { oldtraps=3D"$(trap)" trap 'xxx' USR1 .... eval "$oldtraps" } trap - USR1 f Will SIGUSR1 be handled now with the command xxx--as is the case with historic sh and bash--or must it be reset to its default behaviour? W. M=FCller IEEE-CS Member No. 40112058 Interpretation response --------------------------------- Question 1: The standard clearly states that the condition should not have the leading SIG. While extensions might allow implementations to accept a leading SIG, the output would be less portable if this was produced. The condition never has SIG ... strings that have SIG can be interpreted as condition by removing SIG. Therefore bash is non-conforming with respect to output (though input is fine). Question 2: The standard is unclear on this issue. Section 3.14.3 Line 1752 talks about "commands" rather than "actions". If commands are the subset of actions that consist of a non-null string that is not "-", then it is impossible to meet the conditions given on lines 1754-6. Nearly all (or all known) implementations fail to meet the conditions on line 1754-6. No conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Rationale ------------- None. Notes to the editor(not part of this interpretation): ---------------------------------------------------- Change Trap 3.14.3 "a list of commands" on line 1752 to "a list of actions". Forwarded to Interpretations group: Aug 26 1997 Proposed resolution: Oct 14 1997 Finalised: Nov 18 1997 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ WG15 Status Block: ------------------------------------------------------------------------ 1 Defect report number: IS9945-2#157 2 Submitter: IEEE PASC November 18 1997 3 Addressed to: JTC1/SC22 /WG15 editor's group on ISO 9945-2 4 WG secretariat: ------------------------------------------------------------------------ 5 Date circulated by WG secretariat: 6 Deadline on response from editor: 7 Defect Report concerning (number and title of International Standard or DIS final text): IS 9945-2 1993 8 Qualifier (e.g. error, omission, clarification required): clarification required 9 References in document (e.g. page, clause, figure, and/or table numbers): Relevant Sections: 3.9.1 10 Nature of defect (complete, concise explanation of the perceived problem): See attached ------------------------------------------------------------------------ 11 Solution proposed by the submitter (optional): None. ------------------------------------------------------------------------ 12 Editor's response (any material proposed for processing as a technical corrigendum to, an amendment to, or a commentary on the International Standard or DIS final text is attached separately to this completed report): See interpretation response attached _____________________________________________________________________________ PASC Interpretation reference 1003.2-92 #157 _____________________________________________________________________________ Interpretation Number: xxxx Topic: shell Relevant Sections: 3.9.1 Interpretation Request: ----------------------- From: Joseph Myers October 6th 1997 This is a request for an official, binding interpretation of IEEE Std 1003.2-1992 (POSIX.2). Page and line references herein are to the joint IEEE/ISO/IEC version (joint publication of IEEE Std 1003.2-1992 and ISO/IEC 9945-2: 1993 (E)). A) 3.9.1 (Simple Commands) has the text (page 135, lines 725 to 727): When a given simple command is required to be executed (i.e. when any conditional construct such as an AND-OR list or a case statement has not bypassed the simple command) ... Does this present a requirement on the time of execution of such a command in relation to tokenization and parsing, i.e. that it happens as soon as a simple command has been identified and it is known that it is the next simple command to be executed? Specifically, consider a script containing alias foo=bar; foo where foo is not previously defined as a function or alias. Which (if either) of the following behaviours are conforming? (a) The command foo is executed, the alias command not having been executed by the time the token foo was delimited; (b) The command bar is executed, the alias command having been executed by the time the token foo was delimited? B) 3.2.3 (Double Quotes) (page 118, lines 55 to 59) says, concerning the $( ... ) construct The tokenizing rules in 3.3 shall be applied recursively to find the matching ). Does the reference to the tokenizing rules in 3.3 include alias substitution (3.3.1)? If so, then: (a) Is the expansion of an alias encountered included in the text of the command to be substituted (in place of the original text), despite the provision to the contrary in 3.3(5) (page 120, line 125)? (b) May a ) in the value of the alias terminate the $( ... ) construct? (c) Is alias expansion also applied when and if command substitution is applied? (d) Since alias substitution requires recognition of reserved words in correct grammatical context, its application here requires parsing of the command whose output shall be substituted to occur in the process of finding the matching ). Is the behaviour in the event of a grammatical error defined (since recovery is necessary to continue to determine whether aliases should be substituted)? For example, may errors or diagnostics occur from the following? true || $(if; foo) If not, then: Is the reference to parsing of the command whose output is to be substituted in E.3.2 (page 823, line 2975) in error? C) 3.3 (Token Recognition) allows the formation of a token that is delimited by the end of input. (a) This allows a token with an opening quote (single or double) but no closing quote. Must the end of input be considered to end the quoted text, or may an implementation consider this a syntax error. Specifically, must sh -c "echo '\$foo" output the four characters $foo (quote removal having been applied to the single ') without error, and may a conforming application use this construct? (b) If the end of input leaves a ${, $(, $(( or ` construct unterminates, is the effect defined? D) 3.2.3 (page 119, lines 71 to 74) provides that the constructs "`echo hello" and `echo "foo` produce undefined results. Are the results undefined if these constructs are encountered during token recognition, or only if expansions need to be performed on them? E) 3.3.1 (Alias substitution) does not clearly state what is to be done with the value of an alias that is substituted. Is it required that the resulting text itself be tokenized? If so, is the value of the alias treated as a separate input source, so that any final token is delimited by the end of the value of the alias; if not, what are the effects of the following definitions and uses of aliases? alias foo="'" foo bar' alias bar="echo >" bar>baz Interpretation response ------------------------------- A. The standard clearly states (page 120 line 149-150) that tokenization occurs before any grammatical rules are applied. All compound commands are tokenized in their entirety before they are executed. The grammar rules show that the whole line shall be read before tokenization. Therefore the command shall be "foo" and never "bar". This also means that a function f() { alias foo=bar foo } will execute "foo" and not "bar", since the entire function is tokenized before alias substition takes place. B. The standard appears to mandate that alias substituion should occur, however, concerns have been raised about this which are being referred to the sponsor. Historically shells have not done this, and the standard would appear to be in error. C. a. The standard does not speak to this issue of whether the single-quoted (or double-quoted, or back-quoted) string is correctly terminated by this end of token, and therefore it is acceptable for an implementation to treat this as an error. As such no conformance distinction can be made between alternative implementations based on this. b. For ${, $( or $(( the standard clearly states tha the effect is defined as a syntax error -- the tokenization completes, but the resulting grammar is in error. For ', the effect is as above (C subsection a). For `, the effect should also be as above (the back-quoted string is a single token). D. The standard clearly states that the constructs are undefined in all cases. E. The standard clearly states in Section 3.3.1 alias substitution along with Section 5.1, the definition for the alias utility, that the alias provides a string not a set of tokens that are interpreted/parsed when the alias substitution occurs. There is nothing in the standard that states that there is an end of token at the end of an alias. Rationale ------------- None. Notes to the editor (not part of this interpretation): ----------------------------------------------------- For item B above, add the phrase "other than alias substitution" on line 58-59 after "recursively" worded appropriately. Forwarded to Interpretations group: October 8 1997 Proposed resolution: Oct 14 1997 Finalised: Nov 18 1997