ISO/IEC JTC1/SC22/WG15 RIN N092 Title: reorder_after and reorder_end keywords in LC_COLLATE Source: DS Date: 1992-10-14 The following section is inserted in the description of LC_COLLATE keywords in POSIX.2 D11.3 section 2.5.5.2. line 2037: change: "no other keyword shall be present." to "only the keywords 'reorder_after' and 'reorder_end' may be present." Line 2047 add: 'reorder_after' Specify a position in the previously defined collation order, to be used as a starting point for reordering collating elements. This statement is followed by one or more collation reorder statements, reassigning character collation weights to collating elements. This keyword is optional. 'reorder_end' Specify the end of the collation reordering statements. This keyword is required when a 'reorder_after' statement has been used. line 2376 add: 2.5.2.2.6 'reorder_after' keyword The 'reorder_after' keyword specifies a starting point for reordering collating elements. It is followed by one or more collation reorder statements, reassigning character collation weights to collating elements. The syntax is: "reorder_after %s\n", 2.5.2.2.6 Collation Reordering Each 'reorder_after' statement shall be followed by one or more collation element reordering entries. The definition of collation element reordering entries are equivalent to the collating element entries in 2.5.2.2.4, specifying collation elements and associated weights. The collating element reordring entries are terminated by a 'reorder_after' keyword or a 'reorder_end' keyword. Each collation element specified via a collation element reordering entry is removed from the current collating sequence, if present, and inserted in the collating sequence after the previous reordering collation elements. The collating element specified on the previous 'reorder_after' statement specifies the first reordering collation element. The last reordering collation element is followed by the follower to the collation element specified on the 'replace-after' statement. Example: order_start order_end reorder_after reorder_after reorder_end The resulting order is then: 2.5.2.2.8 'reorder_end' keyword The collating reorder entries shall be terminated with a 'reorder_end' keyword. 2.5.2.2.9.1 'reorder_after' rationale Much work is done on locales, making them applicable to large character sets and also to numerous character sets via a character set independent specification. WG15 RIN has on its programme of work to harmonize locales as far as it is feasible. The POSIX.2 Draft 11 introduced a copy command for all sections of the locale. This is useful for many purposes and it ensures that two locales are equivalent for this category. The 'reorder_after' specification faciliates a further refined way of building on previous work. The collating sequences of characters vary a bit from country to country, but generally much of the collating sequence is the same. For example the Danish collating sequence is quite equal to the German, English or French, but for about a dozen letters it differs. The same can be said for Swedish or Spanish: generally the Latin collating sequence is the same, but a few charcters are different. With the advent of the quite general coded character set independent locales like the example Danish in POSIX.2 draft 11 annex F, it would be convenient if the few differences could be specified just as changes to an existing one. This also improves the overview of what the changes really are. 2.5.2.2.9.2 'awk' script for 'reorder_after' A script has been written in the 'awk' language defined in POSIX.2 to implement the 'reorder_after' construct. BEGIN { comment = "%"; back[0]= follow[0] = 0 } /LC_COLLATE/ { coll=1 } /END LC_COLLATE/ { coll=0; for (lnr= 1; lnr; lnr= follow[lnr]) print cont[lnr] } { if (coll == 0) print $0 ; else { if ($1 == "copy") { file = $2 while (getline < file ) if ( $1 == "LC_COLLATE" ) copy_lc = 1 else if ( $1 == "END" &&&& $2 == "LC_COLLATE" ) copy_lc =0 else if (copy_lc) { lnr++ follow[lnr-1] = lnr back [ lnr ] = lnr-1 cont[lnr] = $0 symb[ $1 ] = lnr } close (file ) } else if ($1 == "reorder_after") { ra=1 ; after = symb [ $2 ] } else if ($1 == "reorder_end") ra = 0 else { lnr++ if (ra) follow [ lnr ] = follow [ after ] if (ra) back [ follow [ after ] ] = lnr follow[after] = lnr back [ lnr ] = after cont[lnr] = $0 if ( ra &&&& $1 != comment &&&& $1 != "" ) { old = symb [ $1 ] follow [ back [ old ] ] = follow [ old ] back [ follow [ old ] ] = back [ old ] symb[ $1 ] = lnr } after = lnr } } }