RFC 4661, "An Extensible Markup Language (XML)-Based Format for Event Notification Filtering", September 2006Source of RFC: simple (rai)
Errata ID: 925
Status: Held for Document Update
Publication Format(s) : TEXT
Reported By: Alfred Hoenes
Date Reported: 2006-11-09
Held for Document Update by: Robert Sparks
significant issues with filter syntax ( ABNF in RFC 4661 ) Many of the filter examples in RFC 4661 and RFC 4660 do not conform with the ABNF presented in Section 5, on page 11, of RFC 4661. Further, that ABNF allows unexpected, strange instantiations of 'elem-path', and there's at least one significant semantical ambiguity in the syntax described. Because I am a bloody XML and XML XPATH layman, I am not in a position to exactly diagnose what is wrong, and to quickly propose suitable corrections, in a way that keeps the specification as close to XPATH as possible. I just present some of the strange outcomes of strict application of the RFC 4661 ABNF and a few pointers to examples in the RFC text that do not conform with that syntax. a) ambiguous semantics / insufficient syntax The ABNF production, expression = "[" (elem-expr / attr-expr) 1*[oper (elem-expr / attr-expr)] "]" entails oper = "and" / "or" Accordingly, these rules allow to build up expressions containing multiple terms of the form '(elem-expr / attr-expr)' separated by "and" and/or "or" operators. There are no operator precedence rules specified, and there's no possibility to insert parentheses to build sub-expressions / groups to enforce the desired operator precedence. Thus, it remains unclear whether, e.g., an expression of the form, [ <expr1> or <expr2> and <expr3> ] means: [ <expr1> or ( <expr2> and <expr3> ) ] (corresponding to commonly used precedence rules), or: [ ( <expr1> or <expr2> ) and <expr3> ] (corresponding to simple left-to-right operator evaluation). b) strange productions (#1) The ABNF production, elem-path = (element / "*") 1*["/" / "*" / element] ["*" / element] together with: element = [ns] string ns = string ":" admits 'elem-path' values of strange and unexpected forms, e.g., **** *<ns_string1>:<elem_string1><elem_string2><ns_strind3>:<elem_string3> without any intervening separator characters. I am quite sure that this was not intended. Looking at the 'elem-path' production, it can be observed: The construction '1*[ ... ]' apparently does not make much sense; since a group in brackets, '[ ... ]', means "optional", this construction would be equivalent to the simpler '*( ...)'. Perhaps, the '1*[ ... ]' group should look similar to the '1*( ... )' group in elem-reference = "/" 1*("/" / "/*" / ("/" element)) including the required "/" in all alternatives. This also make the final, optional group questionable. c) strange productions (#2) The ABNF productions, reference = elem-reference / attr-reference attr-reference = reference attribute together with: attribute = "@" [ns] string ns = string ":" admits 'reference' values with multiple attribute references like <elem-reference>@<ns1>:<attr1>@<attr2>@<attr3>@<ns4>:<attr4> I am quite sure that this was not intended. c) front end of examples not matching the syntax Successive substitution of the ABNF productions of RFC 4661 leads to the following observations: * each 'elem-reference' must start with a double-slash, "//" ; * each 'reference' starts with an 'elem-reference', and hence it must start with a double-slash; * each 'selection' starts with a 'reference, and hence it must start with a double-slash. Many examples do not conform to this restriction, starting from the short examples at the bottom of page 11 up to many of the detailed XML examples in both RFCs. d) back end of examples not matching the syntax Further observations from the ABNF: * (un-escaped) square brackets, "[" and "]" only appear in the 'expression' production, forming the leading and the trailing character of it, respectively; * each 'selection' contains at most one 'expression', and this 'expression' is the trailing part of the 'selection; * hence, each 'selection' contains at most one matching pair of square brackets, and if it does, the "]" must be the last character of it. There are examples given, e.g. in Section 6.1 of RFC 4661, on top of page 13, and in Section 6.6 of RFC 4661, on page 15, where this restriction is not adhered to! Final conclusion: Apparently, all (or most) examples presented are expected to be crafted as intended, i.e. with valid syntax. Hence, the ABNF needs to be substantially reworked to allow production of these examples, and to really produce exactly what it should. Otherwise, implementations based on the ABNF will drastically fail, will not conform to the intended behaviour, and will not be interoperable with implementations not based on the ABNF of RFC 4661.