[rfc-i] Open issue: semantically tagging data

On 21 Jun 2012, at 17:03 , Heather Flanagan (RFC Series Editor) wrote:

> Want the ability to semantically tag some document info, at least
> authors' names and references

> Any discussion on this?  Is this just a "nice to have" that may just
> be a part of some of the formats we'll see proposed and not a
> requirement for a new format?

We really want the ability to extract information from RFCs in ways that are simple to do in tools and work reliably. The names of the authors would be the most obvious example. Note though that that doesn't mean that when the name of the author appears, there is some tag that says "this is the name of an author". A different way to go would be to have this information encoded separately from the human readable text. Also note that a name can appear as an author, a contributor, or possibly in a different way. (And of course there are various ambiguities with names... Which we may or may not want to resolve.)

However, there is information which we would probably want to tag as being something, where having a machine extractable and a human readable version is undesirable. What I'm thinking of here is ASCII art and code. (Where we would want to indicate the type of code.)

As for references: adding machine extractable information about third party documents in our documents seems like a waste of time to me. The human readable information must be there for references to work the oldfashioned way, and then if possible a link, both for the convenience of readers and because that way tools can consult the canonical location of a document for information about that document.

