=====Notes on RFC metadata in the v3 era===== * New XSD is in place (10 Sept 2019): https://www.rfc-editor.org/in-notes/rfc-index.xsd * rfc-index.txt, .html, .xml (and similar files) are slightly different from before, this is due to: * added new publication formats and source format. * removed file sizes. (Byte count is no longer included as metadata for each RFC.) * There is only one page count per RFC, even if it is available in multiple file formats. * As a result, in rfc-index.xml, page-count has been "pulled up" a level. * For page count, source of data will change. * For pre-v3 docs, it is the page count of the .txt file (except for a handful of old RFCs where there is no .txt file). * For v3 docs, it will be the page count of the PDF file. === PDF naming conventions === * Externally, PDFs are listed simply as "PDF" (whether v3 or otherwise). * Internally, the db holds "v3PDF" to mean v3 output; "PDF" is used for pre-v3. * As before, .txt.pdf files are not listed in index files. === TEXT naming conventions === * Externally, .txt format (whether pre-v3 or not) is listed as "TEXT".\\ Exception: rfc-index.xml: will display "ASCII" (for pre-v3) and "TEXT" (afterwards). Rationale: other index files were already using the "TEXT" instead of "ASCII", so they weren't changed to start differentiating. * Internally, the db holds "TEXT" to mean v3 output; "ASCII" is used for pre-v3. === New resource: JSON files of RFC metadata === * One JSON file per RFC was made available 11 September 2019. * Each file contains data similar to an entry in rfc-index.xml. * Sample file: https://www.rfc-editor.org/rfc/rfc5234.json === Examples === Example: RFC4254 -- OLD ASCII 50338 24 HTML -- NEW ASCII HTML 24 For comparison, the JSON record includes (among other data): "format":["ASCII","HTML"],"page_count":"24" --------------- Example: RFC8888 (v3 era) -- NEW TEXT HTML PDF XML 48 For comparison, the JSON record would include (among other data): "format":["TEXT","HTML","PDF","XML"],"page_count":"48"