Format specification compliance

Last updated on August 1st, 2017 at 01:53 pm

To enable the automatic processing and preservation of the archived contents, it is advisable to comply with the format specifications.

Every digital content must respect a format specification to enable its automatic processing by computers, that will present the contained information in a human understandable way. For instance, a file was coded respecting the JPEG format specification, so that a browser can interpret it and present a photo on the monitor to a person.

  • If a content does not comply with a format specification, its information may become inaccessible.

Respecting format specifications enables preservation and access to the content’s information across time, even when the format becomes obsolete. However, many pages on the web do not comply with any format specification.

Notice that besides passing the syntactic validation, a page should also respect the correct semantic use for the tags. For instance, header texts should be identified on the source code using the tags H1, H2, H3, …, H6, according to their relative importance on the page. Header texts should not be published using other tags.