Crawler-friendly homepage

Last updated on August 1st, 2017 at 01:52 pm

To enable the archive of a site, it is fundamental that the site presents a crawler-friendly homepage.

The Portuguese Web Archive crawler archives the web by crawling the homepages of sites (e.g. first and then following links to the remaining contents.

If the crawler cannot process the homepage of a site, it will not be able to find the links to other contents. Therefore, to create crawler-friendly homepages:

  • Use preferentially the HTML format;
  • Ensure that every content can be found by following links from the homepage;
  • Do not create homepages composed exclusively by images or animations (e.g. Flash). If you must create a homepage of this kind, there should be an alternative version of the homepage in HTML format.