Last updated on August 1st, 2017 at 01:52 pm
To enable the archive of a site, it is fundamental that the site presents a crawler-friendly homepage.
The Portuguese Web Archive crawler archives the web by crawling the homepages of sites (e.g. http://www.fccn.pt) first and then following links to the remaining contents.
If the crawler cannot process the homepage of a site, it will not be able to find the links to other contents. Therefore, to create crawler-friendly homepages:
- Use preferentially the HTML format;
- Ensure that every content can be found by following links from the homepage;
- Do not create homepages composed exclusively by images or animations (e.g. Flash). If you must create a homepage of this kind, there should be an alternative version of the homepage in HTML format.