Last updated on September 28th, 2017 at 01:26 pm
Several organizations around the world are struggling to archive information from the web before it vanishes. However, users demand efficient and effective search mechanisms to access the already vast collections of historical information held by web archives. The Portuguese Web Archive is the largest full-text searchable web archive publicly available. It supports search over 1.2 billion files archived from the web since 1996.
The paper Creating a Billion-Scale Searchable Web Archive was presented on the Temporal Web Analytics Workshop 2013, in Rio de Janeiro, Brazil.