We issued a press release about Arquivo.pt

On the 12th January 2016 we have issued a press release about Arquivo.pt, where we explain our service, its history, present, colaborations and future challenges.

The Arquivo.pt is a free of charge service, that allows the search of Web archived data since 1996. The investigation infrastructure of Arquivo.pt is mainly focused on the preservation of relevant content for the portuguese community.

There are currently about 2700 milions of archived files, which correspond to 95 TB of information and any person can suggest interesting sites to be archived by simply going to https://sobre.arquivo.pt/collaborate/suggest-a-site.

In 2015, the Arquivo.pt has collected 580 milion files for preservation, and the search service has registered an average of 3 692 users per month, from which 90% are new users.

In 2016 the Arquivo.pt will make avaliable the access to archived Web pages from the .eu domain and the Arquivo.pt team will work on a prototype that will allow the search and visualization of archived images.

View full text (in Portuguese)

Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Email this to someone

Ask us for free dissemination materials!

Arquivo.pt – the Portuguese Web Archive is a nonprofit public service that requires dissemination.

We produced the following dissemination materials:

Your collaboration to disseminate Arquivo.pt is essential for this service to become useful to an increasing amount people.

Ask us for materials to disseminate Arquivo.pt at your institution or event. We will send it to you free of charge.

Thank you.

Know more:

Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Email this to someone

Fénix: the new release of Arquivo.pt!

We are pleased to announce that we launched a new release of Arquivo.pt – the Portuguese Web Archive, after two years of suspension in the development of the service.

The name given to this release was opportunely Fénix (Phoenix). 20 cases that originated the following improvements were resolved:

Informative site (sobre.arquivo.pt):

  • Review and update of all content (152 pages in Portuguese and English);
  • Reorganization of the Information Architecture;
  • Fix functional and usability errors;
  • Incoherence of graphical style.

Search and access (www.arquivo.pt):

  • Interoperability improvements (e.g. URLs following the Wayback syntax, OpenSearch API fixes);
  • Information updates;
  • Error corrections on the user interface;
  • Introductory video included on the homepage.

We inform that we migrated our open source project to GitHub: pwa_technologies where you can find more details about the Fénix version.

Comments and suggestions are most welcome!

Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Email this to someone

A first attempt to archive the .EU domain

The Portuguese Web Archive attempts to archive .EU web sites.

The .EU domain is commonly used to reference sites related to Europe. The strategy adopted to archive the World Wide Web has been delegating the responsibility of each domain to the respective national archiving institutions. However, the .EU domain fails to fit in this model because it covers multiple nations. Thus, the preservation of .EU sites was not been yet assigned and undertaken by any institution.

RESAW is an European network that aims to create a Research Infrastructure for the Study of Archived Web Materials (resaw.eu).

The Portuguese Web Archive performed a first attempt to crawl and preserve web sites hosted under the .EU domain performed by within the scope of RESAW activities. This first crawl began on the 21 November 2014 and finished on the 16 December 2014.

As future work we intend to perform 2 more crawls of the .EU domain. Each one of performed .EU crawls shall be indexed and become searchable through archive.pt one year after its finish date.

Collaborations with researchers interested on studying the collected web data sets or crawl logs are welcome.

Resources

Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Email this to someone

PhD “Information Search in Web Archives”: slides and video

Our former colleague Miguel Costa defended his PhD thesis at the University of Lisbon on the 4th November 2014. The slides and video are available!

Share on FacebookTweet about this on TwitterShare on LinkedInShare on Google+Email this to someone