Arquivo.pt improves advanced search

Arquivo.pt launched a new version on June 23, 2017 called Venus, in which the Advanced Search operators were improved.

It stands out the improvement of the search operator site that allows to restrict searches to certain preserved sites. For example:

The reproduction of preserved pages containing error messages was also reduced.

Know More

We archived the Web pages of the Portuguese Parliamentary Elections of 2015!

Last updated on January 17th, 2018 at 10:45 am

The Arquivo.pt made 4 crawls of Web pages related with the Portuguese Parliamentary Elections of 2015.

We had appealed to the community contribution by suggesting Web pages related with the Parliamentary Elections of 2015 in order to archive it.

We made 4 crawls, during and after the election campaign period, using the list of 127 Web pages suggested by the community, archiving a total of 2 802 407 Web resources, that occupy 274 GB.

It were collected Web pages such as the ones from the running political parties, news in the media about the elections, blogs, opinion articles, and satirical political Web pages.

The Arquivo.pt respects an embargo period of 1 year, and for that reason the archived collection will only be avaliable by the end of 2016.

However you can consult now some archived Web pages from the previous Portuguese Parliamentary Elections such as:

We would like to thank all the volunteers that helped with this initiative.
Now we need your collaboration suggesting Web pages about the Portuguese Presidencial Elections.
Can we count on you?

Scientific study presents a search log analysis of a search engine

Last updated on December 20th, 2019 at 03:54 pm

This research presents a characterization of the information-seeking behaviour of the users of a Portuguese web search engine, based on the analysis of its logs.

The paper A Search Log Analysis of a Portuguese Web Search Engine, by Miguel Costa and Mário J. Silva, was presented at INForum 2010 – Simpósio de Informática, in Braga, Portugal.

Paper presented at EPIA 2009

Last updated on October 2nd, 2017 at 10:51 am

An Updated Portrait of the Portuguese Web presented at EPIA 2009

The paper An Updated Portrait of the Portuguese Web, by João Miranda and Daniel Gomes, was presented at the 14th Portuguese Conference on Artificial Intelligence (EPIA 2009) in Aveiro.

This paper presents a characterization of the Portuguese Web derived from a crawl performed by the Portuguese Web Archive in March 2008, with 48 million documents in 2.5 TB of amount of data.

Session at ISCTE “Archive.pt as an infrastructure for research in Social Sciences and Humanities

Last updated on September 28th, 2017 at 11:13 am

Session at ISCTE (Lisbon) “Archive.pt as an infrastructure for research in Social Sciences and Humanities”

You missed it?

No problem. Here are all the presentations:

Portuguese Web Archive – a Memory Infrastructure @DLM2014

Last updated on December 20th, 2019 at 05:18 pm

Presentation about the Archive.pt service and the importance of web archiving to preserve the memory of Humanity.

Presentation on Thursday 17:15 (13 November) in Lisbon at DLM Forum – Making the Information Governance Landscape in Europe

The Forum will be held at Instituto Superior Técnico.

@dlmforum2014 #DLM2014

WWW 2013: Search the Past with the Portuguese Web Archive

Last updated on September 28th, 2017 at 01:29 pm

The Portuguese Web Archive (PWA) is at the World Wide Web Conference (WWW 2013) in Rio de Janeiro, Brazil, with a demo session.

The demo at WWW 2013 presents the Portuguese Web Archive, which enables search over 1.6 billion files archived from 1996 to 2012.

New video: “The Portuguese Web Archive and the open access to scientific knowledge”

Last updated on December 20th, 2019 at 05:31 pm

Web archiving contributes to empower open-access to science.

There is a growing amount of open access scientific knowledge published on the Web.

This video debates the importance of web archiving to empower open access to science.

Technical report documents the creation of a searchable web archive

Last updated on September 29th, 2017 at 02:17 pm

This report presents some of the work developed to create an efficient and effective web archive service, from data acquisition to user interface design.

The results of this research were applied to create the Portuguese Web Archive that is publicly available since January 2010. It supports full-text search over 1 billion contents archived from 1996 to 2010. The developed software is available as an open source project.

I. P. Santarém, 7th and 8th Feb.: learn more about the Portuguese Web Archive

Last updated on September 29th, 2017 at 02:22 pm

Come and meet the Archive’s team.

The Portuguese Web Archive will be presented at Jornadas FCCN on 7th and 8th of February 2012, with the following activities (in Portuguese):