The information collected from the Web during 2019 is now avaliable in Arquivo.pt (in respect to the embargo period of 1 year).
Printed screen from www.politico.eu preserved by Arquivo.pt, collected in June 18, 2019. Article about the Notre Dame fire in Paris, “Notre Dame fire ‘fully extinguished’ as fundraising starts”.
Remember and research historical events in 2019, such as
training in preserving open data published online.
AMA is the public organisation responsible for promoting digital means in Public Administration and aims to modernise and simplify citizens’ access to State services.
Arquivo.pt is a service operated by the Fundação para a Ciência e Tecnologia I.P. that preserves data published on the Web between 1996 and the present day, making them accessible to any citizen for memory and research purposes.
EU open data directive includes documents on websites
“(30) This Directive lays down the definition of the term ‘document’ and that definition should include any part of a document. The term ‘document’ should cover any representation of acts, facts or information — and any compilation of such acts, facts or information — whatever its medium (paper, or electronic form or as a sound, visual or audiovisual recording.
…
(34) To facilitate re-use, public sector bodies should, where possible and appropriate, make documents, including those published on websites, available through an open and machine-readable format and together with their metadata, at the best level of precision and granularity, in a format that ensures interoperability
…
(35) A document should be considered to be in a machine-readable format if it is in a file format that is structured in such a way that software applications can easily identify, recognise and extract specific data from it. Data encoded in files that are structured in a machine-readable format should be considered to be machine-readable data. A machine-readable format can be open or proprietary. They can be formal standards or not.
…
(60) The Commission should facilitate the cooperation among Member States and support the design, testing, implementation and deployment of interoperable electronic interfaces that enable more efficient and secure public services.
…
Arquivo.pt is a public service that has the mission of preserving documents published on Internet sites to enable their long-term open access and provides interoperable electronic interfaces (APIs) for their automatic processing.
Any citizen can access the open data resulting from these historical archives and, for example, search for official information published on the websites of successive governments.
In 2021, Arquivo.pt provided open access to over 10 billion files (721 TB) from 27 million websites. The open data preserved by Arquivo.pt can be explored through the search interface, automatically through API (https://arquivo.pt/api) or by reusing derived datasets.
Derived datasets available on the Open Data Portal
Besides the original web artefacts preserved at Arquivo.pt, this service has generated open datasets derived from its activities, which are now available in open access so that they can be reused:
Web Archiving Conference 2021 – the most important meeting in the field of Web preservation, where experts share new knowledge and experiences
RESAW Conference – meeting of the European RESAW network (Research Infrastructure for the Study of Archived Web Materials) this year in its 4th edition, mainly addressed to the community of researchers from non-technological scientific areas, such as Social Sciences, Arts and Humanities.
Contributions of Arquivo.pt to the international community
Arquivo.pt presented some results of the work developed in the last year, with emphasis on the functionalities that improve the reproduction of the archived contents, such as the “Complete the page”.
Two historical collections were integrated on the Arquivo.pt: the Geocities and the Internet Memory Foundation. Arquivo.pt did special collections about the 2019 European Elections and Covid-19.
The contents of Arquivo.pt are accessible to any researcher regardless of the country they are in and therefore it is a useful service to the international community.
Presentations
Arquivo.pt updates 2021: presentation at the IIPC – General Assembly, by Daniel Gomes (Vídeo)
Complete the page. 1 minute drop in (presentation at the IIPC – General Assembly “complete the page”), by Daniel Gomes (Slide, Video)
A transnational and cross-lingual crawl of the European Parliamentary Elections 2019, by Ivo Branco (Slides, Vídeo)
Enhancing access to research the Geocities historical collection, by Pedro Gomes (Slides, Vídeo)
Complete the page – demo. Slide used in the IIPC 1 minute presentation, at the IIPC General Assembly 2021
Campaign websites are historically relevant. However, they are difficult to identify because they appear and disappear quickly. Moreover, they are often exclusively referenced through printed media (e.g. posters).
That’s why your collaboration is essential!
To help, simply add addresses of pages or sites related to the Municipal Elections of 2017 through the following link:
The winners of the Arquivo.pt 2020 Award was announced by the Público newspaper, the official media partner of this year’s edition, which granted an honorable mention to the best work based on the contents of the newspaper. 26 candidate works were received.
The 2nd prize in the amount of 3,000 euros was awarded to the work “Politiquices” developed by David Batista.
“Politquices” is a Web application that allows searching support or opposition relations between political personalities and parties expressed in news headlines preserved at Arquivo.pt.
This interface makes it possible to analyse the relationship of support or opposition between two political personalities or organisations.
3rd place – “Primeiras páginas de jornais online portugueses”
The 3rd place winner received a prize of 2,000 euros and was awarded to the work “Primeiras páginas de jornais online portugueses”, developed by Susana Parreiraunder the supervision of Ana Sabino, Ana Boavida e Penousal Machado.
“Primeiras páginas de jornais online portugueses” (Front pages of Portuguese online newspapers) presents an interactive graphical analysis of the front pages of Portuguese online newspapers. For this study, specific items within the newspaper design were analysed, thus allowing trends to be observed over time.
As a result we have a Web interface that allows interactively visualising, for example, the space occupied by the images on the Público newspaper front page.
Público newspaper, official partner of the 4th edition of the Arquivo.pt Prize, awarded its Honorable Mention to the work “Primeiras páginas de jornais online portugueses”.
The historical collection of web content generated during the Internet Memory Foundation’s (IMF) activity has been donated to Arquivo.pt and is now searchable!
The IMF was a European organization dedicated to preserving web content that was wound up in 2018.
In 2010, Julien Masanès, the “father” of Web archives in Europe created the IMF.
Examples of pages from the collection donated by the IMF
The collection donated by the IMF has now been integrated in the Arquivo.pt collection to be preserved for posterity.
This collection is composed of 142 million files that total 6.3 TB of historical information whose texts or images can now be searched through Arquivo.pt.
This new collection has been named “InternetMemory” in the Arquivo.pt collections list.
Searches can be made on this collection using the collection search parameter or through the custom search page available at arquivo.pt/InternetMemory.
Colectiva de Artistas. 2008.04.19 a 2008.06.07. Galeria Quadrado Azul. Porto. Composition from a Webpage preserved on Arquivo.pt: www.quadradoazul.pt, 22nd October 2008.
On April 29, May 27 and July 1, from 3 to 4:30 pm, webinars geared to the community of artists, curators, gallerists and event producers will be held, open also to anyone interested in learning more about preserving art websites.
Throughout the sessions, participants will learn in detail about the functionalities of Arquivo.pt in order to take advantage of this public Web preservation service. They will have technical information, in the form of recommendations and best practices, to create preservable websites. Finally, they will learn how to use available tools to save their websites in a standardized format so that their contents are not lost.
This cycle of Webinars is an initiative of the “Forever” Project, a collaboration between the Calouste Gulbenkian Foundation Art Library and Arquivo.pt under the ROSSIO infrastructure.
In this 4th edition of the annual Arquivo.pt Award, € 15,000 will be awarded to the 3 best works (1st place: € 10,000).
The deadline for submissions is May 4, 2021.
Works may be developed individually or in group about any topic, as long as they use the information provided by Arquivo.pt as the main source of information.
The Público Newspaper is the official media partner of the Arquivo.pt Award in 2021. It was one of the first newspapers to become available online.
Jornal Público will award an Honorable Mention to one of the works which focused on the historical web-archive of Público online.