“Major Minors” on World Digital Preservation Day 2021

Last updated on December 13th, 2021 at 12:03 pm

The winners of the Arquivo.pt Award 2021 were the guests of the Arquivo.pt online session on World Digital Preservation Day 2021.

As in previous years, Arquivo.pt joined this international initiative by holding an open session, where useful knowledge will be shared with the community.

Paulo Martins, Leandro Costa and Jose Carlos Ramalho, who guided this work, spoke about the “Major Minors” project and how they used the contents preserved by Arquivo.pt.

The “Major Minors” project is an ontology of press clippings from Portuguese newspapers with reference to social minorities.  It aims to map and study the representation of minorities in the Portuguese journalistic context over the first two decades of the 21st century.

Please share the slides and the video.

Agenda

November 4th

3:00 pm – Welcome and news by Daniel Gomes (slides PDF, 3MB)
3:10 pm – Major Minors by Paulo Martins, Leandro Costa  and José Carlos Ramalho (Slides PDF, 5MB)
3:40 pm – Questions and answers
4:00 pm – End

Session video

Create automatic narratives about any topic!

thumbnail-narrative-q2

Arquivo.pt provides a new function that allows you to automatically create temporal narratives on any topic.

The “Narrative” functionality, integrated into Arquivo.pt in September 2021, is the result of the collaboration between “Conta-me Histórias”, winner of the Arquivo.pt Award 2018, and Arquivo.pt.

The Conta-me Histórias” (Tell me Stories) project was developed by researchers from the Laboratory of Artificial Intelligence and Decision Support (LIAAD – INESCTEC )  and affiliated to the institutions Instituto Politécnico de Tomar – Center for Research in Smart Cities (CI2) ; University of Porto and University of Innsbruck .

How it works?

When a user enters a set of words about a topic in the Arquivo.pt search box and clicks on the “Narrative” button, the user is directed to the “Conta-me Histórias” service, which automatically analyzes the news from 25 websites archived by Arquivo.pt over time and presents a chronology of news related to the topic.

For example, if we search for “Just Bieber” and click on the “Narrative” button (Figure 1), we will be directed to the “Conta-me Histórias” , where we will automatically obtain a narrative of archived news (Figure 2).

example-narrative-arquivopt

Figure 1: Search results for pages about “Justin Bieber”.

example-tell-me-stories-arquivopt

Figure 2: Narrative of news about “Justin Bieber” from Portuguese news sites preserved by Arquivo.pt generated by the “Conta-me Histórias” service.

Create your narrative now!

“Conta-me Histórias” researches, analyzes and aggregates thousands of results to generate each narrative about a topic. It is recommended to choose descriptive words about well-defined themes, personalities or events to obtain good narratives.

Creating a narrative is useful for researchers, journalists or citizens who want to quickly get an overview of the evolution of a topic along time, thus saving them a lot of time and effort.

Go to Arquivo.pt and try to create a narrative about a theme of your choice.

Tell us about your experience so we can improve the service!

Book “The Past Web: Exploring Web Archives” available in Green Open access!

thumb-the-past-web

Last updated on March 3rd, 2022 at 09:49 am

Meet the new book The Past Web Exploring Web Archives arquivo.pt/book

 

The deadline to freely download the book was extended to 20th March!

 

Since 2006, a book has not been published that reflects the state-of-the-art in the area of ​​web preservation and the research that has been conducted on web archives.

The main goal of the new book The Past Web: Exploring Web Archives was to create a new, up-to-date resource to educate more people in the field of web preservation and to make web archives known to researchers and academics.

As such, the book is primarily aimed at the academic and scientific communities, and presents the most innovative methods for exploring information from the past preserved by web archives.

Daniel Gomes, manager of Arquivo.pt led the book’s editorial team, which also included the field specialists Elena DemidovaJane Winters and Thomas Risse. In total, the book resulted from the contributions of 40 authors from around the world who are experts in web archiving.

The book is divided into 6 parts where we find various resources for exploring pages archived from the Internet since the 1990s.

We can also learn how to preserve our collective memory in the Digital Era, which strategies to use when selecting online content, and what impact web archives have on preserving historical information.

The book aims to support professors in their mission to transmit innovative and adequate knowledge for the digital literacy required to train professionals for the 21st century.

The manager of Arquivo.pt alerts to the need of including web archives in teaching plans and emphasizes that this knowledge brings a great competitive advantage especially for students of  Humanities and Social Sciences.

An innovative detail of this book is that all its cited links have been preserved by Arquivo.pt in order to ensure that the references remain valid over time.

The book is available for free to be downloaded from Portuguese higher education institutions (b-On member entities) until March 6th 2022!

If you do not belong to a Portuguese higher education institution, you can download a pre-print version of the book (Green Open Access).

Links

Image gallery

Apresentação do livro “The past Web” no Museu de Leiria durante as Jornadas FCCN

Apresentação do livro
Apresentação do livro
Apresentação do livro
Apresentação do livro
Apresentação do livro
Apresentação do livro
Apresentação do livro
Apresentação do livro
Apresentação do livro Apresentação do livro Apresentação do livro Apresentação do livro Apresentação do livro Apresentação do livro Apresentação do livro Apresentação do livro

2019 websites available and Arquivo.pt surpasses 10 billion files

thumb_notre-dame-paris

Last updated on December 16th, 2021 at 06:43 pm

The information collected from the Web during 2019 is now avaliable in Arquivo.pt (in respect to the embargo period of 1 year).

Printed screen from www.politico.eu preserved by Arquivo.pt, collected in June 18, 2019. Article about the Notre Dame fire in Paris, "Notre Dame fire 'fully extinguished’ as fundraising starts".
Printed screen from www.politico.eu preserved by Arquivo.pt, collected in June 18, 2019. Article about the Notre Dame fire in Paris, “Notre Dame fire ‘fully extinguished’ as fundraising starts”.

Remember and research historical events in 2019, such as

Arquivo.pt has visited 2 million sites and collected 1,7 billion files, 131TB in total, so that you can access the memory of past events.

In 2021, Arquivo.pt provides open access to more than 10 billion files (721 TB) from 27 million websites.

Arquivo.pt certified as an open data provider

selo-dados-gov

Last updated on September 16th, 2021 at 09:45 am

Arquivo.pt has been collaborating with Agência Modernização Administrativa (AMA) with the aim of improving the preservation of Public Administration websites.

Collaboration is based on three action points:

AMA is the public organisation responsible for promoting digital means in Public Administration and aims to modernise and simplify citizens’ access to State services.

Arquivo.pt is a service operated by the Fundação para a Ciência e Tecnologia I.P. that preserves data published on the Web between 1996 and the present day, making them accessible to any citizen for memory and research purposes.

EU open data directive includes documents on websites

The Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information stipulates the following:

“(30) This Directive lays down the definition of the term ‘document’ and that definition should include any part of a document. The term ‘document’ should cover any representation of acts, facts or information — and any compilation of such acts, facts or information — whatever its medium (paper, or electronic form or as a sound, visual or audiovisual recording.

(34) To facilitate re-use, public sector bodies should, where possible and appropriate, make documents, including those published on websites, available through an open and machine-readable format and together with their metadata, at the best level of precision and granularity, in a format that ensures interoperability

(35) A document should be considered to be in a machine-readable format if it is in a file format that is structured in such a way that software applications can easily identify, recognise and extract specific data from it. Data encoded in files that are structured in a machine-readable format should be considered to be machine-readable data. A machine-readable format can be open or proprietary. They can be formal standards or not.

(60) The Commission should facilitate the cooperation among Member States and support the design, testing, implementation and deployment of interoperable electronic interfaces that enable more efficient and secure public services.

Arquivo.pt is a public service that has the mission of preserving documents published on Internet sites to enable their long-term open access and provides interoperable electronic interfaces (APIs) for their automatic processing.

The Portuguese Law No. 68/2021 of 2021-08-26 approves the general principles on open data and transposes the European Directive.

Arquivo.pt was certified as a Public Administration open data provider

The AMA recognized Arquivo.pt as a public service and open data provider and awarded its certification seal on the Open Data Portal.

Arquivo.pt collects general information published on the Web of interest to the Portuguese community. However, it is also responsible for the preservation of Public Administration websites, such as the Portal do Governo, in collaboration with the Management Center for the Government Electronic Network (CEGER).

Any citizen can access the open data resulting from these historical archives and, for example, search for official information published on the websites of successive governments.

In 2021, Arquivo.pt provided open access to over 10 billion files (721 TB) from 27 million websites. The open data preserved by Arquivo.pt can be explored through the search interface, automatically through API (https://arquivo.pt/api) or by reusing derived datasets.

Derived datasets available on the Open Data Portal

Besides the original web artefacts preserved at Arquivo.pt, this service has generated open datasets derived from its activities, which are now available in open access so that they can be reused:

Resources list

Presentations in the IIPC Web Archiving Conference and RESAW 2021

Thumbnail IIPC WAC 2021

Last updated on August 17th, 2021 at 07:35 pm

During the week of 14 to 18 June, three international meetings were held by videoconference with the participation of the Arquivo.pt:

    • International Internet Preservation Consortium (IIPC) – General Assembly – general assembly of the consortium that gathers the Web archiving initiatives around the world
    • Web Archiving Conference 2021 – the most important meeting in the field of Web preservation, where experts share new knowledge and experiences
    • RESAW Conference – meeting of the European RESAW network (Research Infrastructure for the Study of Archived Web Materials) this year in its 4th edition, mainly addressed to the community of researchers from non-technological scientific areas, such as Social Sciences, Arts and Humanities.

Contributions of Arquivo.pt to the international community

Arquivo.pt presented some results of the work developed in the last year, with emphasis on the functionalities that improve the reproduction of the archived contents, such as the “Complete the page”.
Two historical collections were integrated on the Arquivo.pt: the Geocities and the Internet Memory Foundation. Arquivo.pt did special collections about the 2019 European Elections and Covid-19.
The contents of Arquivo.pt are accessible to any researcher regardless of the country they are in and therefore it is a useful service to the international community.

Presentations

  • Arquivo.pt updates 2021: presentation at the IIPC – General Assembly, by Daniel Gomes (Vídeo)
  • Complete the page. 1 minute drop in (presentation at the IIPC – General Assembly “complete the page”), by Daniel Gomes (Slide)
  • A transnational and cross-lingual crawl of the European Parliamentary Elections 2019, by Ivo Branco (Slides, Vídeo)
  • Enhancing access to research the Geocities historical collection, by Pedro Gomes (Slides, Vídeo)
Complete the page - demo
Complete the page – demo. Slide used in the IIPC 1 minute presentation, at the IIPC General Assembly 2021

2021 Local Elections: We Need Your Help!

Last updated on August 2nd, 2021 at 02:22 pm

We have been emphasizing during our presentations that Arquivo.pt requires your collaboration to preserve information published on the Web related to Elections.

Campaign websites are historically relevant. However, they are difficult to identify because they appear and disappear quickly. Moreover, they are often exclusively referenced through printed media (e.g. posters).

That’s why your collaboration is essential!

To help, simply add addresses of pages or sites related to the Municipal Elections of 2017 through the following link:

Suggesting only 1 address related to your location will make a valuable contribution.

Can you help?

If you have any questions, please contact us.

Meet the winners of the Arquivo.pt Award 2021!

Last updated on February 18th, 2022 at 12:39 pm


The winners of the Arquivo.pt 2020 Award was announced by the Público newspaper, the official media partner of this year’s edition, which granted an honorable mention to the best work based on the contents of the newspaper. 26 candidate works were received.

The award ceremony toke place during Science 2021 – Meeting with Science and Technology, june 30, at the Lisbon Congress Center.

1st place – “Major Minors”

The winner of the 10,000 euros prize was the work “Major Minors” by Paulo Martins e Leandro Costa.

“Major Minors” is an Ontology of press clippings from Portuguese newspapers with reference to social minorities.

2nd place – “Politiquices”

The 2nd prize in the amount of 3,000 euros was awarded to the work “Politiquices” developed by David Batista.

“Politquices” is a Web application that allows searching support or opposition relations between political personalities and parties expressed in news headlines preserved at Arquivo.pt.

This interface makes it possible to analyse the relationship of support or opposition between two political personalities or organisations.

3rd place – “Primeiras páginas de jornais online portugueses”

The 3rd place winner received a prize of 2,000 euros and was awarded to the work “Primeiras páginas de jornais online portugueses”, developed by Susana Parreiraunder the supervision of Ana Sabino, Ana Boavida e Penousal Machado.

“Primeiras páginas de jornais online portugueses” (Front pages of Portuguese online newspapers) presents an interactive graphical analysis of the front pages of Portuguese online newspapers. For this study, specific items within the newspaper design were analysed, thus allowing trends to be observed over time.

As a result we have a Web interface that allows interactively visualising, for example, the space occupied by the images on the Público newspaper front page.

Menção Honrosa do Público

Público newspaper, official partner of the 4th edition of the Arquivo.pt Prize, awarded its Honorable Mention to the work “Primeiras páginas de jornais online portugueses”.

Videos

Gallery

Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021
Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021 Entrega do Prémio Arquivo.pt 2021

Photos by Valter Gouveia – FCT | FCCN | Arquivo.pt

Internet Memory Foundation collection available in Arquivo.pt

logo Internet Memory Foundation - website

Last updated on September 15th, 2021 at 09:29 am

The historical collection of web content generated during the Internet Memory Foundation’s (IMF) activity has been donated to Arquivo.pt and is now searchable!

The IMF was a European organization dedicated to preserving web content that was wound up in 2018.

The 1st web archiving project in Europe (2004-2010) was led by Julien Masanès (who was guest of honour at the celebration of 10 years of Arquivo.pt) and was called European Archive Foundation.

In 2010, Julien Masanès, the “father” of Web archives in Europe created the IMF.

Examples of pages from the collection donated by the IMF

The collection donated by the IMF has now been integrated in the Arquivo.pt collection to be preserved for posterity.

This collection is composed of 142 million files that total 6.3 TB of historical information whose texts or images can now be searched through Arquivo.pt.

webpage liteScience printscreen

Life Science Competence in Europe portal, 2009.

print homepage www.limes.fp6-limes.eu

LIMES project homepage (Land and Sea Monitoring for Environment and Security), 2009.

print homepage intelligence-territoriale.eu

Project Intelligence-territoriale homepage, 2009.

European Parliament news page in the 20th anniversary of the break of the Berlim Wall, 2009.

Le Figaro about French presidential election, 2012.

Reuters with a new about WikiLeaks, 2011.

Print da página do Internet Memory Research em 2014

Internet Memory Foundation homepage, 2014.

Search this new collection!

This new collection has been named “InternetMemory” in the Arquivo.pt collections list.

Searches can be made on this collection using the collection search parameter or through the custom search page available at arquivo.pt/InternetMemory.

custom-search-page of Internet Memory collection

 

“Art Forever on the Web”: Cycle of Webinars

composicao sobre Colectiva de Artistas 2008 Quadrado Azul

Last updated on July 6th, 2021 at 01:23 pm

composicao sobre Colectiva de Artistas 2008 Quadrado Azul

Colectiva de Artistas. 2008.04.19 a 2008.06.07. Galeria Quadrado Azul. Porto. Composition from a Webpage preserved on Arquivo.pt: www.quadradoazul.pt, 22nd October 2008.

On April 29, May 27 and July 1, from 3 to 4:30 pm, webinars geared to the community of artists, curators, gallerists and event producers will be held, open also to anyone interested in learning more about preserving art websites.

Throughout the sessions, participants will learn in detail about the functionalities of Arquivo.pt in order to take advantage of this public Web preservation service. They will have technical information, in the form of recommendations and best practices, to create preservable websites. Finally, they will learn how to use available tools to save their websites in a standardized format so that their contents are not lost.

This cycle of Webinars is an initiative of the “Forever” Project, a collaboration between the Calouste Gulbenkian Foundation Art Library and Arquivo.pt under the ROSSIO infrastructure.

For more details and sharing, please see the program (PDF) (in Portuguese).

Sign up!

April 29 – The Arquivo.pt and the preservation of digital memory
May 27 – Recommendations for creating preservable websites for the future
July 1 – Archiving the Web: do-it-yourself!

Held sessions presentations