World Digital Preservation Day celebrated at Portuguese National Archive Torre do Tombo

Last updated on November 18th, 2024 at 11:23 am

Let’s talk about preservation and access!

On November 7, 2024, the New Paths to Information Preservation and Access Meeting was held, organised jointly by Arquivo.pt and the Arquivo de Ciência e Tecnologia, the first located on Avenida do Brasil and the latter on Avenida D. Carlos I, in Lisbon, both services of the Fundação para a Ciência e a Tecnologia (FCT).

The aim of this joint FCT team was precisely to bring about the meeting and sharing of experiences between various institutions that inevitably have to manage information, both in traditional formats such as paper and in digital formats.

The meeting had 243 participants and 29 speakers throughout the day. Nine of the twenty-seven presentations were submitted for a session called ‘Community Space’.

The Portuguese Association of Librarians, Archivists, Information and Documentation Professionals APBAD made an important contribution to publicising the event to the community and was present with an information stand.

An international day dedicated to digital preservation

On this day, World Digital Preservation Day was celebrated, an initiative of the Digital Preservation Coalition (DPC) to which Arquivo.pt has been associated since the first edition in 2017. Jane Winters, Chair of the DPC, sent a video message to join this initiative in Portugal.

Digital information was the main theme of the speeches. At the opening, the Head of the DGLAB – Direção Geral do Livro, dos Arquivos e das Bibliotecas  (Directorate for Books, Archives and Libraries), Silvestre Lacerda, recalled that the DGLAB was a pioneer among public organisations in tackling the issue of digital preservation. FCT vice-president Francisco Santos emphasised the economic value of data for scientific research.

Digital preservation is not just about technology, as Henrique São Mamede, Professor at Universidade Aberta, INESC TEC, said at the opening conference. It’s also about people, the human factor, the environment outside organisations and new sensibilities such as sustainability and ecology. Hence the importance of creating bridges, of using Artificial Intelligence, for example, in conjunction with ethics.

Throughout the day, four panels brought together presentations on various preservation contexts such as the digitisation of sound, image and video, research data, regulatory frameworks, management systems for digitised or born-digital information, dissemination and access, and use in academic research.

Panel 1: Digital preservation initiatives and realities

The first panel was moderated by João Gomes, Director of Advanced Services at FCT, and brought to the table the diversity of contexts in which the issue of preservation and access arises. Here we highlight one aspect of each presentation and invite you to follow the links to learn more about these initiatives.

Moisés Rockemback, Professor at the University of Coimbra and co-author of the book Arquivamento da web e preservação digital  (Web archiving and digital preservation), spoke about the first initiatives carried out in Brazil to preserve content published on the Web. The websites of the candidates in the Brazilian elections, for example, are ephemeral by nature but have become material for historiographical research by being preserved in a web archive. From a more theoretical perspective, he addressed the issue of memory. Preserving the web allows us to bring to light events that were only broadcast on digital media such as the web and, in this sense, postpones the end of history expressed in the metaphor of the ‘Dark Age’, a time of darkness, empty of information.

Pedro Penteado, Director of Archival and Standardisation Services, presented a set of instruments that the DGLAB has developed, such as the Macro Estrutura Funcional (MEF) (Macro Functional Structure, the Avaliação Suprainstitucional da Informação Arquivística (ASIA) (Super-institutional Assessment of Archival Information) project and the Lista Consolidada na Plataforma CLAV (Consolidated List on the CLAV Platform), which allows the different public administration bodies to comply with legislation and standardise classification and assessment practices. He recalled that these tools are flexible to meet the specific needs of organisations.

Pedro Príncipe, Head of the Documentation Services Division at the University of Minho, spoke about research data. The preservation of and access to data is fundamental to the production of science. To achieve this, it is necessary to combine initiatives and work in networks and create communities of practice. The GDI Forum is an example of how useful it is to meet professionals. Certification is highly recommended, as demonstrated by the University of Minho, which has certified its repository, as it is an extra reason to create robustness and to achieve the FAIR (Findable, Accessible, Interoperable, and Reusable) objectives.

Hilário Lopes, RTP’s Deputy Director of Institutional Relations and Archive, described the path to digitalisation that has completely changed the way we access the RTP archive (Portuguese Radio and Televison). If until 2001 digitisation was done on request, from that year onwards the contents were massively digitised. Since 2007, the contents have been accessible in digital format, which has facilitated access and use. RTP Memória and Portal RTP are two examples of access to the audiovisual heritage of public radio and television.

Panel 2: Preserving and reusing Web information

The theme of web archiving was highlighted in the second panel, moderated by Daniel Gomes, manager of Arquivo.pt and its initiator on 8 November 2007.

Ricardo Basílio, digital curator at Arquivo.pt, presented the online exhibition ‘Memories of 25 April on the Internet’, created in collaboration with the 50 Years of 25 April Commemorative Commission, based on preserved web pages. Select pages about the 25 April celebrations across the country were highlighted through a guided tour of the exhibition.

Joana Paulino, historian and researcher at the Faculdade de Ciências Sociais e Humanas da Universidade Nova de Lisboa, showed how technologies contribute to the development of studies in areas traditionally far removed from technologies, based on her experience at the Digital Humanities Laboratory.

António Campos and Hélder Mestre, from the Arquivo Municipal de Sines (Sines City Council Archive), showed how, since 2020, they have been preserving web content of local interest in collaboration with Arquivo.pt. They record web pages with ArchiveWeb.page, a Webrecorder tool, send a copy of the files to Arquivo.pt, transcribe images and videos verbatim, and also use PDF as the most traditional format for archiving news. The issue of accessibility to content for people with special needs is fundamental in the preservation process.

António Ramiro and Carmen Fonseca, winners of the Arquivo.pt 2024 Award, presented their work Noticioso.pt. It’s a project that reuses information from Arquivo.pt to challenge citizens’ critical capacity.

Finally, Daniel Gomes emphasised how much has been done in the last 17 years in the field of web preservation, to the point where we now have a functional service that everyone can use. As a testimony to those early days, we found a page from Diário Digital newspaper from November 2006.

Panel 3: Preserving the present and safeguarding the future

The third panel was moderated by Paula Meireles, Coordinator of the Archive, Documentation and Information service at the Foundation for Science and Technology (FCT) and brought four other realities to the table.

Filipe Guimarães Silva, Executive Director of the Fundação Mário Soares e Maria Barroso,  and António Coelho, Digital Reproduction Coordinator, delved into the technical issues related to digitisation, based on the case of the collection, which is also accessible on the Casa Comum portal. Quality control is the most important factor in obtaining a preservable digital version. You don’t always need expensive technology to get good results. It is essential to follow standards and ensure that quality metadata is generated.

Fernanda Gonçalves, Director of Archives at the São João Local Health Unit, showed how the São João Digital Clinical Repository is transforming access to clinical files with advantages in terms of both speed and quality of information. The information management model at this huge institution poses immense challenges for preservation and continued access, as it involves creating interoperability between multiple systems. What’s more, this is sensitive data with different levels of access. This is where the archive comes in as an asset. The archive service must rise to the challenges of any organisation in order to serve all its ‘clients’.

Augusto Ribeiro, head of the Documentation and Information Management Service at UPdigital, University of Porto, explained how the university collection is being preserved. From the treatment of paper documents to their digitisation and inclusion in the digital repository, it’s important to guarantee their robustness. This work has been progressive and systematic, i.e. it follows a plan where all the pieces fit together as the work is carried out.

Pedro Penteado (DGLAB) presented the ‘Digital Preservation Guide’ project that is being developed in collaboration with the Asociación Latinoamericana de Archivos (ALA). This initiative will structure content on digital preservation in a pragmatic way. Soon, professionals will have a knowledge base to consult whenever they carry out digital preservation activities.

Panel 4: Community space

The fourth panel, moderated by Paula Carvalho, from FCT’s Science and Technology Archive, included 9 short presentations submitted by the community. Below, we present the abstracts submitted by the authors:

Celebrating the 50th anniversary of 25 de Abril at the closing session

Maria Inácia Rezola, Executive Commissioner of the Mission Structure for the Commemorations of the 50th Anniversary of the Revolution of 25 de Abril 1974, presented a historical perspective of the impact of 25 April on Portuguese society, namely through the way it is commemorated throughout the country.

It was shown the work that the Commission has been doing to identify archives, documentation centres and collections of all kinds with material about 25 April. There are public collections that are practically unknown, and others that are in private collections. Inventorying and publicising them is therefore the first step in promoting study and knowledge about 25 de Abrril.

Finally, Maria Inácia Rezola announced the award of the Honourable Mention ‘25 de Abril and Democracy’, together with a prize of 5,000 euros, in the Arquivo.pt Award 2025, to the best work on 25 April that uses Arquivo.pt.

Image gallery

Encontro Dia Mundial da Preservação Digital 2024 #WDPD2024

Carmen Fonseca, O Noticioso.pt
Ricardo Basílio, Arquivo.pt -FCT
Hélder Mestre e António Campos, Arquivo Municipal de Sines
Hélder Mestre e António Campos, Arquivo Municipal de Sines
Ricardo Basílio, Arquivo.pt -FCT
Joana Paulino, NOVA-FCSH
António Ramiro, Noticioso.pt
2º Painel - António Ramiro e Carmen Fonseca, Noticioso.pt
António Ramiro e Carmen Fonseca, Noticioso.pt
Encontro Novos Caminhos para a Preservação e o Acesso à Informação
2º painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Encontro Novos Caminhos para a preservação e o aEncontro Novos Caminhos para a Preservação e o Acesso à Informaçãoesso à informação
1º Painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Pedro Príncipe, Universidade do Minho
Moisés Rockemback, Universidade de Coimbra
Hilário Lopes, Arquivo da RTP
Stand do Arquivo.pt - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
1º Painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Pedro Penteado, DGLAB
Encontro Novos Caminhos para a Preservação e o Acesso à Informação
1º Painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Moisés Rockemback, Univ. Coimbra, Ricardo Basílio, Arquivo.pt
Henrique São Mamede, Universidade Aberta, INESC TEC
Stand do Arquivo.pt - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Stand do Arquivo.pt - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Moisés Rockemback, Universidade de Coimbra
Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT
3º Painel - Paula Meireles, FCT
Henrique São Mamede, Universidade Aberta, INESC TEC
Sessão de Abertura - João Gomes, Diretor Serviços Avançados da FCT
Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT
Sessão de Abertura - Jane Winters, Digital Preservation Coalition (DPC)
Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT
Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT
Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT
Augusto Ribeiro, Universidade do Porto, UPDigital
3º painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação
Pedro Penteado, DGLAB
wdpd_encontro-preservacao-vasco-rato-arquivo-pt
wdpd_encontro-preservacao-pedro-gomes-citationsaver-fccn-1
wdpd_encontro-preservacao-rita-cepa-nova-fcsh
wdpd_encontro-preservacao-pedro-gomes-citationsaver-fccn
wdpd_encontro-preservacao-joao-pedro-oliveira-nova-fcsh
wdpd_encontro-preservacao-uab-madalena-carvalho
wdpd_encontro-preservacao-suzana-oliveira-act-fct-1
wdpd_encontro-preservacao-susana-torrao-pedro-cavaco-nova-fcsh
wdpd_encontro-preservacao-inacia-rezola
wdpd_encontro-preservacao-inacia-rezola-1
moises-rockembach
arquivamento-da-web-moises-rockembach
paula-meireles-inacia-rezola-sessao-de-encerramento
pedro-principe-uminho
wdpd-paula-meireles
Carmen Fonseca, O Noticioso.pt Ricardo Basílio, Arquivo.pt -FCT Hélder Mestre e António Campos, Arquivo Municipal de Sines Hélder Mestre e António Campos, Arquivo Municipal de Sines Ricardo Basílio, Arquivo.pt -FCT Joana Paulino, NOVA-FCSH António Ramiro, Noticioso.pt 2º Painel - António Ramiro e Carmen Fonseca, Noticioso.pt António Ramiro e Carmen Fonseca, Noticioso.pt Encontro Novos Caminhos para a Preservação e o Acesso à Informação 2º painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Encontro Novos Caminhos para a preservação e o aEncontro Novos Caminhos para a Preservação e o Acesso à Informaçãoesso à informação 1º Painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Encontro Novos Caminhos para a Preservação e o Acesso à Informação Encontro Novos Caminhos para a Preservação e o Acesso à Informação Encontro Novos Caminhos para a Preservação e o Acesso à Informação Encontro Novos Caminhos para a Preservação e o Acesso à Informação Pedro Príncipe, Universidade do Minho Moisés Rockemback, Universidade de Coimbra Hilário Lopes, Arquivo da RTP Stand do Arquivo.pt - Encontro Novos Caminhos para a Preservação e o Acesso à Informação 1º Painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Pedro Penteado, DGLAB Encontro Novos Caminhos para a Preservação e o Acesso à Informação 1º Painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Moisés Rockemback, Univ. Coimbra, Ricardo Basílio, Arquivo.pt Henrique São Mamede, Universidade Aberta, INESC TEC Stand do Arquivo.pt - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Stand do Arquivo.pt - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Moisés Rockemback, Universidade de Coimbra Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT 3º Painel - Paula Meireles, FCT Henrique São Mamede, Universidade Aberta, INESC TEC Sessão de Abertura - João Gomes, Diretor Serviços Avançados da FCT Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT Sessão de Abertura - Jane Winters, Digital Preservation Coalition (DPC) Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT Sessão de Abertura - Silvestre Lacerda, Diretor da DGLAB e Francisco Santos, Vice-Presidente da FCT Augusto Ribeiro, Universidade do Porto, UPDigital 3º painel - Encontro Novos Caminhos para a Preservação e o Acesso à Informação Pedro Penteado, DGLAB wdpd_encontro-preservacao-vasco-rato-arquivo-pt wdpd_encontro-preservacao-pedro-gomes-citationsaver-fccn-1 wdpd_encontro-preservacao-rita-cepa-nova-fcsh wdpd_encontro-preservacao-pedro-gomes-citationsaver-fccn wdpd_encontro-preservacao-joao-pedro-oliveira-nova-fcsh wdpd_encontro-preservacao-uab-madalena-carvalho wdpd_encontro-preservacao-suzana-oliveira-act-fct-1 wdpd_encontro-preservacao-susana-torrao-pedro-cavaco-nova-fcsh wdpd_encontro-preservacao-inacia-rezola wdpd_encontro-preservacao-inacia-rezola-1 moises-rockembach arquivamento-da-web-moises-rockembach paula-meireles-inacia-rezola-sessao-de-encerramento pedro-principe-uminho wdpd-paula-meireles

Credits: photos by Leonor Arrimar (FCT). Included are some images of mobile devices sent in by participants.

Video


Video by Leonor Arrimar (FCT)

Know more

Previous editions of World Digital Preservation Day with Arquivo.pt

Save websites before they disappear with the Browsertrix Crawler tool

print-browsertrix-tutorial

Last updated on September 14th, 2024 at 10:07 pm

The month of September marks the beginning of a year’s work and also the end of many websites that are hopelessly lost. Remodelled or shut down without making a good copy of their content, this is how historic websites are lost unnecessarily.

There are tools that allow websites to be saved immediately by the organisations that manage them. In addition, there is the on-demand archiving service for high-quality websites that Arquivo.pt provides to partner organisations or in occasional collaborations.

This article aims to highlight the Browsertrix Crawler used by Arquivo.pt, without excluding other tools, which can be useful to information managers and IT departments.

Use of Browsertrix-crawler by Arquivo.pt for high-quality collections

Browsertrix Crawler is a tool that lets you record entire websites and lists of web pages automatically and in a format compatible with web archives.

Arquivo.pt uses the Browsertrix Crawler to make high-quality site collections (RAQs) on-demand of the community. For example, when a site is about to be shut down, when it’s going to undergo remodelling or, periodically, to maintain a good history of a particular site.

An illustrative case is the Almada City Council website, recorded in April 2021 at the request of the Municipal Archive. Another case is the website of the newspaper Notícias de Leiria, which was recorded before its closure in December 2023.

Requests for high-quality collections (RAQs) to Arquivo.pt are increasingly frequent: 77 requests from January to September 2024. This is a sign that there is greater concern about the preservation of web content.

What you need to use Browsertrix-crawler locally

The group that developed the Browsertrix Crawler, Webrecorder.net, led by Ilya Kreymer, has the motto ‘web archiving for all’. Its tools make it possible to record the Internet in a decentralised way and on a small scale.

The Browsertrix Crawler is available and can be installed on your computer for small collections.

The basic version of Browsertrix that Arquivo.pt is using requires basic command line knowledge, which is the only barrier for non-experts.

From Arquivo.pt’s own experience, using the Browsertrix Crawler is easy in multidisciplinary teams, where there is always someone with minimal knowledge to use Linux commands and provide occasional support.

Demonstration of recording entire websites on your own computer

To promote the preservation of sites in Web archive format, Arquivo.pt presents a use case for the Browsertrix Crawler. It’s useful for anyone who wants to deepen their knowledge and practice of saving sites in a local environment.

Other tools used by Arquivo.pt to record content

Brozzler: a tool for improving the history of daily and monthly collection sites

Brozzler is a similar tool to Browsertrix Crawler in that it also bases its recording on a browser. It is used and maintained by the Internet Archive.

Arquivo.pt has been using Brozzler since at least 2018 to record web pages with interactive content present on the web pages and for high-quality collections (RAQs).

Lists of up to 200 sites are successfully recorded by Brozzler. For example, the 125 daily collection sites (FAWP) are recorded with Brozzler at the beginning of each month.During the month, another list of 75 monthly collection sites (MAWP) is recorded using Brozzler.

At the end of 2023, Arquivo.pt compared Brozzler and Browsertrix Crawler and chose to keep these two tools.

Heritrix, pywb and ArchiveWeb.page: tools for thousands of sites or one page

The Heritrix crawler is Arquivo.pt’s main recording tool. It is used on huge lists of websites, such as the .PT domain sites, to which other Portuguese sites are added, totalling  more than half a million.

On the opposite side is the ArchiveWeb.page extension, which Arquivo.pt uses for short page-by-page recordings and also for the Web archiving: do-it-yourself! training course.

To complete the list of recording tools used by Arquivo.pt, mention should be made of pywb, which comes into play, for example, when an Arquivo.pt user uses the ‘Complete the page’ functionality or the SavePageNow service.

Portuguese at the 2024 Olympics and Paralympics in IIPC’s international collection of websites

print-replay-comiteolimpicoportugal-

Last updated on September 11th, 2024 at 04:23 pm

print-noticia-rtp-rececao-atletas-paralimpicos-paris-2021
Paralympic Games. Miguel Monteiro, gold medallist, returns to Lisbon (News on the RTP website, 2 September, selected for international collection)

Arquivo.pt has contributed to the international collection of web pages on the Summer Olympics Games taking place in Paris from 26 July to 11 August 2024 and is doing the same for the Summer Paralympics taking place from 28 August to 8 September.

The initiative to create the “2024 Summer Olympics/Paralympics IIPC CDG” collection is the responsibility of the International Internet Preservation Consortium (IIPC), the world’s leading organisation in the field of Internet preservation, through its Content Development Working Group.

The IIPC’s collaborative collections aim to promote the creation of thematic collections and collections based on international events. The web pages are recorded and then made available on the Archive-it service.

The pages of this collection will also be available on Arquivo.pt for those who want to carry out studies on sport and Olympism.

How the pages about Portuguese athletes were selected

At the Olympic Games 73 athletes represented Portugal in 15 sports, and at the Paralympic Games 27 athletes in 10 sports.

The criterion for selecting pages for the international collection was news about the athletes. For each athlete, pages were selected about their expectations before the games, their performance in the competition and their comments during and after the competition.

Some athletes have more news selected than others, and the same goes for the sites from which the news comes. The selection of pages was not limited to the first results presented by the search engine. We looked for a variety of channels and news from regional and local sites, some from the region or city where the athletes came from.

More than 500 pages to remember the Portuguese presence in Paris

The contribution of Arquivo.pt, as you can see in the table, already has more than 500 web pages.

print-tabela-seeds-ilustrativa-jogosolimpicos
Portuguese Seeds – 2024 Summer Olympics and Paralímpics, International Internet Preservation Consortium – Content Development Working Group (IIPC CDG)

Collaborate in the collection via the IIPC form

Helena Byrne, curator of web archives at the British Library and main curator of this collection, invites everyone to send in interesting pages to record: And we’re off – Get Involved in Web Archiving the Summer Games – Paris 2024.

The following public form is available to contribute:

2024 Summer Olympics & Paralympics

Exhibition of old websites to mark International Museum Day

Heritales Crowd-Recycling e Arquivo.pt no Dia Internacional dos Museus

May 18, International Museum Day, was celebrated all over the country with free admission, guided tours, entertainment and exhibitions related to memory and heritage.

Arquivo.pt contributed with an exhibition of old web pages, entitled “Digital Memory through the Internet of the Past”, which was on display at one of the stands at the National Coach Museum in Lisbon.

The pages were selected to show different aspects of the Alentejo over time. From 2016, pages relating to the Heritales project were selected.

Heritales and Crowd-Recycling drew attention to the preservation of the Internet’s memory

Heritales is a project based in Évora that aims to study and disseminate heritage in all its manifestations. It is known for its main event created in 2016, HERITALES – International Heritage Film Festival.

Crowd-Recycling is a project focused on good practices for sustainability.

Heritales, Crowd-Recycling and Arquivo.pt carried out this action in collaboration with the aim of giving visibility to content published on the web over time. Preserving and giving access to digital content is fundamental to enhancing heritage.

Why an exhibition of old websites is a good idea

Making an exhibition of websites over time is relatively easy, all you have to do is come up with a theme, which can also be the history of an institution, and choose pages preserved on Arquivo.pt.

An exhibition of old websites is an original idea for the target audience. It often features texts and images that only existed on the web.

By drawing attention to the websites, we realize that many things were left unrecorded and this changes our view of the content we publish today. We start taking more care to save important pages, for example by taking action or saving them on the spot with SavePageNow.

Heritales Crowd-Recycling e Arquivo.pt no Dia Internacional dos Museus
Heritales, Crowd-Recycling and Arquivo.pt on International Museum Day at the National Coach Museum

World Internet Day was on May 17th

The day before International Museum Day was World Internet Day (May 17). The proximity of the two commemorations ties in with the theme of preserving memory.

Portugal connected to the Internet for the first time in 1991, with the FCCN project “RCCN IP Service”.

To remember how it all happened, here are the three suggestions that FCCN published on social media for this day:

Arquivo.pt is finalist for the DPC Awards 2024

dpc-award-thumb

Last updated on August 12th, 2024 at 11:50 am

The Digital Preservation Coalition Awards

The Digital Preservation Coalition (DPC) is dedicated to promoting digital preservation and associated best practices.

The DPC Awards promote exemplary and innovative digital preservation use cases from all over the world.

The Arquivo.pt team submitted two applications to the DPC Awards 2024 in the categories of “Safeguarding the Digital Legacy” and “Research and Innovation”.

The Award for Safeguarding the Digital Legacy celebrates the practical application of preservation tools to protect at-risk digital objects.

The Award for Research and Innovation recognizes excellence in practical research and innovation activities.

Arquivo.pt applications to the DPC Awards

#1 Arquivo.pt catalog of tools for digital preservation

Information that rules modern-day lives is born-digital and disseminated online. However, invaluable digital objects published online have been continuously lost.

Arquivo.pt is a public infrastructure which supports the preservation of digital objects published online to safeguard this digital legacy for future generations.

Thus, in October 2023 after 15 years of research and development, Arquivo.pt released a Catalog of 13 innovative tools to support the preservation of at-risk online content, from acquisition to dissemination (e.g. search and access, APIs, training, open data sets, exhibitions).

Arquivo.pt safeguards online digital objects of worldwide interest for research and education.

The Arquivo.pt Catalog was selected as finalist to the Safeguarding the Digital Legacy Award.

#2 Searching preserved web-images

Images published online are precious digital assets that document contemporary times for future generations.

This initiative describes the research and development of an innovative image search system that enables the discovery and access to billions of preserved images acquired from the web since the 1990s.

This research was applied to enhance the Arquivo.pt web archive with an image search service publicly available to any Internet user, officially launched in August 2022.

The resulting scientific and technical publications are available in open-access and the developed software is available as free open-source to be reused and enhanced by the community.

This work on searching images preserved in web archives applied for the Research and Innovation Award.

Know more

Commemoration of the 50th anniversary of April 25 – the Portuguese revolution of 1974

50anos25abril-ArquivoPT-IG-Feed-2

Arquivo.pt joined the celebrations of the 50th anniversary of April 25, the Portuguese Revolution of 1974, as part of the initiatives promoted by the Fundação para a Ciência e a Tecnologia (FCT) in partnership with the Estrutura de Missão – Comissão Comemorativa 50 anos 25 de Abril.

The initiatives were as follows: a journey through time, a special collection on the theme “Abril 25”, a presentation at the “50 years of April International Congress” and the inclusion of a special mention in the 2025 edition of the Arquivo.pt Award.

Memories of April 25 on the Internet exhibition

The exhibition Memories of April 25 on the Internet presents a selection of web pages about the celebrations of April 25 in various regions of the country, since the beginning of the web in the 1990s.

The criteria for choosing the pages for the exhibition were as follows:

  • Pages relating to the April 25 commemorations;
  • Pages found on Arquivo.pt on dates close to the anniversary each year;
  • Diversity to include different areas of the country;
  • Popular demonstrations and official ceremonies.

A historical memory without web archives is incomplete. The aim of this journey through time is to invite citizens to travel back in time, browsing through old web pages and reliving recent episodes in our life as a democracy.

See the exhibiton: arquivo.pt/50anos25abril

Special collection on April 25 – the Portuguese Revolution of 1974

To mark the anniversary, Arquivo.pt carried out a special collection on the topic of “April 25” and made the results available in an open dataset, published on the Dados.gov portal.

The dataset contains a list of keywords put into a search engine in order to obtain results on the topic of “April 25”. The search considered names of people, places, political, social and cultural aspects, as well as words associated with the event.

The searches were carried out on March 22, 2024 using the Bing Search API, an automatic search service that returns results according to the relevance criteria of the Bing service itself and others configured by us.

A total of 12,650 unique web page addresses were obtained. It is hoped that the recording of these pages will be useful for the organizations that produced this content, for researchers who want to study our history and for citizens who cultivate a sense of memory and democracy.

Participation in the 50 years of April International Congress

memorial-congresso-internacional-50anos25abril
João Gomes, Director of Advanced Services, FCCN-FCT presenting the Arquivo.pt Memorial service at the 50 years of April International Congress

On May 2, 2024, João Gomes, Director of Advanced Services at the FCCN Scientific Computing Unit of the Foundation for Science and Technology I.P., presented Arquivo.pt to the participants of the 50 years of April International Congress, as a distinctive service, open to citizens and useful for organizations.

This event, organized by the Estrutura de Missão – Comissão Comemorativa 50 anos 25 de Abril and the University of Lisbon, included a presentation of two FCT services for citizens: Arquivo.pt and NAU’s massive online open courses.

Arquivo.pt is a web preservation service available to all citizens who want to search for old content published on the web.

Using Arquivo.pt contributes to a better understanding of our history. It also provides useful services for cybersecurity, such as the Arquivo.pt Memorial, which is able to maintain institutions’ old websites, preventing attacks and saving them resources.

Special mention for “April 25 and Democracy” at the Arquivo.pt Awards 2025

The Arquivo.pt Award is held annually and honors works that use Arquivo.pt.

In 2025, as part of the celebrations for the 50th anniversary of April 25, a special mention will be made of work on the theme “April 25 and Democracy”.

We therefore challenge researchers and interested citizens to create innovative works using Arquivo.pt.

If you have any questions about the Arquivo.pt Award, please contact us.

Arquivo.pt reaches 1 PetaByte of preserved information!

The collection of 1 PetaByte of content predominantly in Portuguese, accessible to both researchers and ordinary citizens, is a milestone that deserves to be celebrated, in the month of its 16th anniversary.

At Arquivo.pt you can search for information published on the Web in the past, such as:

Discover more pages through the selected pages in the Arquivo.pt Online Exhibitions.

The first European page
News from The New York Times in 2008
European Film Awards 2014

Purpose and mission of the Portuguese Web Archive

Arquivo.pt was created on November 8, 2007 with the aim of preserving content from the Portuguese Web.

In 2013, as a service operated by the Fundação para a Ciência e a Tecnologia (FCT), its mission was formulated as follows: “To promote the preservation of content available on the national Internet, ensuring that it is made available to the scientific community and the general public” (Decreto-Lei no. 55/2013).

In recent years, Arquivo.pt has created new services, such as CitationSaver, which allows researchers to record references to web content in their scientific articles, Memorial and Complete page, which facilitate access to content scattered throughout the huge 1 PetaByte block of data.

Where did so much information come from?

In order to reach the 1 PetaByte volume, Arquivo.pt periodically recorded content from websites in the .PT domain and from Portuguese websites in other domains.

In addition, frequent daily and monthly collections were made from a small number of government sites and the main news sites in Portugal.

As part of international collaborations, content was collected from sites in various languages, for example on the 2019 European Elections.

Content prior to 2008 came from the Internet Archive and donations, such as a collection made by the National Library and INESC on the 2005 Legislative Elections.

The largest Portuguese-language dataset available to researchers

By making 1 PetaByte of information available, in open access and through the use of APIs (Application Programming Interfaces), Arquivo.pt is a useful tool for research.

For example, a researcher who wants to do a study on elections in Portugal can use the entire Arquivo.pt collection. Better still, they can focus on just a few special collections dedicated to the elections, choosing the ones that interest them and downloading just a few Terabytes to process automatically with the APIs.

Contributions from the various teams and friends of Arquivo.pt

The development of Arquivo.pt is more than a technological issue and has been due to the dedication and persistence of the various teams that have worked on it since 2007.

It was also due to the contribution of many friends of Arquivo.pt, who were always on hand to help improve, and to the response of the user community.

Congratulations to all! Thank you.

World Digital Preservation Day dedicated to Justice

Last updated on November 13th, 2023 at 08:59 am

The Instituto de Gestão Financeira e Equipamentos da Justiça (IGFEJ) and Secretaria Geral do Ministério da Justiça (SGMJ), in collaboration with BAD, organized the event “Digital Preservation in Justice” to mark World Digital Preservation Day on November 2, 2023.

The event, which took place in the auditorium of the Polícia Judiciária in Lisbon, was attended by representatives from the government’s justice department and professionals from the archives, communications and IT departments.

How to use Arquivo.pt to preserve institutional websites

Arquivo.pt took part in the presentation “Preserve your website”, which addressed the issue of preserving institutional websites and critical aspects such as cybersecurity.

Justice entities can benefit from Arquivo.pt and its various services to ensure good preservation of their websites, mitigate cybersecurity threats and provide historical content to citizens.

The presentation concluded with the following recommendations:

  • Inventory and publicize your current and historical websites
  • Use Arquivo.pt services collaboratively
  • Save content in a standardized format with ArchiveWeb.page

Resources

University of Lisbon preserved over 100 historical websites in the Arquivo.pt Memorial

thumb-memorial-fcul

Last updated on March 27th, 2024 at 11:17 am

More than 100 historical websites from the Faculty of Sciences of the University of Lisbon (FCUL) are now accessible through the Memorial service of Arquivo.pt.

FCUL’s IT Department sent to Arquivo.pt a list of old websites hosted on its servers that were no longer updated, but whose historical content continues to be interesting to the community (e.g. websites of research projects or scientific events).

Arquivo.pt preserved these websites in collaboration with their ownersa, seeking to maintain a faithful representation of the published content for the future.

FCUL redirected the domain of each website to Arquivo.pt, and then, became able to disconnect the respective servers and  begin sparing the resources spent on their maintenance (e.g. electricity, data center space, human resources).

The show case of MiNEMA

print-memorial-example-minema-project

Landing page of www.minema.di.fc.ul.pt at Memorial do Arquivo.pt.

The MiNEMA scientific program website was the first that FCUL integrated into the Memorial. This website stopped being updated in 2009 when the project ended. FCUL invested resources in maintaining the website for another 10 years until it became necessary to suspend it down for cybersecurity reasons.

The Memorial of Arquivo.pt emerged as an option and since 2020, FCUL just needs to maintain the domain www.minema.di.fc.ul.pt while Arquivo.pt preserveS the information contained on the website.

Please note that the website’s content continues to be displayed in search engine results.

Follow FCUL and preserve your historical websites in the Memorial!

An increasing number of institutions are recurring to the Memorial of Arquivo.pt to safely preserve the content of their historical websites. For example, FCUL preserved 116 websites, the Government IT Network Management Center preserved 23 and the Foundation for Science and Technology preserved 40.

Public institutions have priority to benefit from this service. However, other entities can also request it as long as they own the website domain.

Identify your historical websites candidate to be integrated into the Memorial of Arquivo.pt and contact us!

To know more

Completing webpages from the past: it is possible!

Last updated on October 16th, 2023 at 06:59 pm

Some web-archived pages are reproduced incompletely due to problems occurred during the archiving process (e.g. deformatted or missing embedded images).

Complete page is a function of Arquivo.pt that allows to recover missing elements in web-archived pages, from other web archives or the original websites.

When a user views a page archived in Arquivo.pt, just needs to access the Options menu in the top right corner and choose Complete page.

This process is performed automatically.

How does Complete page work?

If you open a web-archived page that appears incomplete, try the Complete page option and wait.

Arquivo.pt will search for missing elements on the Internet and in other web archives using the Memento protocol. If it succeeds, the obtained elements will be immediately displayed on the web-archived page.

Later, these recovered elements are integrated into the Arquivo.pt collection, so that the web-archived page will appear more complete in the future accesses performed by any user.

complete-page-website-cristina-guerra-en

Completing the home page of artist Cristina Guerra’s website found a missing image.

For example, the website of artist Cristina Guerra archived in 2005 had a missing image. By using Complete page, it was possible in 2021 to obtain this missing image from another web archive which preserved it.

Participate in collaborative curation to improve the quality of Arquivo.pt!

Due to the high number of web-archived pages, it is not possible for Arquivo.pt to complete them all automatically. Therefore, the collaboration of users to identify important pages with missing elements and try to complete them is important.

By using Complete page, the users are contributing to improve the quality of the historical webpages preserved in Arquivo.pt!

Always give it a try to complete web-archived pages may that look incomplete. If you detect any problem, contact us.

Spread the word about the Arquivo.pt Complete page!