FCCN presents Arquivo.pt at the “File Not Found” event in Lisbon

 

goethe-institut-file-evento-file-not-found

From March 23 to 26, Lisbon hosted the File Not Found event, organized by the Goethe-Institut. Over the course of four days, the initiative brought together national and international experts to explore the role of archives in the digital age, particularly their cultural, social, and political value in a constantly evolving digital world. The discussion highlighted practices, challenges, and responsibilities associated with the preservation of information heritage in this context of increasing digitization.

On the final day of the conference, March 26, João Gomes, area director at FCCN, the digital services unit of FCT, participated in the panel “Archiving Online: Power, Risk, and Digital Care Practices.” His presentation focused on Arquivo.pt, the public service for preserving Portuguese web content, developed by FCCN.

João Gomes presented the mission and progress of Arquivo.pt, emphasizing the importance of ensuring that information published online can be preserved and reused by researchers, journalists, public entities, and citizens. He also highlighted the service’s role in promoting digital literacy and advocating for open access to information.

Learn more about collaborations with Arquivo.pt

Arquivo.pt participated in the International Digital Curation Conference in Zagreb

IDCC 2026 Zagreb

Last updated on March 16th, 2026 at 12:38 pm

IDCC 2026 Zagreb

Arquivo.pt participated in the International Digital Curation Conference with a presentation entitled How Arquivo.pt is Preserving scientific research project websites and promoting data reuse, represented by Ricardo Basílio, digital curator.

IDDC 2026 took place in Zagreb, Croatia, between February 16 and 18. The organizer of this annual event is the Digital Curation Center, a leading consortium in the field of data management and curation for scientific research. This event had 219 attendees from 30 countries including 5 from Portugal.

The same panel, moderated by Mikala Narlock, from the Indiana University, featured the following presentations: Organizing a community to survive research ecosystem instability, by Lauren Phegley, from the University of Pennsylvania, What should be saved? The impact of austerity on data rescue, by Shona Jane Fergusonm, from the UK Centre for ecology and hydrology, and How do you calculate the carbon footprint of your digital preservation activities?, by Jenny Mitcham  from the Digital Preservation Coalition.

Contemporary challenges in digital curation

The theme of this year’s conference was “AI, austerity, and authoritarianism: contemporary challenges in digital curation.”

At the opening, Antica Čulinam, from the Ruder Boskovic Institute, addressed the issue of the reliability of science, which requires transparent, scrutinized processes and well-documented, unbiased data.

In parallel sessions, other current challenges were addressed, such as carbon footprint, the use of AI, successful cases of data management, and community engagement.

In the closing session, the topic of web preservation was highlighted with a presentation by Mikala Narclock from Indiana University and Linda Kellam from Pennsylvania University on the Data Rescue Project.

Urgency is a determining factor in web preservation, especially when scientific research results are involved.

Tribute to Kevin Ashley

The final moment of the conference was to honor Kevin Ashley, director of the DCC since April 2010. Since the 1990s, he has worked on the development and provision of digital preservation services as head of digital archives at the University of London Computing Center (ULCC). As leader of the DCC and a great communicator, he has played a charismatic role in the development of data management planning, advice, guidance, and training.

In Portugal, we have records of two presentations by Kevin Ashley at the 5ª Conferência Luso-Brasileira sobre Acesso Aberto (CONFOA) at the Universidade de Coimbra in 2014, which we recall here:

Contribution of Arquivo.pt to the preservation of scientific research results

Arquivo.pt, a digital service provided by FCT, has among its priorities the preservation of all types of information published on the Web related to research projects, such as project websites, abstracts of scientific publications, news in the media related to projects and, in general, all information on the Web referenced in scientific publications.

For example, and this was the case presented to conference participants, in 2021, Arquivo.pt identified and collected 17 terabytes of information related to projects funded by the European Commission’s H2020 program. Until then, 46% of H2020 projects did not mention their websites or project pages in the data published on the European data portal Cordis.

Based on this successful initiative, Arquivo.pt has been systematically collecting content related to the projects, in collaboration with RCAAP, PTCRIS, and Ciência Vitae, from which URLs of publications available on the Web are obtained.

Use of Arquivo.pt by researchers

At the same time that Arquivo.pt took the initiative to record web content produced by researchers, the number of use cases of Arquivo.pt increased year on year. In other words, we have more researchers making use of the data and testing methodologies. We found some examples in LLMs for the Portuguese language, such as GlórIA and AmálIA, and in the works competing for the Arquivo.pt Award.

For example, in 2025, a group of researchers from CIDEHUS – Centro Interdisciplinar de História, Culturas e Sociedades da Universidade de Évora, used Arquivo.pt to create the work Narrative Monitoring: Analysis of Conspiracy Theories of Population Replacement in the Portuguese Web Archive (1996-2021).

The aim was to show the audience that the preservation of scientific research results requires the involvement of the researchers themselves. Once they are familiar with and use Arquivo.pt, they are also better prepared to take care of the preservation of their publications.

Know more

Special collection of web content on the Presidential Elections. We need your help!

Presidenciais 2026 -logo-PR2026-thumbnail

Last updated on March 13th, 2026 at 11:30 am

The 2026 Portuguese presidential election took place between January 18 and February 15. Arquivo.pt collected 2.3 terabytes of electoral content and now provides data on the entire process, such as search terms, identified content, and archived content.

The 2026 Presidential Elections took place in two rounds, the first on January 18 and the second on February 8, followed by a second round in 20 parishes, in the wake of the storms that ravaged the country. Thus, it is expected to find news about the affected areas as well as the political interventions of the presidential candidates in the collection.

Call for community participation in identifying and archiving election-related content

On January 15, Arquivo.pt invited the community to participate in collecting information about the elections: “Candidates’ websites, news articles, opinion columns, or social media posts—everything is useful for representing our life in democracy. Have you found interesting election-related content? Participate in identifying and archiving election-related content.”

Two modalities were suggested:

Arquivo.pt methodology for thematic coverage of the elections

Following the practice adopted in previous elections, the procedure consisted of the following steps:

  • definition of search terms
  • identification of search engine results pages (SERP)
  • phased recording of seeds (starting addresses for crawler use)
  • integration into Arquivo.pt
  • availability of data set

A search term is a combination of words used in a search engine. For example: candidate_name+presidential_elections 2026+Portugal.

Google was used to identify electoral content, and the Google Rank Checker,Keyword SERP Ranking Tool were also used to extract the results. The limitations recently imposed by the search engine on simple manual searches of results by a user (10 at a time) make this method less efficient.

The recording was phased as follows: before and after the first round, on January 12 and 23, before and after the second round on February 5 and 12, and a final recording of all seeds on February 18.

The result was 2.3 terabytes of data, comprising 11.4 million files, obtained from approximately 34,000 seeds using Heritrix and Browsertrix-crawler.

The contents are archived in the collection with the ID EAWP51 collection and will be accessible on the Arquivo.pt interface after one year. For now, information about searching and identifying content is available.

2026 Presidential Election Data Set

Available on the open data platform Dados.gov:

Find out more about electoral recalls from previous years