News – sobre.arquivo.pt

Meet the winners of the Arquivo.pt Award 2025!

July 1, 2025June 28, 2025 by Ricardo Basílio

Last updated on July 1st, 2025 at 09:58 am

The winners of the Arquivo.pt 2025 Award were announced by the Público newspaper, the official media partner, on 28 June 2025.

A total of 36 entries were received and validated.

The award ceremony will take place at the closing session of Encontro Ciência, at the NOVA SBE Campus in Carcavelos on 11 July.

1º place – “Minha Região – O Teu Portal Autárquico”

The winner of the 10,000 euro prize was the work ‘Minha Região – O Teu Portal Autárquico’ developed by Rúben Almeida, Ricardo Campos and Sérgio Nunes.

The result of this work is a platform available on the web that gathers municipal electoral information between 1976 and 2021. Through the minharegiao.pt website, anyone can find information by district, municipality and parish.

For example, a search for the Braga district shows the rise in the number of voters over time. It also shows that 2013 was the year with the highest abstention rate.

2nd place – “Memor.pt – Explore a Memória Digital Portuguesa”

The 2nd prize of 3,000 euros was awarded to the work ‘Memor.pt – Explore Portugal’s Digital Memory’, by Joaquim Matoso.

Memor.pt is an interactive platform that uses content preserved by Arquivo.pt to make Portugal’s digital memory accessible. Through a conversational AI, a themed quiz and a daily article, users can explore thousands of archived pages on topics such as housing, democracy, culture and the labour market.

For example, if you choose to ‘play’, you’ll have a choice of five topics to test your knowledge with a quiz. So, what year was the new Democratic Constitution approved after 25 April?

3rd place – “Narrative Monitoring”

The third prize of 2,000 euros was awarded to the work ‘Narrative Monitoring: Analysis of conspiracy theories of population replacement’, developed by Erik Bran Marino, Rafael Prezado, Ana Sofia Ribeiro, Renata Vieira.

The work ‘Narrative Monitoring’ is a systematic and comprehensive analysis of the emergence and evolution of conspiracy theories of population replacement (PRCT, Comparative analysis of conspiracy theories in Europe), in the Portuguese digital space, between 1996 and 2021. Using Arquivo.pt as a primary source, it develops a methodology that combines web archiving techniques, natural language processing and statistical analysis to identify, classify and analyse 36,621 documents related to migration and demographic issues.

The website www.narrativemonitor.uevora.pt presents the results through interactive visualisations, a dynamic timeline and an educational quiz on the keywords most associated with conspiratorial discourse. For example, ‘Multicultural’, ‘Kalergi Plan’, ‘Refugee’. Which of these keywords is most associated with conspiratorial population content?

Honorable mention AMCC – Aveiro Media Competence Center: “Estudos Arquivados”

The Aveiro Media Competence Centre (AMCC) has awarded its Honourable Mention to the work ‘Estudos Arquivados’, by Filipe Oliveira João and Paulo Cabrita.

‘Estudos Arquivados’ is a platform that uses Arquivo.pt for pedagogical and teaching purposes. It organises a user’s searches by subject and school year, and users can register and save their own searches for later use. In contexts where schools use technology for teaching, ‘Estudos Arquivados’, being based on a public and open collection, aims to promote study and digital inclusion.

For example, a survey on the ‘25 de Abril’ for 8th grade students (high shcool) in the subject of History returns dated web pages preserved on Arquivo.pt, training students to use this new type of material.

Honorable mention .PT: “ArchiveChain”

The DNS.PT Association has awarded an Honourable Mention to . PT to the Professor who encouraged the submission of ‘ArchiveChain’. This work was developed as part of Bruno Cotrim’s Master’s dissertation in Computer Engineering at the Faculty of Sciences of the University of Lisbon, with scientific guidance from professors Bernardo Ferreira (Faculty of Sciences) and Miguel Matos (Instituto Superior Técnico).

The concept is explained by its authors as follows: ‘ArchiveChain’ is a blockchain that democratises the mission of archiving the Portuguese historical web. In ArchiveChain, all citizens are invited to save pages from Arquivo.pt, and whoever saves the most pages receives the most rewards in the form of cryptocurrencies. At the same time, the pages saved by participants are used as ‘fuel’ in its internal workings, making it possible to implement smart contracts in a sustainable way and avoiding the massive energy consumption of other blockchains such as Bitcoin.

Start exploring the ArchiveChain platform and contribute to the better preservation of the Portuguese web.

Work website
Work description
Application submission video
Video presentation of the work to the jury
Slides presentation to the jury
Draft report
Project repositories on GitHub:

Honorable mention 25 de Abril e a Democracia: “Arquivo 25 de Abril”

The Comissão Comemorativa 50 anos 25 de Abril (Commemorative Commission for the 50th anniversary of 25 de Abril) awarded an Honourable Mention called ‘25 de Abril e a Democracia’ to the work ‘Arquivo 25 de Abril’, developed by Miguel Garcia.

The ‘‘Arquivo 25 de Abril’ website provides an online archive of journalistic articles about various personalities, events and movements that were relevant before and during 25 April 1974. For each of these elements, articles were collected from various leading media organisations, published online, taking this historical context into account.

For example, by choosing ‘artists’ related to 25 April, a user gets the name and a photo of singer Adriano Correia de Oliveira, the first in a list, and then related news published in the media and preserved by Arquivo.pt.

Imprensa

New at the Público newspaper, media partner (official annoucement of the winners)

Saber mais

RESAW 2025 conference had the participation of Arquivo.pt

June 12, 2025June 12, 2025 by Ricardo Basílio

Arquivo.pt was present at the 6th RESAW Conference for researchers in the Digital Humanities, Media and Communication and other areas, on the theme of ‘The Datafied Web’, which took place at the University of Siegen, Germany, from 4 to 6 June 2025.

RESAW (Research Infrastructure for the Study of Archived Web Materials) is an informal initiative that brings together researchers who use web archives in their research. RESAW’s first conference was in 2015, and it is now held every two years.

Initially, RESAW brought together European researchers, but now it brings together researchers from all over the world and has become a unique forum of its kind. In 2025, it had more than 100 participants. It brings together the best in the field of using web archives in research.

Niels Brügger, Professor of Media and Communication at Aaharus University in Denmark, has been its main driving force for 10 years.

Other leading researchers with studies on web archives are: Valerie Schafer from the University of Luxembourg, Jane Winters from the University of London, Anne Helmond from the University of Utrecht, Susan Aasman from the University of Groningen, Sophie Gebeil from Aix-Marseille University and Ian Millingan from the University of Waterloo.

This year’s theme “The Datafied Web” addressed the issue of the datification of the Web, from its beginnings in the 1990s to the present day, marked by massive data processing and the use of Artificial Intelligence.

Why would a web archive take part in an academic meeting?

Arquivo.pt has been a regular participant in RESAW since 2019, as it wants to make itself increasingly known as a service for national and international researchers.

Thanks to participation in international events such as RESAW, several publications have appeared that use and refer to Arquivo.pt. Any researcher with Internet access can search the information preserved on Arquivo.pt, use the APIs, process information or train their models.

We invite Portuguese researchers to take part in this meeting, as we have been the only Portuguese presence in several editions. We have an accessible web archive, ready to use, which is not the case in other countries. We would like to see researchers in the fields of Digital Humanities and Media and Communication in Portugal using Arquivo.pt more often and actively participating in meetings like RESAW.

Arquivo.pt’s contribution to RESAW 2025

Arquivo.pt contributed two presentations to the 2025 edition of the RESAW meeting, held at the University of Siegen. The first was about the Arquivo.pt APIs and their application in a research context, by Vasco Rato. The second was about the open datasets and lists of websites on topics and events that Arquivo.pt has prepared to help researchers start exploring archived information in greater depth.

Image gallery

RESAW 2025 na Universidade de Siegen

Portuguese Legislative Elections 2025 had a special collection by Arquivo.pt

May 28, 2025May 26, 2025 by Ricardo Basílio

Last updated on May 28th, 2025 at 09:04 am

Arquivo.pt carried out a special collection of content published online in connection with the Legislative Elections of 18 May 2025.

More than 8,000 unique pages were recorded, before and after the elections, resulting in around 250 Gigabytes of information.

This collection includes news items from the media, party websites and other citizen publications documenting this important event in Portuguese life.

The data collected is available for researchers to use in their work and projects.

Methodology for collecting the electoral event

The collection was carried out using a semi-automatic methodology that allows information to be identified and collected quickly and saves resources. The steps were as follows:

preparation of a list of search terms;
automatic search with the Bing Search API;
extraction of a list of page addresses or URLs;
recording (using Browsertrix-crawler);
integration into Arquivo.pt;
making the dataset available for research.

The starting point for identifying pages for this electoral event was a list of search terms, including words, names, dates, website addresses and also words in other languages. For example, we used ‘eleições’ “legislativas”, 2025, candidate names, party websites, newspaper websites and ‘eleições Portugal’ in other European languages to find foreign media pages that referred to the Portuguese elections. A total of 384 search terms were used.

The extracted addresses or URLs are then recorded, assuming that there are pages that miss the target and favouring speed, an important factor in this type of event.

A search was carried out to identify web pages before the elections and two the following week, with the corresponding recording, in order to add new content to the collection.

Finally, all the data from this special collection was published. Researchers are invited to use this information for projects or studies and to compete for the annual Arquivo.pt Award.

Legislative Elections 2025 data set

The dataset Legislative elections 2025: list of web pages with electoral content for preservation at Arquivo.pt was published at the open data Dados.gov.pt.

Find out more about electoral recalls from previous years

MOOC on Arquivo.pt and web archives launched and open to the community

May 21, 2025May 16, 2025 by Ricardo Basílio

Last updated on May 21st, 2025 at 11:42 am

The online training programme on Arquivo.pt, entitled The Web of the Past: Preservation and Research, has been launched and is open free of charge on the NAU platform to anyone who wants to deepen their knowledge of web archiving and Arquivo.pt services.

Daniel Gomes, manager of Arquivo.pt, who developed this training programme, announced it, at a first-hand, at the Faculdade de Letras da Universidade de Coimbra, during the workshop Digital preservation: tools as practices, held on May 7, 2025.

Registration open on the NAU platform for the MOOC web archiving

NAU – Sempre a Aprender is the e-learning platform of the FCCN, Foundation for Science and Technology (FCT) digital services unit. The NAU initiative focuses on supporting the publication and dynamisation of content in the Massive Open Online Courses (MOOC) format in Portuguese.

The aim of this programme is to develop skills in searching the Web’s digital memory, with an emphasis on using Arquivo.pt both in everyday life and in the context of studies and research.

The programme is divided into four courses:

Preservação da web e arquivos (Preserving the web and archives)
Pesquisar e aceder ao passado com o Arquivo.pt (Search and access the past with Arquivo.pt)
Bem publicar para bem preservar (Publish well to preserve well)
Casos de uso do Arquivo.pt (Arquivo.pt use cases)

No special requirements are needed, apart from a computer with Internet access and a browser such as Google, Chrome, Internet Explorer.

Spread the word: arquivo.pt/mooc

Know more

Interview Internet Day, May 17, published on NAU website

Arquivo.pt at the University of Coimbra to talk about digital preservation

May 22, 2025May 10, 2025 by Ricardo Basílio

Last updated on May 22nd, 2025 at 06:45 pm

Arquivo.pt took part in the workshop entitled “Digital preservation: tools and practices”, promoted by the Faculty of Letters of the University of Coimbra, on the afternoon of May 7, 2025. Moderated by Inês Santos, we highlight the initial panel with excellent speeches by Moisés Rockembach (University of Coimbra), Humberto Innarelli (Unicamp, Brazil) and Daniel Gomes (Arquivo.pt, digital service of FCCN-FCT).

The aim of the meeting was to offer the community a critical reflection on new trends in digital preservation tools and practices.

Digital preservation is a cross-cutting issue for organizations, as they all produce and generate information in digital format. There is a growing range of tools and solutions that promise greater efficiency in information processing. Many are labeled Artificial Intelligence. Such an abundance of products and frameworks calls for greater discussion and a critical approach. And this was achieved brilliantly by the panel of speakers.

Three approaches to Artificial Intelligence and Digital Preservation

This meeting brought together three authors of works on digital preservation at the Amphitheatre III of the Faculty of Letters of the University of Coimbra and discussed different approaches.

Moisés Rockembach, co-author with Caterina Pavão of Arquivamento da Web e preservação digital (Archiving the Web and Digital Preservation), the first work in Portuguese on web archives, focused his presentation on the impact of Artificial Intelligence on digital preservation systems, namely on searching for and accessing information, in classification and indexing processes, for example. With regard to the impact of the new tools that digital technology offers us, he referred to a phrase by Demi Gretscko: “The process of searching for and capturing information described in the text could certainly be improved in the future, especially when considering the contribution of new tools, such as those of Artificial Intelligence”.

There are Artificial Intelligence tools that allow interesting access to information through novelty and format. Archiving must take this reality into account and test the extent to which it can transform the way in which many types of content are disseminated and accessed. One example to illustrate this idea was the presentation of a Podcast generated by Artificial Intelligence from An example to illustrate this idea was the presentation of a Podcast generated by Artificial Intelligence, based on chapter 2 of the book on Web Archives, which deals with digital preservation policies.

Link to Podcast generated by Artificial Intelligence (published on Instagram, in Portuguese)

Humberto Innarelli, author of Criptex da preservação digital (Digital preservation cryptex), coordinator of the Arquivo Edgard Leuenroth (AEL) and specialist archival researcher at Unicamp, São Paulo and PhD professor at the Paula Souza Centre, São Paulo, posed the question of the future of digital preservation. Until now, the practice for preserving dynamic digital content has been to convert it into static documents. On the other hand, information is increasingly given to us dynamically, from databases or algorithms and Artificial Intelligence. What’s the next step? Archival practice needs to look not only at metadata, as it has done in recent years, but also at what explains how the information was generated (what we might call paradata). This is the only way to put archives and digital preservation in the long-term perspective. A hundred or two hundred years from now we should still be able to access the digital information produced today.

Daniel Gomes, editor of the book The Past Web and founder of Arquivo.pt, discussed the issue of Artificial Intelligence as it relates to non-artificial, human-produced content. What added value do tools that generate text, images, audio or video bring? If we consider, for example, that a Podcast on digital preservation used a book written by a human author as its basis, what new knowledge did it generate? Little or none. So, what has come to be called Artificial Intelligence can be considered a way of presenting human knowledge and in no way exempts humanity from continuing to think, research and produce new knowledge.

Arquivo.pt preserves content that has been published by individuals and organizations and in this sense is a unique source of its kind. Information published on the web is important for reporting and better understanding recent history, since the 1990s. Any Artificial Intelligence tool will have to go back to the point where the information was created by people. The human origin of the content preserved by Arquivo.pt, and the same can be said of traditional archives, makes them of enormous value, even considering their economic value. How much is the information stored in a web archive worth?

New MOOC (Massive Online Open Course) about web archiving

Daniel Gomes, Manager of Arquivo.pt, has announced first-hand the online course on the NAU platform: The Web of the Past: Preservation and Research (in Portuguese).

The online course or MOOC (Massive Online Open Course) is available for those who want to deepen their knowledge of web preservation.

The short link for dissemination is arquivo.pt/mooc

Preserved Arquivo.pt data and its automatic processing by APIs

Vasco Rato, developer of Arquivo.pt, showed how the automatic processing interfaces, Application Programming Interfaces (APIs), work.

Arquivo.pt data can be processed by Artificial Intelligence. The works competing for the Arquivo.pt Award have already demonstrated this, as have projects such as GlórIA, a Large Language Model developed at NOVA-FCT.

Finally, Ricardo Basílio, digital curator, showed how anyone can save a page or an entire website on their own computer in a standardized format, compatible with web archives. ArchiveWeb.page and browsertrix-crawler were used for this, as training tools. This practice allows the community to be increasingly active in preserving institutional information published on the Web.

Agenda

14h30 Panel – Moderator: Inês Santos, University of Coimbra

Digital Preservation and Artificial Intelligence – Moisés Rockembach, University of Coimbra – Slides
Cryptex for Digital Preservation: The Next Step – Humberto Innarelli, Unicamp – Slides
Arquivo.pt and Web Preservation – Daniel Gomes, FCCN-FCT – Slides

16h00 Break

Open Data for Research. Automatic information processing through APIs – Vasco Rato, FCCN-FCT – Slides
Demo – Archiving the Web: do-it-yourself – Ricardo Basílio, FCCN-FCT – Slides
- Manual recording demo with ArchiveWeb.page
- Automatic recording demo with Browsertrix-crawler

17h00 – Final

Image gallery

Images on the Coimbra University social media

Video of some moments from the event (published on Facebook)

Workshop na Faculdade de Letras da Universidade de Coimbra

Arquivo.pt in Coimbra at scientific computing event Jornadas FCCN 2025

May 17, 2025May 10, 2025 by Ricardo Basílio

Last updated on May 17th, 2025 at 12:52 pm

The Arquivo.pt team was in Coimbra between 6 and 8 May, at Jornadas FCCN to promote the preservation of the Portuguese Internet, as dissemination and promotion are an important part of its mission.

The Jornadas FCCN event is the responsibility of FCT’s digital services and annually brings together hundreds of participants from higher education institutions and other entities linked to science and technology.

On Tuesday morning, Pedro Gomes presented the highlights of the FCCN Zapping session and in the afternoon, from 4.30pm to 6pm, there was the Arquivo.pt session, Hands on for archiving the Web.

On Wednesday 7th, at 2.30pm, the Arquivo.pt team went to the University of Coimbra to take part in a meeting organised by the Faculty of Arts and Humanities (FCUL) entitled Digital preservation: tools and practices (Amphitheatre III, Floor 4).

Late on Wednesday afternoon, Daniel Gomes took part in the session Democratising AI: making Artificial Intelligence accessible to all on the contribution of Arquivo.pt to LLM AMÁLIA.

Arquivo.pt highlights at FCCN’s Zapping session

Pedro Gomes, who is in charge of Arquivo.pt’s collections, showed the oldest image archived on Arquivo.pt, which is on the old University of Coimbra website. He emphasised the new functionality that allows Flash content to be played, the statistical data of the Arquivo.pt, prizes, and the data sets.

Hands-on web archiving

This session, led by Ricardo Basílio, digital curator at Arquivo.pt, showed how to save web pages in standardised format using your own computer.

We believe that a ‘do-it-yourself!’ training is part of Arquivo.pt’s mission to promote the preservation of the Internet. By showing how website recording works, we’re also strengthening the community’s connection to Arquivo.pt.

For those who need to save high-quality copies of websites, this session will help. Participants were challenged to record static pages and others with interactive content, videos and social networks. Based on the questions that arose during the practical exercises, we clarified doubts and showed that archiving web content is very easy.

We used the ArchiveWeb.page extension, a tool from Webercorder.net, which the participants could obtain free of charge and install on their own computers.

If you are a computer scientist or advanced IT user

For those who expect and need to save entire websites automatically, we’ll briefly mention Browsertrix-crawler, an advanced tool that runs on a Docker, on Linux. Computer scientists and advanced IT users are all invited to try their hand at recording and archiving websites.

The demonstrations and exercises we propose using ArchiveWeb.page or Browsertrix-crawler also apply to advanced use cases and respond to organizations’ day-to-day web archiving needs.

Materials for the “hands-on” session

Democratising AI: making Artificial Intelligence accessible to everyone

On the second day of the FCCN Conference, 8 May 2025, in the session dedicated to Artificial Intelligence, Daniel Gomes, from FCNN-FCT, and João Magalhães, from NOVA-FCT, presented “AMÁLIA: Automatic Multimodal Language Assistant with AI”.

Daniel Gomes explained how Arquivo.pt is used for large-scale processing, specifically through the Arquivo.pt Application Programming Interfaces (APIs).

APIs allow researchers to access information from Arquivo.pt automatically and develop various applications in research projects. For example, projects such as Conta-me Histórias, the Portuguese language model GlórIA LLM and, currently, AMÁLIA LLM have used APIs.

Presentation slides

Images gallery

Jornadas FCCN

Arquivo.pt Links Dataset: Unveiling the Web’s Hidden Structure

May 13, 2025April 30, 2025 by Ricardo Basílio

Last updated on May 13th, 2025 at 02:29 pm

The interconnected nature of the World Wide Web has long fascinated researchers and technologists alike. Today, we are thrilled to announce the release of the Arquivo.pt Links dataset, a comprehensive collection that opens new possibilities for understanding and analyzing web connectivity patterns.

The dataset encompasses more than 139 million webpage URLs, each accompanied by crucial metadata about their incoming links – both the source URLs and their corresponding anchor texts, i.e., visible and clickable text in hyperlinks. This rich collection of interconnection data provides researchers with a unique window into the web’s underlying structure.

The importance of hyperlinks in web architecture cannot be understated. They serve as the fundamental building blocks of web navigation and discovery, enabling both users and automated systems to traverse the vast landscape of online content.

Links formed the foundation of Google’s revolutionary PageRank algorithm, which transformed our approach to information retrieval and web search. PageRank’s fundamental insight – that a page’s importance could be measured by analyzing its incoming links – revolutionized search technology and remains influential in modern information retrieval systems.

By making this dataset publicly available, Arquivo.pt enables researchers to explore similar innovative approaches to web analysis and search engine development. The dataset opens up numerous exciting research possibilities across multiple domains:

Researchers can implement and experiment with various ranking algorithms, from classic approaches like PageRank to modern machine learning-based techniques. The inclusion of anchor texts provides valuable semantic context that can enhance search relevance and document classification.
The dataset enables deep analysis of web topology and link structures. Researchers can investigate questions about web connectivity patterns, identify clusters of related content, and study how information spreads across the web through link networks.
The anchor text associated with each link offers a rich source of human-generated descriptions of web content. This data can be particularly valuable for developing and testing document summarization algorithms, semantic analysis tools, and automated classification systems.
For web archiving researchers, this dataset provides insights into how web pages are connected and referenced over time, offering valuable data for studying web preservation strategies and digital heritage maintenance.

Methodology

The process begins with a temporal snapshot of web pages from a specific time period (collection). During this initial phase, our systems analyze each captured page, extracting all outgoing hyperlinks along with their associated anchor texts and capture timestamps. This creates a preliminary mapping of how pages connect to one another within our captured timeframe.

What makes this dataset particularly valuable is its inverted link structure. Rather than organizing the data around source pages and their outgoing links, we’ve created an inverted map that centers on destination pages and their incoming links. This approach is particularly useful for analyzing a page’s importance or authority within the web’s structure, as it provides immediate access to all pages that reference or point to a given URL.

Consider a traditional link structure where Page A links to Pages B, C, and D. In our inverted structure, we instead see entries for Pages B, C, and D, each listing Page A as a source of incoming links. This reorganization of the data facilitates more efficient analysis of page authority and influence, making it particularly valuable for researchers working on ranking algorithms or studying information flow patterns across the web.

The Arquivo.pt links dataset combines three distinct web collections:

PWA9609 (1996-2009): 89 million pages capturing early Internet evolution, focused on the .pt domain. This historical collection provides insights into early web linking patterns.
AWP38 (Oct-Nov 2021): 44 million pages offering a contemporary snapshot of web connectivity, with emphasis on the .pt domain while including broader Internet content.
FAWP47 (Oct-Dec 2021): 8 million pages from daily captures of .pt domain content, designed to track short-term changes in link patterns.

Getting Started with the Dataset

Researchers can access the complete dataset. The data is provided in a format that supports efficient processing and analysis, making it suitable for both large-scale studies and focused investigations.

Conclusion

The release of the Arquivo.pt links dataset represents a significant contribution to the web science research community. By making this rich collection of web connectivity data freely available, we hope to facilitate innovative research and deepen our understanding of the web’s complex structure.

We encourage researchers to explore this dataset and look forward to seeing the novel insights and applications that emerge from its analysis. Whether you’re interested in developing new search algorithms, studying web topology, or investigating content relationships, this dataset provides a robust foundation for your research.

Arquivo.pt took part in the IIPC Web Archiving Conference in Oslo

July 4, 2025April 15, 2025 by Ricardo Basílio

Last updated on July 4th, 2025 at 08:32 am

Four members of the Arquivo.pt team were in Oslo, Norway, to take part in the General Assembly of the International Internet Preservation Consortium and the Web Archiving Conference, from 8 to 15 April 2025.

The National Library of Norway was the host institution for this international event. The Norwegian Web Archive is part of the Library’s mission and is held in a second location specialising in digital preservation, in the city of Mo i Rana, in the centre of the country.

The first day, 8 April, was dedicated to the General Assembly, exclusively for members of the consortium, and to the working groups in which Arquivo.pt plays an active role. The Content Development Working Group is dedicated to the creation of thematic collections and has the participation of Arquivo.pt in the ‘Street Art’ collection. The Training Working Group creates training content and training actions, such as IIPC webinars and face-to-face workshops.

The Web Archiving Conference was held on 9 and 10 April, an event open to all entities and initiatives related to web preservation and archiving.

Arquivo.pt’s contribution

Arquivo.pt presented its services and initiatives for interacting with the community, such as its collaboration with the Sines Municipal Archive in preserving content of local interest. The concern with access to content, both for researchers and for citizens in general, is an aspect that is highly appreciated by the IIPC community.

Arquivo.pt toolkit for web archiving – Lightning talk session 1 – Daniel Gomes – Slides, video
Arquivo.pt Query Logs – Lightning talk session 3 – Pedro Gomes – Slides, video
Collaborative collections at Arquivo.pt: four years of recordings from the city of Sines (Portugal) – Lightning talk session 4 – Ricardo Basílio – Slides, notes, video
API/Bulk access and its usage – Poster slam – Vasco Rato – Poster
Arquivo.pt annual awards: a glimpse since 2018 – Poster slam – Daniel Gomes – Slides

Image gallery

IIPC Web Archiving Conference 2025, Oslo

Arquivo.pt training with APDSI. Sign up!

April 5, 2025March 13, 2025 by Ricardo Basílio

Ciclo de Webinars do Arquivo.pt com a APSDI

Last updated on April 5th, 2025 at 01:10 pm

APDSI – Associação para a Promoção e Desenvolvimento da Sociedade da Informação (Association for the Promotion and Development of the Information Society) promoted a Cycle of Webinars on Arquivo.pt, held between March 20 and April 1, 2025.

This Webinar Cycle, dedicated to the preservation of cultural memory published on the Web, is a collaboration between APDSI and Arquivo.pt, the FCCN digital services of the Fundação para a Ciência e a Tecnologia.

Luís Vidigal, Founding Partner of APDSI, Filipa Fixe and João Tavares, Board Members, introduced the theme of each session and the Arquivo.pt team showed how the preservation of web content works, allowing organizations and citizens to access the web of the past.

The four sessions had a total of 121 participants.

Program

Webinar 1 – March 20 – Arquivo.pt: a new tool for researching the past. Daniel Gomes, Head of Arquivo.pt – Vídeo, slides
Webinar 2 – March 25 – To publish well, to preserve well. Pedro Gomes, Arquivo.pt Collections Manager – Vídeo, slides
Webinar 3 – March 27 – Access and automatic processing of information preserved from the Web through APIs. Vasco Rato, Web developer, Vídeo, Slides
Webinar 4 – April 1 – Archiving the Web: do-it-yourself! Ricardo Basílio, Digital Curator – Video, slides

Registration (free but required)

Know more

Arquivo.pt took part in E-Archiving Portugal workshop

March 11, 2025March 3, 2025 by Ricardo Basílio

Professor José Borbinha, eArchiving workshop, 25 February 2025, at the Instituto Superior Técnico in Lisbon (José Tribolet Room)

Last updated on March 11th, 2025 at 04:22 pm

Arquivo.pt took part in the eArchiving Portugal workshop, which was held at the Instituto Superior Técnico on 25 February 2025, at the invitation of Professor José Borbinha, one of the first people to do web archiving in Portugal when he worked at the Biblioteca Nacional in the 90’s.

Professor José Borbinha, better than anyone, knows how to tell in the first person the small, almost epic episodes, the actions of the first ‘heroes’ that led to the creation of a web archive in Portugal. He sees Arquivo.pt as an essential service when it comes to digital preservation and safeguarding organisations’ communication heritage.

The event had a hybrid format with 50 in-person and 270 online participants and was open to all public and private organisations concerned with digital preservation and information management in any type or format. This includes the content of websites and social networks!

The heads of municipalities and local government organisations took part in the event, responding to the call from the Direção-Geral do Livro, dos Arquivos e das Bibliotecas (DGLAB). This call for people was an opportunity to show how Arquivo.pt can help preserve institutional websites and comply with Portaria n.º 112/2023, de 27 de abril.

eArchiving, a European initiative born in Portugal

The eArchiving Initiative‘s main objective is digital cultural heritage and was created at a meeting of European partners in Lisbon.

‘It was precisely in this room (the José Tribolet room at the Instituto Superior Técnico) that eArchiving began eleven years ago, on 29 May 2014,’ recalled José Borbinha (INESC-ID), host and organiser of the workshop.

The eArchiving initiative is managed on behalf of the European Commission by the E-ARK Consortium, which includes Portuguese partners KEEP Solutions LDA and INESC-ID. The consortium also includes the AIT Austrian Institute of Technology GmbH, the lead partner, and the DLM Forum MTÜ.

Janet Anderson, manager of eArchiving, showed the progress made in eleven years in the field of digital preservation. The projects funded by the European Union within the consortium have resulted in the development of specifications, software, training and knowledge about digital preservation.

This was followed by a presentation of contributions to digital preservation in Portugal: DGLAB, by Pedro Penteado, Centro Hospitalar São João, by Fernanda Gonçalves, Ministério da Justiça, by Alexandra Lourenço and Cristina Soares, Arquivo.pt, by digital curator Ricardo Basílio.

Finnaly, Miguel Ferreira spoke on behalf of DLM Forum MTÜ , a community in which KEEP Solutions LDA participates by developing software. Taking a more technical approach, he showed how the metadata in the E-Ark packaging specifications is structured to fulfil the requirements of digital preservation.

How to use Arquivo.pt to preserve institutional websites

Digital preservation requires collaboration, both internally and externally between organisations, and this workshop served that purpose: sharing good practices, disseminating tools and services and connecting people.

Arquivo.pt highlighted three services from its catalogue for preserving content published on the web:

High-quality archive of websites (on-demand )
Memorial preserves your old website information before deactivating it
Training on web preservation

Arquivo.pt services can be used, for example, by municipalities to preserve content published on institutional websites.

Arquivo.pt training, such as webinars or face-to-face sessions, are useful for empowering organisations to take care of institutional content, including social media content that requires an alternative strategy.

Arquivo.pt presentation

Know more

Video of all speakers, soon at E-ARK