Municipality of Sines and Arquivo.pt together on the International Archives Day

thumbnail-sines-dia-internacional-dos-arquivos

Last updated on June 27th, 2022 at 08:40 am

The Municipal Archive of the Municipality of Sines and Arquivo.pt celebrated the International Archives Day, June 9, at the Salão Nobre dos Paços do Concelho, with a Workshop on preserving the digital memory of Sines (Portugal).

The meeting was broadcast online with the aim of sharing with the community of archivists what has been an experience of collaborative curation of Web content.

Collaboration between a municipal archive and a web archive

This meeting took place in the continuity of a collaboration between the two teams developed during the pandemic period.

The Arquivo Municipal de Sines made a selective and systematic collection of Web content related to the Municipality of Sines, with the collaboration of local media, such as Rádio Miróbriga and Rádio Sines.

In turn, Arquivo.pt contributed with training on tools, like Webrecorder.net, that records in standardized format and prepared useful services, such as SavePageNow that allows to record pages on the fly directly on Arquivo.pt.

Local history is better with preserved Web pages

From this collaboration resulted the preservation of thousands of Web pages (about 200 Gigabytes of information) about the experience of the pandemic in the geographical area of Sines and Santiago do Cacém.

The copies of the Web Archive Files (WARCs) sent to Arquivo.pt have been integrated to become available.

Presentations

Cryptocurrencies and web curation on the 15th anniversary in Viseu

Last updated on August 2nd, 2024 at 12:22 pm

Session of Arquivo.pt at the Jornadas 2022

Arquivo.pt was at the annual meeting Jornadas de Computação Científica 2022, held from May 31st to June 2nd, at the Instituto Politécnico de Viseu.

Cryptocurrencies and web curation were the starting point for sharing the news of the service and talking about the work developed since the last edition of the Jornadas.

Zapping session remembered the 15 years of Arquivo.pt

Arquivo.pt was created in 2007 with the goal of collecting the Portuguese Web. After fifteen years it continues its mission, collecting, but mainly facilitating the access to preserved contents, both for the researcher and the common citizen.

In the Zapping session at the conference, in which each FCCN service presented its services, the Arquivo.pt was highlighted for its long-standing activity in Web preservation.

Training with the Library of the Escola Superior de Tecnologia e Gestão

The Arquivo.pt team was in the Library of the School of Technology and Management (ESTGV) in an extra session of the conference dedicated to digital preservation, mainly to institutional content published on the Web.

The training was promoted by the Library team, especially Dr. Rosa Silva, Coordinator of the service, and had the participation of the community. Besides the presentations, there was an opportunity to share ideas and point out future collaborations.

Paulo Medeiros, responsible for the service of Culture, Communication and Documentation, presented the institutional channels of the Instituto Politécnico de Viseu. These channels are increasingly present on the Web, such as the magazine Polistécnica that went digital in 2012, the scientific journal Millenium and the video channel Politécnico TV.

Arquivo.pt showed how any person or institution can have their Web contents preserved in an adequate format. To save contents directly on Arquivo.pt you can use the new SavePageNow recording service. To make a local Web archive you can use ArchiveWeb.page – Webrecorder.net.

Arquivo.pt APIs presented to Internet technologies students

The Arquivo team was in the classroom, thanks to the excellent welcome given by Prof. Dr. Valter Alves, director of the Design and Multimedia Technology course. Vasco Rato, Web developer of the Arquivo.pt, presented the APIs of the Arquivo.pt (Applications Programming Interfaces) for the automatic processing of preserved information.

By using the APIs of Arquivo.pt the students can make assignments for the technology subjects and compete to the Arquivo.pt Award.

Image gallery

Daniel Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Daniel Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Daniel Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Pedro Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Pedro Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Pedro Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Pedro Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Ricardo Basílio na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Ricardo Basílio na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Ricardo Basílio na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Ricardo Basílio na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Sessão de formação na Biblioteca da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Aula no curso de Tecnologia Design e Multimédia da ESTGV

Daniel Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Pedro Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Pedro Gomes na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Ricardo Basílio na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Ricardo Basílio na sessão do Arquivo.pt nas Jornadas FCCN 2022 em Viseu Sessão de formação na Biblioteca da ESTGV Sessão de formação na Biblioteca da ESTGV Sessão de formação na Biblioteca da ESTGV Sessão de formação na Biblioteca da ESTGV Sessão de formação na Biblioteca da ESTGV Sessão de formação na Biblioteca da ESTGV Sessão de formação na Biblioteca da ESTGV Aula no curso de Tecnologia Design e Multimédia da ESTGV Aula no curso de Tecnologia Design e Multimédia da ESTGV Aula no curso de Tecnologia Design e Multimédia da ESTGV Aula no curso de Tecnologia Design e Multimédia da ESTGV Aula no curso de Tecnologia Design e Multimédia da ESTGV

Web pages for the history of the Instituto Politécnico de Viseu

In 2018, the library team developed a project with the participation of young students that resulted in a documentary short film where memories of old web pages, preserved by Arquivo.pt, were included.

Arquivo404 presents web-archived pages instead of “pages not found”

thumbnail- erro404-en-

Last updated on November 14th, 2023 at 02:46 pm

Does your website presents “Error 404 – Page not found” messages to your users?

Arquivo.pt offers a solution for this problem through Arquivo404.

Just insert a single line of code in the page that generates the 404 error message on your website and web-archived pages will be presented to your users instead of pages not found.

See these examples on websites that installed arquivo404.

How does Arquivo404 work?

example-fccn-pt-arquivo404-en

When a page is no longer on a website, Arquivo404 checks if a preserved version exists.

When a user tries to access a page that is no longer available on a website, Arquivo404 automatically checks if there is a version of that page preserved in Arquivo.pt.

If the page exists in Arquivo.pt, a link is presented so that the user may visit this version. If it does not exist, the normal error page is displayed.

See Arquivo404 at work in this example of an error page that presents a link automatically generated by Arquivo404.

How to install arquivo404 on your website?

The simplest implementation of arquivo404 is to insert the following Javascript on the HTML code that generates the “Page not found” message:

<script type="text/javascript" src="https://arquivo.pt/arquivo404.js" async defer onload="ARQUIVO_NOT_FOUND_404.call();"></script>

The code in Arquivo404 can easily be adapted. You can for example create a customised error message.

Hint for WordPress websites: When editing the 404 error page and inserting the arquivo404 script inside the <body>, you must put the <!– wp:html –> tag at the beginning and the <!– /wp:html –> tag at the end, otherwise the script will be deleted.

If you have any questions or issues, please contact us!

Know more

Short link to this page: arquivo.pt/arquivo404en

SavePageNow to record webpages immediately on Arquivo.pt

thumb_savepagenow

Last updated on August 23rd, 2022 at 11:51 am

Arquivo.pt launched a new version, called Francisco, on the 19th of January 2022.

The SavePageNow function stands out, allowing anyone to save a Web page to be preserved by Arquivo.pt.  It is only necessary to enter a page’s address and browse through its contents.

Arquivo.pt SavePageNow was inspired on the Internet Archive Save Page Now and implemented using webrecorder pywb.

For example, a publication on the FCCN blog marking the 30th anniversary of the Internet in Portugal was saved with SavePageNow and preserved at Arquivo.pt. This way, anyone using SavePageNow is contributing to the contents published on the Internet not being lost.

 

Help us to improve!

The user interfaces have been recoded to be optimized, so we need your help to test them in different devices of various brands (e.g. mobile phones, tablets, laptops).

If you detect any problems, please contact us!

Remember to always send the address of the page where you detected the problem.

To Know more

 

How to preserve Web references from Wikipedia?

Wikimedia Portugal e Arquivo.pt

Last updated on May 19th, 2022 at 07:05 pm

Wikimedia Portugal has started a collaboration with Arquivo.pt that aims at raising the community’s attention to the preservation of contents published on Wikipedia.

Eighty percent of the pages published on the Web disappear or are changed, just one year after their publication. At the same time, the information in Wikipedia is based on information mostly published on the Web. The disappearance of reference information undermines the reliability of Wikipedia articles.

Webinar cycle “Cultural Heritage on the Web: how to preserve references in Wikipedia?”

The cycle of Webinars, promoted by Wikimedia Portugal, includes educational content that enriches the training of information and communication professionals but also the digital literacy of any citizen.

Arquivo.pt and the preservation of digital memory (1st Webinar)

Gonçalo Themudo, President of Wikimedia Portugal, introduced the 1st webinar of the cycle entitled Cultural heritage on the Web: how to preserve references in Wikipedia?. He stressed the importance of preserving the references (URLs) used by authors when publishing articles in Wikipedia. Daniel Gomes, Manager of Arquivo.pt, showed how Arquivo.pt preserves Web contents and how the community of Wikipedia authors can contribute to the effective preservation of those contents.

  • Held on February 22, 2022
  • Speaker: Daniel Gomes, Arquivo.pt
  • Slides
  • Video

Automatic access and processing of preserved information from the Web through APIs (2nd Webinar)

Webinar that presents the Archive.pt’s APIs (Application Programming Interface) that enable the automatic processing of historical information preserved from the Web, in order to develop innovative and useful applications for organizations. This Webinar is mainly intended for IT professionals (e.g. Web developers, Web designers, Web marketers).

  • Date: 22 Mar. 2022 15:00 – 16:30
  • Speaker: Vasco Rato, Arquivo.pt
  • Slides
  • Video

Web archiving: do it yourself! (3rd Webinar)

Webinar that presents how to preserve cultural information of a municipal and national nature published on the Web. It demonstrates through practical cases how anyone can archive information published on the web in a proper format that will allow its preservation for the future using free tools. This Webinar is intended for any Internet user but is particularly useful for those responsible for communication and information management in organisations.

  • Date: 19 Abr. 2022 15:00 – 16:30
  • Speaker: Daniel Gomes, Arquivo.pt
  • Slides
  • Video

On line Cafe with Arquivo.pt continues

Last updated on August 6th, 2024 at 02:11 pm

banner-cafe-com-o-arquivo-pt

Share this page: arquivo.pt/onlinecafe

Welcome to the third season of the Online Cafe with Arquivo.pt

Talk directly to the Arquivo.pt team and get answers to all your questions! The Arquivo.pt launched a new cycle of team chats with you through online sessions. Brief introductory presentations will be given, leaving time to ask all your questions about how to get more out of Arquivo.pt or how to apply to the Arquivo.pt Awards.

Sessions

February 17, 2022 – Primeiras páginas de jornais online portugueses

Primeiras páginas de jornais online portugueses” (Front pages of Portuguese online newspapers) presents an interactive graphical analysis of the front pages of Portuguese online newspapers. For this study, specific items within the newspaper design were analysed, thus allowing trends to be observed over time.

Susana Parreira, explains how she developed this work as part of her Masters, with the collaboration and guidance of Ana Boavida (Universidade de Coimbra) Ana Sabino (Instituto Politécnico de Castelo Branco) and Penousal Machado (Universidade de Coimbra).

22nd session –  January 20, 2022 – Politiquices

Politiquices.pt, allows to research support or opposition relations between political personalities and parties expressed in news headlines. This application uses information preserved in Arquivo.pt to create an ontology of relations. It uses Natural Language Processing technology. David Batista, 2nd place of Arquivo.pt Awards 2021, will explain how he developed his work and demonstrate the applications for researchers and citizens in general.

Special session – World Digital Preservation Day 2021 – Major minors project – november 5

In November, World Digital Preservation Day is broadly celebrated and, to mark this international initiative, Arquivo.pt held an online session open to the community. Special guests of this session were the winners of the Arquivo.pt Award 2021, Leandro Costa, Paulo Martins and José Carlos Ramalho.

Previous seasons

Presentation at the IIPC Web Archiving Conference 2022

Portuguese municipal elections 2021 preserved by Arquivo.pt

thumbnail_eleicoes_autarquicas

Last updated on May 8th, 2023 at 05:09 pm

Thousands of pages about the elections to preserve before they disappear

On 26 September 2021 the local elections were held in Portugal, an event marked by the Covid-19 pandemic. The communication of the candidates was mainly based on the media and publications through the Web.

Electoral websites are of manifest historical importance. However, they are difficult to identify because they appear and disappear quickly. In the case of municipal elections, the number of candidates and the variety of channels used makes the task even more challenging.

Arquivo.pt, as in previous elections, launched a special collection to preserve contents concerning the municipal elections.

How was the electoral content published on the Web identified

The first step was the manual identification of election-related content by municipality and parish. For this purpose help was requested from people and organisations with the following initiatives:

  • collaborative list “Municipal Elections 2021: we need your help!
  • request for collaboration from the archive services of the 308 municipalities in the identification of electoral sites and candidates of the respective municipality;
  • request to the Parties to send the names of their lead candidates.

The Eyedata – Social Data Lab site was used, which made the names of candidates from all over the country available on the Web.  The Wikipedia page Eleições autárquicas portuguesas de 2021 was also used as a source of information.

This manual identification process resulted in a list of 255 addresses which documented the candidacies for the 2021 Municipal Elections. Notice that 61% of the identified addresses pointed to private social media platforms: 54% facebook.com, 5% instagram.com and 2% twitter.com).

Much of this content of national interest could not be preserved because these foreign private companies do not allow it.

The list with names of candidates by county, party or coalition was used to create automatic searches in Bing that identified the most relevant electoral contents.

For instance, by combining the term “autárquicas 2021” with the name of a candidate and the respective municipality, one obtains results related to that candidate, such as news, initiatives of his/her campaign or the official page of his/her electoral campaign.

This methodology was applied in the Presidential Elections 2021 and in the Europeia Elections 2019. The technical report A transnational crawl of the European Parliamentary Elections 2019 details the applied methodology.

Content collection and availability in Arquivo.pt

Between 22nd August and 8th October 2021, the Arquivo.pt gathered, in an exhaustive manner, pages related to the Local Government Elections 2021.

The resulting collection called Municipal Elections 2021″ (EAWP39) gathers 31 million files that total 2.7 TeraBytes of information and will be available one year later.

Researchers who want to make a study on the 2021 Local Elections and need early access to the collected contents can contact Arquivo.pt.

To know more

Memory of events and festivals of art: PARA SEMPRE

Thumbnail-projeto-para-sempre

Last updated on February 8th, 2022 at 10:57 am

The exhibition Memória de festivais e eventos de arte proposes a look at the Portuguese art scene present on the Web and includes a chronology of these events.

This online information product is a presentation of the results in a systematic and structured way of the PARA SEMPRE project.

cartao-expo-memoria-festivais-e-eventos-de-arte

Online exhibition – arteparasempre.wordpress.com

The project’s second online product will be a directory of references of artists, galleries and projects in the area of contemporary Portuguese art to be made available during 2022, at the Gulbenkian Art Library webpage.

Cycle of Webinars “Art forever on the web”

A cycle of Webinars entitled “Art forever on the web” was held, between April and July 2021, oriented to artists, curators, gallerists and event producers, among others.

The average number of participants was 58 per session, who evaluated their satisfaction, on a scale from 1 to 5, with an average score of 4.6. The three sessions aimed at disseminating knowledge about digital preservation of information on the web and requirements for publishing preservable information.

Identification of artists, galleries and projects

The first step was to identify relevant artists, galleries and projects in the contemporary Portuguese art scene. We started from an initial set of 63 agents (artists, galleries and projects), to which 573 artists belonging to the Modern Collection of the Calouste Gulbenkian Foundation and the BAA – FCG Collection of Artist Books and Independent Publishing were added.

Throughout these months, 636 elements were thus identified (social networks and websites active in 2020), which were subsequently analysed.

The conclusions of the analysis carried out within the project were presented in the last webinar, held on July 1, 2021 :

Special feature on art websites and blogs

In April 2021, Arquivo.pt made a special collection based on the initial identification of artists, galleries and projects and obtained 2.8 terabytes of preserved information.

New contents about art websites were recorded, using tools that allow higher quality collections, such as Brozzler and Webrecorder.

A collaborative project of digital curation

“PARA SEMPRE” (forever) is a digital curatorial project applied to the information made available on the web by the several agents of the contemporary Portuguese art scene (artists, galleries and hybrid sites).

Its main purpose is to contribute to the preservation/reuse of past and future pages, to ensure the preservation of the digital memory of current Portuguese art available at Arquivo.pt, and to promote knowledge on this theme by presenting it in a systematized and structured way.

Its creation results from the encounter of the missions of two organizations: one that aims to ensure the preservation of the Portuguese web, Arquivo.pt, and another that assumes itself as an agent in the development of knowledge about contemporary Portuguese art, the Calouste Gulbenkian Foundation Art Library. This is part of the ROSSIO (Research Infrastructure in the Social Sciences, Arts and Humanities).

Training in colaboration with the City Council of Lisboa

Thumbnail_passaporte-competencias-digitais-arquivopt

Last updated on December 13th, 2021 at 12:02 pm

print_passaporte-competencias-digitais

A cycle of webinars was held between October and December 2021, organised by the Department of Development and Training of the Municipality of Lisbon, within the digital skills program Passaporte Competências DigitaisCâmara Municipal de Lisboa, in collaboration with Centro Qualifica +ValorLx, a Infraestrutura ROSSIO and Arquivo.pt Fundação para a Ciência e a Tecnologia I.P.

The aim of this initiative was to present the services of Arquivo.pt and disseminate their use so that the historical heritage published on the web can be preserved and exploited by any citizen.

The sessions were open by registration and had a total of 126 participants (average of 31 per session).

The speakers’ presentations were recorded and can now be accessed, along with the slides from each session.

Sessions held

September 15 – Arquivo.pt. What is it? What is it for?

Daniel Gomes, manager of Arquivo.pt, the public Web preservation service operated by the Fundação para a Ciência e a Tecnologia, I.P., explains how any citizen can use to consult Web pages from the past in the most diverse cases and talks about the importance of preserving the digital memory.

November 11 – API Arquivo.pt : automatic acess to the Web preserved information

Vasco Rato, web developer of Arquivo.pt, presented the Arquivo.pt’s APIs (Application Programming Interface). These enable the development of innovative and useful applications for organizations through the automatic processing of historical information preserved from the Web.

November 25 – Archive the Web: do-it-yourself!

Ricardo Basílio, curador digital do Arquivo.pt, apresentou um tutorial sobre a utilização das ferramentas do Webrecorder.net para gravação de páginas Web em formato normalizado no próprio computador, a qual permite que uma pessoa ou uma organização possa organizar em pequena escala o seu próprio arquivo da Web.

December 9 – Publish on the Web: best practices  by Arquivo.pt

Pedro Gomes, the engineer responsible for the Arquivo.pt crawls, addressed the issue of publishing preservable web contents. How many contents are in formats that make their future access difficult or impossible? These situations were illustrated with practical cases and recommendations on how to avoid them. Therefore, it all boils down to publishing well in order to preserve well.

Know more about Arquivo.pt training

Arquivo.pt is open to collaborations aiming at training professionals in organizations or common citizens on Web preservation.

Learn about the training modules and contact us.

 

H2020 projects preserved by Arquivo.pt

Thumbnail H2020 projects

Last updated on August 5th, 2024 at 04:50 pm

The main objective of Arquivo.pt is to preserve online information for research and education purposes.

Previously, Arquivo.pt identified and preserved Research & Development project websites funded by the European Union during the FP4, FP5, FP6 and FP7 programmes (1994-2013).

Now, Arquivo.pt contributed to preserve online information that documents R&D projects funded by the Horizon 2020 programme (2014-2021). It preserved 197 million web files (17 TB) related to science for future access.

H2020 projects publish valuable information online but are being lost

Websites about Research and Development (R&D) projects are increasingly being used to publish and disseminate important scientific information that complements published literature (e.g. data sets, documentation or software).

However, after projects ending, the corresponding websites usually disappear causing a permanent loss of unique and valuable scientific information.

Arquivo.pt automatically identified URLs that document H2020 Research and Development projects

The European Union’s Open Data Portal published a data set from the Community Research and Development Information Service (CORDIS) that documents H2020 research projects. However, from the 31 129 projects listed, only 46% presented a project URL.

Arquivo.pt developed a low-cost methodology that automatically identifies URLs related to R&D projects to be systematically preserved. This automatic identification is achieved through the combination of open data sets with web search services. This methodology is detailed on a scientific article published at the International Conference on Digital Preservation 2016.

In sum, we extracted 106 300 unique URLs from the following open data sets:

Then, we extracted the acronym and title of the projects from the data sets and automatically searched the web for additional URLs using the Bing Search API.

All the data sets and tools developed have been made publicly available in open access so that they can be reused and collaboratively enhanced. In particular, you can access the software developed to automatically identify additional URLs about H2020 projects.

197 million web files related to science were preserved

Arquivo.pt identified and preserved 197 million web files (17 TB) that document R&D projects funded by Horizon 2020.

In 2021, we can already witness project websites that are no longer available online, such as the Extended Model of Organic Semiconductors (EXTMOS) project (http://extmos.eu/). However, it was preserved and can be accessed at Arquivo.pt:

Archived version at Arquivo.pt (https://arquivo.pt/wayback/20170427182603/http://extmos.eu/) of the home page of the EXTMOS Research and Development project (http://extmos.eu/)funded by H2020.
Archived version at Arquivo.pt of the home page of the EXTMOS Research and Development project funded by H2020.

Contributions to complement the European Open Data Sets

All the resulting data sets were made publicly available so that they can be improved and reused by other organizations also interested on preserving this digital heritage:

If you want to know more information about this collection you can watch the video Preservation of web content related to Horizon 2020.

References

Are you a researcher?