Now, Arquivo.pt contributed to preserve online information that documents R&D projects funded by the Horizon 2020 programme (2014-2021). It preserved 197 million web files (17 TB) related to science for future access.
H2020 projects publish valuable information online but are being lost
Then, we extracted the acronym and title of the projects from the data sets and automatically searched the web for additional URLs using the Bing Search API.
All the data sets and tools developed have been made publicly available in open access so that they can be reused and collaboratively enhanced. In particular, you can access the software developed to automatically identify additional URLs about H2020 projects.
197 million web files related to science were preserved
Arquivo.pt identified and preserved 197 million web files (17 TB) that document R&D projects funded by Horizon 2020.
When a user enters a set of words about a topic in the Arquivo.pt search box and clicks on the “Narrative” button, the user is directed to the “Conta-me Histórias” service, which automatically analyzes the news from 25 websites archived by Arquivo.pt over time and presents a chronology of news related to the topic.
Figure 1: Search results for pages about “Justin Bieber”.
Figure 2: Narrative of news about “Justin Bieber” from Portuguese news sites preserved by Arquivo.pt generated by the “Conta-me Histórias” service.
Create your narrative now!
“Conta-me Histórias” researches, analyzes and aggregates thousands of results to generate each narrative about a topic. It is recommended to choose descriptive words about well-defined themes, personalities or events to obtain good narratives.
Creating a narrative is useful for researchers, journalists or citizens who want to quickly get an overview of the evolution of a topic along time, thus saving them a lot of time and effort.
Go to Arquivo.pt and try to create a narrative about a theme of your choice.
RESAW Conference – meeting of the European RESAW network (Research Infrastructure for the Study of Archived Web Materials) this year in its 4th edition, mainly addressed to the community of researchers from non-technological scientific areas, such as Social Sciences, Arts and Humanities.
Contributions of Arquivo.pt to the international community
Arquivo.pt presented some results of the work developed in the last year, with emphasis on the functionalities that improve the reproduction of the archived contents, such as the “Complete the page”.
Two historical collections were integrated on the Arquivo.pt: the Geocities and the Internet Memory Foundation. Arquivo.pt did special collections about the 2019 European Elections and Covid-19.
The contents of Arquivo.pt are accessible to any researcher regardless of the country they are in and therefore it is a useful service to the international community.
Arquivo.pt updates 2021: presentation at the IIPC – General Assembly, by Daniel Gomes (Vídeo)
Complete the page. 1 minute drop in (presentation at the IIPC – General Assembly “complete the page”), by Daniel Gomes (Slide, Video)
A transnational and cross-lingual crawl of the European Parliamentary Elections 2019, by Ivo Branco (Slides, Vídeo)
Enhancing access to research the Geocities historical collection, by Pedro Gomes (Slides, Vídeo)
This external service is useful for research use cases, in areas such as Web design, Art, Communication or History,where it is necessary to access the original visual aspect of a page from the past in the most reliable way possible.
The Arquivo.pt Award aims to annually promote innovative works based on historical information preserved by Arquivo.pt. Submissions closed on May 3 and we received works in areas such as: media studies, education, design, information technology, health or cultural and historical heritage.