Cross-lingual collection about the 2019 European Elections is available

print_europeanelections_q

Last updated on October 1st, 2021 at 09:06 am

Print European Elections 2019
Print from an archived page on Arquivo.pt: https://www.european-elections.eu

The special collection of web pages about the 2019 European Elections is available for search at Arquivo.pt.

To compile this collection, pages written in 24 European languages ​​were identified through automatic searches on the Bing search engine and suggestions from 17 European countries.

We emphasize the collaboration of the Publications Office of the European Union, which reviewed the list of search terms in the different languages ​​of the European Union.

Between May and July 2019, Arquivo.pt exhaustively collected pages related to the European Elections in several countries.

The resulting collection named “European Elections 2019” comprises 99 million web files that sum 4.8 Terabytes of information.

The technical report “A transnational crawl of the European Parliamentary Elections 2019 ” details the applied methodology. This methodology has been applied to generate other thematic collections such as about Covid-19.

We invited all citizens, especially the researchers, to try this service especially created to search the 2019 European Elections cross-lingual and international collection: https://arquivo.pt/ee2019

Video “A transnational and cross-lingual crawl of the European Parliamentary Elections 2019”

A transnational and cross-lingual crawl of the European Parliamentary Elections 2019, Ivo Branco, IIPC Web Archiving Conference and RESAW 2021 (slides)

To know more:

Collection about Covid-19 in Portugal

Thumbnail Covid-19 colletcion in Portugal

Last updated on June 18th, 2021 at 08:26 am

Banner Covid-19 colletcion in Portugal

Suggest web pages about Covid-19

Arquivo.pt invites everyone to suggest web pages that document the Covid-19 pandemic to be preserved for future access. Help us to keep a complete memory of the Portuguese live during this period.

Suggest pages using this form: https://tinyurl.com/arquivopt-covid19

Thousands of web pages to tell the story of the pandemic in Portugal

Arquivo.pt has been carrying out special collections of web pages related to the Covid-19 pandemic since March 2020.

“Future academics, scientists and journalists who are studying the Portuguese response to the Covid-19 pandemic will want to read first-hand testimonies of those affected, official records of the number of victims, and recommendations from doctors, politicians and scientists at the time” , Público newspaper, May 1, 2020 edition.

Daily, content was collected from a set of 106 sites on the theme of Covid-19. This set includes, for example, websites for the media, government, associations and university initiatives.

In another set are Twitter pages (108 identified in May), Youtube videos (815 identified in May) and also pages from Reddit and Git Hub.

Suggestions from the community were included. For example, Archivists from Sines (Portugal) collected local news related to Covid-19 (9 GB). The Revisionista.pt project also contributed and identified pages from newspapers. People sent suggestions through the public form.

Collaboration with IIPC for international collection

In February 2020, the International Internet Preservation Consortium (IIPC), the main organization on Web preservation, proposed to its members a collection about the Novel Coronavirus (Covid-19) outbreak.

Arquivo.pt contributed with 1 237 seeds, mainly in Portuguese. With successive contributions from other countries, the IIPC collection reached over 7 000 pages in July 2020.

A form is also available for anyone to suggest content for this international collection.

The IIPC collection “Novel Coronavirus (COVID-19)” is accessible via the Internet Archive Archive-it.

Arquivo.pt carried out 3 collections of the international collection compiled by the IIPC, the 1st on March 23 the 2nd on June 15 and the 3rd on late August, thus gathering international content useful for worldwide researchers.

Methodology for the selection of pages for the Covid-19 collection

We started by identifying terms related to the Coronavirus theme that included health, economic, political, geographic or organizational aspects.

Then, the Bing Azure service was used to automatically obtain, through a script, the following information for the first 10 results for each term: the page address, the title and the position in the results list.

Considering the list of results, it was decided which software would be used and which settings would be the best to collect the pages.

For example, in the case of a newspaper section dedicated to Covid-19, it was necessary to decide whether to record just one page or whether it makes sense to collect the entire site exhaustively.

Various types of software were used to collect the pages. For daily collections from 106 sites Heritrix was used. For capturing 108 Twitter accounts, Brozzler was chosen and for videos, manual capture using Webrecorder and Browsertrix.

Know more

Meet the winners of the Arquivo.pt Award 2020!

Card Meet the winners of the Arquivo.pt Award 2020

Last updated on February 18th, 2022 at 12:33 pm

The winners of the Arquivo.pt 2020 Award were announced by the Público newspaper, the official media partner of this year’s edition, which granted an honorable mention to the best work based on the contents of the newspaper. 29 candidate works were received.

The award ceremony toke place during Science 2020 – Meeting with Science and Technology, November 4, at the Lisbon Congress Center.

1st place – “Desarquivo”

The winner of the 10,000 euros prize was the work “ Desarquivo ” developed by Miguel Ramalho.

“Desarquivo” is a website that enables searching for named entities (e.g. people, organizations and places) and identify relationships among them, based on news published in online newspapers along time.

The search results are presented in the form of a graph or network of relationships that enables a journalist, researcher or any common citizen to dynamically explore the relationships among historical information preserved from the Web by Arquivo.pt.

For example, a user can explore ideological proximity among political parties along time.

2nd place – “Arquivo.pt Extension”

The 2nd prize in the amount of 3,000 euros was awarded to the work “ Extension Arquivo.pt ”,  a browser extension developed by Rodrigo Marques and Hugo Silva.

This extension enables users to perform advanced searches on Arquivo.pt directly from the browser , without having to leave the page they are currently viewing.

The “Arquivo.pt Extension” is available for download in the Chrome Web Store.

3rd place – “Arquivo Económico .pt”

The 3rd place winner received a prize of 2,000 euros and was awarded to the work “Arquivo Económico .pt” by Nuno Bragança.

The “Arquivo Económico .pt” organizes and presents information preserved by Arquivo.pt about the prices of products since the time of the Portuguese coin escudo.

As a result, we have a website that enables searching the price of consumer goods by different categories, such as supermarket, transportation or others, on given dates.

For example, users can easily know how much a trip from Lisbon-Porto or a cell phone call costed in 1999.

Honorable Mention granted by Público newspaper

Jornal Público, official partner of the 3rd edition of the Arquivo.pt Prize, awarded its Honorable Mention to the work “Jornal do Passado”, developed by Bruno Galhardo.

“Jornal do Passado” is a game for all ages, developed for Android, in which the users test their knowledge about news or events by guessing the date in which they occurred.

As a result, we have an app that enables searching the historical information preserved by Arquivo.pt in a pedagogical and fun way.

Image gallery

Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
20201104-EncontroCiencia-0140
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 no grande auditório do Centro de Congressos de Lisboa
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020
Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 20201104-EncontroCiencia-0140 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 no grande auditório do Centro de Congressos de Lisboa Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020 Entrega de prémios na sessão de encerramento do Encontro Ciência 2020

Time travel with Público Newspaper

Viagem no tempo com o jornal Público (cartão quadrado)
Time Travel of Público Newspaper - card

In the 30th anniversary of Público Newspaper, we are invited to visit web pages of the electronic version of the newspaper, through a Time travel.

By clicking on each link we can visit an old Webpage which maintains the appearance and functionalities it had at the time it was published.

The selection of the pages was done by Público Newspaper, in collaboration with Arquivo.pt which achieved their presentation in the form of a timeline.

In the Arquivo.pt Award 2020 edition Público Newspaper will award an Honorable Mention to works based on the versions preserved by Arquivo.pt.

Applications are wellcome until 4 May: Arquivo.pt 2020 Award.

We believe that a travel to the past will raise ideas and topics for study from the preserved contents.

More time travel

To have a time travel for your organization, please contact us.

Online Cafe with Arquivo.pt

Café com o Arquivo.pt

Last updated on November 24th, 2020 at 05:18 pm

Wellcome to Arquivo.pt  Online Cafe!

Talk directly to the Arquivo.pt team and get answers to all your questions!

The Arquivo.pt team chats with you through online sessions.

Brief introductory presentations will be given, leaving time to ask all your questions about how to get more out of Arquivo.pt or how to apply to the Arquivo.pt Awards.

Sessions held in the 1st season

1st session, 27 March – Website Preservation: Do It Yourself!

The 1st session (in Portuguese) was about Website Preservation: Do It Yourself! and counted with the participation of Ricardo Basílio (Digital Curator of Arquivo.pt) and Daniel Gomes (Manager of Arquivo.pt).

2rd session, April 3 – meuParlamento.pt

The App meuParlamento.pt, was the winner of Arquivo.pt Award 2019. Nuno Moniz presented the relevance of this app to the citizen participation on politics. Arian Pasquali and Tomás Amaro, also authors of this work were presents. The session continued with questions related to the development of works from Arquivo.pt.

3th session, April 17 – Arquivo.pt Award and News on Arquivo.pt

After Easter break Arquivo.pt Online Café was back, presented by Daniel Gomes. This session was dedicated to clarify doubts for those who are finalizing their work to compete for the Arquivo.pt Award. Finally, the new interface of Arquivo.pt has been presented.

4th session, April 24 – Revisionista.PT – Uncovering the News

Flávio Martins and André Mourão, creators of the Revisionista.pt, talked about this tool that uses Arquivo.pt to show the reviews of a given new after its publication in newspapers.

5th session, April 30 – Public speeches about violence in private

Zélia Teixeira, Professor at Fernando Pessoa University and Psychologist, brought us an analysis of 217 news collected in Arquivo.pt from the three main daily newspapers, on domestic violence.

6th session, May 8 – Arquivo.pt API – How to process data at large scale?

André Mourão, Engineer I&D explained Arquivo.pt APIs (Application Programming Interfaces) through examples and cases, in the session held on 8 April. One doesn’t need to be an IT expert to see the the potencial of the API when used on research or new tools.

7th session, May 15 – Website Preservation: Do It Yourself!

Ricardo Basílio, Arquivo.pt’s web curator, presented a tutorial dedicated to Webrecorder and Browsertrix. This tools are usefull to capture websites locally in a small scale. From a demonstration of how it works, Arquivo.pt want to encourage the community. Anyone can make a selection of pages or websites and preserve them in a standardized format.

8th session, May 22 – The history of video games on the Portuguese web

Miguel Costa, Web developer and passionate about Web, tecnologies and videogames talked about the main figures of national business of videogames and about the first Portuguese videogame. In Arquivo.pt he founded archived files of videogames and a lot of information.

9th session, May 29 – Straight Edge in the metropolitan area of Lisbon

In the 9th session of the Café, we have got to know Straight Edge and its presence in the punk/hardcore medium of the metropolitan area of Lisbon in the 90s more closely. Diogo Duarte, anthropologist and researcher at the Contemporary History Institute of Universidade Nova de Lisboa, talkedabout his work dedicated to the theme and about the importance of Arquivo.pt to study this movement and other expressions of popular culture.

1oth session, June 5 – Health and Internet: an evolution

Health and Internet was the topic of the 10th session of Arquivo.pt Café, presented by Rita Espanha, professor and researcher at the ISCTE (University Institute of Lisbon) and CIES (Centre for Research and Studies in Sociology). The Internet has become the privileged medium where citizens seek information and build their own know in all areas of your life, including health. State agencies in turn have developed services that use the Internet. From the outside, part of the population remains that has not followed this change. The other part of the population that has easy access to information does not always have the critical sense to evaluate information and use it to their advantage. All of these issues became more evident during the Covid-19 pandemic period.

11th session, June 19 – Creating and managing preservable websites

The team of Arquivo.pt presented a set of good practises when publishing information though the Web, in order to its preservation.

12th session, June 26 – “Tell me Stories”, “Conta-me Histórias

“Tell me Stories”, “Conta-me Histórias” is a service that creates temporal narratives, based on the contents preserved by Arquivo.pt. This application was the winner of the Arquivo.pt Prize 2018. One of its authors, Ricardo Campos (IPT; INESC TEC), talked about the service developments. Arian Pasquali, member of the development team, also participated in the discussion.

13th session, July 3 – Arquivo de Opinião

Researchers on NLP (Natural Language Processing find in this session an excellent use case explained in detail by its author. Miguel Won, resercher at the INESC-ID (Lisbon), talked about the opinion sections of the media. How do commentators read events and how does this reflect their political position? Based on this question, he developed the Web application Arquivo de Opinion, awarded in 2018, which presents a history of the opinion columns of Portuguese newspapers, from the pages of Arquivo.pt. In this session we got to know the news of the project, which now also collects pages from social networks.

14th session, July 10 – Museum of Portuguese Web Design

Sandra Antunes, Professor at the School of Technology and Management of Viseu (ESTGV) spoke about virtual spaces for the memory of Portuguese Web design and showed the importance of a museum to fill gaps in the areas of preservation, exhibition and history of Portuguese Web design.

Sessions of the 2nd season

Arquivo.pt is back to its origins at FCUL 20 years later

FCUL exhibition

FCUL exhibition

The exhibition of Arquivo.pt is being displayed at the library of the Faculty of Sciences of the University of Lisbon (FCUL) until April 30.

Eight posters with old web pages invite students, researchers and professors to use Arquivo.pt in their work and apply to the Arquivo.pt 2020 Award. There will be training at FCUL on March 12, 4h30 to 18h00 p.m., room 1.3.15.

This exhibition has been going through several Higher Education institutions, but in the case of FCUL it is a return to its origins.

In 2000 at FCUL, XLDB Research Group of the Department of Informatics, supported by FCCN (the agency that supports computing for science, research and technology), started a project on research and access to Web content, an optimised search engine: TUMBA! – (meaning “We have an alternative search engine!”).

Between 2002 and 2006, as part of a new project (TOMBA, referring to the national archive, Torre do Tombo), a prototype was developed for a national web archive. The researcher group was awarded as the best article by the network DELOS – Network of Excellence on Digital Libraries.

Arquivo.pt was officially launched at the FCCN in November 2007, aimed to collecting and preserving Portuguese Web content and using specific technologies, similar to those of the Internet Archive.

Three researchers from FCUL were part of the core team. They have developed the Arquivo.pt service in the early years. In 2010, they presented a prototype of a search service, a Google for the past, innovative in the context of web archiving.

Currently, Arquivo.pt also has an image search and an API (Advanced Programming Interface). It maintains the perspective, followed by the first project at FCUL, which is based of creating useful services for the community.

Arquivo.pt Memorial is the most recent service, created for institutions that wants to keep old sites accessible, even after disconnecting them from their servers. As an example of this, you can visit the Minema project (finished years ago), and see how this service works.

Image gallery

Exposições 2020

Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL
Exposição Arquivo.pt FCUL Exposição Arquivo.pt FCUL Exposição Arquivo.pt FCUL Exposição Arquivo.pt FCUL Exposição Arquivo.pt FCUL Exposição Arquivo.pt FCUL Exposição Arquivo.pt FCUL

Credits to Valter Gouveia

Arquivo.pt Award 2020 launched at Público Newspaper

Lançamento do Prémio Arquivo.pt 2020

Last updated on March 24th, 2020 at 12:21 pm

The Arquivo.pt Award 2020 was officially launched on January 16th, at the Público Newspaper in Lisbon. Público is one of the most well-known newspapers in Portugal.

The event had talks by the Director of Público Newspaper Manuel Carvalho, the President of the Foundation for Science and Technology Helena Pereira and the manager of Arquivo.pt Daniel Gomes.

Participants of this open event were led into a guided visit to the newsroom. They saw a real scenario where contents of a newspaper are edited and produced.

Público’s website is daily crawled by the Arquivo.pt, which means an important contribution to the future access and use of contents.

In the 2020 edition of the Arquivo.pt Award, Público Newspaper will grant an Honorable Mention to works based on the newspaper’s content along its 20 years.

Find out how to apply, till 4th of May: arquivo.pt/award2020

Galeria de fotos

Lançamento do Prémio Arquivo.pt 2020

Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020
Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020 Lançamento do Prémio Arquivo.pt 2020

Photos by: Valter Gouveia, FCT

Arquivo.pt celebrates the World Digital Preservation Day with free training

wdpd_logo_en

Last updated on November 25th, 2019 at 10:34 am

November 7 was the World Digital Preservation Day 2019 (#WDPD2019).

To celebrate the date, Arquivo.pt a free training on Web Preservation was given. The objective of the training has been to maximize users’ productivity in exploring the service.

The event began at 13:30 pm (see full schedule below), in the Pequeno Auditório located at FCT-FCCN (Avenida do Brasil, 101 – Lisbon).

Schedule

Target Audience

  • Information professionals (e.g. librarians, archivists, and documentalists)
  • Website managers (e.g. Communication and Design bureaus)
  • Web content authors (e.g. bloggers)
  • Professors, students, and researchers interested in Digital Preservation

Image gallery

Dia Mundial da Preservação Digital 2019

IMG_7144
IMG_7128
IMG_7116_1
IMG_7157_1
IMG_7180
IMG_7171
IMG_7189
IMG_7487
IMG_7477
IMG_7493
IMG_7452
Dia Mundial da Preservação Digital 2019
IMG_7233
IMG_7246
IMG_7279
IMG_7324
IMG_7125
IMG_7123
IMG_7119
IMG_7261
20191107_173137
IMG_7372
IMG_7375
IMG_7392
IMG_7414
IMG_7144 IMG_7128 IMG_7116_1 IMG_7157_1 IMG_7180 IMG_7171 IMG_7189 IMG_7487 IMG_7477 IMG_7493 IMG_7452 Dia Mundial da Preservação Digital 2019 IMG_7233 IMG_7246 IMG_7279 IMG_7324 IMG_7125 IMG_7123 IMG_7119 IMG_7261 20191107_173137 IMG_7372 IMG_7375 IMG_7392 IMG_7414

Photos by: Valter Gouveia, FCT

Results

The training was attended by 43 participants, who rated it as good.

Do not miss the next WDPD in 2020 and subscribe to the Arquivo.pt mailing list.

Sites crawled in 2017 are available through Arquivo.pt

Fires in Pedrógão Grande

Last updated on July 2nd, 2019 at 03:58 pm

The information collected from the Web during 2017 is avaliable in Arquivo.pt.

Fires in Pedrógão Grande. Photo published in the newspaper As Beiras
Fires in Pedrógão Grande, Portugal, june 2017. Photo published in the newspaper As Beiras

Remember and investigate historical events of 2017 such as:

Arquivo.pt visited 3 millions sites and collected 900 million files, 75TB in total, so you can acess the memory of past events.

Librarians as web curators in universities

4th meeting BES

The 4th Meeting of Libraries of Higher Education, for librarians in Portuguese universities, held at Coimbra University, on 4 and 5 June 2019, had two presentations about Arquivo.pt.

Workflow for the preservation of institutional web sites (fig. 1) was suggested as a guide for anyone to be able to care for and preserve quality the pages of own organization.

The second presented the new “Memorial” Arquivo.pt service, which preserves the sites at the end of their life cycle, keeping them accessible to institutional memory and research.

About 170 librarians participated at the meeting and were challenged to collaborate with Arquivo.pt adopting simple but decisive practices to preserve the Web with quality.

Fluxograma Curadoria Web

Fig. 1 – Workflow to preserve institutional websites (PDF, in Portuguese)