Tutorial: how to explore Arquivo.pt using Python

Last updated on July 17th, 2023 at 01:44 pm

The Programming Historian aims to develop digital skills among the Humanities researchers through the publication of practical lessons in several languages.

The call Computational analysis skills for large-scale humanities data originated 7 new lessons.

One of them was the tutorial “Timeline summarization for large-scale past-web events with Python: the case of Arquivo.pt” developed by Daniel Gomes and Ricardo Campos.

It shows how to explore Arquivo.pt user interface and the Application Programming Interface (API) to execute advanced queries, process large amount of data or build new services, such as Tell me stories.

All the developed resources are freely available in open-access.

Open-access resources of the tutorial “Timeline summarization for large-scale past-web events with Python: the case of Arquivo.pt”

 

 

Open dataset about cryptocurrency

Criptomoedas gráfico

Last updated on August 17th, 2022 at 09:19 am

(Photo: QuoteInspector)

Since 2008 the cryptocurrency market has revolutionised the world by innovating and expanding into other areas (e.g., finance and art). However, with this rapid expansion, many projects are created every day, giving rise to a wide and varied range of websites, technologies and scams. Markets follow financing stages and it is during an initial stage of euphoria that more projects are created.

We believe that as the cryptocurrency market  stabilises, projects/websites are disappearing because funding diminishes or runs out.

Arquivo.pt initiated a new web archive collection that preserves web content that documents Cryptocurrency activities.

This work produced a new open dataset with information documenting each cryptocurrency project, including it is original URLs and links to the corresponding web-archived version in Arquivo.pt. The information sources selected to create this dataset were:

We believe that by creating this new dataset related to cryptocurrencies and by preserving all the corresponding web content, it has the potential to originate innovative scientific contributions in several areas such as Economy or Digital Humanities.

Resources

Researchers who want to carry out studies on the Cryptocurrencies dataset and need earlier access to the collected contents can contact Arquivo.pt.

Presentation at the IIPC Web Archiving Conference 2022

Meet the winners of the Arquivo.pt Award 2022!

thumbnail-award-arquivo.pt 2022

Last updated on April 28th, 2023 at 03:41 pm

The winners of the Arquivo.pt Award 2022 were announced by the Público newspaper on 22th July 2022, the official communication partner of this edition, which awarded an honorable mention to the best work based on its historical web content.

22 applications were received.

The award ceremony took place during the Commemorative Session of the World Science Day: the excellence of research in Portugal, on November 24th, at the Teatro Thalia, in Lisbon.

1st place – “Arquivo do Parlamento”

The winner of the 10 000 euro prize was the work “Parliamentary Archive” developed by Tiago Santos.

“Parliament Archive” is a web application that aggregates news and opinion articles extracted from Arquivo.pt based on parlamento.pt open data.

For example, a user can search on a political personality and get speeches, news and other publications that Arquivo.pt has preserved.

2nd place – “Classificação automática de artigos estigmatizantes de doenças mentais”

The 2nd prize of 3 000 euros was awarded to the work “Automatic classification of stigmatizing articles of mental illness“, authored by Alina Yanchuk, Alina Trifan, Olga Fajarda and José Luís Oliveira.

This work developed a methodology for the automatic classification of stigmatizing mental illness articles, present in Portuguese online news newspapers, using Artificial Intelligence.

For example, a news article that uses the term schizophrenia associated with a news article about political life is classified as stigmatizing. Using automated processes, this work allows to identify thousands of news items and draw the attention of the media and society to the stigmatization of mental illnesses.

3rd place – “Arquivo Público”

The 3rd place winner received a prize of 2 000 euros and was awarded to the work “Arquivo Público”, developed by Diogo Correia and Ricardo Campos.

“Arquivo Público” is a web application focused on the contents published on the Público newspaper website over time and preserved by Arquivo.pt.

As a result, we have a web interface that allows the visualization of archived news about a specific subject and also the representation of the number of news, most frequent terms and geographical reference.

Honorable Mention granted by Público newspaper

The Público newspaper, official partner of the 5th edition of the Arquivo.pt Award, granted an Honorable Mention to the work “Arquivo Público”, carried out by Diogo Correia and Ricardo Campos.

Photos of the award cerimony

The award ceremony took place during the commemorative session of the National Day of Scientific Culture, on November 24th 2022, at the Teatro Thalia, in Lisbon.

The awards were presented by the Minister of Science, Technology and Higher Education, Elvira Fortunato, the President of the Board of Directors of FCT, Madalena Alves, and the representative of the media partner, the science editor of Público newspaper, Teresa Firmino.

Image Gallery

Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022

Créditos das fotos: Pedro Ferreira – FCT | FCCN | Arquivo.pt

Video of the cerimony

Flash interview videos

Dissemination materials

Press

Short-link to this page: arquivo.pt/winners2022

Participation of Arquivo.pt in the meetings of the International Internet Preservation Consortium

thumbnail_GA_WAC2022

Last updated on August 1st, 2023 at 05:37 pm

IIPC Web Archiving Conference

The International Internet Preservation Consortium (IIPC), a consortium that brings together Web preservation initiatives from around the world, held its General Assembly with its members between May 17 and 19, 2022.

The following week, between May 24 and 25, held the IIPC Web Archiving Conference (IIPC WAC), online as in the previous year due to the contingencies of the Covid-19 pandemic.

The 2022 edition of these two events was hosted by the Library of Congress.

Arquivo.pt resources and initiatives presented at the IIPC WAC 2022

The IIPC Web Archiving Conference is an initiative open to the community, where people or entities interested in the Web preservation domain may participate.

The Arquivo.pt contributed to the Ligthtning Talks sessions (session 5 and session 13).

The Arquivo.pt presentations focused on the resources and initiatives that this service has lately developed for the community.