Arquivo.pt at Jornadas de Computação Científica 2023. Register now!

thumbnail jornadas FCCN 2023

Last updated on October 24th, 2023 at 04:03 pm

Jornadas de Computação Científica 2023 was held at the Naval School in Almada from 27 to 29 June 2023.

This event is a meeting for sharing knowledge among the entities that make up the national higher education and research community.

The event counts with the participation of decision-makers of the institutions, people in charge of computer technical services and people responsible for libraries and documentation services, among others.

Arquivo.pt presented two 90-minute sessions, on June 28th from 2h30 p.m. to 6 p.m., under the theme “Arquivo.pt services for managing citations and cybersecurity” and the service Arquivo.pt Memorial in the Zapping session.

Agenda

June 28 2:30-16 p.m.: Arquivo.pt: available services and system architecture

Sessão 2 4:30-6 p.m.: Arquivo.pt: uma ferramenta para gerir citações e cibersegurança

Arquivo.pt Memorial

Register

Jornadas de Computação Científica Registration page Científica 2023

Virtual Museum of Tourism MUVITUR created a collection of preserved Websites

Coleção registos no Catálogo do MUVITUR com páginas Web preservadas no Arquivo.pt

Last updated on February 26th, 2024 at 09:07 am

MUVITUR – Virtual Museum of Turism is a portal that aggregates digital content about Tourism in Portugal.

The platform is maintained by the Celestino Domingues Library of The Estoril Higher Institute for Tourism and Hotel Studies (ESHTE) and has the participation of institutions from various areas of heritage that are content providers.

Among the digitized contents that can be consulted in the catalog and accessed in the provider institutions were sound, image, photography, printed material, but websites were missing.

Thus, the idea for the MUVITUR’s new “Web Pages” collection emerged.

Collaboration between MUVITUR and Arquivo.pt

In 2019, a collaboration between Arquivo.pt and MUVITUR began with the aim of identifying websites related to Tourism in Portugal and to disseminate the history of content published on the Web since 1996.

In 2022, a list was established with about 400 records of websites of various entities related to tourism, hotels, travel agencies, pages of municipalities’ websites dedicated to tourism and others.

This database resulted in the first collection of preserved websites about Tourism in Portugal.

Collection of records in the MUVITUR catalog with webpages preserved at Arquivo.pt. 

How the integration was done

MUVITUR uses Nyron software, which allows content from different sources to be aggregated using the OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) interoperability protocol, which is very common among libraries, archives and museums to provide content to portals such as Europeana.

Arquivo.pt, however, does not make information available through OAI-PMH so it was necessary to find alternative ways to create a record in Nyron with descriptive information from preserved sites.

The procedure for integration was as follows:

  • The XML schema with the fields for the metadata, according to what works in Nyron, was exported to an Excel sheet.
  • The information was entered manually, respecting the format and syntax, in collaboration with the computer technicians.
  • The XML file with the inserted data was validated and imported into Nyron.

Creating records in catalogs is largely a manual task and requires human curation. However, it was possible to input information to be automatically processed in the records of the Website collection. For example, the thumbnail was obtained using the Arquivo.pt API, more specifically the linkToScreenShot, visible in the technical details of a preserved page (see the options menu on the top right of a replayed page).

For other elements, such as the site’s title, it would be possible to obtain them automatically through the Arquivo.pt API, however the quality of the information depends on what the site’s producers have inserted and may not be accurate. The dates to limit the temporal scope can also be obtained automatically, but the manual method was chosen to control the information presented.

In the continuation of the project, the collection will be increased with new records, as there are thousands of websites about the Tourism sector.

Description of Web contents in the MUVITUR catalog

In the collection “Paginas Web” the following data are used:

  • Denomination – usually the title of the website
  • Organization – the entity to which the publication belongs
  • Website address on the Internet
  • Address for version in Arquivo.pt
  • Moment(s) to remember
  • Link for miniature in Arquivo.pt
  • Descriptors
  • Geographical data (location, coordinates, geographical name)

The presentation of the information was adjusted to be aligned with that of other MUVITUR resources and contains links to Arquivo.pt.

For example, in the register of the Turismo do Algarve site, we find a link to a moment to remember in 2011 and another link to the history in Arquivo.pt under “Consultar objecto”.

Detalhe do registo do site "Turismo do Algarve"
Detalhe do registo do site Turismo do Algarve

Organizations can create collections of Websites from their area

The National Library of Australia, for example, included records of preserved Websites in its catalog. In the Library of Congress there are collections of old Websites alongside traditional resources.

However, websites are rarely included in  museums.

With this unprecedented project we can say that preserved Web sites have gained citizenship in digital platforms dedicated to cultural heritage.

MUVITUR has paved the way with this project for other entities to create collections of websites of their interest on their own platforms.

Other results of the collaboration

CitationSaver preserves citations to web resources

Last updated on April 20th, 2023 at 09:37 pm

Documents cite web content by referencing their URLs so that readers can later access them.

In the case of scientific articles, the importance of these citations is even greater to maintain the integrity of research works because they often reference essential information to enable the reproducibility of an experiment or analysis.

For example, links in a scientific article may cite the datasets, software or web news that supported the research, which are not included in the text of the article.

To respond to the need of preserving the integrity of documents, Arquivo.pt launched the CitationSaver.

CitationSaver automatically extracts cited links in a document and preserves their content (e.g. web pages cited in a book) so that they can be retrieved later from Arquivo.pt.

infografia-citationsaver-en

Use CitationSaver to preserve the integrity of your documents

Upload a document and CitationSaver will extract the cited URLs, archive their content and make it available on Arquivo.pt after a short notice. There are 3 methods to upload a document:

  • insert the address (URL) of the PDF or TXT file, if it is published online
  • upload the file in PDF or TXT format
  • paste the text containing the addresses you want to preserve (e.g. References section of an article or Bibliography of a book).

More information

Project “Renascer” brings back old websites

Last updated on April 17th, 2023 at 06:32 pm

Organizations keep domains that referenced websites which are no longer used, to prevent them from being bought or because they were just forgotten.

The aim of project Renascer (Reborn) is to bring back historical websites whose content is no longer available online and whose domain continues to be held by their authors.

“Forgotten” domains can cause cybersecurity problems

In May 2023, the domain hmsportugal.pt of the Harvard Medical School-Portugal project referenced just one default web page hosted on an active server and the domain continued to be owned by its author.

In this situation, the original content of the website was inaccessible despite the fact that the domain continued to be owned by the author of the website.

Furthermore, since the domain was still pointing to an active web server, cybersecurity issues could occur if this server was not being properly maintained.

The domain hmsportugal.pt could be reborn to reference the contents of this website preserved by Arquivo.pt.

How are websites Reborn?

The domain owner only has to redirect it to Arquivo.pt, through the Memorial service.

For example, the mctes.pt domain started to reference back its original contents preserved by Arquivo.pt, thus making this website to be reborn.

Examples of Reborn domains

Project Renascer identified active domains managed by FCCN which were not referencing any content, and gave them a new life turning them to reference its historical contents preserved by Arquivo.pt.

Contact Arquivo.pt to reborn the historical websites of your organization.

See the following examples of Reborn websites:

 

 

 

Free training on digital media – webinars

Last updated on June 2nd, 2023 at 05:35 am

The Aveiro Media Competence Center (AMCC) is a platform to support and promote the European Union (EU) Local News Media sector in the implementation of digital transition projects. The consortium includes the PCI Creative Science Park of Aveiro Region, the Associação Portuguesa de Imprensa and the University of Aveiro.

Arquivo.pt is a free public service that allows searching and accessing Web pages preserved since the 1990’s, such as viewing an old news or accessing an old version of a website.

The collaboration between the AMCC and Arquivo.pt is materialized in a training program entitled Arquivo.pt: Digital Skills for the Media, developed in four webinars, and in the attribution of the AMCC Honorable Mention to work done on Portuguese centenary newspapers in the Arquivo.pt Award 2023.

Webinar cycle: Arquivo.pt: digital skills for media

The webinar cycle aims to equip trainees with digital skills that enable them to solve problems caused by the disappearance of digital information and gain competitive advantage in the production of unique and exclusive content.

  • Webinar 1: A tool for quickly searching the past
    • Data: Mars 24, 2023 Time: 14h00-15h30 (in Portuguese)
  • Webinar 2: Publishing well for preserving well

    • Data: April 6, 2023, Time: 14h00-15h30 (in Portuguese)
  • Webinar 3: Automated access and processing of preserved Web information through APIs
    • Data: May 4, 2023, Time: 14h00-15h30 (in Portuguese)
    • Slides
    • Video
  • Webinar 4: Web archiving: do-it-yourself!
    • Data: June 1, 2023, Time: 14h00-15h30 (in Portuguese)

Prepare a work for the Arquivo.pt Award 2023!

thumbnail_award2023

Last updated on January 26th, 2023 at 12:21 pm

Until May 4th, Arquivo.pt launches the challenge of creating a work based on historical information preserved from the Web.

In this 6th edition of the Arquivo.pt Award, 15 000 euros will be granted to the three best works (1st place: 10 000 euros).

Works about any subject may be submitted, done individually or in group. The only condition is that Arquivo.pt was the main source of information.

The Público newspaper will grant an Honorable Mention for works based on the web-archived content of Público online.

The Aveiro Media Competence Center (AMCC) will also grant an Honorable Mention to one of the submitted works that focuses on the archives of the online version of century-old newspapers.

All details at: arquivo.pt/award2023

The Arquivo.pt Award promotes the visibility of the applicants and their institutions.

Help us spread the word about the Arquivo.pt Award 2023 among potential candidates!

 

Arquivo.pt was considered the best Digital Service of 2022

thumbnial seal the best digital service

Last updated on December 13th, 2022 at 12:39 pm

Exame Informática, the oldest Portuguese magazine on Information and Communication Technology, distinguished Arquivo.pt with the award for the Best Digital Service of the year 2022.

The prize was delivered during the 16th gala Os Melhores & As Maiores do Portugal Tecnológico, held on November 29th, 2022, in Lisbon.

Daniel Gomes, manager of Arquivo.pt, dedicated the award to the various teams that have worked on Arquivo.pt over the years. In the month in which Arquivo.pt marked 15 years of existence, this distinction is an excellent anniversary gift, he concluded.

He also invited those present to discover the old pages of Exame Informática preserved in Arquivo.pt.

Photos of the event

Selo Os Melhores & As Maiores do Portugal Tecnológico 2022
Prémio Serviço Digital. Os Melhores & As Maiores do Portugal Tecnológico 2022
Prémio Serviço Digital. Os Melhores & As Maiores do Portugal Tecnológico 2022
Prémio Serviço Digital. Os Melhores & As Maiores do Portugal Tecnológico 2022
Selo Os Melhores & As Maiores do Portugal Tecnológico 2022 Prémio Serviço Digital. Os Melhores & As Maiores do Portugal Tecnológico 2022 Prémio Serviço Digital. Os Melhores & As Maiores do Portugal Tecnológico 2022 Prémio Serviço Digital. Os Melhores & As Maiores do Portugal Tecnológico 2022

Videos

Award cerimony

Flash interview

Extract from the programme Exame Informática broadcast by SIC Notícias on 11 December 2022 (was obtained by external screen recording).

15 years of Arquivo.pt celebrated in a event promoted by Wikimedia

thumbnail_15-anos-Arquivopt-Wikimedia

Last updated on August 18th, 2023 at 03:29 pm

On November 8, 2007, the Portuguese Web Archive was officially created and later named Arquivo.pt.

To celebrate this date, Wikimedia Portugal and Arquivo.pt have associated themselves in the organization of an online event dedicated to the preservation of the digital heritage.

Agenda

  • Introdução – André Barbosa, Wikimédia Portugal (Video)
  • 15 anos de Arquivo.pt – Daniel Gomes, Arquivo.pt (Slides, Video)
  • Wikimedia na Universidade: Exploração e Projetos na NOVA FCSH – Rute Correia, Residência WMPT na NOVA FCSH, (Slides; Video)
  • GLAM Wiki. Uma introdução geral – Giovanna Fontenelle, Fundação Wikimédia, Brasil (Slides; Video)
  • Demo dos recursos em acesso livre no Arquivo.pt – Daniel Gomes (Video)

More information

15-anos-Arquivopt-Wikimedia

Afghanistan Websites and the fall of the regime in August 2021

thumbnail_Karima Faryabi

Last updated on September 26th, 2022 at 03:57 pm

afghan-ministry-of-economy-17-08-2021

Afghanistan Ministry of Economy website with Karima Faryabi (recorded August 17, 2021)

On August 15, 2021 the presidential palace in Kabul was taken over by the Taliban, consummating the fall of the regime that had been in place for 20 years, following the 9/11 attacks on the United States.

The community of Web archivists, through the Content Development Working Group – International Internet Preservation Consortium, was challenged to record the Afghan sites, given the risk that they would disappear with the new regime.

No time to lose when it comes to preserving the Web

Arquivo.pt reacted quickly, launching an automatic content search focused on .af domain sites and on international media news about the ongoing events.

On August 17, the websites began to be recorded.

1800 website addresses from Afghanistan (ending in .af) and 500 media news stories from around the world were used.

The addresses, URLs or “seeds” were obtained through automated search using the Bing Search API and immediately put into recording.

Content available to know Afghanistan’s history

As a result of the collection carried out, more than 400 Gigabytes of information became available at Arquivo.pt, which anyone can use for research in the most diverse areas.

The main contribution of Arquivo.pt to the community of Web archivists was the use of the automatic search that allows a quick reaction in the recording of Web contents in imminent risk of being lost.

Know more

Arquivo.pt open data set (Dados.gov)

Content collected by the Content Development Working Group of the International Internet Preservation Consortium available at the Archive-it service

Meet the winners of the Arquivo.pt Award 2022!

thumbnail-award-arquivo.pt 2022

Last updated on April 28th, 2023 at 03:41 pm

The winners of the Arquivo.pt Award 2022 were announced by the Público newspaper on 22th July 2022, the official communication partner of this edition, which awarded an honorable mention to the best work based on its historical web content.

22 applications were received.

The award ceremony took place during the Commemorative Session of the World Science Day: the excellence of research in Portugal, on November 24th, at the Teatro Thalia, in Lisbon.

1st place – “Arquivo do Parlamento”

The winner of the 10 000 euro prize was the work “Parliamentary Archive” developed by Tiago Santos.

“Parliament Archive” is a web application that aggregates news and opinion articles extracted from Arquivo.pt based on parlamento.pt open data.

For example, a user can search on a political personality and get speeches, news and other publications that Arquivo.pt has preserved.

2nd place – “Classificação automática de artigos estigmatizantes de doenças mentais”

The 2nd prize of 3 000 euros was awarded to the work “Automatic classification of stigmatizing articles of mental illness“, authored by Alina Yanchuk, Alina Trifan, Olga Fajarda and José Luís Oliveira.

This work developed a methodology for the automatic classification of stigmatizing mental illness articles, present in Portuguese online news newspapers, using Artificial Intelligence.

For example, a news article that uses the term schizophrenia associated with a news article about political life is classified as stigmatizing. Using automated processes, this work allows to identify thousands of news items and draw the attention of the media and society to the stigmatization of mental illnesses.

3rd place – “Arquivo Público”

The 3rd place winner received a prize of 2 000 euros and was awarded to the work “Arquivo Público”, developed by Diogo Correia and Ricardo Campos.

“Arquivo Público” is a web application focused on the contents published on the Público newspaper website over time and preserved by Arquivo.pt.

As a result, we have a web interface that allows the visualization of archived news about a specific subject and also the representation of the number of news, most frequent terms and geographical reference.

Honorable Mention granted by Público newspaper

The Público newspaper, official partner of the 5th edition of the Arquivo.pt Award, granted an Honorable Mention to the work “Arquivo Público”, carried out by Diogo Correia and Ricardo Campos.

Photos of the award cerimony

The award ceremony took place during the commemorative session of the National Day of Scientific Culture, on November 24th 2022, at the Teatro Thalia, in Lisbon.

The awards were presented by the Minister of Science, Technology and Higher Education, Elvira Fortunato, the President of the Board of Directors of FCT, Madalena Alves, and the representative of the media partner, the science editor of Público newspaper, Teresa Firmino.

Image Gallery

Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022
Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022 Ceriminónia de entrega dos prémios 2022

Créditos das fotos: Pedro Ferreira – FCT | FCCN | Arquivo.pt

Video of the cerimony

Flash interview videos

Dissemination materials

Press

Short-link to this page: arquivo.pt/winners2022