European Elections

2024 European and Portuguese elections in special Arquivo.pt collections

Arquivo.pt made special collections on the three elections that took place this year: the Parlamentary elections on 10 March, the elections in Madeira island on 26 May and the European elections on 9 June.

More than 70,000 pages with content related to the elections and political life in Portugal and Europe were identified and around 4 terabytes of information collected.

We would like to thank the people who contributed to the selection of pages. Teachers and students are encouraged to do work using the special collections on elections that Arquivo.pt has produced over the years.

Find out more about the collection procedure and the results obtained.

Portuguese Parlamentary Elections (Legislativas 2024)

The Portuguese Parlamentary Elections  took place on 10 March 2024 to elect the members of the Assembly of the Republic for the 16th Legislature of the Third Portuguese Republic.

We would like to highlight the community’s contribution to this collection with a manual selection of 827 pages, which helped to improve the quality of the collection.

Around 500 compound terms or keywords were used to search for content published on the web about the elections. The service used for the automatic search was the Bing Search API. The results were limited to the top 20.

For example, the compound term ‘head-to-head legislative 2024’ found pages relating to debates between candidates. The term ‘legislative housing 2024’ found pages relating to party proposals for housing. The term ‘legislativas 2024 site:expresso.pt’ identified Expresso pages about the elections. The names of the candidates were also used.

After the elections, search terms specific to that period were used, such as ‘legislative victory 2024’, ‘legislative defeat 2024’ or ‘legislative results 2024’, among others.

The automatic search in the Bing Search API resulted in 34,120 addresses obtained before the elections and 5,803 after the elections.

The websites of political parties, including parties without parliamentary seats, were also collected during the election period.

Not all the content identified could actually be recorded, due to the limitations of the recording tools or the restrictions of the websites themselves.

The tools Heritrix, Brozzler and Browsertrix-cloud (beta version), courtesy of Webrecorder.net, were used for the recording.

The recording took place between 6 and 20 March and resulted in 3.2 Terabytes of information. The contents have been included in the EAWP45 special collection and will be available after one year.

To find out more, consult the open dataset:

Madeira Legislative Assembly elections 2024

The elections for the Legislative Assembly of Madeira took place on 26 May. Arquivo.pt carried out a special collection of content published on the web.

We began by automatically searching for news, election pages and websites related to the elections in Madeira. We used a list of search terms to put into the Bing Search API.

The aim was to obtain as many URLs as possible related to the event or topic in question, i.e. the Madeiran elections. To do this, several limits were set for the results: top 10, top20, top50 and top100. This process was documented, which shows that the more we expand the number of results, the greater the number of pages that are not very relevant and sometimes outside the intended target.

All the addresses (12,656) were recorded on 7 June, together with the pages relating to the European Elections, in the Heritrix crawler.

Find out more by consulting the open dataset:

European elections 2024 in multilingual collection

The European elections took place on 9 June in Portugal. In some countries, such as Estonia, Czechia and Italy, the elections were held on a different date.

Arquivo.pt collected pages relating to the European Elections in the 27 countries of the European Union and in the 24 official languages.

The same methodology was used for the 2019 European Elections collection, i.e. a multilingual and semi-automatic search.

A list of 40 compound terms or keywords was used and translated into the 24 official EU languages. The terms were translated into the various languages in 2019 by the EU Publications Office. This resulted in a multilingual list of 960 terms to put into the Bing Search API.

Before the elections, on 3 June, the first search was carried out, resulting in 8,986 unique addresses, limiting the number of results to the top 20.

After the elections, new search terms were added with the names of the main candidates for the European Parliament in each country of the European Union. This second post-election search yielded 15,371 unique addresses.

The tool used for this collection was Heritrix. The collection was limited to three ‘hops’. In this case, the crawler follows links up to three times. This means that we opted for a certain restraint in the depth of the recording. Three ‘hops’ in the Heritrix crawler is enough to record one page (in other applications also called ‘page’ or ‘single page’ recording).

The content was recorded between 7 and 20 June and included in the EAWP46 special collection. It will be available after 1 year.

Find out more by consulting the open dataset:

Know more about past collections about elections