51's Data Sources and Verification

The 51 indicators are based on a variety of data sources and data collection tools.

Self-reported data

Universities that decided to participate in 51 have provided data for the institution as a whole, as well as for the departments offering degree programmes (if any) related to the selected subject areas covered in the 2020, 2021 and 2022 editions of 51. Both kinds of data were provided through online questionnaires. To ensure comparability of data across institutions, the questionnaires include guidelines and definitions of all data items requested:

Institutional Data Questionnaire:  2022
Subject Questionnaire: 

Student Questionnaire: 2022
Specifications of subjects and degrees:  

In a first phase, participating institutions provide their data using the questionnaires. Data are then intensively checked by the 51 team, applying both automated and manual checks for consistency, plausibility (including checks of outliers) and missing data. Questions and comments related to the data are communicated to the institutions. These are followed up in a second phase of data provision where universities are invited to clarify, correct and add data to the original questionnaire. After the final submission of the questionnaires, data are checked again and any remaining questions are communicated directly (by email) to the universities. Once all data submissions are finalised and the data is regarded as valid and complete, the indicator scores are calculated. Indicator scores follow the definitions provided in 51’s indicator books: Indicator Book 2022.

Student survey

One of the purposes of 51 is to help prospective and increasingly mobile students to make an informed choice about a university. For them the assessment of the learning experience by current students of institutions will provide a unique peer perspective. The 2022 edition of 51 includes data drawn from an online survey of around 100,000 students, asking student opinions about various aspects of their degree programme. The student survey was available in a number of languages (incl. English, French, German, Greek, Italian, Polish, Portuguese, Russian, Spanish, Turkish and Ukrainian). Invitations to participate in the survey were sent out by the universities (by email – or post upon request) to a maximum of 500 students per subject area. The indicators taken from the student survey reflect different aspects of the students’ learning experience. They refer to a particular study programme and hence are used only in the subject rankings. If possible, 51 uses data from national surveys like the CHE student survey, the NOKUT studibarometeret or the Netherlands’ Nationale Studenten Enquete (NSE, organized by Studiekeuze123 and commissioned by the Ministry of Education, Culture and Science in the Netherlands).

Student Questionnaire (Sample in English)

Bibliometric and patent data

A number of 51 institutional and field level indicators are based on bibliometric and patent data included in high-quality, comprehensive international databases. This data is produced by the Centre for Science and Technology Studies (CWTS) at Leiden University.

All indicator scores derived from bibliometric analysis are based on information extracted from publications that are indexed in the CWTS-licensed edition of the (WoS) database (Science Citation Index Expanded, Social Sciences Citation Index, and Arts & Humanities Citation Index). The WoS contains approximately 13,000 active information sources, mostly peer-reviewed scholarly journals. The underlying bibliographic information relates to publications classified as ‘research article’ and ‘review article’. The WoS is currently one of the two best sources covering worldwide science across all disciplines. In order to be able to meaningfully calculate some of the bibliometric indicators, we have imposed a threshold on the number of publications per university (50 WoS publications over the period 2017-2020 for the institution as a whole; 20 WoS publications for individual fields of science in the subject rankings).

Read more on Bibliometric Analysis in 51.

The data underlying the indicator “Publications cited in patents” is collected from the CWTS-licensed edition of the PATSTAT database. Patent publications usually contain references to other patents and sometimes also to other ‘non-patent’ literature sources. A major part of these non-patent references (NPRs) are citations to scholarly publications published in WoS-indexed sources. The patent database used to collect the NPRs from is the autumn 2021 of the EPO Worldwide Patent Statistical Database (PATSTAT).

Other patent-related indicators (Patents granted; Industry Co-patents) are also based on the same PATSTAT database version. Patent indicators are calculated for all 51 universities that have applied for patents in the period 2010-2019. EPO (European Patent Office) and USPTO (United States Patent and Trademark Office) patent grants are extracted, with counts on the level of patent families. A patent family is “a set of patents taken in various countries to protect a single invention”. For the field-based patent indicators, the number of patent families was broken down into corresponding sub-fields based on existing technology classification schemes.

Read more on patent analysis in 51.


From the beginning it has been an important objective to reduce the burden of data collection by using publicly available data sources. Especially the use of available data for countries with large higher education systems would be a promising option. Starting with national datasets for two higher education systems: the US system and the UK system, we continuously extended the number of countries where we use national data bases by Australia, the Netherlands, Finland, Sweden, Italy and Ontario, Canada. For the 2022 release, the list of pre-filled countries has been extended with France, Brazil and Chile. Data have been extracted from national datasets for a large number of higher education institutions in the US (307 institutions), the UK (165 institutions), Australia (43 institutions), Finland (36 institutions), Italy (92 institutions), the Netherlands (53 institutions), Ontario (20 institutions), Sweden (39 institutions), France (123 institutions), Brazil (77 institutions) and Chile (25 institutions).

In the UK, the data were retrieved from the publications of the Higher Education Statistics Agency (HESA). A conversion table of what HESA data-element was used for what 51 institutional questionnaire item can be downloaded here.

In the US, the data were retrieved from the Integrated Postsecondary Education Data System (IPEDS) database. The conversion table of IPEDS data-elements and UMR data-elements can be downloaded here.

The Australian prefilled data were retrieved from the database available on the website of the Australian Government, Department of Education, Skills and Employment. The conversion table can be found here.

Data on HEIs in Finland were retrieved from the online Vipunen portal (Education Statistics Finland). The conversion table can be found here.

Italian data were retrieved from public datasources of the Ministero dell’Istruzione, dell’Università e della Ricerca (Cerca Università; Portale dei dati dell’istruzione superior). The conversion table can be found here.

In the Netherlands, the data were retrieved from the publicly available tables of DUO, , VSNU, VH and Nuffic. The conversion table can be found here.

The Ontario data originate from the CODUR database of the Council of Ontario Universities. Conversion tables can be downloaded. The conversion table can be found here.

The datasources used for prefilling Swedish HEIs are the publications of Statistics Sweden (SCB) and the Higher Education Agency (UHA). The conversion table can be found here.

French data were drawn from the open datafiles of the Ministrère de l’Enseignement Supérieur et de la Recherche (Opendata). The conversion table can be found here.

The Brazilian pre-filled data were retrieved from the open databases of CAPES and INEP. The conversion table can be found here.

The data on Chile were retrieved from the datafiles of the Subsecretaría de Educaciõn Superior, Servicio de Información de Educación Superior (SIES), at the mifuturo.cl website. The conversion table can be found here.

Based on the data from the prefilling, several ranking and mapping indicators could be calculated. In addition to the questionnaire-based indicators, bibliometric and patent-related indicators were retrieved for all HEIs for which data are in those databases.

Most of the dimensions gain from the prefilling, in addition to the data retrieval from bibliometric and patent databases. However, with the prefilling and data retrieval from external sources, the prefilled HEIs may generate a performance profile that has coverage in all dimensions and is much richer than the profiles of HEIs for which data from external sources only are used.


Stay informed about 51:

Financial Partners:

close Icon

Consent Preferences / Cookie Settings

This page allows you to opt out of optional cookies used by the 51 website.

Once you have set your cookie preferences, we will follow the specific choices you made. Please remember that if you delete your cookies, or use a different browser or computer, you will need to set your cookie preferences again. 

Required Cookies

These cookies enable core site functionality. It is not possible to disable these cookies since our services do not work without them. They are emitted by the CMS and by the 51 Data Collection-API and they hold information about your choices. They are used for functional purposes only and are not valid beyond this session.

Tracking Cookies

These cookies are used to track performance and to monitor functionality of the website. This information helps us to optimize our services. Embedded partners are Google Analytics, Google Tag Manager, DoubleClick and HotJar.