Data in the Local Database comes from various Danish research institutions’ local systems and portals, which are used for registering their own publication outputs and activities (also known as CRIS systems). In the database, these local data providers are referred to as ‘Danish Data Providers,’ and under the filter in the database with the same name, you can choose to limit your result to only include publications registered by and harvested from one or more of the individual Danish Data Providers.
In addition to the current Danish Data Providers, we are in dialogue with several other Danish research institutions to be included in the portal in the future.
Data is harvested from the individual Danish Data Provider’s local systems via a web service (OAI-PMH) using the national exchange format DDF-MXD (Danish Research Database Metadata Exchange Format), which is used to exchange publication metadata.
Data from the various Danish Data Providers are harvested around the 20th of each month. In the Local Database, a full harvest takes place every month. This means that changes made at the individual research institution locally are reflected in the Local Database after the consecutive update. After the data has been harvested, data is processed, quality assured and matched with publication data from the Global Database, making new data available in the Local Database around the beginning of each month.
In the interim period between two updates, data is further enhanced by an updated mapping of Danish organization names, based on new name variants that appeared in the previous data update.
In the filter ‘Added on’ you can see and limit your search result to the year/month when the individual publication records were added to the database. This can be particularly useful if you want to see only newly added publications.
DDF-MXD (Danish Research Database Metadata Exchange Format) is a nationally developed format used for exchanging publication metadata. In Research Portal Denmark, DDF-MXD is used to retrieve and harvest information about publication data registered in the local systems and portals of Danish research institutions (Danish Data Providers) in a systematic manner. The format is being used for data harvest in the portal’s Local Database as well as the Danish Open Access Indicator.
DDF-MXD makes it easier to harvest data from various systems and enables direct conversion of the data into the portal, rather than collecting diverse data and subsequently processing it within the portal before display.
As it is a national format, DDF-MXD, can be continuously modified and expanded with new relevant fields to stay up-to-date. Proposals for changes and extensions to the format are managed and reviewed by NORA and approved by The Working Group for Data from Danish Research Institutions. Suggestions for modifications and/or expansions can be submitted to nora.info@dtu.dk.
From the Danish Data Providers only a selection of publication types and – categories are harvested:
- Journal Article
- Journal Comment
- Journal Review
- Newspaper Article
- Book
- Book Chapter
- Book Preface / Encyclopedia article
- Report
- Report Chapter
- Conference Paper
- Conference Abstract
- Conference Poster
- Working Paper / Preprint
- Lecture Notes
- Thesis Doctoral
- Thesis PhD
- Other
Since the Local Database also includes publication types such as conference posters and popular science publications, which are typically not indexed by the global commercial data providers, the data scope is broader in terms of publication type than what is included from the global data providers in the Global Database, which exclusively focuses on more traditional peer-reviewed scientific publications.
The research institutions can to some extent decide for themselves which data they choose to expose via their web service to the Research Portal. Some Danish Data Providers have chosen to only expose publications that have the status of being finally published, while others have chosen that publication records must be validated in order to be able to be harvested via their web service. In other words, from some research institutions, we receive all the publication data they expose themselves, while from others, we can only harvest a subset of the total amount of registered publication records by the individual Danish Data Provider.
Within the scope of the selection of publication types and categories, metadata is retrieved from 2011 onwards from all the Danish Data Providers.
NORA-Enhancements is a general term for standardised names used across all databases of Research Portal Denmark. Standardisation is done to ensure consistent and structured data, which makes it both easier to search and forms the basis for groupings of particular analytical interest.
Metadata elements collected from various data providers often contain multiple and different name variants describing the same value.
In the Local Database, the following metadata elements are standardised (mapped or grouped):
Danish affiliations and groupings
All Danish affiliation names are mapped to one standardised name and fall under a specific grouping. For example, both ‘University of Copenhagen’ and ‘Copenhagen University’ are mapped to the standardised affiliation name ‘KU University of Copenhagen’ and categorised under the grouping ‘Universities’ (find these in the Danish Affiliations filter).
Collaborating countries/regions
The other countries with which Danish authors co-publish (collaboration countries) are grouped into regions of analytical interest. For example, ‘Norway’ is grouped into the regions ‘Europe’, ‘Non-EU’, ‘Nordic‘ and ‘OECD’ (find these in the Collaboration – Regions filter).
Subject classifications and Open Access categories
Unlike the Global Database, neither subject classifications nor Open Access status are enhanced/mapped, as this standardisation is partly already done via the transformation of the data through the DDF-MXD format. However, minor standardisations are applied in several filters in the Local Database, e.g. Keywords and Source Filters.
Read more about NORA-Enhancements in the technical documentation.
Since data are harvested from several Danish Data Providers, the full dataset will contain a number of publication duplicates. The reason for this is that multiple universities or other research institutions co-publish, resulting in several Danish research institutions registering the same publication in their respective local systems/portals.
To ensure that publications, registered by two or more of the Danish Data Providers, are identified and matched correctly, we have developed a deduplication algorithm which consolidates data from the local Danish Data Providers.The algorithm is based on a series of fine-tuned rules that determine whether it is actually the same publication. These rules take into account various conditions, where especially metadata in the form of PIDs (Persistent Identifiers such as DOI, PMID, ISSN etc.) play a central role.
If one consolidated record consists of data from several Danish Data Providers, this is always indicated by the symbol both in the result list and on the full publication record itself.
The display of the publications in the database consolidates metadata from all the individual publication records being merged. For this purpose, a number of display rules have been created, specifying which metadata fields must be displayed and used in the database’s filters.
If you need to use the local data and/or the database without deduplication (e.g. for analytical purposes), there is also a version of the Local Database with duplicates (called Local Data – Raw Data) available.
In addition to matching publications from the various Danish Data Providers, publications from the Local Database are matched with the corresponding publication records in the Global Database and thus a link is established to and from the two databases – find this information on the individual publication record or in the filter ‘Matching Records in’ .
Read more about this cross-cutting matching algorithm in the technical documentation.
When you look at a publication record in full view in the Local Database, you can click on ‘Data Provider’, and see which Danish Data Providers have contributed to the merged and consolidated record. In the pop-up box that appears, you will find information about the individual records that have been matched, links to the record in the individual Danish Data Providers’ own systems/portals, as well as links to the record in the DDF-MXD format.
Each publication record has a local object ID (also known as LOI), which is structured in such a way that makes it possible to see how many individual records a publication record in the database is consolidated from (even if it is only a single individual record) and how high of a match percentage, there are between the consolidated records:
We are always in close dialogue with the Danish Data Providers and the NORA team emphasising testing the individual institutions’ data thoroughly before final inclusion of the data from new Danish Data Providers in the Local Database.
The collaboration with the Danish Data Providers are also anchored in The Working Group for Data from Danish Research Institutions, which represents the various types of research institutions and acts as a sparring partner with NORA.
The Local Database has some challenges that are worth paying attention to:
- Data is harvested as defined and registered by the individual Danish Data Provider.
- The coverage depends on the Danish Data Providers that we are harvesting from. That is, the coverage of the research institutions that currently are the Danish Data Providers and their collaboration partners are well covered, while others such as private companies are not covered to the same extent by the database.
- The publication data is based on the locally registered publication metadata from several different research institutions’ own systems, which may have specific setups following local needs.
- Updates of the Danish Data Providers own local systems does not necessarily happen at the same time, so even if several Danish Data Providers use the same system provider for their local system/portal, new content, extensions of the MXD format and corrections will only take effect when the individual research institution has updated their local system to the version of their local system containing these fixes or changes.
- Display of data and registration practices may differ from data provider to data provider, thus the data we receive from the individual Danish Data Providers may also vary.
In the Operation Status section, you can access a summary of current issues related to updates, such as problems with data harvesting.