Skip to main content

Table 2 Considerations for researchers planning to link databases

From: Using linked administrative and disease-specific databases to study end-of-life care on a population level

Topics

Considerations

Exploring relevant databases

Are my research questions clear and well-defined? What data are needed to answer them?

What is/are my study population(s)? What data are needed to identify it?

What database(s) contains the core data and could thus be selected as a starting point?

When a starting database is chosen, what data are lacking to fully address the research questions? Where can we find them?

How can we establish contact with the database administrators of the databases? Obtain principal approval from all administrators (e.g. by presenting the study to the board of directors)

What is the cost associated with each database?

Variable selection

What specific variables do we need from the selected databases to answer our research questions?

Are the variables we want available and linkable between the different databases?

Does the preferred selection of variables complicate the linking procedure considerably? Balance the gain in information with the increase in complexity and time.

What is the required level of detail for each variable? Balance the preferred level with what is allowed in terms of data protection (e.g. through small cells risk analysis to determine risk of re-identification based on a combination of variables)

Do we have sufficient storage capacity and analysis hardware to store and analyze all the data we want?

Access procedures

What ethical and privacy procedures need to be followed to link and access the selected database?

What technical procedures need to be followed to link and access the selected databases?

Infrastructure

How will data be stored safely? Is infrastructure provided by researchers or by database administrators? What is the cost for this infrastructure?

How will data be protected? Physical and digital protection need to be guaranteed.

How can data be accessed in a safe and easy way? What hardware and software do we need to access and analyze the requested data?