Biobanks characteristics

Some considerations for a successful project collaboration

Biobanks are extremely diverse depending on the nature of the sample collections and associated data. Below are some factors which are important to consider when establishing a collaboration, supported by some interesting examples.


Cohort size

Cohorts in biobanks may vary, some features are:

• Rare disease cohorts.
• Population isolates.
• Deep endophenotype profiles.
• Clinically extremely well characterized patient populations.


Biological samples

What makes biobanks valuable is the availability of biological samples for research. Registries offer prospective observational data and the biological samples allow to search for predictive biomarkers, estimate causality and develop deeper mechanistic insights into disease pathophysiology.

The most common type of sample in research biobanks is blood samples. While clinical biobanks keep almost every type of patient samples.


Molecular profiles

If the main focus areas are understanding disease pathophysiology, finding or developing biomarkers, and drug discovery biological data that has already been profiled at a molecular level is of interest.
• Molecular profiles offer direct readouts of biological processes and represent substantially more intellectual property information than merely questionnaires or health records.
• Only very few biobanks include molecular profiles other than genetic data available.
• Industry has a growing interest to involve molecular profiles in any R&D processes.


Source and depth of phenotype data

Most biobanks are cross-sectional, and all phenotype information available was collected during the recruitment process either through questionnaires, or objective measurements and experiments. In such settings most of the lifestyle and disease data are self-reported.

Biobanks which regularly perform physical evaluation visits to recruitment centers or can retrieve lifestyle and health status updates from electronic databases offer a more comprehensive overview of participants’ life trajectories, which enables insights into more sophisticated phenomenon or behavioral patterns. Another option to gather longitudinal data is to send questionnaires regularly to participants. This approach however has several limitations including a low response rate, and incompleteness e.g. participants answer the questions selectively.


Legislation and policies for data sharing

Although there are very large numbers of samples in European biobanks only a small fraction of the samples have proper consented for research, and an even smaller fraction has available longitudinal health data or possibility to recall.
Often, participants have given consent for only a specific study.

Of similar importance are data access, storage abroad, and sharing regulations which may vary between countries. In some countries, like China and Russia, getting biological samples outside the country is very complicated. In others, if appropriate IRB approvals are in place, no particular limitations apply. In some countries, all data processing must be carried out on local servers, and use of cloud services is not allowed.

In general cloud providers are accepted by European biobanks, but the cloud warehouses have to be located physically in Europe. For some projects industry partners prefer to interact directly with the raw data. In such case, if data cannot leave biobank servers, secure and scalable IT solutions need to be provided by the biobank in order to engage with certain types of projects.