Connecting the Dots Between Proper Sample Annotation and Research Success with a Biospecimen Tracking and Management System

Ensure Proper Sample Annotation with a Biospecimen Tracking and Management System

Biospecimens, invaluable assets in clinical research, offer a window into the complexities of human biology and disease. However, they are not limitless. With 30 million human-derived samples housed within the top 2 biobanks globally, it’s clear that these resources are finite and precious. Given their limited nature, the effective utilization of biospecimens becomes paramount. Every biospecimen is a potential treasure trove of insights and represents an opportunity for scientific discovery. However, without proper annotation, these specimens remain enigmatic, their full potential unrealized. Inadequate sample annotation negatively impacts research endeavors. Instances abound where delays or setbacks in research projects are attributed to insufficiently annotated samples, hindering progress and impeding scientific discovery. In this blog, we explore the nexus between proper sample annotation and research outcomes.

What is Sample Annotation?

Sample annotation is the systematic process of labeling and providing detailed information about biological specimens stored within biobanks. This information typically includes metadata such as the specimen’s origin, characteristics, processing methods, storage conditions, and associated clinical or research data. Essentially, sample annotation serves as a roadmap, providing researchers with crucial contextual insights into each specimen’s history and relevance to specific research inquiries. 

In biobanking, sample annotation facilitates sample identification, tracking, and retrieval, ensuring that researchers can locate and access the specimens they need efficiently. Secondly, it enhances data integrity and reproducibility by providing a standardized framework for recording essential information about each sample. Moreover, sample annotation enables data sharing and collaboration, allowing researchers to exchange valuable insights and findings across different studies and institutions. 

Why is Proper Sample Annotation More Important Than Ever?

The evolution of data capture methodologies over the last two decades has been monumental, reshaping the landscape of scientific research. For instance, data obtained from nucleic acid electrophoresis was rarely labeled about a couple of decades ago. After nucleic acid electrophoresis, the resulting bands on an agarose gel were visualized using UV light. A picture of the bands on the gel was then taken using a physical camera, such as a Polaroid camera, and taped into a notebook along with the experiment details. At the time of publication, the gel picture was scanned to create a digital image for use in the publication. Consequently, the data was not digitally labeled, as it was physically taped into a notebook.

In modern biobanks, nucleic acid gel electrophoresis data is labeled and annotated using digital imaging and analysis software. High-resolution images of gel bands are captured using digital gel documentation systems, which automatically detect and measure the bands. These images are annotated with specific lane and band information, as well as experimental details, which are stored in Electronic Lab Notebooks (ELNs) and Laboratory Information Management Systems (LIMS). This technological shift has fundamentally transformed the way sample annotation is approached. Unlike in the past, where physical photographs of gels were scanned for documentation, today’s research relies heavily on digital data capture and storage. This necessitates robust sample annotation protocols to accompany the advances in technology and ensure data integrity.

Sample annotation extends beyond mere labeling of physical samples; it involves a comprehensive system of nomenclature that links each experimental analysis back to its original dataset. As computational analyses become more prevalent in research, the need for clear and consistent annotation becomes increasingly critical to interpret results accurately.

The current scientific landscape demands more sophisticated record-keeping systems, moving away from traditional paper notebooks towards electronic databases with linked data. Journals now require comprehensive data repository submissions to facilitate easy access and interpretation of research findings.

The rise of collaborative consortia underscores the importance of linked data in research. When multiple labs contribute datasets to a common problem, effective sample annotation becomes essential for linking disparate analyses back to the original sample population, ensuring the integrity and usefulness of the data generated.

What Type of Data is Annotated to Samples?

Sample annotation encompasses a diverse array of data types that provide crucial contextual information about each specimen stored within a biobank. Typically, this data includes personal and health-related information that offers insights into the sample’s origin, characteristics, and relevance to specific research inquiries. Here are some examples of the types of data that can be associated with samples for proper annotation:

  • Demographics: Information about the age, gender, ethnicity, and other demographic characteristics of the individual from whom the sample was collected.
  • History of Illness: Details about the individual’s medical history, including past illnesses, chronic conditions, and any relevant medical interventions.
  • Family History: Information about the individual’s family history of disease, which can provide insights into genetic predispositions and familial risk factors.
  • Cardiovascular Risk Factors: Data on factors such as blood pressure, cholesterol levels, smoking status, and other cardiovascular risk factors that may influence disease development.
  • Physical Measures: Measurements such as blood pressure, anthropometric measurements (e.g., height, weight, waist circumference), and other physical characteristics.
  • Lifestyle: Information about the individual’s lifestyle factors, including diet, exercise habits, alcohol consumption, and stress levels.
  • Treatments: Details about any medications, therapies, or medical interventions the individual has received.
  • Clinical Outcome: Information about the individual’s clinical outcomes, including disease progression, treatment response, and long-term prognosis.

For example, in stroke research, variables associated with stroke characterization that are typically annotated include the type of stroke, time of blood collection, acute revascularization treatments, brain imaging findings, functional outcomes, and follow-up data. These variables provide valuable insights into the nature and severity of the stroke, as well as the individual’s response to treatment and long-term prognosis. 

It’s important to note that clinical variables should ideally be defined according to established and accepted definitions and criteria to ensure consistency and comparability across studies. In some cases, disease-associated variables may be ascertained through linkages to routinely collected coded national administrative health datasets, such as the International Classification of Disease (ICD) codes for stroke and stroke-related hospitalizations, providing standardized and comprehensive information for research purposes.

What is the Impact of Sample Annotation on Research Outcomes?

Properly annotated samples provide researchers with vital contextual information that enhances the quality and reliability of their findings. 

In a research study conducted by the Canadian Primary Care Sentinel Surveillance Network (CPCSSN), the importance of sample annotation was vividly underscored. By melding general practitioner (GP) data with electronic medical record (EMR) data, researchers gained access to a wealth of information, revealing profound insights into patient health histories. For instance, in the case of a diagnosed diabetic patient, annotation allowed for a granular exploration of their pre-diabetic phase and body mass index (BMI) trends leading up to diagnosis. This example highlights how detailed sample annotation can facilitate in-depth analyses and uncover subtle patterns that may have otherwise gone unnoticed.

Detailed sample annotation ensures that researchers have access to accurate and comprehensive information about each specimen, reducing the risk of data errors and inaccuracies. This enhances the quality and reliability of research findings. It provides essential context for interpreting research data, allowing researchers to understand underlying characteristics and variables associated with each sample. Moreover, transparent and standardized sample annotation protocols promote reproducibility by allowing other researchers to replicate and verify study findings.

How Does a Biospecimen Tracking and Management System Aid in Annotating Samples Accurately?

A LIMS, also known as a biospecimen management and tracking system, serves as a critical tool for supporting sample annotation within biobanks and research settings. It can provide standardized templates for data entry, ensuring that all information necessary for sample annotation is recorded uniformly. Furthermore, a biobanking LIMS forces staff to use predefined lists and controlled vocabulary, which reduces the chances of incorrect or inconsistent annotations. With a biospecimen tracking and management system, biobanks can centrally manage all sample-related information in one accessible database, ensuring consistency and easy access. LIMS platforms also offer robust traceability features that track the complete lifecycle of each sample and record every action taken on samples, including annotations, enhancing data transparency and compliance with regulatory requirements. 

Furthermore, a LIMS with its interoperability can automatically capture and annotate analytical data directly from the source, which reduces manual data handling errors. A biospecimen management and tracking system can also integrate with other software systems such as EMRs, further ensuring that annotations are consistent across platforms. Additionally, a LIMS facilitates rapid data retrieval and analysis through powerful query and search functionalities, transforming data to meaningful information. A LIMS can enforce validation rules and checks to ensure data integrity. For example, it can prompt users to complete required fields or flag anomalies for review. A LIMS allows administrators to control who can annotate samples and what kind of data they can access or modify, thereby reducing the risk of unauthorized or erroneous annotations. Moreover, a LIMS can generate automated reports summarizing sample annotations, helping in quick verification and validation of the data. A cloud platform, while enabling different agencies to collect and collaborate for resources, can also be used for the standardization of research and data dictionaries.


Today, proper sample annotation has become more crucial than ever for ensuring the quality and reliability of research outcomes. Biospecimens, though invaluable, are finite resources that hold immense potential for scientific discovery. Without proper annotation, these specimens’ full potential remains untapped, hindering progress and impeding research. Sample annotation provides essential context, enhances data integrity, and ensures reproducibility of research findings, making it a foundational element of modern research. By leveraging advanced tools such as biospecimen management and tracking systems, researchers can streamline the annotation process, maintain comprehensive records, and foster meaningful collaboration across studies. Clearly, meticulous annotation of samples is not just a procedural necessity but a critical enabler of scientific progress.

Leave a Reply

Your email address will not be published. Required fields are marked *