Data Transformation

Unlock the value of your large-scale health data
with Lifebit’s Data Transformation Suite

The challenge
Data for health research comes from a wide range of sources


With this diversity comes wide variability in how data is described and stored:

  • Formats can vary widely
  • Fields are named differently between datasets
  • Different medical vocabularies used between datasets
  • Data is often not cleaned 

This diversity is also evident across geographies and countries, as local and national healthcare reporting systems frequently use different data models and ontologies.

These differences in the way that data is stored or described create challenges for researchers preparing data for analyses: 

  • Inefficient and repetitive tasks
    Researchers spend repeated time and effort cleaning and harmonising data before analysis.
  • Coding and data expertise needed
    Data cannot be easily used or cleaned/harmonised by people unfamiliar with the data or who lack the necessary coding expertise.
  • Limited analysis potential
    Running analysis queries on unstructured data or across data in different file formats is not possible.


Types of data graphic

The solution

Adopting a common data model

Lifebit’s Data Transformation Suite is used by leading organisations across healthcare and life sciences to accelerate scientific discoveries. 

It is harnessing the power of data for Genomics England, the UK government’s flagship initiative to improve the genomic health of the population, and is accelerating research insights for academia, healthcare and pharmaceutical companies across the world.

Find out more arrow-top-right



Transforming data to OMOP

Lifebit’s platform is making data usable, quickly.  

Lifebit is certified by the European Heath Data Evidence Network (EHDEN) in transforming data to the Observational Medical Outcomes Partnership (OMOP) common data model. 

This is harmonising disparate data sources, transforming them into a common format and using a standard set of vocabularies so they can be analysed using a library of standard analytic pipelines. 

Lifebit’s Data Transformation Suite uses a set of pipelines that transform raw data to analysis-ready data. These pipelines are automated yet flexible, and built to accommodate new data types over time.

Through the Data Transformation Suite, data is harmonised, mapped to existing standards, annotations and ontologies and then interlinked during data ingestion to produce a linked data graph. This process increases the interoperability and reusability of the data and overall actionability.

Client Environment graphic

The Data Transformation Suite is fully automated and based on
three key components of Lifebit’s technology


Pipeline Composer

A tool that enables coders and non-coders alike to easily build their own reproducible, complex workflows for clinical and multi-omics data analyses with an intuitive drag-and-drop interface.


ETL (Extraction, Transformation, Loading) Pipelines

Predefined pipelines that have been pre-configured and set up in the Platform to automatically process and convert raw data to analysis-ready data.


Health Data knowledge base

Proprietary knowledge base of mapping specification files, based on Lifebit’s deep OMOP experience across diverse data types/formats.

Increasing data utility
creating analysis-ready data for research

Lifebit’s data transformation pipelines result in cleaned, harmonised and mapped data that is analysis-ready
and ingested into the platform database where it can be queried for research.


Data pipelines graphic



Interested in learning more about
Lifebit’s federated data solution
for genomics research?


At Lifebit, we develop secure federated data analysis solutions for clients including Genomics England, NIHR Cambridge Biomedical Research Centre, Danish National Genome Centre and Boehringer Ingelheim to help researchers turn data into discoveries.

Contact us  Request a demo