Preparing your genomic data files for downstream analysis

Following the identification of a cohort (covered in #1 & #2 of the webinar series), popular tools such as bcftools enable simple preparation of genomics data into a ready to analyse format.

This third training session of the Data Science Get Results series will demonstrate how to find or link omics data within the Lifebit Platform, mount it to an interactive session, and use some tools to complete file manipulations and to prepare the data for downstream analysis. Tools used in this session will include bcftools, PLINK2 and R.

Note, that the data utilised in this session is synthetic or publicly available, and does not include any participant identifiable information. Only users who are registered to use the Lifebit Platform across any Lifebit client will be eligible to attend this course. Any attendees who do not meet this criteria will be unregistered prior to the course.


What you will learn

  1. How to import files into interactive session
  2. Best practices for loading packages in the Lifebit Platform
  3. How to subset and aggregate files using bcftools
  4. How to convert files from 1 format to another using PLINK2
  5. How to use R to create file lists for pipelines such as GWAS & Exomiser



Target audience

This course is intended for researchers who need to prepare omics data for downstream analysis. Attendees tend to be academic researchers, data scientists, or bioinformaticians. Basic understanding of command line interface is beneficial.