Back to Blogs

BLOG

Targeting cancer through large-scale data analysis

Home / Blog / Targeting cancer through large-scale data analysis

Hadley Sheppard, PhD

September 22, 2023

22 September 2023

Introduction to World Cancer Research Day

September 24th, World Cancer Research Day, is a global international movement dedicated to maintaining the momentum of cancer research. In 2020 alone, there were 18 million new cases of cancer recorded, with a striking 10 million new deaths worldwide.

While there has been tremendous progress in developing cancer therapies, it is only in the last ten years, which coincide with the sequencing of the human genome, that there has been a sharp uptake in targeted drugs. Furthermore, the road to drug approval is long, 10-15 years; in 2022, the US Federal Drug Administration listed 37 approvals for new drugs, with only 31% for cancer treatment.

Given this challenge, it is critical that the greater scientific community remain committed to cancer research and identifying innovative approaches to accelerate findings. Currently, the scientific community is failing - cancer drug discovery as it is (Drug Discovery 1.0) relies on wet lab approaches that take tens of years to complete, cost billions of dollars and patients are left without treatments. The solution to this problem is Drug Discovery 2.0 - harnessing the power of big data to bring life-saving therapies to people.

This blog considers the challenges surrounding targeting cancer and how data, data federation, Trusted Research Environments, and end-to-end solutions are the solution to these problems.

The challenges surrounding targeting cancer

Cancer is fundamentally a problem of genetic dysregulation. In a healthy cell, some genes become active to promote cell growth and those that become active to stop cell growth. Both processes are essential within normal development. For example, cells need to grow to regenerate tissue if one falls and scrapes their knee, but there also are signals to stop cell growth if genetic abnormalities are detected.

In cancer, the normal cellular signals to grow or not grow have gone awry. In genetic terms, identifying the root cause of a cancer can fall into two categories:

Oncogenes: A gene capable of inducing characteristics of cancer cells. Cancers are often addicted to having these genes turned on.
Tumour suppressor genes: A gene whose inactivation leads to tumour development.

Researchers have done considerable work to identify how cancers sustain growth. In 2011, Professors Douglas Hanahan and Robert Weinberg published the foundational Hallmarks of Cancer, highlighting the key ways cancer cells continue to survive, which continue to be updated as new knowledge is gained.

To make the genetics even more complicated, 40% of cancer-causing genes are known to be transcription factors, or genes that control other genes. A stark example is the MITF gene in melanoma (a type of skin cancer)- turned on in the body long after its developmental role, this transcription factor plays a pro-survival role in melanoma.

Given this, there needs to be continued efforts and methods to uncover the complex genetics of cancer biology, and ultimately ways to discover targets and treat the disease.

Drug Discovery 2.0: unravelling cancer’s complexities through big data

As mentioned above, with the sequencing of the human genome, there has been a sharp increase in targeted cancer therapies. The first generation of cancer drugs were designed to stop cell growth, which ultimately included both healthy and cancer cells, classifying these drugs as cytotoxic due to their many side effects.

However, as genomic sequencing data became available, researchers could begin to create molecular maps and identify cancer’s vulnerabilities. However, much of the work still took place in the lab, and Drug Discovery 1.0 was slow to progress.

Drug Discovery 2.0 will revolutionise the speed at which we understand which genes may be turned on or off within a given cancer and the role that they are playing. One of the first stages in drug discovery is target identification, which refers to determining what a drug should attack in a specific disease context. Because Drug Discovery 2.0 uses computational approaches that examine the entire genome. Researchers can compare genomic data from healthy patients and those that have a specific cancer, streamlining how to identify what an effective drug should target. This approach has brought timelines of target identification from years down to months.

The process of drug discovery relies heavily on access to usable large-scale data.

Next in the process of drug discovery, a cancer target is experimentally validated and drugs are developed against it, allowing them to enter preclinical development. Large-scale data is essential in preclinical cancer research because it helps researchers understand if and how early-stage drugs are working, before they enter a patient.

Once evaluated preclinically, cancer drugs will enter clinical development for testing in patients. Large-scale health data serves to understand how the drug is functioning in cancer patients and helps medical professionals identify the appropriate group to be treated with the drug. Drug Discovery 2.0 uses large-scale health data to group patients into disease subgroups, known as patient stratification, accelerating the translation of drugs into the clinic by maximising the likelihood of their success while preserving patient safety.

Considerations in using sensitive data within cancer research

Given the insights large sets of data can bring, it is unsurprising that there are now 2 to 40 billion gigabytes of data generated each year in cancer care and research and beyond. However, there are key considerations to ensuring that Drug Discovery 2.0 can achieve its full potential:

Data needs to be accessible to researchers and clinicians without compromising patient security. Housing data in secure, Trusted Research Environments can maintain security while enabling research at scale.
Employing federated technologies for secure access avoids copying and physically moving the data, allowing it to remain in with its jurisdictional boundaries.
Linking and extracting insights from diverse data types across various sources and modalities (e.g. preclinical, clinical, molecular, imaging) enables researchers and clinicians to create a full biomedical picture for a given patient.
End-to-End Analytical Solutions reduce the time to insights by keeping data access, standardisation and analysis all in one place.
Implementing democratised, user-friendly, no-code solutions empower researchers regardless of a data science background.

There are several key approaches to enable the use of big data within cancer research.

Featured resource: Read our whitepaper on forecasting the future of genomic data management

Concluding remarks

Recognising World Cancer Day not only reminds us of the milestones that have been achieved but also invigorates momentum for cancer research to continue to improve patient outcomes. Secure access to usable health data, with the resources to derive insights, is the solution to accelerate these efforts and bring the scientific community into Drug Discovery 2.0 - a faster, smarter and more strategic approach to developing therapies. Lifebit is committed to doing our part to combat cancer through supporting and enabling large-scale research - ultimately striving for a healthier future for all.

Author: Hadley E. Sheppard, PhD

Contributors: Hannah Gaimster, PhD and Amanda White

About Lifebit

Lifebit provides health data standardisation services for clients, including Genomics England, Boehringer Ingelheim, Flatiron Health and more, to help researchers transform data into discoveries.

Lifebit’s services are making health data usable quickly.

Featured news and events

2025-03-26 11:17:46

Building the Future of European Trusted Research Environments

2025-03-14 15:45:18

Lifebit Powers Global Precision Medicine Breakthroughs

2025-03-05 12:49:53

Creating Compliant and High-Impact Data Products from Real-world Data

2025-02-27 10:00:00

The Application of Data Lakehouses in Life Sciences

2025-02-19 13:30:24

Optimizing Real-World Evidence for Pharma: From Data to Discovery

2025-02-11 08:39:49

Lung Cancer Genetics Study and Lifebit Partner to Advance Groundbreaking Research Funded by Troper Wojcicki Philanthropies

2025-01-30 12:47:38

10 Key Benefits of a Federated Data Lakehouse in Life Sciences

2025-01-28 08:00:00

Cancer Research Horizons partners with Lifebit to maximise the impact of research data

2025-01-23 09:07:20

Lifebit and Psifas Partner to Advance Genomic Research in Israel

2025-01-08 13:58:41

23andMe Launches Discover23 to Help Accelerate Large-Scale Genetics Research, Powered By Lifebit’s Trusted Technology

Targeting cancer through large-scale data analysis

Introduction to World Cancer Research Day

The challenges surrounding targeting cancer

Drug Discovery 2.0: unravelling cancer’s complexities through big data

Considerations in using sensitive data within cancer research

Concluding remarks

About Lifebit

Life Sciences

Healthcare

Software

Use Cases

Learning & Development

Company

Lifebit Federated technology

Lifebit Mission

Lifebit partners with Latin American innovators to help solve global health challenges through genomics research

ASHG Annual Meeting 2023

Bioinformatician (Remote - Nextflow Developer)

Lifebit partners with Flatiron Health

Get in Touch

Lifebit CloudOS

Lifebit REAL

Become a Pioneer in Precision Medicine

Become a Therapeutic Leader

Data Transformation (OMOP)

Federated data analysis

Trusted research environment

Frontiers in Genetics

Secure data, scalable research

Better together: the promise of health data linkage and its challenges

Lifebit CloudOS Documentation

Lifebit Federated Platform

Lifebit Federated Technology

Lifebit Mission

Lifebit partners with Latin American innovators to help solve global health challenges through genomics research

ASHG Annual Meeting 2023

Bioinformatician (Remote - Nextflow Developer)

Lifebit partners with Flatiron Health

Get in Touch

Lifebit CloudOS

Lifebit REAL

Become a Pioneer in Precision Medicine

Become a Therapeutic Leader

Data Transformation (OMOP)

Federated data analysis

Trusted research environment

Frontiers in Genetics

Secure data, scalable research

Better together: the promise of health data linkage and its challenges

Lifebit CloudOS Documentation

Targeting cancer through large-scale data analysis

Introduction to World Cancer Research Day

The challenges surrounding targeting cancer

Drug Discovery 2.0: unravelling cancer’s complexities through big data

Considerations in using sensitive data within cancer research

Concluding remarks

About Lifebit

Featured news and events