Professor Serena Nik-Zainal, Thorben Seeger, Rosanna Fennessy, Eleanor Hall, Parker Moss, Geoff Coles, Dr Maria Chatzou Dunford, Dr Pablo Prieto Barja, Mark Avery, Keiran Raine

Combining data assets for research has historically involved the movement of data between organisations. Trusted Research Environments (TREs) now offer an alternative to data sharing by providing secure environments for approved researchers to access and analyse data. However, data are still held separately. Even where researchers have permission to use data held in two separate TREs, moving data between organisations from one TRE to another can be costly, complicated and time consuming. The purpose of this project is to demonstrate how TREs can be enabled to ‘talk to each other’ to facilitate analysis across separate databases as if they were one, a process known as ‘federation.’

GenomeChronicler: The Personal Genome Project UK Genomic Report Generator Pipeline

Guerra-Assuncao JA, Conde L, Moghul I, Webster AP, Chervova O, Chatzipantsiou C, Prieto P, Beck S and Herrero J
bioRxiv 2020.01.06.873026v2
In recent years, there has been a significant increase in whole genome sequencing data of individual genomes produced by research projects as well as direct to consumer service providers. While many of these sources provide their users with an interpretation of the data, there is a lack of free, open tools for generating similar reports exploring the data in an easy to understand manner. GenomeChronicler was written as part of the Personal Genome Project UK (PGP-UK) project to address this need. PGP-UK provides genomic, transcriptomic, epigenomic and self-reported phenotypic data under an open-access model with full ethical approval. As a result, the reports generated by GenomeChronicler are intended for research purposes only and include information relating to potentially beneficial and potentially harmful variants, but without clinical curation.
Chervova O, Conde L, Guerra-Assuncao JA, Moghul I, Webster AP, Berner A, Larose Cadieux E, Tian Y, Voloshin V, Jesus TF, Hamoudi R, Herrero J, Beck S
Sci Data. 2019 Oct 31;6(1):257
Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics.

Novel missense variants in the RNF213 gene from a European family with Moyamoya disease

Andrey N Gagunashvili, Louise Ocaka, Daniel Kelberman, Pinki Munot , Chiara Bacchelli, Philip L Beales, Vijeya Ganesan
Hum Genome Var. 2019 Aug 8;6:35.
In this report, we present a European family with six individuals affected with Moyamoya disease (MMD). We detected two novel missense variants in the Moyamoya susceptibility gene RNF213, c.12553A>G (p.(Lys4185Glu)) and c.12562G>A (p.(Ala4188Thr)). Cosegregation of the variants with MMD, as well as a previous report of a variant affecting the same amino acid residue in unrelated MMD patients, supports the role of RNF213 in the pathogenesis of MMD.
Pasolli E, Asnicar F, Manara S, Zolfo M, Karcher N, Armanini F, Beghini F, Manghi P, Tett A, Segata N et al.
Cell. 2019 Jan 24;176(3):649-662.e20
The body-wide human microbiome plays a role in health, but its full diversity remains uncharacterized, particularly outside of the gut and in international populations. We leveraged 9,428 metagenomes to reconstruct 154,723 microbial genomes (45% of high quality) spanning body sites, ages, countries, and lifestyles. We recapitulated 4,930 species-level genome bins (SGBs), 77% without genomes in public repositories (unknown SGBs [uSGBs]). uSGBs are prevalent (in 93% of well-assembled samples), expand underrepresented phyla, and are enriched in non-Westernized populations (40% of the total SGBs). We annotated 2.85 M genes in SGBs, many associated with conditions including infant development (94,000) or Westernization (106,000). SGBs and uSGBs permit deeper microbiome analyses and increase the average mappability of metagenomic reads from 67.76% to 87.51% in the gut (median 94.26%) and 65.14% to 82.34% in the mouth. We thus identify thousands of microbial genomes from yet-to-be-named species, expand the pangenomes of human-associated microbes, and allow better exploitation of metagenomic technologies.
Jesus T, Ribeiro-Gonçalves B, Silva D, Bortolaia V, Ramirez M, Carriço J
Plasmid ATLAS (pATLAS, http://www.patlas.site) provides an easy-to-use web accessible database with visual analytics tools to explore the relationships of plasmids available in NCBI's RefSeq database. pATLAS has two main goals: (i) to provide an easy way to search for plasmids deposited in NCBI RefSeq and their associated metadata; (ii) to visualize the relationships of plasmids in a graph, allowing the exploration of plasmid evolution. pATLAS allows searching by plasmid name, bacterial host taxa, antibiotic resistance and virulence genes, plasmid families, and by sequence length and similarity. pATLAS is also able to represent in the plasmid network, plasmid sets identified by external pipelines using mapping, mash screen or assembly from high-throughput sequencing data. By representing the identified hits within the network of relationships between plasmids, allowing the possibility of removing redundant results, and by taking advantage of the browsing capabilities of pATLAS, users can more easily interpret the pipelines' results. All these analyses can be saved to a JSON file for sharing and future re-evaluation. Furthermore, by offering a REST-API, the pATLAS database and network display are easily accessible by other interfaces or pipelines.
Lamia Mestek-Boukhibar, Emma Clement, Wendy D Jones, Suzanne Drury, Louise Ocaka, Andrey Gagunashvili, Polona Le Quesne Stabej, Chiara Bacchelli, Nital Jani, Shamima Rahman, Lucy Jenkins, Jane A Hurst, Maria Bitner-Glindzicz, Mark Peters, Philip L Beales, Hywel J Williams
Cell. 2019 Jan 24;176(3):649-662.e20
Rare genetic conditions are frequent risk factors for, or direct causes of, paediatric intensive care unit (PICU) admission. Such conditions are frequently suspected but unidentified at PICU admission. Compassionate and effective care is greatly assisted by definitive diagnostic information. There is therefore a need to provide a rapid genetic diagnosis to inform clinical management. To date, whole genome sequencing (WGS) approaches have proved successful in diagnosing a proportion of children with rare diseases, but results may take months to report. Our aim was to develop an end-to-end workflow for the use of rapid WGS for diagnosis in critically ill children in a UK National Health Service (NHS) diagnostic setting.
Verity L Hartill, Glenn van de Hoek, Mitali P Patel, Rosie Little, Christopher M Watson, Ian R Berry, Amelia Shoemark, Dina Abdelmottaleb, Emma Parkes, Chiara Bacchelli, Katarzyna Szymanska, Nine V Knoers, Peter J Scambler, Marius Ueffing, Karsten Boldt, Robert Yates, Paul J Winyard, Beryl Adler, Eduardo Moya, Louise Hattingh, Anil Shenoy, Claire Hogg, Eamonn Sheridan, Ronald Roepman, Dominic Norris, Hannah M Mitchison, Rachel H Giles, Colin A Johnson
Hum Mol Genet . 2018 Feb 1;27(3):529-545.
DNAAF1 (LRRC50) is a cytoplasmic protein required for dynein heavy chain assembly and cilia motility, and DNAAF1 mutations cause primary ciliary dyskinesia (PCD; MIM 613193). We describe four families with DNAAF1 mutations and complex congenital heart disease (CHD). In three families, all affected individuals have typical PCD phenotypes. However, an additional family demonstrates isolated CHD (heterotaxy) in two affected siblings, but no clinical evidence of PCD. We identified a homozygous DNAAF1 missense mutation, p.Leu191Phe, as causative for heterotaxy in this family. Genetic complementation in dnaaf1-null zebrafish embryos demonstrated the rescue of normal heart looping with wild-type human DNAAF1, but not the p.Leu191Phe variant, supporting the conserved pathogenicity of this DNAAF1 missense mutation. This observation points to a phenotypic continuum between CHD and PCD, providing new insights into the pathogenesis of isolated CHD. In further investigations of the function of DNAAF1 in dynein arm assembly, we identified interactions with members of a putative dynein arm assembly complex. These include the ciliary intraflagellar transport protein IFT88 and the AAA+ (ATPases Associated with various cellular Activities) family proteins RUVBL1 (Pontin) and RUVBL2 (Reptin). Co-localization studies support these findings, with the loss of RUVBL1 perturbing the co-localization of DNAAF1 with IFT88. We show that RUVBL1 orthologues have an asymmetric left-sided distribution at both the mouse embryonic node and the Kupffer's vesicle in zebrafish embryos, with the latter asymmetry dependent on DNAAF1. These results suggest that DNAAF1-RUVBL1 biochemical and genetic interactions have a novel functional role in symmetry breaking and cardiac development.
Elizabeth Forsythe, Joanna Kenny, Chiara Bacchelli, Philip Beales
Front Pediatr . 2018 Feb 13;6:23.
Bardet–Biedl syndrome is a rare autosomal recessive multisystem disorder caused by defects in genes encoding for proteins that localize to the primary cilium/basal body complex. Twenty-one disease-causing genes have been identified to date. It is one of the most well-studied conditions in the family of diseases caused by defective cilia collectively known as ciliopathies. In this review, we provide an update on diagnostic developments, clinical features, and progress in the management of Bardet–Biedl syndrome. Advances in diagnostic technologies including exome and whole genome sequencing are expanding the spectrum of patients who are diagnosed with Bardet–Biedl syndrome and increasing the number of cases with diagnostic uncertainty. As a result of the diagnostic developments, a small number of patients with only one or two clinical features of Bardet–Biedl syndrome are being diagnosed. Our understanding of the syndrome-associated renal disease has evolved and is reviewed here. Novel interventions are developing at a rapid pace and are explored in this review including genetic therapeutics such as gene therapy, exon skipping therapy, nonsense suppression therapy, and gene editing. Other non-genetic therapies such as gene repurposing, targeted therapies, and non-pharmacological interventions are also discussed.
Prieto Barja P, Pescher P, Bussotti G, Dumetz F, Imamura H, Kedra D, Domagalska M, Chaumeau V, Himmelbauer H, Pages M, Sterkers Y, Dujardin JC, Notredame C, Späth GF
Nature Ecology & Evolutionvolume 1, pages1961–1969 (2017)
The parasite Leishmania donovani causes a fatal disease termed visceral leishmaniasis. The process through which the parasite adapts to environmental change remains largely unknown. Here we show that aneuploidy is integral for parasite adaptation and that karyotypic fluctuations allow for selection of beneficial haplotypes, which impact transcriptomic output and correlate with phenotypic variations in proliferation and infectivity. To avoid loss of diversity following karyotype and haplotype selection, L. donovani utilizes two mechanisms: polyclonal selection of beneficial haplotypes to create coexisting subpopulations that preserve the original diversity, and generation of new diversity as aneuploidy-prone chromosomes tolerate higher mutation rates. Our results reveal high aneuploidy turnover and haplotype selection as a unique evolutionary adaptation mechanism that L. donovani uses to preserve genetic diversity under strong selection. This unexplored process may function in other human diseases, including fungal infection and cancer, and stimulate innovative treatment options.
Joanna Kenny, Elizabeth Forsythe, Philip Beales, Chiara Bacchelli
Per Med . 2017 Sep;14(5):447-456.
Personalized medicine is becoming routine in the treatment of common diseases such as cancer, but has lagged behind in the field of rare diseases. It is currently in the early stages for the treatment of Bardet-Biedl syndrome. Advances in the understanding of ciliary biology and diagnostic techniques have opened up the prospect of treating BBS in a patient-specific manner. Owing to their structure and function, cilia provide an attractive therapeutic target and genetic therapies are being explored in ciliopathy treatment. Promising avenues include gene therapy, gene editing techniques and splice-correcting and read-through therapies. Targeted drug design has been successful in the treatment of genetic disease and research is underway in the discovery of known and novel drugs to treat Bardet-Biedl syndrome.

Exome sequencing for the differential diagnosis of ciliary chondrodysplasias: Example of a WDR35 mutation case and review of the literature

Dinu Antony, Narayanan Nampoory, Chiara Bacchelli, Motasem Melhem, Kaman Wu, Chela T James, Philip L Beales, Mike Hubank, Daisy Thomas, Anant Mashankar, Kazem Behbehani, Miriam Schmidts, Osama Alsmadi
Eur J Med Genet . 2017 Dec;60(12):658-666.
Exome sequencing is becoming widely popular and affordable, making it one of the most desirable methods for the identification of rare genetic variants for clinical diagnosis. Here, we report the clinical application of whole exome sequencing for the ultimate diagnosis of a ciliary chondrodysplasia case presented with an initial clinical diagnosis of Asphyxiating Thoracic Dystrophy (ATD, Jeune Syndrome). We have identified a novel homozygous missense mutation in WDR35 (c.206G > A), a gene previously associated with Sensenbrenner Syndrome, Ellis-van Creveld syndrome and Short-rib polydactyly syndrome type V. The genetic findings in this family led to the re-evaluation of the initial diagnosis and a differential diagnosis of Sensenbrenner Syndrome was made after cautious re-examination of the patient. Cell culture studies revealed normal subcellular localization of the mutant WDR35 protein in comparison to wildtype protein, pointing towards impaired protein-protein interaction and/or altered cell signaling pathways as a consequence of the mutated allele. This research study highlights the importance of including pathogenic variant identification in the diagnosis pipeline of ciliary chondrodysplasias, especially for clinically not fully defined phenotypes.
Šošic M, Šikic M
Bioinformatics, Volume 33, Issue 9, 1 May 2017, Pages 1394–1395
We present Edlib, an open-source C/C ++ library for exact pairwise sequence alignment using edit distance. We compare Edlib to other libraries and show that it is the fastest while not lacking in functionality and can also easily handle very large sequences. Being easy to use, flexible, fast and low on memory usage, we expect it to be easily adopted as a building block for future bioinformatics tools.
Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C.
Nat Biotechnol. 2017 Apr 11;35(4):316-319
The increasing complexity of readouts for omics analyses goes hand-in-hand with concerns about the reproducibility of experiments that analyze ‘big data’ . When analyzing very large data sets, the main source of computational irreproducibility arises from a lack of good practice pertaining to software and database usage . Small variations across computational platforms also contribute ...
Motasem Melhem, Mohamed Abu-Farha, Dinu Antony, Ashraf Al Madhoun, Chiara Bacchelli, Fadi Alkayal, Irina AlKhairi, Sumi John, Mohamad Alomari, Phillip L Beales, Osama Alsmadi
Eur J Haematol . 2017 Mar;98(3):218-227.
To characterize the underlying genetic and molecular defects in a consanguineous family with lifelong blood disorder manifested with thrombocytopenia (low platelets count) and anemia. Genetic linkage analysis, exome sequencing, and functional genomics were carried out to identify and characterize the defective gene. We identified a novel truncation mutation (p.C108*) in chromosome 6 open reading frame 25 (C6orf25) gene in this family. We also showed the p.C108* mutation was responsible for destabilizing the encoded truncated G6B protein. Unlike the truncated form, wild-type G6B expression resulted in enhanced K562 differentiation into megakaryocytes and erythrocytes. C6orf25, also known as G6B, is an effector protein for the key hematopoiesis regulators, Src homology region 2 domain-containing phosphatases SHP-1 and SHP-2.

COLEC10 is mutated in 3MC patients and regulates early craniofacial development

Mustafa M Munye, Anna Diaz-Font, Louise Ocaka, Maiken L Henriksen, Melissa Lees, Angela Brady, Dagan Jenkins, Jenny Morton, Soren W Hansen, Chiara Bacchelli, Philip L Beales, Victor Hernandez-Hernandez
PLoS Genet . 2017 Mar 16;13(3):e1006679.
3MC syndrome is an autosomal recessive heterogeneous disorder with features linked to developmental abnormalities. The main features include facial dysmorphism, craniosynostosis and cleft lip/palate; skeletal structures derived from cranial neural crest cells (cNCC). We previously reported that lectin complement pathway genes COLEC11 and MASP1/3 are mutated in 3MC syndrome patients. Here we define a new gene, COLEC10, also mutated in 3MC families and present novel mutations in COLEC11 and MASP1/3 genes in a further five families. The protein products of COLEC11 and COLEC10, CL-K1 and CL-L1 respectively, form heteromeric complexes. We show COLEC10 is expressed in the base membrane of the palate during murine embryo development. We demonstrate how mutations in COLEC10 (c.25C>T; p.Arg9Ter, c.226delA; p.Gly77Glufs*66 and c.528C>G p.Cys176Trp) impair the expression and/or secretion of CL-L1 highlighting their pathogenicity. Together, these findings provide further evidence linking the lectin complement pathway and complement factors COLEC11 and COLEC10 to morphogenesis of craniofacial structures and 3MC etiology.

Mutations in EXTL3 Cause Neuro-immuno-skeletal Dysplasia Syndrome

Machteld M Oud, Paul Tuijnenburg, Maja Hempel, Naomi van Vlies, Zemin Ren, Sacha Ferdinandusse, Machiel H Jansen, René Santer, Jessika Johannsen, Chiara Bacchelli, Marielle Alders, Rui Li, Rosalind Davies, Lucie Dupuis, Catherine M Cale, Ronald J A Wanders, Steven T Pals, Louise Ocaka, Chela James, Ingo Müller, Kai Lehmberg, Tim Strom, Hartmut Engels, Hywel J Williams, Phil Beales, Ronald Roepman, Patricia Dias, Han G Brunner, Jan-Maarten Cobben, Christine Hall, Taila Hartley, Polona Le Quesne Stabej, Roberto Mendoza-Londono, E Graham Davies, Sérgio B de Sousa, Davor Lessel, Heleen H Arts, Taco W Kuijpers
Am J Hum Genet . 2017 Feb 2;100(2):281-296.
EXTL3 regulates the biosynthesis of heparan sulfate (HS), important for both skeletal development and hematopoiesis, through the formation of HS proteoglycans (HSPGs). By whole-exome sequencing, we identified homozygous missense mutations c.1382C>T, c.1537C>T, c.1970A>G, and c.2008T>G in EXTL3 in nine affected individuals from five unrelated families. Notably, we found the identical homozygous missense mutation c.1382C>T (p.Pro461Leu) in four affected individuals from two unrelated families. Affected individuals presented with variable skeletal abnormalities and neurodevelopmental defects. Severe combined immunodeficiency (SCID) with a complete absence of T cells was observed in three families. EXTL3 was most abundant in hematopoietic stem cells and early progenitor T cells, which is in line with a SCID phenotype at the level of early T cell development in the thymus. To provide further support for the hypothesis that mutations in EXTL3 cause a neuro-immuno-skeletal dysplasia syndrome, and to gain insight into the pathogenesis of the disorder, we analyzed the localization of EXTL3 in fibroblasts derived from affected individuals and determined glycosaminoglycan concentrations in these cells as well as in urine and blood. We observed abnormal glycosaminoglycan concentrations and increased concentrations of the non-sulfated chondroitin disaccharide D0a0 and the disaccharide D0a4 in serum and urine of all analyzed affected individuals. In summary, we show that biallelic mutations in EXTL3 disturb glycosaminoglycan synthesis and thus lead to a recognizable syndrome characterized by variable expression of skeletal, neurological, and immunological abnormalities.

Mutations in EXTL3 Cause Neuro-immuno-skeletal Dysplasia Syndrome

Machteld M Oud, Paul Tuijnenburg, Maja Hempel
Am J Hum Genet . 2017 Feb 2;100(2):281-296.

EXTL3 regulates the biosynthesis of heparan sulfate (HS), important for both skeletal development and hematopoiesis, through the formation of HS proteoglycans (HSPGs). By whole-exome sequencing, we identified homozygous missense mutations c.1382C>T, c.1537C>T, c.1970A>G, and c.2008T>G in EXTL3 in nine affected individuals from five unrelated families. Notably, we found the identical homozygous missense mutation c.1382C>T (p.Pro461Leu) in four affected individuals from two unrelated families. Affected individuals presented with variable skeletal abnormalities and neurodevelopmental defects. Severe combined immunodeficiency (SCID) with a complete absence of T cells was observed in three families. EXTL3 was most abundant in hematopoietic stem cells and early progenitor T cells, which is in line with a SCID phenotype at the level of early T cell development in the thymus. To provide further support for the hypothesis that mutations in EXTL3 cause a neuro-immuno-skeletal dysplasia syndrome, and to gain insight into the pathogenesis of the disorder, we analyzed the localization of EXTL3 in fibroblasts derived from affected individuals and determined glycosaminoglycan concentrations in these cells as well as in urine and blood. We observed abnormal glycosaminoglycan concentrations and increased concentrations of the non-sulfated chondroitin disaccharide D0a0 and the disaccharide D0a4 in serum and urine of all analyzed affected individuals. In summary, we show that biallelic mutations in EXTL3 disturb glycosaminoglycan synthesis and thus lead to a recognizable syndrome characterized by variable expression of skeletal, neurological, and immunological abnormalities.

Chiara Bacchelli, Federico A Moretti, Marlene Carmo, Stuart Adams, Horia C Stanescu, Kerra Pearce, Manisha Madkaikar, Kimberly C Gilmour, Adeline K Nicholas, C Geoffrey Woods, Robert Kleta, Phil L Beales, Waseem Qasim, H Bobby Gaspar
J Allergy Clin Immunol . 2017 Feb;139(2):634-642.e5.
Signaling through the T-cell receptor (TCR) is critical for T-cell development and function. Linker for activation of T cells (LAT) is a transmembrane adaptor signaling molecule that is part of the TCR complex and essential for T-cell development, as demonstrated by LAT-deficient mice, which show a complete lack of peripheral T cells. We describe a pedigree affected by a severe combined immunodeficiency phenotype with absent T cells and normal B-cell and natural killer cell numbers. A novel homozygous frameshift mutation in the gene encoding for LAT was identified in this kindred.

Mutations in linker for activation of T cells (LAT) lead to a novel form of severe combined immunodeficiency

Chiara Bacchelli, Federico A Moretti
J Allergy Clin Immunol . 2017 Feb;139(2):634-642.e5.

Signaling through the T-cell receptor (TCR) is critical for T-cell development and function. Linker for activation of T cells (LAT) is a transmembrane adaptor signaling molecule that is part of the TCR complex and essential for T-cell development, as demonstrated by LAT-deficient mice, which show a complete lack of peripheral T cells. We describe a pedigree affected by a severe combined immunodeficiency phenotype with absent T cells and normal B-cell and natural killer cell numbers. A novel homozygous frameshift mutation in the gene encoding for LAT was identified in this kindred.

Jochen Kammermeier, Robert Dziubak, Matilde Pescarin, Suzanne Drury, Heather Godwin, Kate Reeve, Sibongile Chadokufa, Bonita Huggett, Sara Sider, Chela James, Nikki Acton, Elena Cernat, Marco Gasparetto, Gabi Noble-Jamieson, Fevronia Kiparissi, Mamoun Elawad, Phil L Beales, Neil J Sebire, Kimberly Gilmour, Holm H Uhlig, Chiara Bacchelli, Neil Shah
J Crohns Colitis . 2017 Jan;11(1):60-69
Inflammatory bowel disease [IBD] presenting in early childhood is extremely rare. More recently, progress has been made to identify children with monogenic forms of IBD predominantly presenting very early in life. In this study, we describe the heterogeneous phenotypes and genotypes of patients with IBD presenting before the age of 2 years and establish phenotypic features associated with underlying monogenicity.
Chatzou M, Magis C, Chang JM, Kemena C, Bussotti G, Erb I, Notredame C.
Brief Bioinform. 2016 Nov;17(6):1009-1023. Epub 2015 Nov 27. Review.
This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods.

Multiple sequence alignment modeling: methods and applications.

Chatzou M, Magis C, Chang JM
Brief Bioinform. 2016 Nov;17(6):1009-1023. Epub 2015 Nov 27. Review.
This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods.
Chiara Bacchelli, Hywel J Williams
Expert Rev Mol Diagn . 2016 Oct;16(10):1073-1082
Rare pediatric diseases are clinically severe with high rates of mortality and morbidity. This paper outlines how next-generation sequencing (NGS) can be used to greatly advance identification of the underlying genetic causes. Areas covered: This manuscript is a blend of evidence obtained from literature searches from PubMed and rare disease related websites, laboratory experience and the author's opinions. The paper covers the current state of the field and identifies where the challenges lie and how they are being overcome, using up-to-date references. Expert commentary: The field of NGS is still relatively new but it has already transformed the field of rare disease research. Technological advances in instrumentation, computational hardware and software have resulted in the identification of many causative genes, but as sequencing moves into population-scale initiatives standardisation and data sharing is going to be of paramount importance to ensure we derive the maximum benefit for patients.
Ribeiro-Gonçalves B, Francisco AP, Vaz C, Ramirez M, Carriço J
High-throughput sequencing methods generated allele and single nucleotide polymorphism information for thousands of bacterial strains that are publicly available in online repositories and created the possibility of generating similar information for hundreds to thousands of strains more in a single study. Minimum spanning tree analysis of allelic data offers a scalable and reproducible methodological alternative to traditional phylogenetic inference approaches, useful in epidemiological investigations and population studies of bacterial pathogens. PHYLOViZ Online was developed to allow users to do these analyses without software installation and to enable easy accessing and sharing of data and analyses results from any Internet enabled computer. PHYLOViZ Online also offers a RESTful API for programmatic access to data and algorithms, allowing it to be seamlessly integrated into any third party web service or software. PHYLOViZ Online is freely available at https://online.phyloviz.net.
Vlasova A, Capella-Gutiérrez S, Rendón-Anaya M, Hernández-Oñate M, Minoche AE, Erb I, Câmara F, Prieto-Barja P, Corvelo A, Sanseverino W, Westergaard G, Dohm JC, Pappas GJ Jr, Saburido-Alvarez S, Kedra D, Gonzalez I, Cozzuto L, Gómez-Garrido J, Aguilar-Morón MA, Andreu N, Aguilar OM, Garcia-Mas J, Zehnsdorf M, Vázquez MP, Delgado-Salinas A, Delaye L, Lowy E, Mentaberry A, Vianello-Brondani RP, García JL, Alioto T, Sánchez F, Himmelbauer H, Santalla M, Notredame C, Gabaldón T, Herrera-Estrella A, Guigó R
Genome Biol. 2016 Feb 25;17:32
Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. The genome and transcriptome data herein generated for a Mesoamerican genotype represent a counterpart to the genomic resources already available for the Andean gene pool. Altogether, this information will allow the genetic dissection of the characters involved in the domestication and adaptation of the crop, and their further implementation in breeding strategies for this important crop.
Hywel J Williams, John R Hurst, Louise Ocaka, Chela James, Caroline Pao, Estelle Chanudet, Francesco Lescai, Horia C Stanescu, Robert Kleta, GOSgene; Elisabeth Rosser, Chiara Bacchelli, Philip Beales
Eur J Hum Genet . 2016 Feb;24(2):298-301
The success of whole-exome sequencing to identify mutations causing single-gene disorders has been well documented. In contrast whole-exome sequencing has so far had limited success in the identification of variants causing more complex phenotypes that seem unlikely to be due to the disruption of a single gene. We describe a family where two male offspring of healthy first cousin parents present a complex phenotype consisting of peripheral neuropathy and bronchiectasis that has not been described previously in the literature. Due to the fact that both children had the same problems in the context of parental consanguinity we hypothesised illness resulted from either X-linked or autosomal recessive inheritance. Through the use of whole-exome sequencing we were able to simplify this complex phenotype and identified a causative mutation (p.R1070*) in the gene periaxin (PRX), a gene previously shown to cause peripheral neuropathy (Dejerine-Sottas syndrome) when this mutation is present. For the bronchiectasis phenotype we were unable to identify a causal single mutation or compound heterozygote, reflecting the heterogeneous nature of this phenotype. In conclusion, in this study we show that whole-exome sequencing has the power to disentangle complex phenotypes through the identification of causative genetic mutations for distinct clinical disorders that were previously masked.
Emma S Reid, Hywel Williams, Polona Le Quesne Stabej, Chela James, Louise Ocaka, Chiara Bacchelli, Emma J Footitt, Stewart Boyd, Maureen A Cleary, Philippa B Mills, Peter T Clayton
JIMD Rep . 2016;27:79-84
There is increasing evidence that vitamin B6, given either as pyridoxine or pyridoxal 5'-phosphate, can sometimes result in improved seizure control in idiopathic epilepsy. Whole-exome sequencing was used to identify a de novo mutation (c.629G>A; p.Arg210His) in KCNQ2 in a 7-year-old patient whose neonatal seizures showed a response to pyridoxine and who had a high plasma to CSF pyridoxal 5'-phosphate ratio, usually indicative of an inborn error of vitamin B6 metabolism. This mutation has been described in three other patients with neonatal epileptic encephalopathy. A review of the literature was performed to assess the effectiveness of vitamin B6 treatment in patients with a KCNQ2 channelopathy. Twenty-three patients have been reported to have been trialled with B6; in three of which B6 treatment was used alone or in combination with other antiepileptic drugs to control seizures. The anticonvulsant effect of B6 vitamers may be propagated by multiple mechanisms including direct antagonist action on ion channels, antioxidant action on excess reactive oxygen species generated by increased neuronal firing and replenishing the pool of pyridoxal 5'-phosphate needed for the synthesis of some inhibitory neurotransmitters. Vitamin B6 may be a promising adjunctive treatment for patients with channelopathies and the wider epileptic population. This report also demonstrates that an abnormal plasma to CSF pyridoxal 5'-phosphate ratio may not be exclusive to inborn errors of vitamin B6 metabolism.
Polona Le Quesne Stabej, Hywel J Williams, Chela James, Mehmet Tekman, Horia C Stanescu, Robert Kleta, Louise Ocaka, Francesco Lescai, Helen L Storr, Maria Bitner-Glindzicz, Chiara Bacchelli, Gerard S Conway, GOSgene
Eur J Hum Genet . 2016 Jan;24(1):135-8
Primary ovarian insufficiency (POI) is a distressing cause of infertility in young women. POI is heterogeneous with only a few causative genes having been discovered so far. Our objective was to determine the genetic cause of POI in a consanguineous Lebanese family with two affected sisters presenting with primary amenorrhoea and an absence of any pubertal development. Multipoint parametric linkage analysis was performed. Whole-exome sequencing was done on the proband. Linkage analysis identified a locus on chromosome 7 where exome sequencing successfully identified a homozygous two base pair duplication (c.1947_48dupCT), leading to a truncated protein p.(Y650Sfs*22) in the STAG3 gene, confirming it as the cause of POI in this family. Exome sequencing combined with linkage analyses offers a powerful tool to efficiently find novel genetic causes of rare, heterogeneous disorders, even in small single families. This is only the second report of a STAG3 variant; the first STAG3 variant was recently described in a phenotypically similar family with extreme POI. Identification of an additional family highlights the importance of STAG3 in POI pathogenesis and suggests it should be evaluated in families affected with POI.
Di Tommaso P, Palumbo E, Chatzou M, Prieto P, Heuer ML, Notredame C
September 24, 2015, PeerJ 3:e1273
Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus, the question that arises is to what extent the use of Docker containers might affect the performance of these pipelines. Here we address this question and conclude that Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.
Silva M, Machado M, Silva D, Rossi M, Moran-Gilad J, Santos S, Ramirez M, Carriço JA
Gene-by-gene approaches are becoming increasingly popular in bacterial genomic epidemiology and outbreak detection. However, there is a lack of open-source scalable software for schema definition and allele calling for these methodologies. The chewBBACA suite was designed to assist users in the creation and evaluation of novel whole-genome or core-genome gene-by-gene typing schemas and subsequent allele calling in bacterial strains of interest. chewBBACA performs the schema creation and allele calls on complete or draft genomes resulting from de novo assemblers. The chewBBACA software uses Python 3.4 or higher and can run on a laptop or in high performance clusters making it useful for both small laboratories and large reference centers. ChewBBACA is available at https://github.com/B-UMMI/chewBBACA.
Winnie Ip, H Bobby Gaspar, Robert Kleta, Estelle Chanudet, Chiara Bacchelli, Alison Pitts, Zohreh Nademi, E Graham Davies, Mary A Slatter, Persis Amrolia, Kanchan Rao, Paul Veys, Andrew R Gennery, Waseem Qasim
J Clin Immunol . 2015 Feb;35(2):147-57
Mutations in RMRP primarily give rise to Cartilage Hair Hypoplasia (CHH), a highly diverse skeletal disorder which can be associated with severe immunodeficiency. Increased availability of RMRP mutation screening has uncovered a number of infants with significant immunodeficiency but only mild or absent skeletal features. We surveyed the clinical and immunological phenotype of children who have undergone allogeneic haematopoietic stem cell transplantation for this condition in the UK.
Schmid M, Smith J, Burt DW, Aken BL, Antin PB, Archibald AL, Ashwell C, Blackshear PJ, Boschiero C, Brown CT, Burgess SC, Cheng HH, Chow W, Coble DJ, Cooksey A, Crooijmans RPMA, Damas J, Davis RVN, de Koning D-J18 Delany ME, Derrien T, Desta TT, Dunn IC, Dunn M, Ellegren H, Eöry L, Erb I, Farré M, Fasold M, Fleming D, Flicek P, Fowler KE, Frésard L, Froman DP, Garceau V, Gardner PP, Gheyas AA, Griffin DK, Groenen MAM, Haaf T, Hanotte O, Hart A, Häsler J, Hedges SB, Hertel J, Howe K, Hubbard A, Hume DA, Kaiser P, Kedra D, Kemp SJ, Klopp C, Kniel KE, Kuo R, Lagarrigue S, Lamont SJ, Larkin DM, Lawal RA, Markland SM, McCarthy F, McCormack HA, McPherson MC, Motegi A, Muljo SA, Münsterberg A, Nag R, Nanda I, Neuberger M, Nitsche A2 Notredame C, Noyes H, O’Connor R, O’Hare EA, Oler AJ, Ommeh SC, Pais H, Persia M, Pitel F, Preeyanon L, Prieto Barja P, Pritchett EM, Rhoads DD, Robinson CM, Romanov MN, Rothschild M, Roux P-F, Schmidt CJ, Schneider A-S, Schwartz MG, Searle SM, Skinner MA, Smith CA, Stadler PF, Steeves TE, Steinlein C, Sun L, Takata M, Ulitsky I, Wang Q, Wang Y, Warren WC, Wood JMD, Wragg D, Zhou H
Cytogenet Genome Res 2015;145:78-179
Publication of the chicken genome sequence in 2004 (International Chicken Genome Sequencing Consortium 2004) highlighted the beginning of a revolution in avian genomics. Progression of DNA sequencing technologies and data handling capabilities has also meant that genome sequencing and assembly is now a relatively simple, fast and inexpensive procedure. The success seen with the chicken genome was soon followed by the completion of the zebra finch genome ...
Jochen Kammermeier, Suzanne Drury, Chela T James, Robert Dziubak, Louise Ocaka, Mamoun Elawad, Philip Beales, Nicholas Lench, Holm H Uhlig, Chiara Bacchelli, Neil Shah
J Med Genet . 2014 Nov;51(11):748-55
Multiple monogenetic conditions with partially overlapping phenotypes can present with inflammatory bowel disease (IBD)-like intestinal inflammation. With novel genotype-specific therapies emerging, establishing a molecular diagnosis is becoming increasingly important.
Cheng Y, Ma Z, Kim BH, Wu W, Cayting P, Boyle AP, Sundaram V, Xing X, Dogan N, Li J, Euskirchen G, Lin S, Lin Y, Visel A, Kawli T, Yang X, Patacsil D, Keller CA, Giardine B; mouse ENCODE Consortium, Kundaje A, Wang T, Pennacchio LA, Weng Z, Hardison RC, Snyder MP
Nature volume 515, pages 371–375, Nov 2014
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
Rajvinder Karda, Suzanne M K Buckley, Citra N Mattar, Joanne Ng, Giulia Massaro, Michael P Hughes, Manju A Kurian, Julien Baruteau, Paul Gissen, Jerry K Y Chan, Chiara Bacchelli, Simon N Waddington, Ahad A Rahim
Front Mol Neurosci . 2014 Nov 14;7:89
Neurodegenerative monogenic diseases often affect tissues and organs beyond the nervous system. An effective treatment would require a systemic approach. The intravenous administration of novel therapies is ideal but is hampered by the inability of such drugs to cross the blood-brain barrier (BBB) and precludes efficacy in the central nervous system. A number of these early lethal intractable diseases also present devastating irreversible pathology at birth or soon after. Therefore, any therapy would ideally be administered during the perinatal period to prevent, stop, or ameliorate disease progression. The concept of perinatal gene therapy has moved a step further toward being a feasible approach to treating such disorders. This has primarily been driven by the recent discoveries that particular serotypes of adeno-associated virus (AAV) gene delivery vectors have the ability to cross the BBB following intravenous administration. Furthermore, safety has been demonstrated after perinatal administration mice and non-human primates. This review focuses on the progress made in using AAV to achieve systemic transduction and what this means for developing perinatal gene therapy for early lethal neurodegenerative diseases.
Anna C Thomas, Hywel Williams, Núria Setó-Salvia, Chiara Bacchelli, Dagan Jenkins, Mary O’Sullivan, Konstantinos Mengrelis, Miho Ishida, Louise Ocaka, Estelle Chanudet, Chela James, Francesco Lescai, Glenn Anderson, Deborah Morrogh, Mina Ryten, Andrew J Duncan, Yun Jin Pai, Jorge M Saraiva, Fabiana Ramos, Bernadette Farren, Dawn Saunders, Bertrand Vernay, Paul Gissen, Anna Straatmaan-Iwanowska, Frank Baas, Nicholas W Wood, Joshua Hersheson, Henry Houlden, Jane Hurst, Richard Scott, Maria Bitner-Glindzicz, Gudrun E Moore, Sérgio B Sousa , Philip Stanier
Am J Hum Genet . 2014 Nov 6;95(5):611-21
Intellectual disability and cerebellar atrophy occur together in a large number of genetic conditions and are frequently associated with microcephaly and/or epilepsy. Here we report the identification of causal mutations in Sorting Nexin 14 (SNX14) found in seven affected individuals from three unrelated consanguineous families who presented with recessively inherited moderate-severe intellectual disability, cerebellar ataxia, early-onset cerebellar atrophy, sensorineural hearing loss, and the distinctive association of progressively coarsening facial features, relative macrocephaly, and the absence of seizures. We used homozygosity mapping and whole-exome sequencing to identify a homozygous nonsense mutation and an in-frame multiexon deletion in two families. A homozygous splice site mutation was identified by Sanger sequencing of SNX14 in a third family, selected purely by phenotypic similarity. This discovery confirms that these characteristic features represent a distinct and recognizable syndrome. SNX14 encodes a cellular protein containing Phox (PX) and regulator of G protein signaling (RGS) domains. Weighted gene coexpression network analysis predicts that SNX14 is highly coexpressed with genes involved in cellular protein metabolism and vesicle-mediated transport. All three mutations either directly affected the PX domain or diminished SNX14 levels, implicating a loss of normal cellular function. This manifested as increased cytoplasmic vacuolation as observed in cultured fibroblasts. Our findings indicate an essential role for SNX14 in neural development and function, particularly in development and maturation of the cerebellum.
Daniel Kelberman, Lily Islam, Jörn Lakowski, Chiara Bacchelli, Estelle Chanudet, Francesco Lescai, Aara Patel, Elia Stupka, Anja Buck, Stephan Wolf, Philip L Beales, Thomas S Jacques, Maria Bitner-Glindzicz, Alki Liasis, Ordan J Lehmann, Jürgen Kohlhase, Ken K Nischal, Jane C Sowden
Hum Mol Genet . 2014 May 15;23(10):2511-26
Ocular coloboma is a congenital defect resulting from failure of normal closure of the optic fissure during embryonic eye development. This birth defect causes childhood blindness worldwide, yet the genetic etiology is poorly understood. Here, we identified a novel homozygous mutation in the SALL2 gene in members of a consanguineous family affected with non-syndromic ocular coloboma variably affecting the iris and retina. This mutation, c.85G>T, introduces a premature termination codon (p.Glu29*) predicted to truncate the SALL2 protein so that it lacks three clusters of zinc-finger motifs that are essential for DNA-binding activity. This discovery identifies SALL2 as the third member of the Drosophila homeotic Spalt-like family of developmental transcription factor genes implicated in human disease. SALL2 is expressed in the developing human retina at the time of, and subsequent to, optic fissure closure. Analysis of Sall2-deficient mouse embryos revealed delayed apposition of the optic fissure margins and the persistence of an anterior retinal coloboma phenotype after birth. Sall2-deficient embryos displayed correct posterior closure toward the optic nerve head, and upon contact of the fissure margins, dissolution of the basal lamina occurred and PAX2, known to be critical for this process, was expressed normally. Anterior closure was disrupted with the fissure margins failing to meet, or in some cases misaligning leading to a retinal lesion. These observations demonstrate, for the first time, a role for SALL2 in eye morphogenesis and that loss of function of the gene causes ocular coloboma in humans and mice.
Francesco Lescai, Elena Marasco, Chiara Bacchelli, Philip Stanier, Vilma Mantovani, Philip Beales
Mol Genet Genomic Med . 2014 Jan;2(1):58-63
The choice of an appropriate variant calling pipeline for exome sequencing data is becoming increasingly more important in translational medicine projects and clinical contexts. Within GOSgene, which facilitates genetic analysis as part of a joint effort of the University College London and the Great Ormond Street Hospital, we aimed to optimize a variant calling pipeline suitable for our clinical context. We implemented the GATK/Queue framework and evaluated the performance of its two callers: the classical UnifiedGenotyper and the new variant discovery tool HaplotypeCaller. We performed an experimental validation of the loss-of-function (LoF) variants called by the two methods using Sequenom technology. UnifiedGenotyper showed a total validation rate of 97.6% for LoF single-nucleotide polymorphisms (SNPs) and 92.0% for insertions or deletions (INDELs), whereas HaplotypeCaller was 91.7% for SNPs and 55.9% for INDELs. We confirm that GATK/Queue is a reliable pipeline in translational medicine and clinical context. We conclude that in our working environment, UnifiedGenotyper is the caller of choice, being an accurate method, with a high validation rate of error-prone calls like LoF variants. We finally highlight the importance of experimental validation, especially for INDELs, as part of a standard pipeline in clinical environments.
Emma A Webb, Angham AlMutair, Daniel Kelberman, Chiara Bacchelli, Estelle Chanudet, Francesco Lescai, Cynthia L Andoniadou, Abdul Banyan, Al Alsawaid, Muhammad T Alrifai, Mohammed A Alahmesh, M Balwi, Seyedeh N Mousavy-Gharavy, Biljana Lukovic, Derek Burke, Mark J McCabe, Tessa Kasia, Robert Kleta, Elia Stupka, Philip L Beales, Dorothy A Thompson, W Kling Chong, Fowzan S Alkuraya, Juan-Pedro Martinez-Barbera, Jane C Sowden, Mehul T Dattani
Brain. 2013 Oct;136(Pt 10):3096-105
We describe a previously unreported syndrome characterized by secondary (post-natal) microcephaly with fronto-temporal lobe hypoplasia, multiple pituitary hormone deficiency, seizures, severe visual impairment and abnormalities of the kidneys and urinary tract in a highly consanguineous family with six affected children. Homozygosity mapping and exome sequencing revealed a novel homozygous frameshift mutation in the basic helix-loop-helix transcription factor gene ARNT2 (c.1373_1374dupTC) in affected individuals. This mutation results in absence of detectable levels of ARNT2 transcript and protein from patient fibroblasts compared with controls, consistent with nonsense-mediated decay of the mutant transcript and loss of ARNT2 function. We also show expression of ARNT2 within the central nervous system, including the hypothalamus, as well as the renal tract during human embryonic development. The progressive neurological abnormalities, congenital hypopituitarism and post-retinal visual pathway dysfunction in affected individuals demonstrates for the first time the essential role of ARNT2 in the development of the hypothalamo-pituitary axis, post-natal brain growth, and visual and renal function in humans.

GPU implementation of epidemiological behaviour in large social networks

Menon SKM, Baruah PK, Sosic M
IEEE International Conference on High Performance Computing (HiPC), 2012
Preventing and controlling of epidemics in human or computer networks is a top problem. This paper presents an implementation of the algorithm that simulates the epidemic spreading in networks, using CUDA (Compute Unified Device Architecture) technology. Spreading of the epidemics over the network is modeled by a discrete SIR (Susceptible - Infected - Recovered) model. This implementation enables selection of starting node and monitoring the epidemic spread in each cycle. In comparison to a common CPU implementation, the CUDA implementation achieves about 10× faster execution time, which is of high significance to running tests on larger networks. The implementation was tested on real social networks consisting of more than 5 million nodes. Hence, we belive this implementation can have a practical value in analysis of the epidemic spreading over large networks.

CUDA implementation of the algorithm for simulating the epidemic spreading over large networks

Šošić M, Šikić M
Conference: MIPRO 2012 – 35th International Convention on Information and Communication Technology
Preventing and controlling of epidemics in human or computer networks is a top problem. This paper presents an implementation of the algorithm that simulates the epidemic spreading in networks, using CUDA (Compute Unified Device Architecture) technology. Spreading of the epidemics over the network is modeled by a discrete SIR (Susceptible - Infected - Recovered) model. This implementation enables selection of starting node and monitoring the epidemic spread in each cycle. In comparison to a common CPU implementation, the CUDA implementation achieves about 10× faster execution time, which is of high significance to running tests on larger networks. The implementation was tested on real social networks consisting of more than 5 million nodes. Hence, we belive this implementation can have a practical value in analysis of the epidemic spreading over large networks.
Francesco Lescai, Silvia Bonfiglio, Chiara Bacchelli, Estelle Chanudet, Aoife Waters, Sanjay M Sisodiya, Dalia Kasperavičiūtė, Julie Williams, Denise Harold, John Hardy, Robert Kleta, Sebahattin Cirak, Richard Williams, John C Achermann, John Anderson, David Kelsell, Tom Vulliamy, Henry Houlden, Nicholas Wood, Una Sheerin, Gian Paolo Tonini, Donna Mackay, Khalid Hussain, Jane Sowden, Veronica Kinsler, Justyna Osinska, Tony Brooks, Mike Hubank, Philip Beales, Elia Stupka
PLoS One . 2012;7(12):e51292.
Recent advances in genomics technologies have spurred unprecedented efforts in genome and exome re-sequencing aiming to unravel the genetic component of rare and complex disorders. While in rare disorders this allowed the identification of novel causal genes, the missing heritability paradox in complex diseases remains so far elusive. Despite rapid advances of next-generation sequencing, both the technology and the analysis of the data it produces are in its infancy. At present there is abundant knowledge pertaining to the role of rare single nucleotide variants (SNVs) in rare disorders and of common SNVs in common disorders. Although the 1,000 genome project has clearly highlighted the prevalence of rare variants and more complex variants (e.g. insertions, deletions), their role in disease is as yet far from elucidated.We set out to analyse the properties of sequence variants identified in a comprehensive collection of exome re-sequencing studies performed on samples from patients affected by a broad range of complex and rare diseases (N = 173). Given the known potential for Loss of Function (LoF) variants to be false positive, we performed an extensive validation of the common, rare and private LoF variants identified, which indicated that most of the private and rare variants identified were indeed true, while common novel variants had a significantly higher false positive rate. Our results indicated a strong enrichment of very low-frequency insertion/deletion variants, so far under-investigated, which might be difficult to capture with low coverage and imputation approaches and for which most of study designs would be under-powered. These insertions and deletions might play a significant role in disease genetics, contributing specifically to the underlining rare and private variation predicted to be discovered through next generation sequencing.
Chiara Bacchelli, Karen F Buckland, Sylvie Buckridge, Ulrich Salzer, Pascal Schneider, Adrian J Thrasher, H Bobby Gaspar
J Allergy Clin Immunol. 2011 May;127(5):1253-9.e13
Mutations in TNFRSF13B, the gene encoding transmembrane activator and calcium modulator cyclophilin ligand interactor (TACI), are found in 10% of patients with common variable immunodeficiency. However, the most commonly detected mutation is the heterozygous change C104R, which is also found in 0.5% to 1% of healthy subjects. The contribution of the C104R mutation to the B-cell defects observed in patients with common variable immunodeficiency therefore remains unclear.
Ingrid Slade, Chiara Bacchelli, Helen Davies, Anne Murray, Fatemeh Abbaszadeh, Sandra Hanks, Rita Barfoot, Amos Burke, Julia Chisholm, Martin Hewitt, Helen Jenkinson, Derek King, Bruce Morland, Barry Pizer, Katrina Prescott, Anand Saggar, Lucy Side, Heidi Traunecker, Sucheta Vaidya, Paul Ward, P Andrew Futreal, Gordan Vujanic, Andrew G Nicholson, Neil Sebire, Clare Turnbull, John R Priest, Kathryn Pritchard-Jones, Richard Houlston, Charles Stiller, Michael R Stratton, Jenny Douglas, Nazneen Rahman
J Med Genet. 2011 Apr;48(4):273-8
Constitutional DICER1 mutations were recently reported to cause familial pleuropulmonary blastoma (PPB).
Ulrich Salzer, Chiara Bacchelli, Sylvie Buckridge, Qiang Pan-Hammarström, Stephanie Jennings, Vassilis Lougaris, Astrid Bergbreiter, Tina Hagena, Jennifer Birmelin, Alessandro Plebani, A David B Webster, Hans-Hartmut Peter, Daniel Suez, Helen Chapel, Andrew McLean-Tooke, Gavin P Spickett, Stephanie Anover-Sombke, Hans D Ochs, Simon Urschel, Bernd H Belohradsky, Sanja Ugrinovic, Dinakantha S Kumararatne, Tatiana C Lawrence, Are M Holm, Jose L Franco, Ilka Schulze, Pascal Schneider, E Michael Gertz, Alejandro A Schäffer, Lennart Hammarström, Adrian J Thrasher, H Bobby Gaspar, Bodo Grimbacher
Blood. 2009 Feb 26;113(9):1967-76
TNFRSF13B encodes transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI), a B cell- specific tumor necrosis factor (TNF) receptor superfamily member. Both biallelic and monoallelic TNFRSF13B mutations were identified in patients with common variable immunodeficiency disorders. The genetic complexity and variable clinical presentation of TACI deficiency prompted us to evaluate the genetic, immunologic, and clinical condition in 50 individuals with TNFRSF13B alterations, following screening of 564 unrelated patients with hypogammaglobulinemia. We identified 13 new sequence variants. The most frequent TNFRSF13B variants (C104R and A181E; n=39; 6.9%) were also present in a heterozygous state in 2% of 675 controls. All patients with biallelic mutations had hypogammaglobulinemia and nearly all showed impaired binding to a proliferation-inducing ligand (APRIL). However, the majority (n=41; 82%) of the pa-tients carried monoallelic changes in TNFRSF13B. Presence of a heterozygous mutation was associated with antibody deficiency (P< .001, relative risk 3.6). Heterozygosity for the most common mutation, C104R, was associated with disease (P< .001, relative risk 4.2). Furthermore, heterozygosity for C104R was associated with low numbers of IgD(-)CD27(+) B cells (P= .019), benign lymphoproliferation (P< .001), and autoimmune complications (P= .001). These associations indicate that C104R heterozygosity increases the risk for common variable immunodeficiency disorders and influences clinical presentation.
Pina-Martins F, Silva D, Fino J, Paulo O
Structure_threader is a program to parallelize multiple runs of genetic clustering software that does not make use of multithreading technology (structure, fastStructure and MavericK) on multicore computers. Our approach was benchmarked across multiple systems and displayed great speed improvements relative to the single-threaded implementation, scaling very close to linearly with the number of physical cores used. Structure_threader was compared to previous software written for the same task-ParallelStructure and StrAuto and was proven to be the faster (up to 25% faster) wrapper under all tested scenarios. Furthermore, Structure_threader can perform several automatic and convenient operations, assisting the user in assessing the most biologically likely value of 'K' via implementations such as the "Evanno," or "Thermodynamic Integration" tests and automatically draw the "meanQ" plots (static or interactive) for each value of K (or even combined plots). Structure_threader is written in python 3 and licensed under the GPLv3. It can be downloaded free of charge at https://github.com/StuntsPT/Structure_threader.
C Bacchelli, S Buckridge, A J Thrasher, H B Gaspar
Clin Exp Immunol. 2007 Sep;149(3):401-9
Common variable immunodeficiency (CVID) is a primary immunodeficiency that typically affects adults and is characterized by abnormalities of quantative and qualitative humoral function that are heterogeneous in their immunological profile and clinical manifestations. The recent identification of four monogenic defects that result in the CVID phenotype also demonstrates that the genetic basis of CVID is highly variable. Mutations in the genes encoding the tumour necrosis factor (TNF) superfamily receptors transmembrane activator and calcium-modulating ligand interactor (TACI) and B cell activation factor of the TNF family receptor (BAFF-R), CD19 and the co-stimulatory molecule inducible co-stimulator molecule (ICOS) all lead to CVID and illustrate the complex interplay required to co-ordinate an effective humoral immune response. The molecular mechanisms leading to the immune defect are still not understood clearly and particularly in the case of TACI, where a number of heterozygous mutations have been found in affected individuals, the molecular pathogenesis of disease requires further elucidation. Together these defects account for perhaps 10-15% of all cases of CVID and it is highly likely that further genetic defects will be identified.
Ulrich Salzer, Jennifer Birmelin, Chiara Bacchelli, Torsten Witte, Ulrike Buchegger-Podbielski, Sylvie Buckridge, Rita Rzepka, H Bobby Gaspar, Adrian J Thrasher, Reinhold E Schmidt, Inga Melchers, Bodo Grimbacher
J Clin Immunol. 2007 Jul;27(4):372-7
B cell activating factor belonging to the TNF family (BAFF) and a proliferation inducing ligand (APRIL), and their receptors BAFF receptor (BAFFR), B cell maturation antigen (BCMA), and transmembrane activator and CAML interactor (TACI) are involved in the regulation of B cell homeostasis and differentiation. BAFF overexpression leads to systemic lupus erythematosus (SLE) in mice and elevated BAFF levels have been observed in human SLE and mouse models for SLE. Furthermore, genetic inactivation of TACI in mice results in a SLE-like phenotype. Based on our recent finding that TACI is mutated in patients with common variable immunodeficiency, of whom more than 30% suffer from autoimmune conditions, we analyzed TACI in humans with SLE. Sequence analysis of TNFRSF13b/TACI in 119 unrelated SLE patients revealed four variants: R20C in exon 1, R72H in exon 3, the silent variation c.327 G > A in exon 3, and A181E in exon 4. No significant association with any of these variants was found, when compared to the frequencies of the variants in a healthy control cohort. Furthermore, the mutated alleles R20C and R72H did not segregate with the SLE phenotype in familial cases of SLE. Thus, our evaluation of the coding region of TNFRSF13b/TACI did not reveal any deleterious or disease-associated mutations.
Philip L Beales, Elizabeth Bland, Jonathan L Tobin, Chiara Bacchelli, Beyhan Tuysuz, Josephine Hill, Suzanne Rix, Chad G Pearson, Masatake Kai, Jane Hartley, Colin Johnson, Melita Irving, Nursel Elcioglu, Mark Winey, Masazumi Tada, Peter J Scambler
Nat Genet. 2007 Jun;39(6):727-9
Jeune asphyxiating thoracic dystrophy, an autosomal recessive chondrodysplasia, often leads to death in infancy because of a severely constricted thoracic cage and respiratory insufficiency; retinal degeneration, cystic renal disease and polydactyly may be complicating features. We show that IFT80 mutations underlie a subset of Jeune asphyxiating thoracic dystrophy cases, establishing the first association of a defective intraflagellar transport (IFT) protein with human disease. Knockdown of ift80 in zebrafish resulted in cystic kidneys, and knockdown in Tetrahymena thermophila produced shortened or absent cilia.
Qiang Pan-Hammarström, Ulrich Salzer, Likun Du, Janne Björkander, Charlotte Cunningham-Rundles, David L Nelson, Chiara Bacchelli, H Bobby Gaspar, Steven Offer, Timothy W Behrens, Bodo Grimbacher, Lennart Hammarström
Nat Genet. 2007 Apr;39(4):429-30
Tumor necrosis factor (TNF)-like receptors are members of a superfamily of proteins involved in regulating maturation and survival of lymphocytes. One of these receptors, TACI (transmembrane activator and CAML interactor; encoded by TNFRSF13B), binds two ligands, BAFF and APRIL. Deletion of Tnfrsf13b in mice results in an impaired response to thymus-independent antigens1 and virtually abolishes APRIL-induced switching to IgA, IgE and IgG1. Conversely, lack of APRIL, owing to a targeted inactivation of Tnfsf13 in mice, results in an impaired ability to switch to IgA production.
C Bacchelli, L C Wilson, J A Cook, R M Winter, F R Goodman
Clin Genet. 2003 Sep;64(3):263-5
Brachydactyly type B (BDB, MIM 113000), the most severe of the inherited brachydactylies, is characterized by short second to fifth fingers with hypoplastic or absent nails and distal phalanges, hypoplastic middle phalanges and variable symphalangism (1, 2). The thumbs are frequently spared, but may be broad and flat with duplicated distal phalanges. There may also be central soft-tissue syndactyly. The feet are similarly, but more mildly, affected. In some families, there is also a characteristic facial appearance, including wide-spaced eyes. down-slanting palpebral fissures, a short philtrum and a prominent nose with a bulbous tip (3-5). BDB is dominantly inherited and genetically heterogeneous (4, 5). The underlying gene at the locus on chromosome 9q22, ROR2 (6, 7), encodes an orphan receptor tyrosine kinase that is expressed in a variety of developing tissues, including the primordia of all bones that undergo endochondral ossification, and plays an essential role in cartilage growth and differentiation (8-11). Homozygous loss-of-function mutations in ROR2 cause recessively inherited Robinow syn-drome (MIM 268310) (12-14), but heterozygous loss-of-function mutations cause no abnormalities in humans (6, 12, 13) or mice (8). The heterozygous ROR2 mutations responsible for BDB are thus thought to act by a specific gain-of-function mechanism (6, 7). The role of ROR2 in other forms of isolated and syndromic brachydactyly, however, has not yet been examined.
N V Morgan, C Bacchelli, P Gissen, J Morton, G B Ferrero, M Silengo, P Labrune, I Casteels, C Hall, P Cox, D A Kelly, R C Trembath, P J Scambler, E R Maher, F R Goodman, C A Johnson
J Med Genet. 2003 Jun;40(6):431-5
Asphyxiating thoracic dystrophy (ATD), or Jeune syndrome, is a multisystem autosomal recessive disorder associated with a characteristic skeletal dysplasia and variable renal, hepatic, pancreatic, and retinal abnormalities. We have performed a genome wide linkage search using autozygosity mapping in a cohort of four consanguineous families with ATD, three of which originate from Pakistan, and one from southern Italy. In these families, as well as in a fifth consanguineous family from France, we localised a novel ATD locus (ATD) to chromosome 15q13, with a maximum cumulative two point lod score at D15S1031 (Zmax=3.77 at theta=0.00). Five consanguineous families shared a 1.2 cM region of homozygosity between D15S165 and D15S1010. Investigation of a further four European kindreds, with no known parental consanguinity, showed evidence of marker homozygosity across a similar interval. Families with both mild and severe forms of ATD mapped to 15q13, but mutation analysis of two candidate genes, GREMLIN and FORMIN, did not show pathogenic mutations.
P Debeer, C Bacchelli, P J Scambler, L De Smet, J-P Fryns, F R Goodman
J Med Genet. 2002 Nov;39(11):852-6
Hox genes encode a highly conserved family of transcription factors with fundamental roles in body patterning during embryogenesis. Studies in mouse and chick have shown that the 5‘ HoxD and HoxA genes are critical for vertebrate limb and urogenital tract development. In humans, mutations in HOXD13 and HOXA13 cause the rare dominantly inherited limb malformation syndromes synpolydactyly (SPD, MIM 186000) and hand-foot-genital syndrome (HFGS, MIM 140000), respectively. SPD is characterised by syndactyly between the third and fourth fingers and between the fourth and fifth toes, with variable digit duplication in the syndactylous web. Most cases result from expansions of a polyalanine tract in the N-terminal region of HOXD13 but frameshifting deletions have been identified in three families with an atypical foot phenotype. HFGS is characterised by short thumbs and halluces, hypospadias in males, Müllerian duct fusion defects in females, and urinary tract malformations in both sexes. Most cases result from nonsense mutations in HOXA13, but two polyalanine tract expansions and one missense mutation have also been described.
Jeffrey W Innis, Frances R Goodman, Chiara Bacchelli, Thomas M Williams, Douglas P Mortlock, Praveen Sateesh, Peter J Scambler, Wendy McKinnon, Alan E Guttmacher
Hum Mutat. 2002 May;19(5):573-4
Guttmacher syndrome, a dominantly inherited combination of distal limb and genital tract abnormalities, has several features in common with hand-foot-genital syndrome (HFGS), including hypoplastic first digits and hypospadias. The presence of features not seen in HFGS, however, including postaxial polydactyly of the hands and uniphalangeal 2(nd) toes with absent nails, suggests that it represents a distinct entity. HFGS is caused by mutations in the HOXA13 gene. We have therefore re-investigated the original Guttmacher syndrome family, and have found that affected individuals are heterozygous for a novel missense mutation in the HOXA13 homeobox (c.1112A>T; homeodomain residue Q50L), which arose on an allele already carrying a novel 2-bp deletion (-78-79delGC) in the gene's highly conserved promoter region. This deletion produces no detectable abnormalities on its own, but may contribute to the phenotype in the affected individuals. The missense mutation, which alters a key residue in the recognition helix of the homeodomain, is likely to perturb HOXA13's DNA-binding properties, resulting in both a loss and a specific gain of function.
C Bacchelli, F R Goodman, P J Scambler, R M Winter
Clin Genet. 2001 Mar;59(3):203-5
Cenani-Lenz syndrome (CLS; MIM 212780) is a rare congenital limb malformation characterized by 'total' digit syndactyly and extensive metacarpal and carpal fusions, often accompanied by partial or complete radio-ulnar synostosis. The feet are usually only mildly affected, and tibio-fibular synostosis has never been observed. To date, only 22 cases have been described; of these, two also have extreme shortening of the forearms, and may represent a separate condition. Although most cases are sporadic, there are four reports of affected siblings born to unaffected non-consanguineous parents, and three reports of affected children born to unaffected consanguineous parents, strongly suggesting autosomal recessive inheritance. A dominantly-inherited limb malformation originally reported as CLS, and associated with a t(12;22Xp11.2; 813.3) balanced translocation, was subsequently re-classified as a novel type of complex synpolydactyly.
F R Goodman, C Bacchelli, A F Brady, L A Brueton, J P Fryns, D P Mortlock, J W Innis, L B Holmes, A E Donnenfeld, M Feingold, F A Beemer, R C Hennekam, P J Scambler
Am J Hum Genet. 2000 Jul;67(1):197-202
Hand-foot-genital syndrome (HFGS) is a rare, dominantly inherited condition affecting the distal limbs and genitourinary tract. A nonsense mutation in the homeobox of HOXA13 has been identified in one affected family, making HFGS the second human syndrome shown to be caused by a HOX gene mutation. We have therefore examined HOXA13 in two new and four previously reported families with features of HFGS. In families 1, 2, and 3, nonsense mutations truncating the encoded protein N-terminal to or within the homeodomain produce typical limb and genitourinary abnormalities; in family 4, an expansion of an N-terminal polyalanine tract produces a similar phenotype; in family 5, a missense mutation, which alters an invariant domain, produces an exceptionally severe limb phenotype; and in family 6, in which limb abnormalities were atypical, no HOXA13 mutation could be detected. Mutations in HOXA13 can therefore cause more-severe limb abnormalities than previously suspected and may act by more than one mechanism.