This is one of the most discussed topics especially in Clinical Genomics! Affordability, accuracy, feasibility and of course time consumption - based on these factor mostly, which sequencing technology is more suitable for clinics? Whole Exome Sequencing or Whole Genome Sequencing? (WGS or WES, WGS vs WES) So here's my 2 cents on this discussion! When it comes to DNA sequencing there has always been a raging debate over the choice of Whole Genome Sequencing (WGS) or Whole Exome Sequencing (WXS) for routine use. Whole genome sequencing (WGS), as the name suggests is the process of obtaining the entire genome. In most cases however, this is far from practical and only 95-97% of the genome is covered because it is technically difficult to sequence certain regions of the genome (high GC content, large repeat regions, centromeres, telomeres, etc.) with existing technology. “It’s very fair to say the human genome was never fully sequenced,” - Craig Venter “The human genome ha...
This is one of the most discussed topics especially in Clinical Genomics!
Affordability, accuracy, feasibility and of course time consumption - based on these factor mostly, which sequencing technology is more suitable for clinics? Whole Exome Sequencing or Whole Genome Sequencing? (WGS or WES, WGS vs WES)
So here's my 2 cents on this discussion!
When it comes to DNA sequencing there has always been a raging debate over the choice of Whole Genome Sequencing (WGS) or Whole Exome Sequencing (WXS) for routine use. Whole genome sequencing (WGS), as the name suggests is the process of obtaining the entire genome. In most cases however, this is far from practical and only 95-97% of the genome is covered because it is technically difficult to sequence certain regions of the genome (high GC content, large repeat regions, centromeres, telomeres, etc.) with existing technology.
WGS or Exome
Affordability, accuracy, feasibility and of course time consumption - based on these factor mostly, which sequencing technology is more suitable for clinics? Whole Exome Sequencing or Whole Genome Sequencing? (WGS or WES, WGS vs WES)
So here's my 2 cents on this discussion!
When it comes to DNA sequencing there has always been a raging debate over the choice of Whole Genome Sequencing (WGS) or Whole Exome Sequencing (WXS) for routine use. Whole genome sequencing (WGS), as the name suggests is the process of obtaining the entire genome. In most cases however, this is far from practical and only 95-97% of the genome is covered because it is technically difficult to sequence certain regions of the genome (high GC content, large repeat regions, centromeres, telomeres, etc.) with existing technology.
“It’s very fair to say the human genome was never fully sequenced,” - Craig Venter
“The human genome has not been completely sequenced and neither has any other mammalian genome as far as I’m aware” - George Church, Harvard Medical School
According to Dr. George Church of the Harvard Medical School, about 4-9% of the human genome is yet to be sequenced! (Read more about the incomplete nature of the Human Genome Project in this STAT news article)
Exome sequencing, sometimes called ‘whole exome sequencing’ (WES or WXS), instead focuses on just the protein coding sequences. In the human genome that is slightly more than 3 billion bases, only a fraction (roughly 1.5%) is coding (called the Exome) and the remaining consists of repeating DNA sequence that does not code for protein, and has no known function (for the sake of this discussion).
Exome is like the juicy succulent part of an orange and the rind along with seeds forms the junk DNA (well, not a proportionate analogy but a functionally apt one!)
Here is a comparison between these two technologies!
Who wouldn't love to have the whole shebang of all data - exons, introns, promoters, genomic and UTR regulatory regions, noncoding RNAs and structurally variant segments, right?! Think about the decisions that could be facilitated by all that data especially in a clinical situation! But in the current climate, making Clinical WGS feasible and efficient remains a huge practical challenge especially with respect to data storage and compute needs. This is only more complicated with the data privacy, sharing and storage security concerns that surround patient data or other clinically sensible data in such large volumes.
Whole Exome sequencing on the other hand makes the task less daunting. In as little as 1.5% of the total genome's size, WXS can find as many as 23,500 genes with 180,000 exons (1). So overall, less sequencing is required compared to WGS and hence lower costs. This narrow stretch of sequence (Exome) contains approximately 80-90% of all disease causing variants making it a data-rich ground for all analysis saving a lot of time. Comparing clinical grade sequencing a 100x depth exome with that of a 30x whole genome, the file size is anything between 5GB to 8GB for a 100x exome, very clearly less than ~90GB in case of 30x whole genome.
Exome sequencing produces less raw data reducing the overall cost of the project and enabling larger number of samples to be studied simultaneously. This makes identification of novel disease causing variants easier with exome sequencing. This also means that there are more annotated and interpreted data points corresponding to exome regions compared to the non-coding regions. This also means on the contrary that important variations outside the exonic regions will be missed.
Another important thing to keep in mind would be the targeting efficiency of currently available Exome capture kits. While the known number of putative human genes is somewhere between 22,000 and 23,500 approximately (http://www.ncbi.nlm.nih.gov/projects/CCDS/), most of the commercial exome capture kits cover only the CCDS set. This means only around 18,000 genes would be covered from the total set. Adding to the pain, not all of the genes in the CCDS are covered fully. Some exomic regions are excluded during the manufacturing of kits due to the hybridization chemistry and possibility of cross-hybridizations complicated further by segmental duplications and some conserved pseudogenes. Massive deep sequencing of the transcriptome also contributes to this growing complexity by finding novel transcripts and genes in less characterized CCDS and genes.
When we make recommendations for choosing the accurate sequencing technology, quality parameters play an important role. It has been noted the Whole Genome Sequencing (WGS) offers a uniform distribution of sequencing quality parameters compared to WXS. PCR enrichment in exome sequencing produces regions of non-uniform coverage - regions with too much or too less coverage. The heterogeneous coverage of Whole Exome and resultant false positive variants are directly related to PCR and this is a criminal waste of sequencing power especially coupled with a bad capture kit. Average error rate by cycle is two times more in PCR-based technologies (WXS) compared to PCR-free technologies (WGS).
There are also occasions where several potentially damaging coding SNVs have been missed by WXS despite being located in the regions targeted by the exome kit. With respect to Copy Number Variations (CNV), it is needless to say WXS is a not a reliable technology due to the noncontiguous nature of the captured exons, in particular, and the extension of most CNVs beyond the regions covered by the exome kit (2). Whereas in the case of WGS, better identification of copy number variations, rearrangements and other structural variations is very much possible taking advantage of longer reads. Since WGS is aimed at the universal set, there is no reference bias whereas WES capture probes tend to preferentially enrich reference alleles at heterozygous sites producing false negative SNV calls.
Whole Exome Sequencing makes life easy by saving time and cost! But Whole Genome Sequencing gives more complete and reliable insights into genome-wide variation. WXS would be a diagnostic tool where time and money are of essence but approximately 75-80% of the cases still remain undiagnosed. In all those situations, WGS would be the way forward of course taking into account the cost of data analysis and storage.
Whole Genome Sequencing covers almost all regions of the genome! Who wouldn't love to have the whole shebang of all data!
In conclusion, there is no clear consensus as to which one is better of the two with strong proponents on both sides. However, with the WGS prices falling globally, WXS is losing its stand with respect to cost-effectiveness and with modern advancements in computing, the time for analyzing WGS datasets is also coming down. These are pointing towards a favourable future for WGS perhaps as long as users worldwide don’t swing back in favour of WXS, of course with some major modification.
1. Ipe, J., Swart, M., Burgess, K. and Skaar, T. (2017), High-Throughput Assays to Assess the Functional Impact of Genetic Variants: A Road Towards Genomic-Driven Medicine. Clinical And Translational Science. doi:10.1111/cts.12440
2. Belkadi, A., Bolze, A., Itan, Y., Cobat, A., Vincent, Q. B., Antipenko, A., Abel, L. (2014). Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. doi:10.1101/010363
WGS or Exome
Comments
Post a Comment