In genetics, a fusion gene is a hybrid gene formed from two previously independent genes. It can occur as a result of translocation, interstitial deletion, or chromosomal inversion. Fusion genes have been found to be prevalent in all main types of human neoplasia.[1] The identification of these fusion genes play a prominent role in being a diagnostic and prognostic marker.[2]
The first fusion gene[1] was described in cancer cells in the early 1980s. The finding was based on the discovery in 1960 by Peter Nowell and David Hungerford in Philadelphia of a small abnormal marker chromosome in patients with chronic myeloid leukemia—the first consistent chromosome abnormality detected in a human malignancy, later designated the Philadelphia chromosome.[3] In 1973, Janet Rowley in Chicago showed that the Philadelphia chromosome had originated through a translocation between chromosomes 9 and 22, and not through a simple deletion of chromosome 22 as was previously thought. Several investigators in the early 1980s showed that the Philadelphia chromosome translocation led to the formation of a new BCR::ABL1 fusion gene, composed of the 3' part of the ABL1 gene in the breakpoint on chromosome 9 and the 5' part of a gene called BCR in the breakpoint in chromosome 22. In 1985 it was clearly established that the fusion gene on chromosome 22 produced an abnormal chimeric BCR::ABL1 protein with the capacity to induce chronic myeloid leukemia.
It has been known for 30 years that the corresponding gene fusion plays an important role in tumorigenesis.[4] Fusion genes can contribute to tumor formation because fusion genes can produce much more active abnormal protein than non-fusion genes. Often, fusion genes are oncogenes that cause cancer; these include BCR-ABL,[5] TEL-AML1 (ALL with t(12 ; 21)), AML1-ETO (M2 AML with t(8 ; 21)), and TMPRSS2-ERG with an interstitial deletion on chromosome 21, often occurring in prostate cancer.[6] In the case of TMPRSS2-ERG, by disrupting androgen receptor (AR) signaling and inhibiting AR expression by oncogenic ETS transcription factor, the fusion product regulates the prostate cancer.[7] Most fusion genes are found from hematological cancers, sarcomas, and prostate cancer.[1] [8] BCAM-AKT2 is a fusion gene that is specific and unique to high-grade serous ovarian cancer.[9]
Oncogenic fusion genes may lead to a gene product with a new or different function from the two fusion partners. Alternatively, a proto-oncogene is fused to a strong promoter, and thereby the oncogenic function is set to function by an upregulation caused by the strong promoter of the upstream fusion partner. The latter is common in lymphomas, where oncogenes are juxtaposed to the promoters of the immunoglobulin genes.[10] Oncogenic fusion transcripts may also be caused by trans-splicing or read-through events.[11]
Since chromosomal translocations play such a significant role in neoplasia, a specialized database of chromosomal aberrations and gene fusions in cancer has been created. This database is called Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer.[12]
Presence of certain chromosomal aberrations and their resulting fusion genes is commonly used within cancer diagnostics in order to set a precise diagnosis. Chromosome banding analysis, fluorescence in situ hybridization (FISH), and reverse transcription polymerase chain reaction (RT-PCR) are common methods employed at diagnostic laboratories. These methods all have their distinct shortcomings due to the very complex nature of cancer genomes. Recent developments such as high-throughput sequencing[13] and custom DNA microarrays bear promise of introduction of more efficient methods.[14]
Gene fusion plays a key role in the evolution of gene architecture. We can observe its effect if gene fusion occurs in coding sequences.[15] Duplication, sequence divergence, and recombination are the major contributors at work in gene evolution.[16] These events can probably produce new genes from already existing parts. When gene fusion happens in non-coding sequence region, it can lead to the misregulation of the expression of a gene now under the control of the cis-regulatory sequence of another gene. If it happens in coding sequences, gene fusion cause the assembly of a new gene, then it allows the appearance of new functions by adding peptide modules into a multi-domain protein.[15] The detecting methods to inventory gene fusion events on a large biological scale can provide insights about the multi modular architecture of proteins.[17] [18] [19]
The purines adenine and guanine are two of the four information encoding bases of the universal genetic code. Biosynthesis of these purines occurs by similar, but not identical, pathways in different species of the three domains of life, the Archaea, Bacteria and Eukaryotes. A major distinctive feature of the purine biosynthetic pathways in Bacteria is the prevalence of gene fusions where two or more purine biosynthetic enzymes are encoded by a single gene.[20] Such gene fusions are almost exclusively between genes that encode enzymes that perform sequential steps in the biosynthetic pathway. Eukaryotic species generally exhibit the most common gene fusions seen in the Bacteria, but in addition have new fusions that potentially increase metabolic flux.
In recent years, next generation sequencing technology has already become available to screen known and novel gene fusion events on a genome wide scale. However, the precondition for large scale detection is a paired-end sequencing of the cell's transcriptome. The direction of fusion gene detection is mainly towards data analysis and visualization. Some researchers already developed a new tool called Transcriptome Viewer (TViewer) to directly visualize detected gene fusions on the transcript level.[21]
Biologists may also deliberately create fusion genes for research purposes. The fusion of reporter genes to the regulatory elements of genes of interest allows researches to study gene expression. Reporter gene fusions can be used to measure activity levels of gene regulators, identify the regulatory sites of genes (including the signals required), identify various genes that are regulated in response to the same stimulus, and artificially control the expression of desired genes in particular cells.[22] For example, by creating a fusion gene of a protein of interest and green fluorescent protein, the protein of interest may be observed in cells or tissue using fluorescence microscopy.[23] The protein synthesized when a fusion gene is expressed is called a fusion protein.