An expression vector, otherwise known as an expression construct, is usually a plasmid or virus designed for gene expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. Expression vectors are the basic tools in biotechnology for the production of proteins.
The vector is engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the gene carried on the expression vector.[1] The goal of a well-designed expression vector is the efficient production of protein, and this may be achieved by the production of significant amount of stable messenger RNA, which can then be translated into protein. The expression of a protein may be tightly controlled, and the protein is only produced in significant quantity when necessary through the use of an inducer, in some systems however the protein may be expressed constitutively. Escherichia coli is commonly used as the host for protein production, but other cell types may also be used. An example of the use of expression vector is the production of insulin, which is used for medical treatments of diabetes.
An expression vector has features that any vector may have, such as an origin of replication, a selectable marker, and a suitable site for the insertion of a gene like the multiple cloning site. The cloned gene may be transferred from a specialized cloning vector to an expression vector, although it is possible to clone directly into an expression vector. The cloning process is normally performed in Escherichia coli. Vectors used for protein production in organisms other than E.coli may have, in addition to a suitable origin of replication for its propagation in E. coli, elements that allow them to be maintained in another organism, and these vectors are called shuttle vectors.
An expression vector must have elements necessary for gene expression. These may include a promoter, the correct translation initiation sequence such as a ribosomal binding site and start codon, a termination codon, and a transcription termination sequence.[2] There are differences in the machinery for protein synthesis between prokaryotes and eukaryotes, therefore the expression vectors must have the elements for expression that are appropriate for the chosen host. For example, prokaryotes expression vectors would have a Shine-Dalgarno sequence at its translation initiation site for the binding of ribosomes, while eukaryotes expression vectors would contain the Kozak consensus sequence.
The promoter initiates the transcription and is therefore the point of control for the expression of the cloned gene. The promoters used in expression vector are normally inducible, meaning that protein synthesis is only initiated when required by the introduction of an inducer such as IPTG. Gene expression however may also be constitutive (i.e. protein is constantly expressed) in some expression vectors. Low level of constitutive protein synthesis may occur even in expression vectors with tightly controlled promoters.
See main article: Protein tag. After the expression of the gene product, it may be necessary to purify the expressed protein; however, separating the protein of interest from the great majority of proteins of the host cell can be a protracted process. To make this purification process easier, a purification tag may be added to the cloned gene. This tag could be histidine (His) tag, other marker peptides, or a fusion partners such as glutathione S-transferase or maltose-binding protein.[3] Some of these fusion partners may also help to increase the solubility of some expressed proteins. Other fusion proteins such as green fluorescent protein may act as a reporter gene for the identification of successful cloned genes, or they may be used to study protein expression in cellular imaging.[4] [5]
The expression vector is transformed or transfected into the host cell for protein synthesis. Some expression vectors may have elements for transformation or the insertion of DNA into the host chromosome, for example the vir genes for plant transformation, and integrase sites for chromosomal integration .
Some vectors may include targeting sequence that may target the expressed protein to a specific location such as the periplasmic space of bacteria.
Different organisms may be used to express a gene's target protein, and the expression vector used will therefore have elements specific for use in the particular organism. The most commonly used organism for protein production is the bacterium Escherichia coli. However, not all proteins can be successfully expressed in E. coli, or be expressed with the correct form of post-translational modifications such as glycosylations, and other systems may therefore be used.
The expression host of choice for the expression of many proteins is Escherichia coli as the production of heterologous protein in E. coli is relatively simple and convenient, as well as being rapid and cheap. A large number of E. coli expression plasmids are also available for a wide variety of needs. Other bacteria used for protein production include Bacillus subtilis.
Most heterologous proteins are expressed in the cytoplasm of E. coli. However, not all proteins formed may be soluble in the cytoplasm, and incorrectly folded proteins formed in cytoplasm can form insoluble aggregates called inclusion bodies. Such insoluble proteins will require refolding, which can be an involved process and may not necessarily produce high yield.[6] Proteins which have disulphide bonds are often not able to fold correctly due to the reducing environment in the cytoplasm which prevents such bond formation, and a possible solution is to target the protein to the periplasmic space by the use of an N-terminal signal sequence. Another possibility is to manipulate the redox environment of the cytoplasm.[7] Other more sophisticated systems are also being developed; such systems may allow for the expression of proteins previously thought impossible in E. coli, such as glycosylated proteins.[8] [9] [10]
The promoters used for these vector are usually based on the promoter of the lac operon or the T7 promoter,[11] and they are normally regulated by the lac operator. These promoters may also be hybrids of different promoters, for example, the Tac-Promoter is a hybrid of trp and lac promoters.[12] Note that most commonly used lac or lac-derived promoters are based on the lacUV5 mutant which is insensitive to catabolite repression. This mutant allows for expression of protein under the control of the lac promoter when the growth medium contains glucose since glucose would inhibit gene expression if wild-type lac promoter is used.[13] Presence of glucose nevertheless may still be used to reduce background expression through residual inhibition in some systems.[14]
Examples of E. coli expression vectors are the pGEX series of vectors where glutathione S-transferase is used as a fusion partner and gene expression is under the control of the tac promoter,[15] [16] [17] and the pET series of vectors which uses a T7 promoter.[18]
It is possible to simultaneously express two or more different proteins in E. coli using different plasmids. However, when 2 or more plasmids are used, each plasmid needs to use a different antibiotic selection as well as a different origin of replication, otherwise one of the plasmids may not be stably maintained. Many commonly used plasmids are based on the ColE1 replicon and are therefore incompatible with each other; in order for a ColE1-based plasmid to coexist with another in the same cell, the other would need to be of a different replicon, e.g. a p15A replicon-based plasmid such as the pACYC series of plasmids.[19] Another approach would be to use a single two-cistron vector or design the coding sequences in tandem as a bi- or poly-cistronic construct.[20] [21]
A yeast commonly used for protein production is Pichia pastoris.[22] Examples of yeast expression vector in Pichia are the pPIC series of vectors, and these vectors use the AOX1 promoter which is inducible with methanol.[23] The plasmids may contain elements for insertion of foreign DNA into the yeast genome and signal sequence for the secretion of expressed protein. Proteins with disulphide bonds and glycosylation can be efficiently produced in yeast. Another yeast used for protein production is Kluyveromyces lactis and the gene is expressed, driven by a variant of the strong lactase LAC4 promoter.[24]
Saccharomyces cerevisiae is particularly widely used for gene expression studies in yeast, for example in yeast two-hybrid system for the study of protein-protein interaction.[25] The vectors used in yeast two-hybrid system contain fusion partners for two cloned genes that allow the transcription of a reporter gene when there is interaction between the two proteins expressed from the cloned genes.
Baculovirus, a rod-shaped virus which infects insect cells, is used as the expression vector in this system.[26] Insect cell lines derived from Lepidopterans (moths and butterflies), such as Spodoptera frugiperda, are used as host. A cell line derived from the cabbage looper is of particular interest, as it has been developed to grow fast and without the expensive serum normally needed to boost cell growth.[27] [28] The shuttle vector is called bacmid, and gene expression is under the control of a strong promoter pPolh.[29] Baculovirus has also been used with mammalian cell lines in the BacMam system.
Baculovirus is normally used for production of glycoproteins, although the glycosylations may be different from those found in vertebrates. In general, it is safer to use than mammalian virus as it has a limited host range and does not infect vertebrates without modifications.
Many plant expression vectors are based on the Ti plasmid of Agrobacterium tumefaciens.[30] In these expression vectors, DNA to be inserted into plant is cloned into the T-DNA, a stretch of DNA flanked by a 25-bp direct repeat sequence at either end, and which can integrate into the plant genome. The T-DNA also contains the selectable marker. The Agrobacterium provides a mechanism for transformation, integration of into the plant genome, and the promoters for its vir genes may also be used for the cloned genes. Concerns over the transfer of bacterial or viral genetic material into the plant however have led to the development of vectors called intragenic vectors whereby functional equivalents of plant genome are used so that there is no transfer of genetic material from an alien species into the plant.[31]
Plant viruses may be used as vectors since the Agrobacterium method does not work for all plants. Examples of plant virus used are the tobacco mosaic virus (TMV), potato virus X, and cowpea mosaic virus.[32] The protein may be expressed as a fusion to the coat protein of the virus and is displayed on the surface of assembled viral particles, or as an unfused protein that accumulates within the plant. Expression in plant using plant vectors is often constitutive,[33] and a commonly used constitutive promoter in plant expression vectors is the cauliflower mosaic virus (CaMV) 35S promoter.[34] [35]
Mammalian expression vectors offer considerable advantages for the expression of mammalian proteins over bacterial expression systems - proper folding, post-translational modifications, and relevant enzymatic activity. It may also be more desirable than other eukaryotic non-mammalian systems whereby the proteins expressed may not contain the correct glycosylations. It is of particular use in producing membrane-associating proteins that require chaperones for proper folding and stability as well as containing numerous post-translational modifications. The downside, however, is the low yield of product in comparison to prokaryotic vectors as well as the costly nature of the techniques involved. Its complicated technology, and potential contamination with animal viruses of mammalian cell expression have also placed a constraint on its use in large-scale industrial production.[36]
Cultured mammalian cell lines such as the Chinese hamster ovary (CHO), COS, including human cell lines such as HEK and HeLa may be used to produce protein. Vectors are transfected into the cells and the DNA may be integrated into the genome by homologous recombination in the case of stable transfection, or the cells may be transiently transfected. Examples of mammalian expression vectors include the adenoviral vectors,[37] the pSV and the pCMV series of plasmid vectors, vaccinia and retroviral vectors,[38] as well as baculovirus.[39] The promoters for cytomegalovirus (CMV) and SV40 are commonly used in mammalian expression vectors to drive gene expression. Non-viral promoter, such as the elongation factor (EF)-1 promoter, is also known.[40]
E. coli cell lysate containing the cellular components required for transcription and translation are used in this in vitro method of protein production. The advantage of such system is that protein may be produced much faster than those produced in vivo since it does not require time to culture the cells, but it is also more expensive. Vectors used for E. coli expression can be used in this system although specifically designed vectors for this system are also available. Eukaryotic cell extracts may also be used in other cell-free systems, for example, the wheat germ cell-free expression systems.[41] Mammalian cell-free systems have also been produced.[42]
Expression vector in an expression host is now the usual method used in laboratories to produce proteins for research. Most proteins are produced in E. coli, but for glycosylated proteins and those with disulphide bonds, yeast, baculovirus and mammalian systems may be used.
Most protein pharmaceuticals are now produced through recombinant DNA technology using expression vectors. These peptide and protein pharmaceuticals may be hormones, vaccines, antibiotics, antibodies, and enzymes.[43] The first human recombinant protein used for disease management, insulin, was introduced in 1982.[43] Biotechnology allows these peptide and protein pharmaceuticals, some of which were previously rare or difficult to obtain, to be produced in large quantity. It also reduces the risks of contaminants such as host viruses, toxins and prions. Examples from the past include prion contamination in growth hormone extracted from pituitary glands harvested from human cadavers, which caused Creutzfeldt–Jakob disease in patients receiving treatment for dwarfism,[44] and viral contaminants in clotting factor VIII isolated from human blood that resulted in the transmission of viral diseases such as hepatitis and AIDS.[45] [46] Such risk is reduced or removed completely when the proteins are produced in non-human host cells.
In recent years, expression vectors have been used to introduce specific genes into plants and animals to produce transgenic organisms, for example in agriculture it is used to produce transgenic plants. Expression vectors have been used to introduce a vitamin A precursor, beta-carotene, into rice plants. This product is called golden rice. This process has also been used to introduce a gene into plants that produces an insecticide, called Bacillus thuringiensis toxin or Bt toxin which reduces the need for farmers to apply insecticides since it is produced by the modified organism. In addition expression vectors are used to extend the ripeness of tomatoes by altering the plant so that it produces less of the chemical that causes the tomatoes to rot.[47] There have been controversies over using expression vectors to modify crops due to the fact that there might be unknown health risks, possibilities of companies patenting certain genetically modified food crops, and ethical concerns. Nevertheless, this technique is still being used and heavily researched.
Transgenic animals have also been produced to study animal biochemical processes and human diseases, or used to produce pharmaceuticals and other proteins. They may also be engineered to have advantageous or useful traits. Green fluorescent protein is sometimes used as tags which results in animal that can fluoresce, and this have been exploited commercially to produce the fluorescent GloFish.
See main article: Vectors in gene therapy. Gene therapy is a promising treatment for a number of diseases where a "normal" gene carried by the vector is inserted into the genome, to replace an "abnormal" gene or supplement the expression of particular gene. Viral vectors are generally used but other nonviral methods of delivery are being developed. The treatment is still a risky option due to the viral vector used which can cause ill-effects, for example giving rise to insertional mutation that can result in cancer.[48] [49] However, there have been promising results.[50] [51]