Homologous recombination, also known as general recombination, is a type of genetic recombination in which nucleotide sequences are exchanged between two similar or identical strands of DNA. The process involves several steps of physical breaking and the eventual rejoining of DNA. Although most widely used to accurately repair double-strand breaks in DNA, homologous recombination also produces new combinations of DNA sequences during chromosomal crossover in meiosis. These new combinations of DNA represent genetic variation (e.g. new, possibly beneficial, combinations of alleles), which allows populations to evolutionarily adapt to changing environmental conditions over time.
There are two different types of homologous recombination, one typically involved in DNA repair during mitosis and another involved in meiosis. They share the same initial steps: after a double-strand break occurs, sections of DNA around the break on the 5' end of the damaged chromosome are removed in a process called resection. In the strand invasion step that follows, an overhanging 3' end of the damaged chromosome then "invades" an undamaged homologous chromosome. A Holliday junction is formed between the two chromosomes after strand invasion. In the DNA repair pathway, a second Holliday junction forms. Depending on how the two junctions are resolved (i.e., cut), the meiotic version results in either chromosomal crossover or non-crossover.
Homologous recombination is widely conserved across all three domains of life, suggesting that it is a fundamental biological mechanism. The discovery of genes for homologous recombination in protists has been interpreted as evidence for an early eukaryotic origin of meiosis. Since their dysfunction has been strongly associated with increased susceptibility to several types of cancer, the proteins that facilitate homologous recombination are topics of active research. Homologous recombination is also used as a technique in molecular biology for introducing genetic changes into target organisms. The development of gene targeting techniques that rely on homologous recombination was the subject of the 2007 Nobel Prize for Physiology or Medicine.
Homologous recombination is a major DNA repair process in bacteria. It is also important for producing genetic diversity in bacterial populations, although the process differs substantially from meiotic recombination, which brings about diversity in eukaryotic genomes. Homologous recombination has been most studied and is best understood for Escherichia coli,. Two well-known versions of the pathway are the RecBCD pathway, which aids in the repair of double-strand breaks in DNA, and the RecF pathway, which promotes repair of single-strand breaks. The RecBCD pathway is used in DNA repair to restart replication forks that have been stalled or damaged, and to regulate gene expression.
The RecBCD pathway is the main recombination pathway used in bacteria to repair double-strand breaks in DNA. The RecBCD enzyme initiates recombination by binding to a blunt or nearly blunt end of a break in double-strand DNA. After RecBCD binds with DNA, the RecB and RecD subunits begin unzipping the DNA duplex through helicase activity driven by ATP hydrolysis. The same subunits then endonucleolytically cleave the single strands of DNA that emerge from the unzipping process, with RecB cutting the 3' strand more frequently than RecD cuts the 5' strand. This unzipping and cleaving continues until RecBCD encounters a specific nucleotide sequence (5'-GCTGGTGG-3') known as a chi site.
Upon encountering a chi site, the activity of the RecBCD enzyme changes drastically. The RecBCD first pauses for a few seconds, then resumes movement at roughly half its initial speed. The RecD subunit increases its cleavage rate, leaving the 5' strand even more fragmented, while the RecB subunit stops its cutting altogether, leaving an intact, overhanging 3' strand upstream of the chi site. Recognition of the chi site also changes the RecBCD enzyme so that it begins loading multiple RecA proteins onto the 3' overhang. The resulting RecA-coated nucleoprotein filament then searches out similar sequences of DNA on a homologous chromosome. When it detects that similar region in another DNA molecule, the RecA strand moves into the homologous DNA duplex in a process called strand invasion. The invading 3' overhang causes one of the strands of the recipient DNA duplex to be displaced. The result is a cross-shaped structure called a Holliday junction.
Bacteria use the RecF pathway of homologous recombination, also called the RecFOR pathway, to repair single-strand breaks in DNA. When the RecBCD pathway is inactivated by mutations, the RecF pathway can also substitute in repairing DNA double-strand breaks. The RecF pathway is much less understood than the RecBCD pathway. The two pathways both require RecA for strand invasion, and are similar in their phases of branch migration and resolution.
The RecF pathway begins when RecJ, an exonuclease that cleaves single-stranded DNA in the 5′ → 3′ direction, binds to the 5' end of a single-strand break in DNA and starts moving upstream while cleaving the 5' strand. Although RecJ can function without them, single-strand binding protein (SSBP) and RecQ helicase greatly increase how much the 5' is resected (i.e., cut back). When present, SSBP binds to the 3' overhang left after RecJ's resection of the 5' strand. This ensures that the single-stranded 3' overhang does not fold into secondary structures due to self-complementation.
RecA can be loaded onto the SSBP-coated 3' overhang in either of two distinct pathways, one that requires the RecFOR enzyme or one that requires the RecOR enzyme. In the RecFOR pathway, the RecFR complex binds where the single-strand DNA of the 3' meets the double-strand DNA. RecO then displaces SSBP from the ssDNA, although SSBP remains attached to RecO. RecFOR then loads RecA onto a recessed 5' end of this ssDNA-dsDNA junction. The RecR subunit in RecFR then interacts with RecO to form the RecFOR complex. In doing so, the RecR subunit helps to both detach the SSBP molecules from RecO and load molecules of the RecA protein onto the 3' overhang.
The RecOR pathway of RecA loading differs from the RecFOR pathway in several respects, most notably its molecular interaction requirements and its ideal DNA substrate. Unlike the RecFOR pathway, the RecOR pathway requires an interaction between RecO and the C-terminus of SSBP. The RecOR pathway also does not need a ssDNA-dsDNA junction to begin loading RecA onto the 3' overhang, whereas the RecFOR pathway typically does to work efficiently. Thus, the RecOR pathway in most conditions is more efficient than the RecFOR pathway in loading RecA.
Immediately after strand invasion, the Holliday junction moves along the linked DNA in a process called branch migration. It is in this movement of the Holliday junction that base pairs between the two homologous DNA duplexes are exchanged. To catalyze branch migration, the RuvA protein first recognizes and binds to the Holliday junction and recruits RuvB to form the RuvAB complex. Two molecules of RuvB, a hexameric, ring-shaped ATPase, are loaded onto opposite sides of the Holliday junction, where they act as twin pumps to that provide the force for branch migration. Two tetramers of RuvA remain situated in the square-shaped center of the Holliday junction such that the DNA at the junction is sandwiched between each RuvA tetramer. The strands of both DNA duplexes are unwound on the surface of RuvA as they are guided by the protein from one duplex to the other.
In the resolution phase of recombination, any Holliday junctions formed by the strand invasion process are cut, thereby restoring two separate DNA molecules. This cleavage is done by RuvAB complex interacting with RuvC, which together form the RuvABC complex. RuvC is an endonuclease that cuts the degenerate sequence 5'-(A/T)TT(G/C)-3', which is found frequently in DNA (about once every 64 nucleotides). Before cutting, RuvC likely gains access to the Holliday junction by displacing one of the two RuvA tetramers covering the DNA there. Recombination results in either "splice" or "patch" products, depending on how RuvC cleaves the Holliday junction. Splice products are crossover products, in which there is a reassortment of genetic material that flanks the site of recombination. Patch products, on the other hand, are non-crossover products in which there is no such reassortment and there is only a "patch" of hybrid DNA in the recombination product.
Homologous recombination is an important method of integrating donor DNA into a recipient organism's genome in horizontal gene transfer, the process by which an organism incorporates foreign DNA from another organism without being the offspring of that organism. Homologous recombination requires incoming DNA to be highly similar to the recipient genome, and so horizontal gene transfer is usually limited to similar bacteria. Studies in several bacteria have established that there is a log-linear decrease in recombination frequency with increasing sequence divergence between host and recipient DNA.
In bacterial conjugation, where DNA is transferred between bacteria through direct cell-to-cell contact, homologous recombination helps integrate foreign DNA into the host genome via the RecBCD pathway. The RecBCD enzyme promotes recombination after DNA is converted from single-strand DNA–in which form it originally enters the bacterium–to double-strand DNA during replication. The RecBCD pathway is also essential for the final phase of transduction, a type of horizontal gene transfer in which DNA is transferred from one bacterium to another by a virus. Foreign, bacterial DNA is sometimes misincorporated in the capsid head of bacteriophage virus particles as DNA is packaged into new bacteriophages during viral replication. When these new bacteriophages infect other bacteria, DNA from the previous host bacterium is injected into the new bacterial host as double-strand DNA. The RecBCD enzyme then incorporates this double-strand DNA is into the genome of the new bacterial host.
Homologous recombination is essential to mitosis and meiosis in most eukaryotic cells. In mitosis, homologous recombination repairs double-strand breaks in DNA caused by ionizing radiation or DNA-damaging chemicals. Left unrepaired, these double-strand breaks can cause large-scale rearrangement of chromosomes in somatic cells, which can in turn lead to cancer.
In meiosis, homologous recombination facilitates chromosomal crossover during prophase I. Meiotic homologous recombination begins when the Spo11 protein makes a programmed double-strand break in DNA. The sites of these double-strand breaks often occur at recombination hotspots, 1,000–2,000 base pair regions of chromosomes that have high rates of recombination. The absence of a recombination hotspot between two genes on the same chromosome often implies that those genes will be inherited by future generations in equal proportion—that is, there will be higher linkage between the two genes than that expected from independently assorting genes. The shuffling of genetic material between parental chromosomes that results is an important source of genetic diversity in subsequent generations.
Double-strand breaks can be repaired through homologous recombination or non-homologous end joining (NHEJ). NHEJ is a type of recombination which, unlike homologous recombination, does not require a long homologous sequence to guide repair. Whether homologous recombination or NHEJ is used to repair double-strand breaks is largely determined by the phase of cell cycle. Homologous recombination is upregulated in the S and G2 phases of the cell cycle, when sister chromatids are readily available. Compared to homologous chromosomes, which are similar to another chromosome but often have different alleles, sister chromatids are an ideal template for homologous recombination because they are an identical copy of a given chromosome. NHEJ is predominant in the G1 phase and downregulated thereafter, but maintains at least some activity throughout the cell cycle. This cell-cycle dependent regulation of homologous recombination and NHEJ varies widely between species. Cyclin-dependent kinases (CDKs) are especially important regulators of homologous recombination in both budding yeast and mammals, though their precise mechanism of regulation is different between the two taxa. Upon entry into S phase in budding yeast, the Cdc28 CDK initiates homologous recombination by phosphorylating the Sae2 protein.
Two primary models for how homologous recombination aids double-strand break repair in DNA are the DSBR pathway (sometimes called the double Holliday junction model) and the synthesis-dependent strand annealing (SDSA) pathway. The two pathways are similar in their first several steps. After a double-strand break occurs, a heterotrimeric protein known as the MRX complex (MRN complex in humans) binds to DNA flanking either side of the break. Next a "resection" of the double-strand break is carried out, in which DNA immediately upstream (i.e., toward the 5' end) of the double-strand break is removed on each strand of the DNA duplex. In the first step of resection, the MRX complex recruits Sae2, which together trim back the 5' ends on either side of the break to create short 3' overhangs of single-strand DNA. In the second step, 5'→3' resection is continued by the Sgs1 helicase (a homolog of the bacterial RecQ helicase discussed above) and the Exo1 and Dna2 nucleases.
RPA, a protein with high affinity for ssDNA, initially binds the 3' overhangs. With the help of several mediator proteins, Rad51 (and Dmc1, in meoisis) then forms a nucleoprotein filament on the RPA-coated ssDNA and begins searching for DNA sequences similar to that of the 3' overhang. Upon finding such a sequence, in a process called strand invasion, the single-stranded nucleoprotein filament moves into (invades) a homologous chromosome. A displacement loop (D-loop) is formed during strand invasion between the invading 3' overhang strand and the homologous chromosome. After strand invasion, a DNA polymerase extends the invading 3' strand, changing the D-loop to more prominently cruciform structure known as a Holliday junction. Following this, DNA synthesis occurs on the invading strand (i.e., one of the original 3' overhangs), effectively restoring the strand on the homologous chromosome that was displaced during strand invasion.
After the stages of resection, strand invasion and DNA synthesis outlined above, the DSBR and SDSA pathways become distinct. The DSBR pathway is unique in that the second 3' overhang (which was not involved in strand invasion) also forms a Holliday junction with the homologous chromosome. The double Holliday junctions are then converted into recombination products by nicking endonucleases, a type of restriction endonuclease which cleaves only one DNA strand. While it was thought to result in either crossover or non-crossover in recombinant chromosomes, several genetics studies have suggested the DSBR pathway result predominantly in crossover recombination. Because the DBSR pathway often results in chromosomal crossover, it is the primary model of how homologous recombination occurs during meiosis.
Whether recombination in the DSBR pathway results in chromosomal crossover is determined by how the double Holliday junction is resolved. If the two Holliday junctions are cleaved on the crossing strands (along the black arrowheads at both Holliday junctions in the accompanying figure), then chromosomes without crossover will be produced. Alternatively, chromosomal crossover will occur if one Holliday junction is cleaved on the crossing strand and the other Holliday junction is cleaved on the non-crossing strand (i.e., along the blacks arrowheads at one Holliday junction and along the orange arrowheads at the other in the figure).
Homologous recombination via the SDSA pathway occurs in mitosis and results in non-crossover (NCO) products. In this model, movement of the Holliday junction down the DNA strand (a process called branch migration) ends with the release of the extended invading strand. The newly synthesized 3' end of the invading strand is then able to anneal to the other original 3' overhang in the damaged chromosome through complementary base pairing. SDSA is completed with the removal of 3' flaps left over after annealing and the ligation of any remaining single-stranded gaps.
The single-strand annealing (SSA) pathway of homologous recombination repairs double-strand breaks between two repeat sequences. The pathway is relatively simple in concept: after 5'→3' resection on both strands of the DNA duplex, the two resulting 3' overhangs then align and, with the help of the Rad50 protein, anneal to each other. After annealing is complete, leftover, non-homologous ends of the 3' overhangs are digested away by the Rad1/Rad10 nuclease working in conjunction with Slx4 and Saw1. New DNA synthesis fills in any gaps, and ligation restores the DNA duplex as two continuous strands. The DNA sequence between the repeats is always lost, as is one of the two repeats. The SSA pathway is considered mutagenic since it results in such deletions of genetic material.
During DNA replication, double-strand breaks can sometimes be encountered at replication forks as DNA helicase unzips the template strand. These defects are repaired in the break-induced replication (BIR) pathway of homologous recombination. The precise molecular mechanisms of the BIR pathway remain unclear. Three proposed mechanisms have strand invasion as an initial step, but differ in how they model the migration of the D-loop and later post-synaptic phases.
The BIR pathway can also help to maintain the length of telomeres, regions of DNA at the end of eukaryotic chromosomes, in the absence of (or in cooperation with) telomerase. Without working copies of the telomerase enzyme, telomeres typically shorten with each round of mitosis, which eventually blocks cell division and leads to senescence. In budding yeast cells where telomerase have been inactivated through mutations, two types of "survivor" cells have been observed to avoid senescence longer than expected by elongating their telomeres through BIR pathways. Both Type I and Type II survivor cells require the Rad52 protein. Type I survivor cells also require Rad51, and tend arise more frequently but grow at a slower rate than Type II survivors, which do not require Rad51.
Maintaining telomere length is critical for cell immortalization, a key feature of cancer. Most cancers maintain telomeres by upregulating telomerase. However, in several types of human cancer, a BIR-like pathway helps to sustain some tumors by acting as an alternative mechanism of telomere maintenance. This fact has lead scientists to investigate whether such recombination-based mechanisms of telomere maintenance could thwart anti-cancer drugs like telomerase inhibitors.
Without proper homologous recombination, chromosomes often incorrectly align for the first phase of cell division in meiosis. This causes chromosomes to fail to properly segregate in a process called nondisjunction, which leads to gametes with either an extra or missing chromosome, a condition known as aneuploidy. Conditions like infertility and trisomy 21, the cause of Down's syndrome, are among the many abnormalities that ultimately result from failures of homologous recombination in meiosis. Therefore the process is governed by meiotic recombination checkpoint.
Deficiencies in homologous recombination have been strongly linked to cancer formation in humans. For example, each of the cancer-related diseases Bloom's syndrome, Werner's syndrome and Rothmund-Thomson syndrome are caused by malfunctioning copies of RecQ helicase genes involved in the regulation of homologous recombination: BLM, WRN and RECQ4, respectively. In the cells of Bloom's syndrome patients, who lack a working copy of the BLM protein, there is an elevated rate of homologous recombination. Experiments done in mice deficient in BLM have suggested that the mutation gives rise to cancer through a loss of heterozygosity caused by increased homologous recombination.
Decreased rates of homologous recombination can also lead to cancer. This is the case with BRCA1, a tumor suppressor genes whose malfunctioning has been prominently associated with increased susceptibility to breast and ovarian cancer. Cells missing BRCA1 were shown to have a fivefold decrease in homologous recombination events and increased sensitivity to ionizing radiation (indicating more unrepaired double-strand breaks in DNA). The reintroduction of BRCA1 saw a simultaneous increase in homologous recombination events and decrease in sensitivity to ionizing radiation. Facilitating homologous recombination is the only known function of a closely-related gene, BRCA2. Its large protein product, the 3418-amino acid long BRCA2 protein, aids homologous recombination by binding to single-stranded DNA and providing a platform for the extension of the RAD51 filament. This filament formation is an important step in the initiation of homologous recombination, and cells made deficient in this process by mutant copies of the BRCA2 protein were shown have a similar phenotype to BRCA1 mutants: decreased homologous recombination and increased sensitivity to radiation.
Based on the similarity of their amino acid sequences, sets of proteins involved in homologous recombination (HR) are thought to share common evolutionary origins. One such set of HR-related proteins is the RecA/Rad51 protein family, which includes the RecA protein from bacteria and homologous proteins in archaea (RadA and RadB) and eukaryotes (Rad51, Rad51B, Rad51C, Rad51D, Dmc1, XRCC2 and XRCC3). All of these proteins share a conserved region of approximately 230 amino acids in length, known as the RecA/RAD51 domain. Within this protein domain are two sequence motifs, Walker A and Walker B, which confer ATP hydrolysis activity to the protein products of all members of the RecA/Rad51 gene family.
Phylogenetic trees indicate that RadA, Rad51 and Dmc1 are members of the same monophyletic group, implying that they share a common ancestor. Within this protein family, Rad51 and Dmc1 are grouped together in a separate clade from RadA. An ancient gene duplication event of a eukaryotic RecA gene has been proposed as a likely origin of the modern RAD51 and DMC1 genes. One of the bases for grouping these three proteins together is that they all possess a modified helix-turn-helix motif, which confers DNA-binding activity, toward their N-terminal ends.
The discovery of Dmc1 in several species of Giardia, one of the earliest protists to diverge as a eukaryote, suggests that meiotic homologous recombination (and thus meiosis itself) emerged very early in eukaryotic evolution.
In addition to research on Dmc1, molecular and phylogenetic analyses of the Spo11 protein and its homologs have also provided information on the origins of meiotic recombination. Spo11 is a type II topoisomerase that catalyzes the double-strand breaks necessary to initiate homologous recombination in meiosis. Phylogenetic trees constructed from inferred protein sequences of Spo11 gene homologs in animals, fungi, some plants, and protists and archaea suggest that eukaryotic Spo11 emerged in the last common ancestor of eukaryotes and archebacteria.
Many methods for introducing DNA sequences into organisms to create recombinant DNA and genetically modified organisms use the process of homologous recombination. Also called gene targeting, the method is especially common in yeast and mouse genetics. The gene targeting method in knockout mice uses mouse embryonic stem cells to deliver artificial genetic material (mostly of therapeutic interest), which represses the target gene of the mouse by the principle of homologous recombination. The mouse thereby acts as a working model to understand the effects of a specific mammalian gene. In recognition of their discovery of how homologous recombination can be used to introduce genetic modifications in mice through embryonic stem cells, Mario Capecchi, Martin Evans and Oliver Smithies were awarded the 2007 Nobel Prize for Physiology or Medicine.
Protein engineering with homologous recombination develops chimeric proteins by swapping fragments between two parental proteins. These techniques exploit the fact that recombination can introduce a high degree of sequence diversity while preserving a protein's ability to fold into its tertiary structure. This stands in contrast to other protein engineering techniques, like random point mutagenesis, in which the probability of maintaining protein function declines exponentially with increasing amino acid substitutions. The chimeras produced by recombination techniques are able to maintain their ability to fold because their swapped parental fragments are structurally and evolutionarily conserved. These recombinable structural "building blocks" preserve residue-residue interactions like structural contacts between residues, or other energetically important interactions. Methods like SCHEMA and statistical coupling analysis can be used to identify structural subunits suitable for recombination.
Techniques that rely on homologous recombination have been used to engineer new proteins. In a study published in 2007, researchers were able to create chimeras of two enzymes involved in the biosynthesis of isoprenoid compounds. The chimeric proteins acquired an ability to catalyze an essential reaction in isoprenoid biosynthesis that was absent in the parent proteins. Cytochrome P450 proteins generated by recombinatioral protein engineering have also been shown to accept substrates not accepted by parent proteins, demonstrating significant functional diversification in the engineered chimeras.