A large part of the genome is repressed in animal cells. This expression profile is set up during early development through a number of different mechanisms, including site-specific repression complexes and global DNA methylation which probably work by generating inaccessible chromatin structures. This overall pattern is then largely maintained throughout development. Early lineage commitment is associated with the turning off of pluripotency genes through programmed heterochromatinization, with DNA methylation providing long-term stability and inhibiting somatic cell reprogramming.
In many simple organisms with limited potential for differentiation, almost all genes are programmed to be expressed. Repression is usually limited to a small number of gene loci that must be specifically identified in order to recruit the silencing machinery. A good example of this is E. coli, where most genes are constitutively active in dividing cells, with a relatively small number of genes being turned off. The β-gal locus, for example, has a binding site for a constitutive repressor that lowers its expression, and only when these bacteria are put in media containing lactose as its carbon source is this repression released. In higher organisms, on the other hand, somatic cells actively express less than 50% of their genome, with many other genes being held in a silent conformation. Considering the large number of these genes, it is obvious that this silencing cannot be accomplished by individual site-specific repressor molecules and there must be more global mechanisms involved in this process. Indeed, it is very possible that epigenetic repression is actually an essential element that is necessary for the ability of multicellular organisms to carry out lineage specific differentiation.
2. Establishment of DNA methylation patterns
One of the main mechanisms for gene repression in mammals is DNA methylation at CpG residues. In somatic cells, there appears to be a bimodal pattern of modification, with many areas of the genome being highly methylated, while CpG islands are in a constitutively unmethylated state. Many experiments both in vitro and in vivo have demonstrated that DNA methylation represses transcription, and this is accomplished by affecting both local histone modification patterns as well as other aspects of chromatin structure that influence gene accessibility (Lande-Diner and Cedar, 2005). A close analysis of DNA methylation as a function of development clearly indicates that this modification plays a key role in defining the potential of cells to differentiate or undergo reprogramming.
As in the case in all somatic cell types, germ-line lineages are characterized by a bimodal pattern of DNA methylation. Examination of cells from a blastula, however, reveal that the DNA of each newly created embryo is highly unmethylated (Kafri et al., 1992; Monk et al., 1987). This process probably begins on the paternal genome through active demethylation (Mayer et al., 2000; Oswald et al., 2000) and then encompasses the maternal DNA as well, perhaps through passive demethylation ensuing from early cycles of replication and cell division (Reik, 2007). Although the actual function of this demethylation process is not really understood, the logic appears to imply that this is some sort of erasing mechanism that clears many of the epigenetic marks characteristic of differentiated cells and makes possible the rebuilding of a new developmental program from pluripotent cells.
This relatively unmethylated state continues until about the time of implantation, when there is a wave of de novo methylation catalyzed by two key enzymes, Dnmt3a and Dnmt3b (Okano et al., 1999). Although this methylation is carried out in a global manner, CpG islands are protected from this process and therefore remain unmethylated. The exact mechanism for this protection has not been completely worked out, but it appears to require cis acting sequences that represent common elements in many islands (Brandeis et al., 1994; Macleod et al., 1994). One good example of this is the Sp1-like motif cluster located in the hamster Aprt promoter region. Interestingly, a short fragment containing these elements was found capable of protecting 200–300 nucleotides of non-CpG island DNA from de novo methylation in transgenic animals. Furthermore, removal of these elements from a natural Aprt construct actually caused this island region to become methylated in vivo (Siegfried et al., 1999).
It thus appears that these cis-acting sequences are both necessary and sufficient for protecting CpG islands from de novo methylation at the time of implantation. In light of the fact that many CpG island-containing genes are tissue specific and thus not productively transcribed at this early stage in development, it is possible that these Sp1-like sequences mediate methylation protection independently of their role in transcription. Alternatively, undermethylation may be induced by the presence of the transcription machinery that is stabilized by interaction with strong transcription factor biding sites such as Sp1, even though not all of these loci actually produce full transcripts (Guenther et al., 2007). In this case, the methylation pattern generated at the time of implantation may actually reflect, and thus perpetuate, the transcription profile prior to implantation when the entire genome is relatively unmethylated.
Although the precise mechanism for protection of CpG islands from DNA methylation is not known, an attractive model proposes that this is carried out through the protein Dnmt3L which has been show to be part of a multi-protein unit together with the de novo methylases, Dnmt3a and Dnmt3b (Jia et al., 2007; Ooi et al., 2007). It appears that this complex is targeted to DNA by virtue of the fact that Dnmt3L can bind to lysine 4 residues in histone H3 (Ooi et al., 2007). According to this model, potential transcriptional start sites in the genome can bind RNA polymerase II which then recruits SET-domain proteins that methylate H3K4 on nucleosomes overlying this region of DNA (Guenther et al., 2007). Since the presence of methyl groups on H3K4 inhibits the direct binding of Dnmt3L, the entire de novo methylase complex would not be able to operate in this region. Thus, while almost the entire genome is amenable to de novo methylation, CpG islands harboring active RNA polymerase molecules would be protected from this process.
While the machinery for bringing about de novo methylation is present during a short window at the time of implantation (Okano et al., 1999), further generations of cells lack this ability. Despite the transient nature of this phenomenon, the methylation pattern established at the time of implantation is faithfully maintained throughout future cell divisions by a maintenance mechanism that utilizes Dnmt1. This enzyme, which is located in the replication complex itself (Leonhardt et al., 1992) operates by recognizing the hemimethylated sites generated at the time of DNA synthesis and methylating the newly made strand only at these positions. This process appears to be aided by additional factors that help recognize the hemimethylated CpGs at the replication fork (Achour et al., 2008; Bostick et al., 2007; Sharif et al., 2007).
De novo methylation at the time of implantation probably plays an important role in early development as indicated by the observation that knockouts of Dnmt3a and 3b are embryonic lethals, with development being halted at about the 8–9 dpc stage (Okano et al., 1999). This suggests that a fully methylated genome may be required for subsequent stages of cell differentiation. This idea is also supported by the observation that conditional knockouts of Dnmts in fully differentiated somatic cell types usually brings about apoptosis or senescence in a process that is dependent on p53 (Jackson-Grusby et al., 2001; Lande-Diner et al., 2007).
3. Role of DNA methylation
While the precise role of DNA methylation during development is not known, there is no question that this modification serves to establish a basal transcription profile that is universal for all cells in the organism. According to this scheme, DNA methylation brings about the repression of transcription thus helping to lower the activity of many tissue specific genes (Siegfried et al., 1999) and silence unwanted endogenous viral-type sequences that are scattered throughout the genome (Walsh et al., 1998). It should be noted that this process is carried out in a global manner without the need to recognize specific gene sequences. At the same time, CpG islands are completely protected from DNA methylation. Since many of these sequences are located within the promoters of housekeeping genes, this allows transcription of genes destined to be expressed throughout the organism. In this sense, DNA methylation actually helps define the basal nature of gene expression while bringing about the repression of a large part of the genome.
Although methyl moieties placed at critical sequences on the DNA can inhibit transcription by interfering with the binding of specific factors (Maier et al., 2004; Tate and Bird, 1993), in general, DNA methylation mainly operates by affecting local and regional chromatin structure. Thus, for example, when naked DNA is inserted into cells by stable transformation, it becomes integrated into the genome and automatically adopts a relatively open structure characterized by DNaseI sensitivity. In contrast, in vitro-methylated DNA becomes packaged in a DNaseI-resistant conformation (Keshet et al., 1986). This process does not seem to be directed by specific protein factor binding, since the effects of methylation are observed regardless of the underlying sequence. DNA methylation also modulates local structural features of chromatin, including nucleosome positioning (Davey et al., 1997), as well as histone modification (Eden et al., 1998; Hashimshony et al., 2003). Thus, unmethylated DNA gets packaged with nucleosomes characterized by acetylated histones and the presence of H3K4me, while methylated DNA serves as a marker for directing deacetylation of histones and undermethylation of H3K4 together with methylation of H3K9. This process appears to be mediated by methyl binding proteins that recruit deacetylase (Jones et al., 1998; Nan et al., 1998), or by SET-domain histone methylases located in the replication complex (Esteve et al., 2006).
It is very likely that regulation of gene expression, in general, is mediated through chromatin structure which ultimately determines the accessibility of any specific region of DNA to the transcription machinery. Despite this important role of chromatin, these structures are not permanently bound to the DNA and actually get disrupted by passage of the DNA replication fork during each cell division (Lucchini and Sogo, 1995). These chromatin features must then be reconstructed following replication. DNA methylation plays a critical role in this process by providing a stable template for directing repackaging (Suzuki and Bird, 2008; Weber and Schubeler, 2007), and in this sense provides a very stable long-term mechanism for gene repression.
4. Repression complexes
While DNA methylation plays an important part in the overall scheme of repression, it is clear that animal cells also utilize additional mechanisms to mediate gene silencing. Thus, even in cells lacking Dnmt1 and having a low level of genome DNA methylation, only a few genes become reactivated, and many others still retain their silenced state (Jackson-Grusby et al., 2001; Lande-Diner et al., 2007). This alternate route of transcriptional inhibition is probably mediated by repression complexes that bind to specific recognition sites located within the promoter regions of target genes. Like DNA methylation, this repression actually operates by affecting overlying histone modification or other aspects of chromatin structure that ultimately affect gene accessibility. One example of this is the NRSF/REST complex which is present in many tissues where it binds and represses neuron-specific gene sequences (Schoenherr and Anderson, 1995). Similarly, the polycomb complex appears to be targeted specifically to genes involved in development and differentiation (Boyer et al., 2006; Franke et al., 1992; Lee et al., 2006). Although these systems do not carry out global repression, they do provide a mechanism for the general silencing of large gene categories.
It should be noted that genes targeted by polycomb actually undergo repression through a process of heterochromatinization. In this scheme, the binding of PRC2 to specific genes brings about local tri-methylation of histone H3K27 by means of Ezh2 (Cao et al., 2002; Czermin et al., 2002; Kuzmichev et al., 2002; Margueron et al., 2008; Shen et al., 2008), one of the proteins located in these complexes. These methyl groups then serve as a landing site for the chromodomain protein, Pc, that is part of the PRC1 complex (Schwartz and Pirrotta, 2007; Wang et al., 2004), and this generates a heterochromatin-like structure. In general, even though these target genes are repressed in ES cells, they appear to have a bivalent structure, being packed with both H3K27me3 as well as the activating modification H3K4me3 (Barski et al., 2007; Bernstein et al., 2006; Mikkelsen et al., 2007; Pan et al., 2007; Zhao et al., 2007). As development proceeds, the polycomb complex is removed in a gene and cell-type specific manner, thus activating those genes required for correct differentiation (Lee et al., 2006; Mohn et al., 2008). It is likely that this bivalency grants genes a degree of relative flexibility by constantly maintaining the potential to switch from an active to inactive structure and vice versa.
Like DNA methylation, repressor-complex-mediated gene silencing also appears to be maintained in vivo through multiple cell divisions. It is very likely that this is managed through the simple rebinding of repressor molecules following replication. Alternatively, it has been suggested that there may be a post-replication mechanism for copying histone modification patterns present on old nucleosomes in order to reconstruct them onto newly incorporated nucleosomes. Although there is as yet no formal proof for this idea, recent studies seem to indicate that polycomb repression may be maintained for many cell generations even after the initiating complex has been removed, and this may be mediated through the ability of PRC2 (Hansen et al., 2008) to recognize H3K27me3.
5. Post implantation gene repression
Following implantation, at about the time of gastrulation, the embryo is subject to additional general changes in gene expression which appear to be related to the process of pluripotency restriction. The prototype for these events is the Oct-3/4 gene, which is active from the time of gametogenesis and appears to be necessary for maintaining pluripotency in the early undifferentiated embryo (Pesce and Scholer, 2000). At the onset of gastrulation, however, this gene undergoes repression, and this is carried out in a 3-step manner. Initially, repressor factors are recruited to the gene promoter (Ben-Shushan et al., 1995; Fuhrmann et al., 2001; Sylvester and Scholer, 1994), thus bringing about the rapid inactivation of Oct-3/4 transcription. In the next step, these same factors apparently serve to recruit a G9a-containing complex which mediates local histone deacetylation followed by H3K9 tri-methylation (Feldman et al., 2006). Since H3K9me3 is a binding partner for heterochromatin protein 1 (HP1), these changes lead to the heterochromatinization of the Oct-3/4 promoter. Finally, G9a itself recruits Dnmt3a and 3b, ultimately bringing about de novo methylation (Epsztejn-Litman et al., 2008). It is this later event that causes Oct-3/4 to remain stably repressed in somatic cells throughout development.
Although Oct-3/4 may be a main player in controlling pluripotency, many other genes, including Nanog and Dnmt3L take part in this process, and some of these have also been found to be inactivated by G9a (Epsztejn-Litman et al., 2008). Thus, G9a may actually represent a master inhibitor of pluripotency genes at this stage in development. It is interesting that other genomic regions are also subject to targeted inactivation in the early embryo. This includes peri-centric satellite sequences which become repressed through Suv39h-mediated heterochromatinization and DNA methylation (Lehnertz et al., 2003), as well as the X-chromosome in female embryos, whose inactivation also involves a form of heterochromatinization and de novo methylation, perhaps by means of polycomb binding (Payer and Lee, 2008).
It is interesting that in all of these cases de novo methylation is associated with histone methylation. In the case of G9a, biochemical and genetic studies clearly indicate that this enzyme induces de novo methylation by actually recruiting Dnmt3 molecules (Dong et al., 2008; Epsztejn-Litman et al., 2008; Tachibana et al., 2008) through an ankyrin domain region that is independent of its catalytic SET domain (Epsztejn-Litman et al., 2008). It is likely that other methyltransferases, such as Suv39h (Fuks et al., 2003) and ESET (Li et al., 2006) are also involved in de novo methylation in the same manner. Genetic studies on KRYPOTNITE in plants (Jackson et al., 2002) and the histone methyltransferase of Neurospora (Tamaru and Selker, 2001) also indicate that histone methylation enzymes are required for DNA de novo methylation, as well. Furthermore, the polycomb protein Ezh2, which catalyzes the methylation of H3K27, and has been shown to be capable of recruiting de novo methylases (Vire et al., 2006), may play a role in tumor-associated CpG island methylation (Ohm et al., 2007; Schlesinger et al., 2007; Widschwendter et al., 2007) even though polycomb targets are generally unmethylated in normal tissues in vivo (Meissner et al., 2008; Mohn et al., 2008).
6. DNA replication timing and gene repression
Another molecular mechanism that may play a role in epigenetic silencing during lineage commitment is late replication timing. It is well known that the entire genome is replicated in a programmed manner with some zones undergoing replication early in S phase, while others replicate at later times (Goren and Cedar, 2003). Furthermore, there appears to be an excellent correlation between late replication timing and gene repression (Farkash-Amar et al., 2008; Schubeler et al., 2002; White et al., 2004). Housekeeping genes, for example, are constitutively replicated in early S, while many tissue specific genes are developmentally regulated so that they replicate late in most cell types, but early in the tissue of expression (Goren and Cedar, 2003). These replication timing patterns are probably set up by long-range cis-acting sequences (Cimbora et al., 2000; Simon et al., 2001) and can be maintained in a stable manner through multiple cell divisions (Mostoslavsky et al., 2001).
The causal relationship between DNA replication timing and gene expression has not yet been elucidated, but microinjection experiments strongly suggest that repackaging of DNA following replication in early S phase is carried out with acetylated histones, while late replicating DNA is automatically reassembled with deacetylated histones (Zhang et al., 2002), and this could serve as a mechanism for initially setting up broad chromatin states which can then be further modulated by additional factors at the local level. A number of different gene silencing events that take place during post implantation differentiation are accompanied by a shift to late replication, including those known to be involved in pluripotency, such as Rex1 (Hiratani et al., 2004; Hiratani et al., 2008; Perry et al., 2004). In addition, a change to late replication timing represents one the first events in the X-inactivation process (Takagi, 1974). These observations suggest that this epigenetic mark may play an important role in setting up stable expression patterns at this point in development.
7. Long-term silencing
Both heterochromatinization through histone methylation together with binding of chromodomain proteins as well as DNA methylation are used for long-term gene silencing, but these epigenetic markers appear to have different functions in vivo. In general, DNA methylation provides a more stable form of repression. A good example for understanding the difference between simple heterochromatinization as compared to DNA methylation is provided by the Oct-3/4 gene. By following the epigenetic changes that occur to this gene in differentiating ES cells, it has been shown that this process occurs in a step-wise manner with heterochromatinization occurring prior to DNA methylation. Interestingly, heterochromatinization alone is able to prevent reactivation of Oct-3/4 when the inducer of differentiation is removed, but this alone is not sufficient to prevent reprogramming of differentiated cells. On the other hand, the placement of DNA methylation on the Oct-3/4 promoter is capable of insuring that differentiated cells cannot easily return to their original pluripotent state (Epsztejn-Litman et al., 2008).
The X-chromosome provides another example of this phenomenon. Normally, one X chromosome in each cell of the female organism undergoes inactivation in the early embryo. This process also proceeds in a step-wise manner, with heterochromatinization and gene inactivation taking place at an early stage, while DNA methylation occurs much later and evidently serves as a locking mechanism to prevent reactivation (Lock et al., 1987; Payer and Lee, 2008). Indeed, paternal-specific X-inactivation in extraembryonic tissues appears to take place without de novo methylation, and, in this case, repression is much less effective with many genes undergoing reactivation (Samollow et al., 1995). Similarly, in marsupials where X-inactivation is also accomplished without subsequent DNA methylation, repressed genes on this chromosome are much more likely to become reactivated in somatic cells (Migeon et al., 1989).
Targeted DNA methylation appears to play a major role in preventing the reprogramming of somatic cells to a more pluripotent phenotype. It has already been demonstrated, for example, that reprogramming by nuclear transplantation is an extremely inefficient process, mainly because key pluripotent genes such as Oct-3/4 do not easily undergo demethylation in this system (Boiani et al., 2002; Bortvin et al., 2003). Another method for reprogramming involves the production of induced pluripotent stem cells (iPS) by introducing key stem-cell transcription factors into somatic cells (Maherali et al., 2007; Takahashi and Yamanaka, 2006; Welstead et al., 2008; Wernig et al., 2007). When this is done, reprogramming appears to take place in a step-wise manner with changes in chromatin occurring with relatively rapid kinetics. The cells then remain stuck in an intermediate state until the endogenous pluripotency genes actually undergo demethylation and become active (Mikkelsen et al., 2008). As might be expected, the removal of G9a not only expedites this process, but also lowers the requirement for exogenous pluripotent transcription factors (Ma et al., 2008; Shi et al., 2008a; Shi et al., 2008b). This is consistent with the idea that G9a may be a master regulator in turning off pluripotency during early embryonic development.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.