We are searching data for your request:
Upon completion, a link will appear to access the found materials.
The processes described above are required whenever any gene is transcribed. What are the molecular switches that turn transcription on or off? Although there are entire books written on this one topic, the basic mechanism by which transcription is regulated depends on highly specific interactions between transcription regulating proteins and regulatory sequences on DNA.
We know that promoters indicate where transcription begins, but what determines that a given gene will be transcribed? In addition to the promoter sequences required for transcription initiation, genes have additional regulatory sequences (sequences of DNA on the same DNA molecule as the gene) that control when a gene is transcribed. Regulatory sequences are bound tightly and specifically by transcriptional regulators, proteins that can recognize DNA sequences and bind to them. The binding of such proteins to the DNA can regulate transcription by preventing or increasing transcription from a particular promoter.
Regulation in Prokaryotes
Let us first consider an example from prokaryotes. In bacteria, genes are often clustered in groups, such that genes that need to be expressed at the same time are next to each other and all of them are controlled as a single unit by the same promoter. The lac operon, shown in Figure 5.4.2, is one such group of genes that encode proteins needed for the uptake and breakdown of the sugar lactose. The three genes of the lac operon, lac z, lac y and lac a are controlled by a single promoter.
Bacterial cells generally prefer to use glucose for their energy needs, but if glucose is unavailable, and lactose is present, the bacteria will take up lactose and break it down for energy. Since the proteins for taking up and breaking down lactose are only needed when glucose is absent and lactose is available, the bacterial cells need a way to express the genes of the lac operon only under those conditions. At times when lactose is absent, the cells do not need to express these genes.
How do bacteria achieve this? Transcription of the lac cluster of genes is primarily controlled by a repressor protein that binds to a region of the DNA just downstream of the -10 sequence of the lac promoter. Recall that the promoter is where the RNA polymerase must bind to begin transcription. The place where the repressor is bound is called the operator (labeled O in the figure). When the repressor is bound at this position, it physically blocks the RNA polymerase from transcribing the genes, just as a vehicle blocking your driveway would prevent you from pulling out.
Obviously, if you want to leave, the vehicle that is blocking your path must be removed. Likewise, in order for transcription to occur, the repressor must be removed from the operator to clear the path for RNA polymerase. How is the repressor removed?
When the sugar lactose is present, it binds to the repressor, changing its conformation so that it no longer binds to the operator. When the repressor is no longer bound at the operator, the "road-block" in front of the RNA polymerase is removed, permitting the transcription of the genes of the lac operon.
Because the binding of the lactose induces the expression of the genes in the lac operon, lactose is called an inducer. (Technically, the inducer is allolactose, a molecule made from lactose by the cell, but the principle is the same.)
What makes this an especially effective control system is that the genes of the lac operon encode proteins that break down lactose. Turning on these genes requires lactose to be present. Once the lactose is broken down, the repressor binds to the operator once more and the lac genes are no longer expressed. This allows the genes to be expressed only when they are needed.
But how do glucose levels affect the expression of the lac genes? We noted earlier that if glucose was present, lactose would not be used. A second level of control is exerted by a protein called CAP that binds to a site adjacent to the promoter and recruits RNA polymerase to bind the lac promoter. When glucose is depleted, there is an increase in levels of cAMP which binds to CAP. The CAP cAMP complex then binds the CAP site, as shown in Figure 5.4.3. The combination of CAP binding and the lac repressor dissociating from the operator when lactose levels are high ensures high levels of transcription of the lac operon just when it is most needed. The CAP protein binding may be thought of as a green light for the RNA polymerase, while the removal of repressor is like the lifting of a barricade in front of it. When both conditions are met, the RNA polymerase transcribes the downstream genes.
The lac operon we have just described is a set of genes that are expressed only under the specific conditions of glucose depletion and lactose availability. Other genes may be expressed unless a particular condition is met. An example of this is the trp operon in bacterial cells, which encodes enzymes necessary for the synthesis of the amino acid tryptophan. These genes are expressed at all times, except when tryptophan is available from the cell's surroundings. This means that these genes must be prevented from being expressed in the presence of tryptophan. This is achieved by having a repressor protein that will bind to the operator only in the presence of tryptophan.
Regulation in Eukaryotes
Transcription in eukaryotes is also regulated by the binding of proteins to specific DNA sequences, but with some differences from the simple schemes outlined above. For most eukaryotic genes, general transcription factors and RNA polymerase (i.e., the basal transcription complex) are necessary, but not sufficient, for high levels of transcription.
In eukaryotes, additional regulatory sequences called enhancers and the proteins that bind to the enhancers are needed to achieve high levels of transcription. Enhancers are DNA sequences that regulate the transcription of genes. Unlike prokaryotic regulatory sequences, enhancers don't need to be next to the gene they control. Often they are many kilobases away on the DNA. As the name suggests, enhancers can enhance (increase) transcription of a particular gene.
How can a DNA sequence far from the gene being transcribed affect the level of its transcription?
Enhancers work by binding proteins (transcriptional activators) that can, in turn, interact with the proteins bound at the promoter. The enhancer region of the DNA, with its associated transcriptional activator(s) can come in contact with the basal transcription complex that is bound at a distant TATA box by looping of the DNA (previous page). This allows the protein bound at the enhancer to make contact with the proteins in the basal transcription complex.
One way that the transcriptional activator bound to the enhancer increases the transcription from a distant promoter is that it increases the frequency and efficiency with which the basal transcription complex is formed at the promoter.
Another mechanism by which proteins bound at the enhancer can affect transcription is by recruiting to the promoter other proteins that can modify the structure of the chromatin in that region. As we noted earlier, in eukaryotes, DNA is packaged with proteins to form chromatin. When the DNA is tightly associated with these proteins, it is difficult to access for transcription. So proteins that can make the DNA more accessible to the transcription machinery can also play a role in the extent to which transcription occurs.
In addition to enhancers, there are also negative regulatory sequences called silencers. Such regulatory sequences bind to transcriptional repressor proteins. Transcriptional activators and repressors are modular proteins- they have a part that binds DNA and a part that activates or represses transcription by interacting with the basal transcription complex.
Combinatorial regulation of transcription factors and microRNAs
Gene regulation is a key factor in gaining a full understanding of molecular biology. Cis-regulatory modules (CRMs), consisting of multiple transcription factor binding sites, have been confirmed as the main regulators in gene expression. In recent years, a novel regulator known as microRNA (miRNA) has been found to play an important role in gene regulation. Meanwhile, transcription factor and microRNA co-regulation has been widely identified. Thus, the relationships between CRMs and microRNAs have generated interest among biologists.
We constructed new combinatorial regulatory modules based on CRMs and miRNAs. By analyzing their effect on gene expression profiles, we found that genes targeted by both CRMs and miRNAs express in a significantly similar way. Furthermore, we constructed a regulatory network composed of CRMs, miRNAs, and their target genes. Investigating its structure, we found that the feed forward loop is a significant network motif, which plays an important role in gene regulation. In addition, we further analyzed the effect of miRNAs in embryonic cells, and we found that mir-154, as well as some other miRNAs, have significant co-regulation effect with CRMs in embryonic development.
Based on the co-regulation of CRMs and miRNAs, we constructed a novel combinatorial regulatory network which was found to play an important role in gene regulation, particularly during embryonic development.
5.4 RNA is Transcribed from a DNA Template
RNA molecules originate from a DNA template, through the process of transcription. That is, a single strand of RNA is transcribed from one of the strands of a DNA helix. Namely, the DNA molecule unwinds at the site of the gene to be transcribed. RNA polymerase catalyzes the formation of an RNA molecule (or transcript) based on complementarity with DNA: where there is a guanine (G) in DNA, a complementary cytosine (C) is added to the RNA strand. However, where there is an adenine (A) in the DNA template, a uracil (U) is added to the RNA transcript. For example, the DNA bases TGCACA is transcribed to the RNA bases ACGUGU.
The transcribed RNA then either functions as mRNA, tRNA, or rRNA, with the roles described above. Either way, in eukaryotes such as plants, animals, and fungi, the RNA molecules are constructed in the nucleus of the cell. After transcription, RNA leaves the nucleus to continue protein synthesis in the cytoplasm of the cell.
In prokaryotic organisms such as bacteria, the mRNA transcript is immediately ready for its role in polypeptide formation. But in eukaryotes, extensive processing is required. RNA processing involves editing some nucleotides, removing non-coding regions of DNA, and preparing the transcript for recognition by the ribosome.
Figure 5.4 A single strand of RNA is transcribed from a DNA template.
The human genome encodes over 20,000 genes, with hundreds to thousands of genes on each of the 23 human chromosomes. As discussed in an earlier unit, the DNA in the nucleus is precisely wound, folded, and compacted into chromatin so that it will fit into the nucleus. It is also organized so that specific segments can be accessed as needed for specific cell types to function.
The first level of organization, or packing, is the winding of DNA strands around histone proteins. Histones package and order DNA into structural units called nucleosome complexes, which can control the access of proteins to the DNA regions ( Figure 5-2a). Under the electron microscope, this winding of DNA around histone proteins to form nucleosomes looks like small beads on a string ( Figure 5-2b).
Figure 5-2: Nucleosomes of DNA . DNA is folded around histone proteins to create (a) nucleosome complexes. These nucleosomes control the access of proteins to the underlying DNA. When viewed through an electron microscope (b), the nucleosomes look like beads on a string. (credit “micrograph”: modification of work by Chris Woodcock)
These histone proteins can move along the string (DNA) to expose different sections of the molecule. If DNA encoding a specific gene is to be transcribed into RNA, the nucleosomes surrounding that region of DNA can slide down the DNA to open that specific chromosomal region and allow for the transcriptional machinery, including transcription factors and RNA polymerase to initiate transcription. The movement of nucleosome complexes is achieved by ATP-dependent chromatin remodeling complexes that use the energy from ATP hydrolysis to push nucleosomes along the DNA.
Figure 5-3: Nucleosomes can slide along DNA . When nucleosomes are spaced closely together (top), transcription factors cannot bind, and gene expression is turned off. When the nucleosomes are spaced far apart (bottom), the DNA is exposed. Transcription factors can bind, allowing gene expression to occur. Modifications to the histones and DNA affect nucleosome spacing.
How closely the histone proteins associate with the DNA is regulated by chemical signals found on both the histone proteins and on the DNA. These signals are functional groups (aka tags) added to histone proteins or to DNA and determine whether a chromosomal region should be open (euchromatic) or closed (heterochromatic) ( Figure 5-3 depicts modifications to histone proteins and DNA). These tags are not permanent but may be added or removed as needed. The most common chemical groups added directly to the nucleic acids are methyl groups, while those added to histone proteins include phosphate, methyl, or acetyl groups. These functional groups are attached to specific amino acids in histone "tails" at the N-terminus of the protein and, importantly, do not alter the DNA base sequence, but they do alter how tightly wound the DNA is around the histone proteins.
DNA is a negatively charged molecule, and unmodified histones are positively charged therefore, changes in the charge of the histone will change how tightly wound the DNA molecule will be. For example, by adding chemical modifications like acetyl groups, the charge of the histone becomes less positive, and the binding of DNA to the histones is often relaxed. This allows an increase in the accessibility of the gene. Phosphate groups are negatively charged, therefore, phosphorylation of histone tails decreases the binding of positively charged histone proteins to DNA, also resulting in accessibility of the gene. Methylation of histone tails has been shown to both repress and activate gene transcription, depending on which amino acids of the histone tails are modified and the number of methyl groups added. For simplicity, we will discuss methylation as a functional group that results in gene silencing or heterochromatin. One hypothesis for how methylation causes an increase in DNA coiling is that methyl groups are nonpolar and nonpolar molecules tend to aggregate (or bind) together. Therefore, the more methyl groups there are on histones the more likely the histones will package closer together.
The commonality between all modifications to the histone tails is that either the modification is relaxing the coiling of DNA around the histones allowing the DNA to become more ‘open’ and accessible to the transcription machinery, or the modification tightens the coiling of DNA causing it to become more ‘closed’ and transcriptionally silent (refer to figure 5-3). Therefore, altering the tightness of the histone-DNA interaction opens some regions of chromatin to transcription and closes others.
A DNA transcription unit encoding for a protein may contain both a coding sequence, which will be translated into the protein, and regulatory sequences, which direct and regulate the synthesis of that protein. The regulatory sequence before ("upstream" from) the coding sequence is called the five prime untranslated region (5'UTR) the sequence after ("downstream" from) the coding sequence is called the three prime untranslated region (3'UTR). 
As opposed to DNA replication, transcription results in an RNA complement that includes the nucleotide uracil (U) in all instances where thymine (T) would have occurred in a DNA complement.
Only one of the two DNA strands serve as a template for transcription. The antisense strand of DNA is read by RNA polymerase from the 3' end to the 5' end during transcription (3' → 5'). The complementary RNA is created in the opposite direction, in the 5' → 3' direction, matching the sequence of the sense strand with the exception of switching uracil for thymine. This directionality is because RNA polymerase can only add nucleotides to the 3' end of the growing mRNA chain. This use of only the 3' → 5' DNA strand eliminates the need for the Okazaki fragments that are seen in DNA replication.  This also removes the need for an RNA primer to initiate RNA synthesis, as is the case in DNA replication.
The non-template (sense) strand of DNA is called the coding strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). This is the strand that is used by convention when presenting a DNA sequence. 
Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA. As a result, transcription has a lower copying fidelity than DNA replication. 
Transcription is divided into initiation, promoter escape, elongation, and termination. 
Setting up for transcription Edit
Enhancers, transcription factors, Mediator complex and DNA loops in mammalian transcription Edit
Setting up for transcription in mammals is regulated by many cis-regulatory elements, including core promoter and promoter-proximal elements that are located near the transcription start sites of genes. Core promoters combined with general transcription factors are sufficient to direct transcription initiation, but generally have low basal activity.  Other important cis-regulatory modules are localized in DNA regions that are distant from the transcription start sites. These include enhancers, silencers, insulators and tethering elements.  Among this constellation of elements, enhancers and their associated transcription factors have a leading role in the initiation of gene transcription.  An enhancer localized in a DNA region distant from the promoter of a gene can have a very large effect on gene transcription, with some genes undergoing up to 100-fold increased transcription due to an activated enhancer. 
Enhancers are regions of the genome that are major gene-regulatory elements. Enhancers control cell-type-specific gene transcription programs, most often by looping through long distances to come in physical proximity with the promoters of their target genes.  While there are hundreds of thousands of enhancer DNA regions,  for a particular type of tissue only specific enhancers are brought into proximity with the promoters that they regulate. In a study of brain cortical neurons, 24,937 loops were found, bringing enhancers to their target promoters.  Multiple enhancers, each often at tens or hundred of thousands of nucleotides distant from their target genes, loop to their target gene promoters and can coordinate with each other to control transcription of their common target gene. 
The schematic illustration in this section shows an enhancer looping around to come into close physical proximity with the promoter of a target gene. The loop is stabilized by a dimer of a connector protein (e.g. dimer of CTCF or YY1), with one member of the dimer anchored to its binding motif on the enhancer and the other member anchored to its binding motif on the promoter (represented by the red zigzags in the illustration).  Several cell function specific transcription factors (there are about 1,600 transcription factors in a human cell  ) generally bind to specific motifs on an enhancer  and a small combination of these enhancer-bound transcription factors, when brought close to a promoter by a DNA loop, govern level of transcription of the target gene. Mediator (a complex usually consisting of about 26 proteins in an interacting structure) communicates regulatory signals from enhancer DNA-bound transcription factors directly to the RNA polymerase II (pol II) enzyme bound to the promoter. 
Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two enhancer RNAs (eRNAs) as illustrated in the Figure.  An inactive enhancer may be bound by an inactive transcription factor. Phosphorylation of the transcription factor may activate it and that activated transcription factor may then activate the enhancer to which it is bound (see small red star representing phosphorylation of transcription factor bound to enhancer in the illustration).  An activated enhancer begins transcription of its RNA before activating transcription of messenger RNA from its target gene. 
CpG island methylation and demethylation Edit
Transcription regulation at about 60% of promoters is also controlled by methylation of cytosines within CpG dinucleotides (where 5’ cytosine is followed by 3’ guanine or CpG sites). 5-methylcytosine (5-mC) is a methylated form of the DNA base cytosine (see Figure). 5-mC is an epigenetic marker found predominantly within CpG sites. About 28 million CpG dinucleotides occur in the human genome.  In most tissues of mammals, on average, 70% to 80% of CpG cytosines are methylated (forming 5-methylCpG or 5-mCpG).  Methylated cytosines within 5’cytosine-guanine 3’ sequences often occur in groups, called CpG islands. About 60% of promoter sequences have a CpG island while only about 6% of enhancer sequences have a CpG island.  CpG islands constitute regulatory sequences, since if CpG islands are methylated in the promoter of a gene this can reduce or silence gene transcription. 
DNA methylation regulates gene transcription through interaction with methyl binding domain (MBD) proteins, such as MeCP2, MBD1 and MBD2. These MBD proteins bind most strongly to highly methylated CpG islands.  These MBD proteins have both a methyl-CpG-binding domain as well as a transcription repression domain.  They bind to methylated DNA and guide or direct protein complexes with chromatin remodeling and/or histone modifying activity to methylated CpG islands. MBD proteins generally repress local chromatin such as by catalyzing the introduction of repressive histone marks, or creating an overall repressive chromatin environment through nucleosome remodeling and chromatin reorganization. 
As noted in the previous section, transcription factors are proteins that bind to specific DNA sequences in order to regulate the expression of a gene. The binding sequence for a transcription factor in DNA is usually about 10 or 11 nucleotides long. As summarized in 2009, Vaquerizas et al. indicated there are approximately 1,400 different transcription factors encoded in the human genome by genes that constitute about 6% of all human protein encoding genes.  About 94% of transcription factor binding sites (TFBSs) that are associated with signal-responsive genes occur in enhancers while only about 6% of such TFBSs occur in promoters. 
EGR1 protein is a particular transcription factor that is important for regulation of methylation of CpG islands. An EGR1 transcription factor binding site is frequently located in enhancer or promoter sequences.  There are about 12,000 binding sites for EGR1 in the mammalian genome and about half of EGR1 binding sites are located in promoters and half in enhancers.  The binding of EGR1 to its target DNA binding site is insensitive to cytosine methylation in the DNA. 
While only small amounts of EGR1 transcription factor protein are detectable in cells that are un-stimulated, translation of the EGR1 gene into protein at one hour after stimulation is drastically elevated.  Expression of EGR1 transcription factor proteins, in various types of cells, can be stimulated by growth factors, neurotransmitters, hormones, stress and injury.  In the brain, when neurons are activated, EGR1 proteins are up-regulated and they bind to (recruit) the pre-existing TET1 enzymes which are highly expressed in neurons. TET enzymes can catalyse demethylation of 5-methylcytosine. When EGR1 transcription factors bring TET1 enzymes to EGR1 binding sites in promoters, the TET enzymes can demethylate the methylated CpG islands at those promoters. Upon demethylation, these promoters can then initiate transcription of their target genes. Hundreds of genes in neurons are differentially expressed after neuron activation through EGR1 recruitment of TET1 to methylated regulatory sequences in their promoters. 
The methylation of promoters is also altered in response to signals. The three mammalian DNA methyltransferasess (DNMT1, DNMT3A, and DNMT3B) catalyze the addition of methyl groups to cytosines in DNA. While DNMT1 is a “maintenance” methyltransferase, DNMT3A and DNMT3B can carry out new methylations. There are also two splice protein isoforms produced from the DNMT3A gene: DNA methyltransferase proteins DNMT3A1 and DNMT3A2. 
The splice isoform DNMT3A2 behaves like the product of a classical immediate-early gene and, for instance, it is robustly and transiently produced after neuronal activation.  Where the DNA methyltransferase isoform DNMT3A2 binds and adds methyl groups to cytosines appears to be determined by histone post translational modifications.   
On the other hand, neural activation causes degradation of DNMT3A1 accompanied by reduced methylation of at least one evaluated targeted promoter. 
Transcription begins with the binding of RNA polymerase, together with one or more general transcription factors, to a specific DNA sequence referred to as a "promoter" to form an RNA polymerase-promoter "closed complex". In the "closed complex" the promoter DNA is still fully double-stranded. 
RNA polymerase, assisted by one or more general transcription factors, then unwinds approximately 14 base pairs of DNA to form an RNA polymerase-promoter "open complex". In the "open complex" the promoter DNA is partly unwound and single-stranded. The exposed, single-stranded DNA is referred to as the "transcription bubble." 
RNA polymerase, assisted by one or more general transcription factors, then selects a transcription start site in the transcription bubble, binds to an initiating NTP and an extending NTP (or a short RNA primer and an extending NTP) complementary to the transcription start site sequence, and catalyzes bond formation to yield an initial RNA product. 
In bacteria, RNA polymerase holoenzyme consists of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. In bacteria, there is one general RNA transcription factor known as a sigma factor. RNA polymerase core enzyme binds to the bacterial general transcription (sigma) factor to form RNA polymerase holoenzyme and then binds to a promoter.  (RNA polymerase is called a holoenzyme when sigma subunit is attached to the core enzyme which is consist of 2 α subunits, 1 β subunit, 1 β' subunit only). Unlike eukaryotes, the initiating nucleotide of nascent bacterial mRNA is not capped with a modified guanine nucleotide. The initiating nucleotide of bacterial transcripts bears a 5′ triphosphate (5′-PPP), which can be used for genome-wide mapping of transcription initiation sites. 
In archaea and eukaryotes, RNA polymerase contains subunits homologous to each of the five RNA polymerase subunits in bacteria and also contains additional subunits. In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiple general transcription factors that work together.  In archaea, there are three general transcription factors: TBP, TFB, and TFE. In eukaryotes, in RNA polymerase II-dependent transcription, there are six general transcription factors: TFIIA, TFIIB (an ortholog of archaeal TFB), TFIID (a multisubunit factor in which the key subunit, TBP, is an ortholog of archaeal TBP), TFIIE (an ortholog of archaeal TFE), TFIIF, and TFIIH. The TFIID is the first component to bind to DNA due to binding of TBP, while TFIIH is the last component to be recruited. In archaea and eukaryotes, the RNA polymerase-promoter closed complex is usually referred to as the "preinitiation complex." 
Transcription initiation is regulated by additional proteins, known as activators and repressors, and, in some cases, associated coactivators or corepressors, which modulate formation and function of the transcription initiation complex. 
Promoter escape Edit
After the first bond is synthesized, the RNA polymerase must escape the promoter. During this time there is a tendency to release the RNA transcript and produce truncated transcripts. This is called abortive initiation, and is common for both eukaryotes and prokaryotes.  Abortive initiation continues to occur until an RNA product of a threshold length of approximately 10 nucleotides is synthesized, at which point promoter escape occurs and a transcription elongation complex is formed.
Mechanistically, promoter escape occurs through DNA scrunching, providing the energy needed to break interactions between RNA polymerase holoenzyme and the promoter. 
In bacteria, it was historically thought that the sigma factor is definitely released after promoter clearance occurs. This theory had been known as the obligate release model. However, later data showed that upon and following promoter clearance, the sigma factor is released according to a stochastic model known as the stochastic release model. 
In eukaryotes, at an RNA polymerase II-dependent promoter, upon promoter clearance, TFIIH phosphorylates serine 5 on the carboxy terminal domain of RNA polymerase II, leading to the recruitment of capping enzyme (CE).   The exact mechanism of how CE induces promoter clearance in eukaryotes is not yet known.
One strand of the DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy (which elongates during the traversal). Although RNA polymerase traverses the template strand from 3' → 5', the coding (non-template) strand and newly formed RNA can also be used as reference points, so transcription can be described as occurring 5' → 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand (except that thymines are replaced with uracils, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one fewer oxygen atom) in its sugar-phosphate backbone). [ citation needed ]
mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene. [ citation needed ] The characteristic elongation rates in prokaryotes and eukaryotes are about 10-100 nts/sec.  In eukaryotes, however, nucleosomes act as major barriers to transcribing polymerases during transcription elongation.   In these organisms, the pausing induced by nucleosomes can be regulated by transcription elongation factors such as TFIIS. 
Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure. [ citation needed ]
Bacteria use two different strategies for transcription termination – Rho-independent termination and Rho-dependent termination. In Rho-independent transcription termination, RNA transcription stops when the newly synthesized RNA molecule forms a G-C-rich hairpin loop followed by a run of Us. When the hairpin forms, the mechanical stress breaks the weak rU-dA bonds, now filling the DNA–RNA hybrid. This pulls the poly-U transcript out of the active site of the RNA polymerase, terminating transcription. In the "Rho-dependent" type of termination, a protein factor called "Rho" destabilizes the interaction between the template and the mRNA, thus releasing the newly synthesized mRNA from the elongation complex. 
Transcription termination in eukaryotes is less well understood than in bacteria, but involves cleavage of the new transcript followed by template-independent addition of adenines at its new 3' end, in a process called polyadenylation. 
Gene expression is controlled by a number of features – regulation of transcription and translation:
In eukaryotes, transcription or target genes can be stimulated or inhibited when specific transcriptional factors move from the cytoplasm into the nucleus. As only target genes are transcribed, it means that specific proteins are made. Each type of body cell has different target cells so they give different characteristics i.e. a nerve cell is different to a red blood cell. Transcription factors can change the rate of transcription and the process is as follows:
- The transcription factors move in by diffusion into the nucleus from the cytoplasm.
- When in the nucleus they may bind to promoter sequence (the sequence which is the start of the target gene).
- The transcription factors either increase or decrease the rate of transcription depending if they have bound onto the promoter sequence.
Some transcription factors are called activators where they increase the rate of transcription. This is done by the transcription factors helping the RNA polymerase to bind to the promoter sequence to activate transcription. Others are called repressors where they decrease the rate of transcription. This is done by the transcription factors binding to the promoter sequence preventing RNA polymerase from binding. This stops transcription.
Oestrogen can initiate the transcription of target genes. NB: Sometimes it can cause a transcription factor to be a repressor. You don’t need to know this for the AQA exam. A transcription factor may be bound to an inhibitor stopping it from binding to the promoter sequence. Oestrogen binds to the transcription factor making an oestrogen-oestrogen receptor complex and changes the site where the inhibitor is joined on (called DNA binding site). This means that the inhibitor is detached allowing the transcription factor to attach to the promoter sequence. NB: You don’t need to know the name of the inhibitor. Also the DNA binding site on the transcription factor stays changed whilst the oestrogen has bound to it.
In eukaryotes and some prokaryotes, translation of the mRNA produced from target genes can be inhibited by RNA interference known as RNAi. Short RNA molecules such as micro RNA, known as miRNA, and small interference RNA, known as siRNA, form an RNA Induced Silencing Complex, known as RISC, with proteins. NB: The small RNA molecules known to be double stranded in the revision guides or in textbooks this is confusing so it is better to start the process as miRNA and siRNA being single stranded. RNA forms a complex with a protein which is an enzyme called RNA hydrolase. miRNA does not form a complex with RNA hydrolase but another protein. These RNA molecules can each make a RISC with more then one protein and the proteins involved do not need to be known for AQA. The complexes each attach to their target mRNA sequence and preventing translation in different ways. This is how it is done for each small RNA molecules:
- siRNA/miRNA in plants:
- The bases on the siRNA attach to the bases on the mRNA by complementary base pairing.
- RNA hydrolase hydrolyses the mRNA strand into fragments preventing translation to occur as the whole polypeptide chain will not be made
NB: It is not necessary to know that the fragments are degraded in the processing body. If you want to learn this there is no harm.
- miRNA in mammals:
- The bases on the miRNA attach to the bases on the mRNA by complementary base pairing.
- Ribosomes are prevented from attaching to the mRNA strand stopping translation from occurring.
NB: Again here, it is not necessary to know that mRNA is degraded or stored in the processing body.
Epigenetics involves heritable changes in gene function, without changes to the DNA base sequence. These changes are caused by changes in the environment (more exposure to pollution) that inhibit transcription by:
- Increased methylation of DNA:A methyl group (known as an epigenetic mark) attaches to cytosine that has to be part of the nucleotide that is attached to guanine by a phosphodiester bond. NB: You may be confused right now but look at the diagram below of one strand of DNA and notice which of the cytosine nucleotides the methyl group joins on to. Notice that the nucleotide on the far right of the strand and the third one from the left does not have a methyl group as they are not next to a nucleotide with guanine as the base. The joining of the methyl group should not be confused by joining on to cytosine which is complementary to guanine on the other strand as this is wrong. Also the methyl group – CH3 – does not change the base sequence but the structure. As the structure has changed, it has become harder for enzymes to attach to the DNA stopping the expression of a gene. If the tumour suppressor gene is not transcribed it can cause cancer.
- Decreased of associated histones: An acetyl group – COCH3 – is another epigenetic mark which attaches to histone proteins to make the chromatin (mixture of DNA wound around histone proteins) less condensed for easy genetic expression to occur. The problem originates when histone deacetylase breaks the bond between the histone protein and acetyl group. The DNA becomes highly condensed making hard for enzymes to carry out the gene expression. NB: Histone deacetylase can be abbreviated into HDAC but it is best that you stay with the full name.
Epigenetic changes to the DNA are fortunately reversible therefore they are good targets by drugs to stop the effects of epigenetic occurring. These drugs can either stop DNA methylation or can inhibit histone deacetylase allowing the acetyl groups to remain attached to the DNA.
7.4 Regulation of Gene Expression
Each of your cells has at least 20,000 genes. In fact, all of your cells have the same genes. Do all of your cells make the same proteins? Nope! If they did, then all your cells would be alike (and you’d be a blob that couldn’t do anything except, well, be a blob). Instead, you have cells with different structures and functions. This is because different cells make different proteins. They do this by using, or expressing, different genes. Using a gene to make a protein is called gene expression.
Go ahead and be expressive! Your genes are!
How Gene Expression is Regulated
Gene expression is regulated to ensure that the correct proteins are made when and where they are needed. Regulation may occur at any point in the expression of a gene, from the start of transcription to the processing of a protein after translation. The focus in this lesson is the regulation of transcription. As shown in Figure below, transcription is controlled by regulatory proteins. The proteins bind to regions of DNA, called regulatory elements, which are located near promoters. After regulatory proteins bind to regulatory elements, they can interact with RNA polymerase, the enzyme that transcribes DNA to mRNA. Regulatory proteins are typically either activators or repressors.
- Activators promote transcription by enhancing the interaction of RNA polymerase with the promoter.
- Repressors prevent transcription by impeding the progress of RNA polymerase along the DNA strand.
Other factors may also be involved in the regulation of transcription, but these are typically the key players.
Regulation of Transcription. Regulatory proteins bind to regulatory elements to control transcription. The regulatory elements are embedded within the DNA.
Prokaryotic Gene Regulation
Transcription is regulated differently in prokaryotes and eukaryotes. In general, prokaryotic regulation is simpler than eukaryotic regulation.
The Role of Operons
Regulation of transcription in prokaryotes typically involves operons. An operon is a region of DNA that consists of one or more genes that encode the proteins needed for a specific function. The operon also includes a promoter and an operator. The operator is a region of the operon where regulatory proteins bind. It is located near the promoter and helps regulate transcription of the operon genes.
The Lac Operon
A well-known example of operon regulation involves the lac operon in E. coli bacteria (see Figure below and the video at the link below). The lac operon consists of a promoter, an operator, and three genes that encode the enzymes needed to digest lactose, the sugar found in milk. The lac operon is regulated by lactose in the environment. http://www.youtube.com/watch?v=oBwtxdI1zvk
- When lactose is absent, a repressor protein binds to the operator. The protein blocks the binding of RNA polymerase to the promoter. As a result, the lac genes are not expressed.
- When lactose is present, the repressor protein does not bind to the operator. This allows RNA polymerase to bind to the promoter and begin transcription. As a result, the lac genes are expressed, and lactose is digested.
Why might it be beneficial to express genes only when they are needed? (Hint: synthesizing proteins requires energy and materials.)
Eukaryotic Gene Regulation
In eukaryotic cells, the start of transcription is one of the most complicated parts of gene regulation. There may be many regulatory proteins and regulatory elements involved. Regulation may also involve enhancers. Enhancers are distant regions of DNA that can loop back to interact with a gene’s promoter.
The TATA Box
Different types of cells have unique patterns of regulatory elements that result in only the necessary genes being transcribed. That’s why a skin cell and nerve cell, for example, are so different from each other. However, some patterns of regulatory elements are common to all genes, regardless of the cells in which they occur. An example is the TATA box. This is a regulatory element that is part of the promoter of most eukaryotic genes. A number of regulatory proteins bind to the TATA box, forming a multi-protein complex. It is only when all of the appropriate proteins are bound to the TATA box that RNA polymerase recognizes the complex and binds to the promoter. Once RNA polymerase binds, transcription begins.
Regulation During Development
The regulation of gene expression is extremely important during the development of an organism. Regulatory proteins must turn on certain genes in particular cells at just the right time so the organism develops normal organs and organ systems. Homeobox genes are an example of genes that regulate development. They code for regulatory proteins that switch on whole series of major developmental genes. In insects, homeobox genes called hox genes ensure that body parts such as limbs develop in the correct place. Figure below shows how a mutation in a hox gene can affect an insect’s development.
Gene Expression and Cancer
The mutations that cause cancer generally occur in two types of regulatory genes: tumor-suppressor genes and proto-oncogenes (see Figure below). These genes produce regulatory proteins that control the cell cycle. When the genes mutate, cells with mutations divide rapidly and without limits.
TED Ed: The Cancer Gene We All Have:
TED Ed: How does cancer spread through the body?
- Gene transcription is controlled by regulatory proteins that bind to regulatory elements on DNA. The proteins usually either activate or repress transcription.
- Regulation of transcription in prokaryotes typically involves an operon, such as the lac operon in E. coli. The lac operon is regulated by proteins that behave differently depending on whether lactose is present.
- Regulation of transcription in eukaryotes is generally more complex. It involves unique regulatory elements in different cells as well as common regulatory elements such as the TATA box. Regulation is especially important during development. It may involve regulatory genes such as homeobox genes that switch other regulatory genes on or off. Mutations in regulatory genes that normally control the cell cycle cause cancer.
Lesson Review Questions
1. What is gene expression?
2. Describe how regulatory proteins regulate gene expression.
3. Identify the TATA box and its function in transcription.
4. What is a homeobox gene?
5. Draw a diagram to show how the lac operon is regulated.
6. Sketch how an insect with a mutated hox gene might look. Explain your sketch.
7. Why is gene regulation especially important during development?
Points to Consider
Scientists know more about human chromosomes and genes than they know about the genetic material of most other species. In fact, scientists have identified all of the approximately 20,000-25,000 genes in human DNA.