Definition of Transcription and it’s Process: the Complete Guide 2021

The process of copying genetic information from one. Strand of the DNA into RNA is termed as Transcription. Here also the principle of complementarity governs the process of transcription, except the adenosine complements now forms base pair with uracil instead of thymine. However, unlike in the process of replication, which once set in, the total DNA of an organism gets duplicated, in transcription only a segment of DNA and only one of the strands is copied into RNA” This necessitates defining the boundaries that would demarcate the region and the strand of DNA that would be transcribed.

Why both the strands are not copied during transcription has the simple answer. First, if both strands act as a template, they would code for RNA molecule with different sequences {Remember complementarity does not mean identical); and in turn, if they code for proteins, the sequence of amino acids in the proteins would be different. Hence, one segment of the DNA would be coding for two different proteins, and this would complicate the genetic information transfer machinery. Second, the two RNA molecules if produced simultaneously would be complementary to each other, hence would form a double stranded RNA. This would prevent RNA from being translated into protein and the exercise of transcription would become a futile one.

Transcription unit

A transcription unit in DNA defined primarily by three regions in the DNA

  1. A promoter
  2. The structural gene
  3. A terminator

There is a convention in defining the two strands of the DNA in the structural gene of a transcription unit. Since the two strands have opposite polarity and the ‘DNA-dependent RNA polymerase also catalyse the polymerisatiOn in only one direction, that is, 5′ to 3′, the strand that has the polarity 3′ to 5′ acts as a template, and is also referred to as template strand. The other strand which has the polarity (5′ to 3’) and the sequence same as RNA (except thymine at the place of uracil), is displaced during transcription. Strangely, this strand (which does not code for anything) is referred to as coding strand. All the reference point While defining a transcription unit is made with coding strand. To explain the point, a hypothetical sequence from a transcription unit is represented below:

3’ –ATGCATGCATGCATGCATGCATGC- 5’ template strand


The promoter and terminator flank the structural gene in a transcription unit. The promoter is said to be located towards 5′-end (upstream) of the structural gene (the reference is made with respect to the polarity of coding strand). It is a DNA Sequence that provides binding site for RNA polymerase, and it is the presence of a promoter in a transcription unit that also defines the template and coding strands. By switching its position with terminator, the definition of coding and template strands could be reversed. The terminator is located towards 3’end (downstream) of the coding strand and it usually defines the end of the process of transcription. There are additional regulatory sequences that may be present further upstream or downstream to the promoter. Some of the properties of these sequences shall be discussed while dealing with regulation of gene expression.

Transcription Unit and the Gene

A gene is defined as the functional unit of inheritance. Though there is no aInbiguity that the genes are located on the DNA, it is difficult to literally define a gene in terms of DNA sequence. The DNA sequence coding for tRNA or rRNA molecule also define a gene. However by defining a’ cistron as’ a segment of DNA coding for a polypeptide, the structui’al gene in a transcription unit could be said as monocistronic (mostly in eukaryotes) or polycistronic (mostly in bacteria prokaryotes). In eukaryotes, the monqcistronic structural genes have Interrupted coding sequences the genes in eukaryotes are split. The coding sequences or expressed sequences are defined as exons. Exons are said to be those sequence that appear in mature Or processed RNA. The exons are interrupted by introns. Introns or intervening sequences do not appear in mature or processed RNA. The split-gene arrangement further complicates the definition of a gene in terms of a DNA segment.

Inheritance of a character is also affected by promoter and regulatory sequences of a structural gene. Hence, sometime the regulatory sequences are loosely defined as regulatory genes, even though these sequences do not code for any RNA or protein.

Types of RNA and the process of Transcription

In bacteria, there are three major types of RNAs: mRNA (messenger RNA), tRNA (transfer RNA), and rRNA (ribosomal RNA). All three RNAs are needed to synthesise a protein in a cell. The mRNA provides the template, tRNA brings amino acids and reads the genetic code, and rRNAs play structural and catalytic role during translation. There is single DNA-dependent RNA polymerase that catalyses transcription of all types of RNA in bacteria. RNA polymerase binds to promoter. and initiates transcription (Initiation). It uses nucleoside triphosphates as substrate and polymerises in a template depended fashion following the rule of complementarity It somehow also facilitates opening of the helix and continues elongation.” Only a short stretch of RNA remains bound to the enzyme. Once the polymerase reaches the terminator region, the nascent RNA falls off so also the RNA polymerase. This results in termination of transcription.

An intriguing question is that how is the RNA polymerases able to catalyse all the three steps, Which are initiation, elongation and termination. The RNA polymerase is only capable of catalysing the process of elongation. It associates transiently with initiation-factor  and termination-factor  to initiate and terminate the transcription, respectively. Association with these factors alter the specificity of the. RNA polymerase to either initiate or terminate.

In bacteria, since the mRNA does not require any processing to become active, andalso since transcription and translation take place in the same ‘ compartment (there is no separation of cytosol and nucleus in bacteria), many times the translation can begin much before the mRNA is fully transcribed. Consequently, the transcription and translation can be coupled ‘ in bacteria.

In eukaryotes, there are two additional complexities:

(i) There are at least three RNA polymerases in the nucleus (in addition, to the RNA polymerase found in the organelles). There is a clear cut division of labour. The RNA polymerase I transcribes rRNAs (28S, 18S, and 5.8S), whereas the RNA polymerase III is responsible for transcription of tRNA, EsrRNA, and snRNAs (small nuclear RNAs). The RNA polymerase II transcribes precursor of mRNA, the heterogeneous nuclear RNA (hnRNA).

(ii) The second complexity is that the primary transcripts contain both. The exons and. the intrdns and are non-functional. Hence, it is subjected to a process called splicing where the introns are removed and exams are joined in a defined order. hnRNA undergoes additional processing called as capping and tailing. In capping an unusual nucleotide (methyl guanosine triphosphate) is added to. The 5’ -end of hnRNA. In tailing, adenylate residues (200-300) are added at 3’ -end in a template independent manner. It is the fully processed hnRNA, now called mRNA that is transported out of the nucleus for translation. The significance of such complexities is now beginning to be understood. The split-gene arrangements represent probably an ancient feature of the genome. The presence of introns is reminiscent of antiquity, and the process of splicing represents the dominance of RNA-world. In recent times, the understanding of RNA and RNA-dependent processes in the living system has assumed more importance.

Genetic Code

During replication and transcription a nucleic acid was copied to form another nucleic acid. Hence, these processes are easy to conceptualise on the basis of complementarity. The process of translation requires transfer of genetic information from a polymer of nucleotides to from a polymer of amino acids. Neither does any complementarity exist between nucleotides and amino acids, nor could any be drawn theoretically. There existed ample evidences, though, to support the notion that changes in nucleic acids (genetic material) were responsible for change in amino acids in proteins. This led to the proposition of a genetic code that could direct the sequence of amino acids during synthesis of proteins.

If determining the biochemical nature of genetic material and the structure of DNA was very exciting, the proposition and deciphering of genetic code were most challenging. In a very true sense, it required involvement of scientists from several disciplines physicists, organic chemists, biochemists and geneticists. It was George Gamow, a physicist, who argued that since there are only 4 bases and if they have to code for 20 amino acids, the code should constitute a combination of bases. He suggested that in order to code for all the 20 amino acids, the code should be made up of three nucleotides. This was a very bold proposition, because a permutation combination of 43 (4 x 4 x 4) would generate 64 codons; generating many more codons than required. .

Providing proof that the codon was a triplet was a more daunting task. The chemical method developed by Har Gobind Khorana Was instrumental in synthesising RNA molecules with defined combinations of bases (homopolymers and copolymers) Marshall Nirenberg’s ceu-free system for protein synthesis finally helped the code to be deciphered. Severo Ochoa enzyme (polynucleotide phosphorylase) was also helpful in polymerlsing RNA with defmed sequences in a template independent manner (enzymatic synthesis of RNA).

Leave a Comment