SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Evolution

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Brumar89 who wrote (39955)8/3/2013 5:04:32 PM
From: 2MAR$  Read Replies (1) of 69300
 
An extension of the coevolution theory of the origin of the genetic code
biologydirect.com

BackgroundThe coevolution theory of the origin of the genetic code suggests that the genetic code is an imprint of the biosynthetic relationships between amino acids. However, this theory does not seem to attribute a role to the biosynthetic relationships between the earliest amino acids that evolved along the pathways of energetic metabolism. As a result, the coevolution theory is unable to clearly define the very earliest phases of genetic code origin. In order to remove this difficulty, I here suggest an extension of the coevolution theory that attributes a crucial role to the first amino acids that evolved along these biosynthetic pathways and to their biosynthetic relationships, even when defined by the non-amino acid molecules that are their precursors.

ResultsIt is re-observed that the first amino acids to evolve along these biosynthetic pathways are predominantly those codified by codons of the type GNN, and this observation is found to be statistically significant. Furthermore, the close biosynthetic relationships between the sibling amino acids Ala-Ser, Ser-Gly, Asp-Glu, and Ala-Val are not random in the genetic code table and reinforce the hypothesis that the biosynthetic relationships between these six amino acids played a crucial role in defining the very earliest phases of genetic code origin.

ConclusionAll this leads to the hypothesis that there existed a code, GNS, reflecting the biosynthetic relationships between these six amino acids which, as it defines the very earliest phases of genetic code origin, removes the main difficulty of the coevolution theory. Furthermore, it is here discussed how this code might have naturally led to the code codifying only for the domains of the codons of precursor amino acids, as predicted by the coevolution theory. Finally, the hypothesis here suggested also removes other problems of the coevolution theory, such as the existence for certain pairs of amino acids with an unclear biosynthetic relationship between the precursor and product amino acids and the collocation of Ala between the amino acids Val and Leu belonging to the pyruvate biosynthetic family, which the coevolution theory considered as belonging to different biosyntheses.

ReviewersThis article was reviewed by Rob Knight, Paul Higgs (nominated by Laura Landweber), and Eugene Koonin.

Background
Why the genetic code originatedThere are two completely different interpretations on why the genetic code might have originated. The first is obtained by means of an extreme interpretation of the stereochemical hypothesis of genetic code origin which suggests that the genetic code originated because its organisation is somehow constrained by the stereochemical relationships between codons or anticodons and amino acids. This extreme interpretation seems totally absurd to me. The second interpretation that I am aware of has to do with the origin of peptidyl-tRNA: the key intermediate in the origin of protein synthesis.

Peptidyl-tRNA has no function per se, but in some models it has been assumed that the entire catalysis of the protocell was originally performed by this intermediate [ 1- 4]. Its origin might therefore have been determined by interactions between covalent complexes of peptide and RNA (peptide-RNAs) and these interactions might have constituted one of the most elementary forms of protein synthesis [ 3, 4]. This model shows that the interactions between peptide-RNAs must, at a certain evolutionary stage, have been directed by a template (pre-mRNA) which must have originally codified only the succession of interactions between peptide-RNAs [ 4]. This pre-mRNA is the most ancestral form of mRNA imaginable [ 4]. Finally, the evolution of these pre-mRNAs must have resulted in an mRNA codifying only for a limited number of amino acids [ 4]. This is the phase that defines the very origin of the genetic code. Clearly this is an historic interpretation of genetic code origin that is completely different from the deterministic one given by the stereochemical theory.

What is particularly important as far as this paper is concerned is that the evolution of these pre-mRNAs into mRNAs was characterised by a progressive refinement of the interactions of the peptide-RNAs on the pre-mRNA templates and this refinement seems to have been made possible only when peptide-RNAs were transformed into amino acid-pre-tRNAs [ 4]. This is because there might have only been the modification, residue by residue, performed by the amino acid-pre-tRNAs on the evolving proteins that might lead to the complete specification of their sequences, and which made possible the birth of an mRNA proper but with codification limited to just a few amino acids [ 4]. As will become clear in the following, I maintain that these amino acid-pre-tRNAs came directly from the biosynthetic pathways of the first six amino acids evolving along the biosynthetic pathways of energetic metabolism and that they were the first amino acids to be codified on these still evolving mRNAs.

The biosynthetic relationships between amino acids are closely linked to the organisation of the genetic code Ever since the genetic code was first deciphered, it has been observed that the biosynthetic relationships between amino acids are linked to the organisation of the genetic code. Indeed, Nirenberg et al. [ 5] acknowledged the existence of a relationship between amino acids of a similar biosynthetic origin and the codons specifying those amino acids. Although the examples of biosynthetic relationships reported by Nirenberg et al. [ 5] contain some inaccuracies, the authors were the first to suggest that the genetic code's evolutionary development might have been defined by the amino acids' biosyntheses. Jukes [ 6] also noted that some amino acids take part in the biosynthesis of other amino acids, such as serine which plays a part in the biosynthesis of tryptophan. However, these seemed to be isolated and not totally clear observations and Jukes [ 6] did not believe they could be generalised for the entire genetic code. Pelc [ 7] recognised that biosynthetic conversions between amino acids might have had an important role in defining the genetic code. However, it was Dillon [ 8] who, above all, suggested a metabolic model for the origin of the genetic code, although this author suggested amino acid biosyntheses that are only partly linked to those existing in living organisms. It was Wong [ 9] who fully recognised the importance, for the evolution of the genetic code, of the biosynthetic relationships between amino acids as they take place in actual organisms, suggesting what is now known as the coevolution theory of genetic code origin. This theory suggests that the genetic code is primarily an imprint of the biosynthetic pathways forming amino acids [ 9]. Consequently the evolution of the genetic code could be clarified on the basis of the precursor-product relationships between amino acids in their biosyntheses [ 9]. In other words, this theory suggests that only few amino acids (precursors) were codified in the genetic code; as other amino acids (products) developed from these, part of the codon domain of precursor amino acids was ceded to product amino acids [ 9]. Therefore, according to this theory, the genetic code might represent an evolutionary map of the biosynthetic relationships between amino acids [ 9].

While Wong [ 9] highlighted the precursor-product relationships between amino acids and their crucial role in defining the organisation of the genetic code, Miseta [ 10] clearly identified that the non-amino acid molecules that were precursors of amino acids might have been able to play an important role in organising the genetic code. Miseta [ 10] suggested the idea of an intimate relationship between molecules, the intermediates of glucose degradation, as precursors of precursor amino acids, and the organisation of the genetic code. This observation is also analysed by Taylor and Coates [ 11] who showed the relationship between the glycolytic pathway, the citric acid cycle, the biosyntheses of amino acids and the genetic code (Fig. 1) and, in particular, they point out that (i) all the amino acids that are members of a biosynthetic family tend to have codons with the same first base (Fig. 1) and (ii) that the five amino acids codified by GNN codons are found in four biosynthetic pathways close to or at the beginning of the pathway head (Fig. 1)[ 11]. More recently, Davis [ 12, 13] has provided evidence that tRNAs descending from a common ancestor were adaptors of amino acids synthesised by a common precursor and he also discusses the biosynthetic families of amino acids, suggesting their importance in genetic code origin.

Figure 1. Biosynthetic relationships between amino acids, as defined by their biosyntheses and their relationships with the glycolytic pathway and the citric acid cycle. The figure was taken from Taylor and Coates [ 11] with a few modifications. The numbers indicate the biosynthetic steps. DAP = diaminopimelic pathway, aKG = alpha-ketoglutarate, OOA = oxalacetic acid, PEP = phosphoenolpyruvate, PGA = phosphoglycerate, R-P3 = 5-phosphoribosylpyrophosphate, Ru-5-P = ribulose-5-phosphate. The other abbreviations are standard.

However, there have also been authors who have suggested that some aspects of the biosynthetic relationships between amino acids were not important in genetic code origin [ 14, 15]. In particular, Ronneberg et al. [ 14] criticise the coevolution theory above all because some pairs of amino acids used by this theory do not seem to be in a clear precursor-product amino acid relationship, although, more generally, they recognise that amino acids in a biosynthetic relationship tend to have codons with the same first base [ 14]. Di Giulio [ 16] responded to the criticisms made by Ronneberg et al [ 14] and, in particular, made numerous observations in favour of the coevolution theory. There has also been evidence indicating that the five families of amino acids, defined in accordance with a single amino acid precursor or a non-amino acid precursor, should have been randomly observed in the genetic code with a probability of 6 × 10-5 [ 17]. This indicates that the biosynthetic relationships between amino acids were fundamental in organising the genetic code.

Finally, if we consider that other works have been carried out on the importance of biosynthetic relationships between amino acids and the genetic code [ 18- 39], we come to the conclusion that there can no longer be any doubts on the hypothesis that the origin of the organisation of the genetic code was affected by the biosynthetic pathways of amino acids.

Results
The extended coevolution theoryIn order to eliminate some criticisms on certain pairs of amino acids that are in an unclear precursor-product relationship [ 14, 16] and, above all, to provide a more complete description of the very earliest phases of genetic code origin, I have been forced to suggest the following theory. This theory, which can be called the 'extended coevolution theory' as it is simply an extension or a generalisation of Wong's coevolution theory [ 9], states that:

"The genetic code is simply an imprint of the biosynthetic relationships between amino acids, even when defined by the non-amino acid molecules that are the precursors of some amino acids, i.e. that the organisation of the genetic code must only reflect the biosynthetic proximity between amino acids in the various stages of evolution of their biosynthetic pathways. This happened because the ancestral biosynthetic pathways took place on tRNA-like molecules and thus enabled a coevolution between these pathways and the organisation of the genetic code through the concession of tRNA-like molecules between biosynthetically close amino acids, which made possible the transfer of codons from one amino acid to another, while mRNA evolved, with the consequence that amino acids with correlated biosyntheses have contiguous codons in the genetic code".

This theory, which in a contracted and informal form has already been suggested [ 16], can be tested and all the evidence in favour of the coevolution theory is also in favour of the extended coevolution theory. The key point on which the two theories disagree regards the predictions on the earliest phases of genetic code origin, which are not well defined for the coevolution theory [ 9, 40] while, for the extended coevolution theory their traces should be present in the biosynthetic relationships between amino acids that are precursors of other amino acids and the non-amino acid molecules that are precursors of precursor amino acids.

As shown in the following section, this main prediction of the extended coevolution theory seems to be corroborated by the observations.

The main prediction of the extended coevolution theory seems to be corroboratedAccording to the predictions of the coevolution theory, the codon concession mechanism between amino acids in a precursor-product relationship was based on tRNA-like molecules on which the theory hypothesises that biosynthetic transformations between amino acids take place [ 9]. Surprisingly, this prediction is confirmed by the existence of molecular fossils [ 33] representing the vestiges of these pathways (Tab. 1) hypothesised by the coevolution theory [ 9, 19- 21]. Although these biosynthetic transformations took place in accordance with the coevolution theory, only among the amino acids in a precursor-product relationship [ 9] is there no a priori reason why this should have taken place only between amino acids [ 28, 31]. The coevolution theory seems to imply that all metabolism took place at that time on tRNA-like molecules [ 28, 31] or, at least, that the entire metabolism of amino acids took place on these molecules. This view, i.e. that metabolism took place on tRNA-like molecules, has been hypothesised by other authors following arguments that might be totally different from those used here [ 41- 43].

Therefore, if the metabolism of amino acids took place on tRNA-like molecules when the genetic code originated, the structure of the genetic code must contain traces linking the very earliest phases of genetic code origin to the biosynthetic relationships between the first amino acids to enter the code and the non-amino acid molecules that were their precursors. This is because the very first amino acids that entered the genetic code and had non-amino acid molecules as their precursors, did so, as suggested by the extended coevolution theory, using the same mechanism employed by the pairs of amino acids in a precursor-product relationship, i.e. exploiting the hypothetical existence of the biosynthetic pathways on the tRNA-like molecules that triggered the origin of the genetic code. This is the main prediction of the extended coevolution theory and how it differentiates the latter from the coevolution theory.

Fig. 2 reports the biosynthetic relationships between amino acids that presumably first originated from the glycolytic pathway and Krebs' cycle. All these amino acids are, with the exception of Gly, directly linked to non-amino acid molecules that are their precursors. (Although the biosynthetic pathways leading to Phe and Tyr and to His are directly linked to a non-amino acid precursor (Fig. 1), they seem too complex for an early evolution because they have at least ten biosynthetic steps in these pathways and so these three amino acids would evidently not fall within this classification (see Appendix)). As suggested by the extended coevolution theory, this might indicate that they were the first to originate during the evolution of the biosynthetic pathways of amino acids. (Gly is the only one of these amino acids that is not directly linked to one of these non-amino acid molecules of the glucose degradation pathway (Figs, 1, 2). Although the synthesis of Gly from Ser is well documented [ 9, 44], the conversion of Gly to Ser also takes place normally [ 9, 45]. For example, Gly is converted to Ser by reacting with formate in the presence of pyridoxal phosphate [ 9, 45- 47]. This favours the hypothesis that these two amino acids, Ser and Gly, were inter-convertible when these pathways originated).

Figure 2. Biosynthetic relationships between amino acids and their precursor non-amino acid molecules, as defined in a particular stage of the evolution of the biosynthetic pathways of amino acids. With the sole exception of proline, these are also the amino acids that first appear in a study on the temporal origin of the appearance of amino acids [ 54]. See Fig. 1 for further information.

If these were effectively the earliest amino acids to originate from non-amino acid precursors of the energetic metabolism pathways (Fig. 2) and if the main prediction of the extended coevolution theory is true, then all these amino acids (Fig. 2) should occupy a particular place within the genetic code table because they should be witnesses of the earliest phases of the evolution of the genetic code. Indeed, as other authors have observed [ 11], with the exception of Ser, all these amino acids (Fig. 2) are codified by codons of the GNN type. The distribution of these amino acids on these codons is not random and is obtained, by pure chance, with a probability equal to 3.9 × 10-4 (see Appendix).

Therefore, this observation that the first amino acids to evolve along the biosynthetic pathways are the same ones that are mostly codified by codons of the GNN type leads us to suppose, in compliance with the extended coevolution theory, that there existed a type of primitive genetic code (mRNA) that possessed only the codons of the type GNC (or GNG) and codified only for the amino acids Ala, Asp and Ser or Gly (or Ala, Glu and Ser or Gly) (Fig. 3) from which the GNS code codifying for Val, Ala, Asp, Glu, Ser and/or Gly (Fig. 3) might have evolved. This is suggested by exploiting the results of Ikehara et al [ 48] who, for quite different reasons, suggested a genetic code origin that is, in some respects, similar.

Figure 3. This shows three stages of genetic code evolution. All the abbreviations are standard. See text for discussion.

It should also be borne in mind that as these amino acids are the most abundant in the experiments of prebiotic synthesis and in meteorites [ 40] they had already attracted the attention of researchers. Indeed, Eigen et al. [ 49] had suggested a primitive code with codons of the GNY type, which is partly compatible with what is maintained here, partly because it might be derived from a GNC code (Fig. 3) [ 50].

Discussion
Some comments on the evolution of the genetic code, as suggested by the extended coevolution theory The evolution of the genetic code as suggested here needs some discussion and clarification.

(i) Ser is not codified by any of the GNN codons whereas, on the basis of the considerations made here, it should be. However, the fact that Ser is biosynthetically inter-convertible with Gly [ 9, 44- 47] might indicate that Ser was codified by some or all the codons that today codify for Gly in the GNS and SNS codes (Fig. 3), and only with the NNS code (Fig. 4), i.e. when the codon domains of precursor amino acids were defined as predicted by the coevolution theory, did Ser cede some codons (GGS) to Gly (Fig. 4). This seems to be corroborated by the observation that, as Ser is also codified by AGY codons contiguous to the GGN codons of Gly, this might imply that the latter codons codified for Ser in a previous evolutionary stage.

Figure 4. This shows a stage of the evolution of the genetic code: the one in which the precursor amino acid codon domains are formed, as predicted by the coevolution theory[ 9]. See text for discussion.

From the evolutionary stage (shown in Fig. 4) of the genetic code on, the evolution of the code is fully described by the coevolution theory [ 9] (see Di Giulio and Medugno [ 35] for details on the entry times of amino acids into the genetic code).

(ii) The closer biosynthetic proximity between the pairs Ser-Ala, Ala-Val, Asp-Glu and Ser-Gly, as shown in Fig. 2, seems to find confirmation in the genetic code structure in that: (1) Ser-Ala and Ser-Gly have contiguous codons in the genetic code, i.e. they differ only in a single base, although Ser does not occupy the last row of the genetic code; (2) the pair Asp-Glu occupies the same box in the genetic code, i.e. their codons differ only in the third base and these amino acids are the same ones that, at the evolutionary stage of the biosynthetic pathways as indicated in Fig. 2, are more biosynthetically correlated; (3) the pair Ala-Val is part of the pyruvate biosynthetic family (Fig. 1) and their codons differ in only one base, a pyrimidine, even if these amino acids occupy the last row of the genetic code. All this seems to imply, in agreement with the extended coevolution theory, that amino acid pairs made in siblings by a non-amino acid molecule, i.e. the pairs Ser-Ala, Ala-Val, Asp-Glu and Ser-Gly (Fig. 2), the last of which might be in a precursor-product relationship [ 9], were particularly important in the earliest phases of genetic code origin because their organisation within the genetic code would also seem to reflect the closer biosynthetic proximity of these pairs (Fig. 2).

(iii) The here-maintained hypothesis that the amino acids that first evolved along the pathways of energetic metabolism (Fig. 2) formed the GNS code (Fig. 3) seems to rationalise why Asp and Glu are codified by GAN codons and not by ANN and CNN codons. Indeed, if the GAS codons had been attributed early on to Asp and Glu, they should have been both abundant on the first mRNAs and linked to them by a stronger historic constraint. Consequently, it would have been more difficult to concede them to product amino acids than the ANS and CNS codons making up the codon domain of Asp and Glu which instead must have been rare (see below) and also less historically constrained and, thus more easily transferable to the product amino acids, as seems to have happened. Therefore, this reasoning rationalises why Asp and Glu are codified by GAN codons and not ANN or CNN codons. Moreover, this strengthens the hypothesis of the existence of the GNS code for the very reason that Asp and Glu are codified by the GAN codons and not by some of those in ANN and CNN, as would have been more reasonable to expect considering the clearer biosynthetic relationship that Asp and Glu have with the product amino acids of their biosynthetic family compared to the less clear relationship they have with each other (Fig. 1). This should have resulted in a closer similarity between codons of Asp and Glu and codons of their product amino acids than with their own. The fact that this did not happen would seem to imply a very early involvement of GAN, or rather GNS, codons in genetic code origin because Asp and Glu are codified by these codons and not by those of the type ANN and CNN, as would instead be imposed by the clearer biosynthetic relationships with their product amino acids. In short, the codification of Asp and Glu by means of GAN codons might reflect the history of the very earliest phases of genetic code origin.

(iv) The evolution of mRNA as defined by the passage from the SNS (or GNS) code (Fig. 3) to the NNS code (Fig. 4) might have been highly facilitated if some codons were rarely used on mRNAs. In other words, let us admit that, for instance, there evolved in the SNS code: one or very few ANS codons codifying for Asp; one or very few CNS codons codifying for Glu: one or very few UNS codon codifying for Ser. It can be seen that in this way, all the precursor amino acid codon domains can be defined, i.e. the NNS code (Fig. 4), paradoxically without there actually being all their codons present. Indeed, it is sufficient for the first base of any one codon to be recognised, although read in triplets [ 51], in order to define the NNS code relatively fully. If the rarity of codons had been preserved in the evolutionary stages following the NNS codes (Fig. 4), then an amino acid precursor might have easily ceded part of its codon domain to the product amino acid without generating considerable translation noise in this transfer of codons. Naturally, every passage between the codes GNC (or GNG), GNS, SNS and NNS (Figs. 3, 4) must have been characterised by the rarity of the types of codons because the system was evolving and, for instance, the majority of tRNA molecules had yet to evolve, i.e. there existed very few types of tRNA molecule. In other words, it would seem that it is the very evolution of the code that implies codon rarity, allowing a faster and more efficient evolution by means of the mechanism of the coevolution theory. This leads us to suppose that the SNS form of code might have only partly preceded the NNS form because it would take just one codon, for instance of the ANS type, to define an entire codon domain and, therefore, an entire evolutionary stage of the genetic code. In other words, the evolutionary stage of the SNS and NNS codes might be less sharp than apparently shown in Figs. 3 and 4. Moreover, this indicates that the mRNA of the NNS code might have been much simpler than appears from the same Fig. 4.

(v) Exceptions to the "rule" of precursor amino acid codon domains seem to be the codons UUG (Leu) and AGG (Arg) (in white in Fig. 4), but also the codon AGC (Ser) although the latter might be derived from codons attributed to Gly, as suggested by Wong [ 9], but in any case outside the domain of Ser (Fig. 4). In other words, the codons UUR and AGR are the only exceptions observed in the precursor amino acid codon domains because they do not biosynthetically belong to the codon domain of the precursor in which they reside. However, while the codons UUR (Leu) might have been captured with a secondary mechanism by the codons in Ser's domain, for the AGR codons (Arg) there might exist a fascinating explanation. It is possible that the AGR codons of Arg derive from the codon domain of Asp and not from that of Glu, which is the natural precursor of Arg (Fig. 1) in that Asp intervenes in one of the terminal steps of the biosynthetic pathway of Arg [ 14, 16]. Therefore, for Arg, the CGN codons might derive from the codon domain of Glu via ornithine or citrulline [ 16], while the AGR codons might derive from the codon domain of Asp [ 14, 16]. This might therefore be an extremely interesting case of a double entry of an amino acid in the genetic code through two different amino acid precursors, something which has also been hypothesised for Ser [ 9]. This would provide a strong corroboration for the mechanism by which amino acids enter the genetic code, as suggested by the coevolution theory.

Finally, the CUS codons of Val (Leu) also apparently belong to the codon domain of Glu (Fig. 4). This might corroborate the hypothesis that these codons were ceded from Glu to Val. Indeed, the early phases of the evolution of NNS codes are characterised by codification limited to only six amino acids (Fig. 4) and therefore, the relative biosynthetic relationships might have made the amino acids Val and Glu biosynthetic siblings (Fig. 2). Although not entirely free of criticism, this viewpoint cannot be categorically excluded.

Nevertheless, there seems to be a much simpler interpretation provided by the SNS code (Fig. 3). Indeed, if in this evolutionary stage all the SUS codons codified for Val (Fig. 3) there would not have been any need for a real transfer of codons from Glu, but this might have only depended on the passage from the GNS to the SNS code provided that the SUS codons continued to codify for Val (Fig. 3).

Conclusion
The coevolution theory [ 9] does not give a complete description of genetic code origin as it seems not to consider that the biosynthetic pathways of the amino acids that first entered the genetic code were important in the earliest phases of the origin of the code itself [ 9, 40, 52]. Whereas, with the extended coevolution theory it can be seen that there might have existed a GNC or a GNG code, but almost certainly a code of the GNS type, because the amino acids codified by these codons are in a clear biosynthetic relationship by means of their precursor non-amino acid molecules (Fig. 2) at the head of the amino acids' biosynthetic pathways and, therefore, must have characterised the earliest phases of genetic code origin.

The extended coevolution theory explains the existence, in the genetic code, of the pairs Phe-Tyr, Val-Leu and Thr-Met which are not in a clear biosynthetic relationship of precursor-product amino acids [ 14], by means of mere biosynthetic proximity. This is because, as the ancestral biosynthetic pathways take place on tRNA-like molecules, they enabled these biosynthetically close amino acids to have similar codons [ 16]. This cannot be achieved satisfactorily by the coevolution theory. For the sake of clarity and completeness, see also the comments already made on these amino acid pairs [ 16].

The coevolution theory [ 9] does not explain the presence of the codons of the amino acid pair Phe-Tyr inside Ser's codon domain (Fig. 4), whereas the extended coevolution theory explains its existence in this very domain through the mere biosynthetic proximity of the pathway leading to the synthesis of Phe and Tyr to that of Ser (Fig. 1).

Finally, the coevolution theory is unable to explain why Ala has codons contiguous to Val, even if it is clear that these two amino acids are biosynthetically correlated in that they are derived from pyruvate (Fig. 1). This theory even puts Ala and the Val-Leu pair in biosynthetically different domains [ 9, 40], which seems to be mistaken. The extended coevolution theory, on the other hand, explains the relationships between these amino acids derived from the same non-amino acid precursor with the hypothesis that their ancestral biosyntheses took place on correlated tRNA-like molecules that allowed these amino acids to have likewise correlated codons in the genetic code [ 16].

Appendix
It is necessary to calculate the probability with which the amino acids Ser, Gly, Ala, Val, Asp and Glu can be observed in the GNN codons of the genetic code while also taking into account the distribution of the amino acids in the non-GNN codons. Fisher's exact test seems to be able to calculate this probability. If we consider that, of these 6 amino acids, only Ser is not codified by GNN type codons, we obtain for amino acids with non-amino acid precursors: (i) 5 of these are codified by GNN codons (= a), while (ii) only 1 (Ser) is codified by non-GNN codons (= b). For amino acids with amino acid precursors, we have: (i) 0 of these are codified by GNN codons (= c), and (ii) 14 of these are codified by non-GNN codons (= d). By applying Fisher's exact test we obtain a probability P = 3.9 × 10-4 (a = 5, b = 1, c = 0, d = 14) which is highly significant.

However, it could be objected that Val is 4 biosynthetic steps away from pyruvate, while Gly is not directly linked to PGA (Fig. 2) and therefore might not fall within the class of amino acids that evolved early on. To answer these strongly dubious questions, certain checks can be carried out.

Eliminating Val and Gly because they might not have entered the genetic code early on from the biosynthetic pathways' point of view (Fig. 2), we have P = 0.0035 (a = 3, b = 1, c = 0, d = 16). Therefore, under this hypothesis too, which actually seems extremely restrictive, we obtain a highly significant probability. Eliminating only Val (because Gly might have evolved very early on through interconversion with Ser [ 9, 44- 47]) or eliminating only Gly because Val is derived directly from pyruvate in a number of biosynthetic steps that, in qualitative terms, evolved rapidly and are not even numerous, we obtain a P = 0.0010 (a = 4, b = 1, c = 0, d = 15) that is still highly significant. In conclusion, these amino acids (Fig. 2) seem to have correlated GNN codons because they evolved early on in the ancestral biosynthetic pathways.

Finally, if we consider that His and Phe-Tyr are also derived from non-amino acid precursors (Fig. 1), we obtain P = 0.0081 (a = 5, b = 4, c = 0, d = 11); If we remove Val or Gly we obtain P = 0.014 (a = 4, b = 4, c = 0, d = 12); whereas, if both Val and Gly are removed, we obtain P = 0.031 (a = 3, b = 4, c = 0, d = 13). These probabilities indicate that considering His and Phe-Tyr as amino acids deriving from non-amino acid precursor does not substantially alter the results of the statistical test
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext