Volume 1, Issue 2, 2007    
  Development of a Classification Scheme Using a Secondary and Tertiary Amino Acid Analysis of Azoreductase Gene    

K. J. Abraham, Langston University, Kjabraham@lunet.edu
G. H. John, Oklahoma State University, gilbert.john10@okstate.edu



Azo dyes are metabolized to colorless aromatic amines by azoreductases. Genes coding for the azoreductase enzymes have been cloned and characterized in a number of bacteria. Primary amino acid sequence analysis of several azoreductase genes showed less identity, making classification difficult. We have made the first attempt to classify azoreductase genes based on secondary and tertiary structure. A web based program Deep View/Swiss Pdb Viewer was used in this study to predict secondary and tertiary structure based on its amino acid sequence and detect structural similarities and differences between species. Azoreductases from six bacterial species were analyzed for secondary and tertiary protein structures. It was determined that Enterococcus faecalis was very distinct and different from the others. The others showed very similar 3D images indicating that these azoreductase enzymes belonged to the same family


Azoreductase enzymes are present in different species of intestinal bacteria and are involved in the generation of toxic metabolites, which can contribute toward the development of disease such as cancer (Rafii, et al., 1997). Azoreductase enzymes catalyze the reductive cleavage of azo (N=N) linkages to produce aromatic amines that are potentially carcinogenic (Platzek et al., 1999). Azo dyes are a predominant class of colorants used in cosmetics, foods and consumer products. 

The ability to extract structural information from sequences has become increasingly important for bioinformatics research. Some proteins have structural similarities with other proteins and these may help in knowing common evolutionary origin. Knowledge of these relationships among the azoreductases from different bacterial strains are central to the understanding and classification of azoreductases.  

The simplest methods for clustering proteins into families relies on sequence-similarity measures, such as those obtained by BLAST (Altschul et al., 1990). Because experimental evidence about individual proteins is difficult to obtain, a common strategy is to classify proteins into families on the basis of the presence of shared features or by clustering using some similarity measure. For example, a loose classification exist which includes Type I family of azoreductases which are NADH dependent, and Type II family which are NADPH dependent. The underlying assumption is that members of the same family may possess similar or identical biochemical functions (Hegyi and Gerstein, 1999) and that one can assign the functions of well-characterized members of a family to other members whose functions are not known or not well understood (Heger and Holm, 2000). There are a number of tools that are important to study the structure and alignment of proteins. In favorable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing primary sequences. 

The application of secondary structure prediction may aid in classifying proteins, and assist in separating domains and identifying particular functional motifs. Therefore, classifying proteins based on secondary structure is promising. Recent years have seen a number of spectacular discoveries on surprisingly similar structures of proteins whose evolutionary kinship cannot be recognized based on primary sequence analysis alone (Gibrat et. al., 1996). Secondary structures, because they allow a simple and intuitive description of 3D structures, are widely employed in a number of structural studies. 

Based on the current literature regarding azoreductase activity and amino acid sequence data, it appears different azoreductases exist in different bacteria. More amino acid sequence data for azoreductase in bacteria will be required to determine the evolution of the enzyme. This work was aimed to further advance the field of azoreductase by comparing genetic information from existing azoreductase genes and developing a classification scheme using 3D imaging, which will help support the evolutionary relationships between the different bacteria. 


The following microorganisms and their azoreductases were analyzed (GenBank accession numbers are also listed): Rhodobacter sphaeroides (AY150311), Bacillus subtilis (AB071366), B. anthracis (AE016879), B. stearothermophilus (AB071367), Bacillus sp. (AB032601) and Enterococcus faecalis (AY422207). A comprehensive primary analysis of the nucleotide and amino acid sequences was performed using programs from the National Center for Biotechnology Information (NCBI). To classify the existing azoreductase genes, a phylogenetic analysis was performed using the primary amino acid sequence (DNASIS, 2.5). A dendrogram showing the branching order was generated.                

Since the initial primary sequence homology search with existing azoreductases showed very low identity (44.6% and 5%), the secondary and tertiary protein structures was analyzed using computer software programs from the ExPASy Molecular Biology Server (http://us.expasy.org/) and the Deep View Swiss-Pdb Viewer software (Deep View-Spdv 3.7). The Deep View Swiss-Pdb Viewer (Guex and Peitsch, 1997) was used to analyze several proteins at the same time and to thread the protein primary sequence on to a 3D template. The secondary structures of the existing azoreductase genes were then superimposed and compared. Azoreductase activity is based on the azo dye absorption reduction and is defined as a unit of specificity or mmole/min/mg enzyme.


The DNASIS dendrogram showed three different categories of azoreductases (Fig. 1).


Fig. 1.  Dendrogram; BACANT~1= Bacillus  anthracis; RHOSPH= Rhodobacter sphaeroides; BACSTE= B. stearothermophilus; BACSUB= B. subtilis; BACSP= Bacillus sp; ENTFAE= Enterococcus faecalis.     

The primary analysis of the amino acid sequences showed 45% to 52% identity (data not shown).

Fig. 2. Figure showing superimposed images of amino acid sequences generating a tertiary structure (3D) of
Bacillus subtilis (yellow) and Rhodobacter sphaeroides (blue).

Fig. 3. Figure showing superimposed images of amino acid sequences generating a tertiary structure (3D) of Bacillus subtilis (yellow) and Bacillus stearothermophilus (blue).

Fig. 4. Figure showing superimposed images of amino acid sequences generating a tertiary structure (3D) of B. subtilis (red) and Enterococcus faecalis (blue).

To classify different azoreductases using 3D imaging, predicted structures of the different azoreductase genes (Rhodobacter sphaeroides, Bacillus subtilis, B. anthracis, B. stearothermophilus, Bacillus sp., and Enterococcus faecalis) were generated using Deep View Swiss-Pdb Viewer. The comparison of superimposed tertiary structures (3D) of  azoreductase genes from  Rhodobacter sphaeroides, Bacillus subtilis, B. anthracis, B. stearothermophilus and Bacillus sp. showed close relationship and a high degree of similarity between the azoreductases (Figs. 2 and 3). E. faecalis had its own unique 3D structure (Fig. 4) as the secondary structure of E. faecalis was superimposed with B. subtilis. The azoreductases that were 44.6% different were put into one classification, while the single azoreductase that was 5% different remained a separate classification based on 3D imaging results. 


We have made the first effort to classify azoreductase enzymes based on secondary and tertiary structures. With the available genetic data, the nucleotide sequence alignment for different azoreductases showed very low identity. In addition, the dendrogram results showed low identity between different groups of azoreductases suggesting that the azoreductases may not be having similar structures. It is reported that all naturally evolved proteins with more than 35% pairwise identical residues, have similar structures (Rost, 1999). Based on the results of the conserved protein folds such as pleated sheets, helices, turns, and random coil in the tertiary structures of the six different bacteria, azoreductase from Rhodobacter sphaeroides, Bacillus subtilis, B. anthracis B. stearothermophilus, and Bacillus sp. are found to be closely related and can be classified under one family and Enterococcus faecalis as another family. Protein families have been classified based on predicted and observed secondary structure (Gerstein and Levitt, 1997; Przytycka et al., 1999). In addition, 2D and 3D structures enable enzyme comparisons and relationships analysis between different proteins that would otherwise be difficult using primary amino acid comparisons (Levin, et al., 1993). The results are interesting as the azoreductases have different degrees of identity (100% to 5%) providing three classification groups based on primary sequences but show two classification groups based on 3D structures. The function of a protein is determined by its 3D structure (Branden and Tooze, 1999) as shown with fibroblast growth factors (FGF-1 and FGF-2). Although there is only a 55% sequence identity between fibroblast growth factors, the three dimensional structures are nearly superimposable which corresponds with similar function (Zhu, et al., 1991). The 3D structures can also suggest different protein families as shown with the azoreductases from Xenophilus azovorans KF46 and Pigmentiphaga kullae K24 (Blumel et al., 2002, Blumel and Stolz, 2003).  

Therefore, azoreductases sharing a high degree of tertiary structure may have a similar function. For example, B. subtilis and Bacillus sp. have 44.5% identity but share similar but low methyl red dye activity, 0.9 and 3.24 units (pure azoreductase reaction mixtures are analyzed with dyes which denature azoreductase enzymes). On the other hand, the 3D imaging results puts the two bacteria into one classification. E. faecalis does not show any similarity with the other five bacterial strains neither in the primary sequence (5%) nor in the tertiary structure. But its activity towards methyl is high 10 units. The 3D imaging enables us to classify two groups of azoreductases. The results not only lead us to conclude that azoreductase from E. faecalis may belong to a different family, but because azoreductase possess different primary and tertiary differences as well as specific and broad substrate specificities, which makes the enzyme  difficult to classify. The 3D imaging approach has the potential to allow a more organized approach to classification as well as structure and function analyses for azoreductases.

Inclusion of more azoreductases and 3D imaging will prove whether this approach can assist in classifying the function of a diverse group of azoreductases from human intestinal anaerobic bacteria, which in turn can be applied to human health. From the evolutionary point of view, the azoreductase genes from the five different bacterial species have an evolutionary relationship and may have a common ancestral gene based on the dendrogram. With additional data, it will be possible to determine the evolution of the azoreductase gene.


This research was supported by a grant received from the Oklahoma Biomedical Research Infrastructure Network (OKBRIN). 


Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215: 403410.  

Branden, C., and Tooze, J. (1999). Introduction to Protein Structure, Second edition, Garland Publishing, New York. 

Blumel, S., Knackmuss, H, J and Stolz, A (2002). Appl. Environ. Microbiol., 68, 3948-3955. 

Blumel, S and Stolz, A. (2003). Appl. Microbiol. Biotech., 62, 186-190. 

Gerstein and Levitt, (1997). A Structural census of the current population of protein sequences. Proc. Natl. acad. Sc. U.S.A., 94, 11911-11916. 

Gibrat, J.F., Madej, T. and Bryant, S. H. (1996). Surprising similarities in structure comparison. Curr. Opin. Struct. Biol., 6(3): 377-385.  

Guex, N. and Peitsch, M.C. (1997). Swiss-Model and the Swiss-Pdb Viewer: An environment for comparative protein modeling. Electrophoresis. 18, 2714-2723.  

Hegyi, H. and Gerstein, M. (1999). The relationship between protein structure and function: A comprehensive survey with application to the yeast genome. J. Mol. Biol. 288: 147164. 

Heger, A. and Holm, L. (2000). Towards a covering set of protein family profiles. Prog. Biophys. Mol. Biol. 73: 321337. 

Levin, J . M., Pascarella, S., Argos, P. and Garnier, J. (1993). Quantification of secondary structure prediction improvement using multiple alignment. Prot. Eng., 6, 849-854. 

Platzek. T., Lang, C., Grohmann,G., Gi, U-S., and Baltes, W. (1999). Formation of a carcinogenic aromatic amine from an azo dye by human skin bacteria in vitro. Hum. Exp. Toxicol., 18. 522-559.  

Przytycka, T., Aurora, R. and Rose, G. D. (1999). A protein taxonomy based on secondary structure. Nature Struct. Biol., 6, 672-682.  

Rafii, F., Hall, J.D., and Cerniglia, C.E. (1997) Mutagenicity of azo dyes used in foods, drugs, and cosmetics before and after reduction by Clostridium species from the human intestinal tract. Food and Chemical Toxicology, 35:897-901. 

Rost, B. (1999). Twilight zone of protein sequence alignments. Prot. Eng., 12, 85-94   

Zhu, X., Komiya, H., Chirino, A., Faham, S., Fox, G.M., Arakawa, T., Hsu, B.T., Ress, D.D. (1991). Three dimensional structures of acidic and basic fibroblast growth factors. Science. 251, 90-93.  


  Return to top.    
| Home | Contact UsEditorial Board | Current Issue | Submission |
Copyright 2006, Scientific Journals International.  All Rights Reserved.