Skip to main content

In silico analysis of chimeric espA, eae and tir fragments of Escherichia coli O157:H7 for oral immunogenic applications



In silico techniques are highly suited for both the discovery of new and development of existing vaccines. Enterohemorrhagic Escherichia coli O157:H7 (EHEC) exhibits a pattern of localized adherence to host cells, with the formation of microcolonies, and induces a specific histopathological lesion (attaching/effacing). The genes encoding the products responsible for this phenotype are clustered on a 35-kb pathogenicity island. Among these proteins, Intimin, Tir, and EspA, which are expressed by attaching-effacing genes, are responsible for the attachment to epithelial cell that leads to lesions.


We designed synthetic genes encoding the carboxy-terminal fragment of Intimin, the middle region of Tir and the carboxy-terminal part of EspA. These multi genes were synthesized with codon optimization for a plant host and were fused together by the application of four repeats of five hydrophobic amino acids as linkers. The structure of the synthetic construct gene, its mRNA and deduced protein and their stabilities were analyzed by bioinformatic software. Furthermore, the immunogenicity of this multimeric recombinant protein consisting of three different domains was predicted.


a structural model for a chimeric gene from LEE antigenic determinants of EHEC is presented. It may define accessibility, solubility and immunogenecity.


Enterohemorrhagic Escherichia coli O157:H7 (EHEC) is an important human pathogen [1], causing diarrhea and in some cases hemolytic-uremic syndrome (HUS), leading to kidney failure and even death [2]. EHEC produces several virulence factors, enabling it to colonize the large bowel and cause disease [3].

Cattle are most frequently identified as the primary source of bacteria, so reduction in E. coli O157:H7 prevalence in cattle by vaccination represents an attractive strategy for reducing the incidence of human disease [4]. An experimental vaccine was recently shown to significantly reduce shedding of the organism under natural exposure conditions [5].

These pathogenic bacteria contain a chromosomal island known as the Locus of Enterocyte Effacement (LEE, 35KD), containing genes critical for forming the attachment and effacement (A/E) lesion. This locus can be divided into three functional regions: the first one encoding a type III secretion system; the second containing the genes eae and tir; and the third consisting of espD, espB, and espA[6, 7].

Intimin, a key colonization factor for EHEC O157:H7 acts as an outer membrane adhesion protein which is encoded by the gene eae. This protein mediates bacterial attachment through its C-terminal region to enterocytes by binding to Tir (Translocated Intimin Receptor) [8, 9].

Tir, a 78-kDa protein, is secreted from EHEC and is efficiently delivered into the host cell [10, 11].

The type III secretion system is involved in the secretion of different proteins including EspA, EspB, EspD, and Tir. EspA forms a filamentous structure on the bacterial surface as a bridge to the host cell surface. It delivers EspB, EspD, and Tir directly into the host cell. EspB is delivered primarily into the host cell membrane where it becomes an integral membrane protein and, along with EspD, forms a pore structure through which other bacterial effectors, such as Tir, enter the host cell [6, 12]. Additionally, studies on rabbit models indicate that pedestal formation is mediated by the same proteins (Intimin, EspA, EspB, EspD and Tir), and translocated Tir can bind to intimin via amino acids 258 to 361 [3, 13].

The Tir-Intimin interaction causes attachment of EHEC to the intestinal cell surface and triggers actin cytoskeletal rearrangements, resulting in pedestal formation. Recent evidence shows that active immunization of mice with recombinant Intimin from Citrobacter rodentium as a mouse model pathogen can prevent colonization of bacteria in the digestive tracts of animals [14].

These determinants are potent mucosal immunogens and induce humoral and mucosal responses (IgA instead of IgG) following oral administration [15, 16]. Among different systems for oral administration, transgenic plants are becoming more attractive because of their low cost, easy scale-up of production, natural storage organs (tubers and seeds), and established practices for efficient harvesting, storing, and processing [17, 18]. Moreover, a number of proteins such as recombinant antibodies and recombinant subunit vaccines have been expressed successfully in transgenic plants [19].

In this study we designed a new structural model containing three putative antigenic determinants of EspA, Intimin and Tir, fused together by hydrophobic linkers. Addition of the regulatory sequences Kozak and ER-retention signal at the 5' and 3' ends respectively, and codon optimization of this chimeric gene for expression in plants, were used to improve the efficiency of transcription and translation [2022]. Finally, a novel in silico approach was used to analyze the structure of the designed chimeric protein.


Design and construction of chimeric gene

The 282 amino acids from the carboxy terminus of Intimin have been reported to be involved in binding to its receptor Tir [23, 24]. The region of Tir involved in the interaction with intimin has also been mapped (residues 258 to 361, designated Tir 103) [25]. For the third fragment, a truncated form of espA (lacking 36 amino acids from the N-terminal of the protein, designated EspA 120) was selected. This part of EspA120 is exposed on the bacterial surface [6].

Upon sequence comparison by ClustalW, the C-terminals of intimin (282 amino acids) and EspA (120 amino acids) and the middle part of Tir (103 amino acids) showed high degree of conservation among different strains of E. coli O157:H7 (Data not shown).

These three parts were selected for designing a synthetic construct. In order to separate the different domains, linkers consisting of EAAAK repeats and expected to form a monomeric hydrophobic α-helix were designed. It has been shown that the salt bridge Glu--Lys+ between repeated Ala can stabilize helix formation [26]. Four repeated EAAAK sequences were introduced between different domains for more flexibility and efficient separation. The Kozak sequence [27] was added before the start codon in order to ensure high and accurate expression of mRNA in a eukaryotic host. For efficient accumulation of the recombinant protein in Endoplasmic Reticulum (ER), the sequence KDEL was added at the end of the synthetic construct. Arrangements of fragment junctions and linker sites are shown in Figure 1.

Figure 1
figure 1

Schematic model which shows the construction of EspA 120, Intimin 282 and Tir 103, bound together by the linkers for expression in plants; these fragments were selected on the basis of the common sequence found in different strains of E. coli O157 H7.

Bioinformatic analysis of the wild type and optimized synthetic gene

A synthetic sequence encoding the chimeric gene was designed using plant codon bias. To optimize the synthetic gene, negatively cis acting motifs and repeated sequences were avoided. Both the wild type and the synthetic chimera were analyzed for their codon bias (Figure 2A) and GC content (Figure 2B),

Figure 2
figure 2

A: Codon usage analysis of wild type and optimized gene for expression in plants. The value of 100 is set for the codon with the highest usage frequency for a given amino acid in the desired expression into plants. This procedure allows us to compare the adaptiveness of different codons relative to each other (relative adaptiveness). Plots represent the relative adaptiveness of a given codon at the indicated codon position. B: GC analysis of wild type and optimized chimeric gene. Plots represent the average GC content, before and after optimization.

The overall GC content was reduced from 41.59 to 40.96%, which should increase the overall stability of mRNA from the synthetic gene. Moreover, there was no sequence stretch within the gene showing an average GC content below 40%.

The optimized gene showed a codon bias for plants and contained no rarely used codon. This is also reflected by the codon adaptation index (CAI), which is a measurement of the relative adaptiveness of the codon usage of a gene compared with the codon usage of highly expressed genes. The chimeric gene showed a CAI of 0.98, compared to that of the wild type gene, which was only 0.76 [28].

Within the synthetic construct, the splice sites, polyadenylation signal, instability elements, and all the cis-acting sites that may have a negative influence on the expression rate were removed (Table 1). Furthermore, the necessary restriction enzyme sites (Xb aI and Sac I) were introduced at the ends of the sequence for cloning purpose.

Table 1 Analysis of cis-acting elements

mRNA structure prediction

A genetic algorithm-based RNA secondary structure prediction was combined with comparative sequence analysis to determine the potential folding of the chimeric gene. The 5' terminus of the gene was folded in the way typical of all bacterial gene structures. The minimum free energy for secondary structures formed by RNA molecules was also predicted. All 34 structural elements obtained in this analysis revealed folding of the RNA construct. The data showed the mRNA was stable enough for efficient translation in the new host (Data not shown) [29].

Protein secondary structure prediction

The secondary structure of the chimeric protein was predicted by online software. Three prediction methods were compared for evaluating the structure of this protein. The results showed that helix structures lie in the regions of aa 129 to 148 and aa 431 to 450, which are related to the hydrophobic amino acids inserted between different domains (Figure 3) [30, 31].

Figure 3
figure 3

Analysis of chimeric EspA-Intimin-Tir protein secondary structure.

Tertiary structural prediction for the chimeric protein

Comparative and ab initio modeling of the synthetic sequence was exploited to produce 3D models of the chimeric protein. Two hundred thirty three-dimensional models were generated for this chimeric protein. The models were uploaded to the server to draw the tertiary structural illustrations with Swiss-PdbViewer and Rasmol software in order to determine the final structure of the protein. Furthermore, SCRATCH servers developed by California University were used for protein structure prediction by PSI-BLAST and neural networks. There were two α-helices and several β-turns, which were consistent with the results of secondary structure analyses. The results of tertiary structure prediction showed the formation of three separate domains of the chimeric protein (Figure 4) [32, 33].

Figure 4
figure 4

Ab initio and comparative modeling was used to predict the tertiary structure of the chimeric protein, EspA-Intimin-Tir. The result was viewed by Rasmol software.

Evaluation of model stability

The profile of energy minimization was calculated by spdbv (Swiss-PdbViewer) (-1391.230 Kcal/mol) indicating that the recombinant protein had acceptable stability compared to that of original structure of each domain. Additionally, the data generated by a Ramachandran plot confirmed the structural stability of the protein (Figure 5).

Figure 5
figure 5

(A) Evaluation of model stability based on a Ramachandran plot and (B) energy minimization.

Solvent accessibility prediction

The solvent accessibility distributions were characterized using the major hydrophobic and polarity properties of residual patterns. These patterns showed that the mean residue accessible surface area (ASA) gave a high solvent accessibility value, approximately fifty percent (Data not shown) [34].

Prediction of B-cell epitopes

Different factors such as hydrophilicity, plasticity, exterior accessibility, antigenicity and secondary structure were used to predict the chimeric protein epitopes. The epitopes located on the surface of the protein could interact easily with antibodies, and they were generally flexible. Bcepred software was used to determine the continuous B cell epitope based on single characters including hydrophilicity, antigenicity, flexibility, accessibility, polarity and exposed surface (Table 2). As shown in Table 2, linkers between different domains (aa 129 to 148 and aa 431 to 450) contained no epitope sites [3537]. Furthermore, the conformational epitopes for B cells were predicted by the Discotope server (Table 3) [38].

Table 2 Epitopes predicted in chimeric protein by different parameters based on Bcepred software
Table 3 One hundred and eighteen discontinuous B-Cell epitopes of chimeric protein predicted by the Discotope server


Many bacterial pathogens infect or invade their hosts via mucosal surfaces. This process is initiated by the attachment of the bacteria to the cell membrane via specific receptors. Enterohemorrhagic E. coli is a good model and has been well studied in this context. In this bacterium, the antigens Intimin, EspA, and Tir are required for attachment to the intestinal mucosa [39]. If the function of these receptors was impaired, the bacterium could not attach to the host cell surface and the disease would be suppressed. This impairment is related to the production of immunoglobulin class A (IgA), which is the dominant antibody on the mucosal surface [2].

Therefore, mucosal immunization especially via the oral route is an attractive strategy for inducing protective immunity against mucosal pathogens [40]. Several vehicles (Polymers, Alginate, Polyphosphazenes and other biodegradable polymers, Immunostimulating complexes (ISCOM), Liposomes) [41] have been used for delivering antigen to the target tissue. The capacity of plants for producing vaccines which could induce mucosal immunity is a great advantage. Plant cells act as a natural microencapsulation system to protect the vaccine antigens from being degraded in the upper digestive tract before they can reach the gut-associated lymphoid tissue (GALT) [18]. Studies on B subunit labile toxin (LTB) suggest that plant-based oral vaccines can significantly boost mucosal immune responses that have been primed by parenteralinjection [42].

One the most important problems in transgenic plants is low level production of recombinant immunogenic protein. To solve this problem, different strategies such as strong promoter, organelle targeting and organelle transformation have been used [17]. Furthermore, synthetic genes with plant codon optimization have been used to mimic highly expressed plant genes. The effective applications of synthetic genes in plants have been proven by other researchers [16].

Two types of vaccines are available against E. coli O157:H7: one is a genetically engineered vaccine tested on a small group of adult volunteers. It appears safe and stimulates the production of antibodies against the potentially fatal pathogen [43]. The other is Econiche (made from an extract of lysed bacteria containing type III secretion proteins) for vaccination of healthy cattle as an aid in reducing shedding of Escherichia coli O157: H [44]. Both of these vaccines are high risk and are insufficiently safe and for this reason we attempted to design multi component antigens which can create protection and prevent colonization. This construct should contain essential antigenic factors of E. coli O157:H7 that are exposed completely.

On the basis of knowledge of molecular modeling and immuno-informatics, a novel approach was employed to identify a set of peptides that could be used as a vaccine either in natural or in synthetic form. This approach has been extended to the entire proteomes of other microorganisms such as T-cell epitopes of secretory proteins of Mycobacterium tuberculosis[45, 46], Tertiary Structure of Mycobacterium leprae Hsp65 Protein [47], T-cell antigen of Chlamydia[48], tandem repeat antigens from Leishmania donovani[49], and Envelope Glycoprotein of Japanese Encephalitis Virus (JEV) [50] to identify new sets of potentially antigenic proteins.

Here we designed new constructs of EHEC antigens including EspA, Intimin and Tir that contained essential determinants for bacterial attachment and effacement. Theoretically, the DNA fragment consisted of these three putative antigens and could be synthesized as a unique construct optimally suited for expression in a plant system. Several factors which can affect the expression of foreign genes in plant systems such as messenger RNA instability [51], premature polyadenylation [52], abnormal splicing [53], and improper codon usage have been reported [54]. In order to increase the mRNA stability, DNA motifs that might contribute to mRNA instability in plants, such as the ATTTA sequence and the potential polyadenylation signal sequence AATAAA, were eliminated from the synthetic gene (for detail see Table 1). The synthetic DNA fragment which encoded the mature chimeric gene was constructed based on the codon usage of highly expressed nuclear-encoded genes of tobacco (Nicotiana tobaccum L.) as a model, and canola (Brassica napus L.) as the final target plant [55].

The efficiency of heterologous protein production can be diminished by biased codon usage. Approaches normally used to overcome this problem include targeted mutagenesis to remove rare codons or the addition of rare codon tRNAs in specific cell lines. Recently, improvements in the technology have enabled synthetic genes to be produced cost-effectively, making this a feasible alternative [56]. In addition, as each step in the process of gene expression, from the transcription of DNA into mRNA to the folding and posttranslational modification of proteins, is regulated by complex cellular mechanisms, a relationship is expected to exist between mRNA expression levels and protein solubility in the cell. By formulating a relation between the mRNA expression level and the recombinant protein, production can be reasonably predicted [57].

In eukaryotic mRNA, the consensus sequence surrounding the start codon (Kozak seq. 5'GCC ACCATGGC) can increase the correctness and efficiency of translation up to 10 fold. In the synthetic construct, the 5'GCCACC sequence was added before the ATG codon. The second codon following the initial methionine was Ala, encoded by the codon GCT, and the necessary GC was provided; therefore there was no need to replace the other nucleotides or amino acids [27]. Codons that are rarely used in plants, such as XCG and XUA (X denotes U, C, A, or G), were avoided in the construction of the synthetic gene (Figure 2B). It has been reported that rare codons in mRNA tend to form higher-order secondary structures, which might require additional time for ribosomal movement through the critical region [58].

An ideally biased gene would show a codon adaptation index (CAI) of 1.0. Even though no natural plant gene reaches this theoretical value, this index was increased from 76% in the wild type chimeric sequence to 98% in this synthetic gene. Furthermore, the G/C ratio and distribution were balanced from 41.59 to 40.96 percent with no significant changes, and this has been reported to be associated with low mRNA stability and expression in higher plants [55]. The nucleotide that encodes the ER retention signal (KDEL) which helps to accumulate the recombinant protein inside the endoplasmic reticulum was fused in-frame at the 3' end of the chimeric gene [15, 16]. Finally, the required restriction enzyme sites (Xba I and Sac I) were introduced at the ends of the synthetic gene for future cloning into plant expression vectors.

Graphical depiction of the predicted minimum free energy for the synthetic gene showed that the average energy minimization was near - 400 Kcal/mol.

Comparison of the synthetic gene with the original one revealed no major difference between these two molecules and their structures were compatible with each other.

In the protein structure prediction, the chimeric protein formed three domains that were separated by two main α-helix moieties which could help the protein to form a final structure. These α-helix structures are related to the designation of special amino acid sequences, residues 129-148 and 431-450, which are inserted between domains. With these results we could speculate that these parts could support the stable structure of a protein which contained three domains.

B-cell epitopes for the chimeric protein could be predicted on the basis of the structural prediction and solvent accessibility. Hopp and Woods in the 1980s developed a method for predicting B-cell epitopes with hydrophilicity parameters. Since then, several distinct methods such as Hydrophilicity method, Accessibility method, Antigenicity method, Flexibility method and secondary structure analysis have been developed [36, 39]. Applying just one of these methods is not enough for obtaining results good enough to predict the B-cell epitope. In this study, we combined all the data obtained by these analyses and predicted the B-cell epitopes.

The integrated results showed that the most likely B-cell epitopes of this chimeric protein, as shown in Table 2, were located in three distinct parts, selected as the EspA, Intimin, and Tir domains.

For eliciting an immune response against E. coli O157:H7, studies have shown that production of the carboxy terminal part of Intimin in a transgenic plant cell line and its application via the oral route is more effective than injection [16]. In this study, we designed a multi domain antigen which was selected on the basis of three immunogenic parts of attaching/effacing loci from E. coli O157:H7, which were then optimized upon plant codon preference for analyzing mucosal and systematic immunity.


Bioinformatics tools for predicting epitopes are now a standard methodology. In silico epitope mapping, combined with in vitro and in vivo verification, accelerates the discovery process by approximately 10-20-fold. Development of sophisticated bioinformatics tools will provide a platform for more in-depth analysis of immunological data and facilitate the construction of new hypotheses to explain the complex immune system function [59].

In this study, we have combined several techniques and profiles to improve the state-of-the-art prediction of 3D structure and relative solvent accessibility. Building a homology model for this chimeric protein has been used to understand the antigenic sites and structural conformation domains which were used to predict continuous and discontinuous epitopes. Also, for the antibody-antigen interaction, it is important to know how much area of surface is exposed; accordingly we defined the exposed areas and surface accessibility.

Considering the multi colonization factor of this bacterium, multi antigenic parts should be used for repressing this pathogen. For this reason, more research should focus on designing multi antigenic proteins from E. coli O157:H7. This study and a few others [60, 61] indicate that epitope construction and prediction will be useful not only in vaccine development but also in the prospective engineering and re-engineering of protein therapeutics, reducing the risk of undesired immunogenecity and improving the likelihood of success in clinical use.

Finally, the conclusions drawn for E. coli O157:H7 proteins could be combined with expression profiling to identify genes whose expression changes under shifting environmental conditions [62].

In conclusion, we believe that all of these findings will intensify efforts to develop a vaccine candidate against E. coli O157:H7.


Sequence analysis

Related sequences for espA (40 sequences), eae (32 sequences) and tir (50 sequences) were obtained from Genbank (accession no. not shown). Multiple sequence alignments were performed using ClustalW software (EBI, UK) in order to identify a fragment common to all the sequences.

Construct design

An antigenic sequence was constructed by fusing the C-terminal of espA, C-terminal of eae and middle fragment of tir using hydrophobic amino acid linkers (accession no. GQ205376).

The in silico gene analysis and multi parameter gene optimization of the synthetic chimera gene was performed using Stand-alone softwares such as Leto (Entelechon, Germany), DNA 2.0, DNAsis MAX (Hitachi Software), and online data bases and softwares such as the codon database, Gene bank codon data base and Swissprot reverse translation online tool The desired properties were verified by Gen-Script (NJ, USA). The multimeric gene was synthesized by ShineGene Molecular Biotech, Inc (Shanghai, China).

Bioinformatic analysis of chimeric recombinant protein

The messenger RNA secondary structure of the chimeric gene was analyzed by the program mfold Recombinant protein Secondary-structure predictions were performed by the neural-network-based algorithm program (PHD), and for 3D structure, online ab initio software was used[63]. 3D structural stability of the synthetic protein was further analyzed by Swiss-PdbViewer for energy minimization [64]. Solvent accessibility of different residues was evaluated by DSSP and other online programs (VADAR) The predictive value of the hyper glycosylation code which may act in plants is well established based on online software[65].

Prediction of B-cell epitopes

The amino acid sequence was analyzed using three web-based B-cell epitope prediction algorithms; Bcepred, Continuous B cell epitopes prediction methods based on physico-chemical properties on a non-redundant dataset, and the Discotope Server for predicting discontinuous B cell epitopes from three-dimensional protein structures. Briefly, chimeric proteins were analyzed first for continuous B-cell epitopes using Bcepred and then using the Discotope server to predict discontinuous B cell epitopes. Finally, we used the VaxiJen server to predict the immunogenecity of the whole antigen and its subunit vaccine [48, 66, 67].


  1. Van Diemen PM, Dziva F, Abu-Median A, Wallis TS, Bosch Van den H, Dougan G: Subunit vaccines based on intimin and Efa-1 polypeptides induce humoral immunity in cattle but do not protect against intestinal colonisation by enterohaemorrhagic Escherichia coli O157:H7 or O26:H. Vet Immunol Immunopathol. 2007, 116 (1-2): 47-58. 10.1016/j.vetimm.2006.12.009.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  2. Babiuk S, Asper DJ, Rogan D, Mutwiri GK, Potter AA: Subcutaneous and intranasal immunization with type III secreted proteins can prevent colonization and shedding of Escherichia coli O157:H7 in mice. Microb Pathog. 2008, 45 (1): 7-11. 10.1016/j.micpath.2008.01.005.

    Article  CAS  PubMed  Google Scholar 

  3. Li Y, Frey E, Mackenzie AM, Finlay BB: Human response to Escherichia coli O157:H7 infection: antibodies to secreted virulence factors. Infect Immun. 2000, 68 (9): 5090-5095. 10.1128/IAI.68.9.5090-5095.2000.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. McNeilly TN, Naylor SW, Mahajan A, Mitchell MC, McAteer S, Deane D: Escherichia coli O157:H7 colonization in cattle following systemic and mucosal immunization with purified H7 flagellin. Infect Immun. 2008, 76 (6): 2594-2602. 10.1128/IAI.01452-07.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  5. Van Donkersgoed J, Hancock D, Rogan D, Potter AA: Escherichia coli O157:H7 vaccine field trial in 9 feedlots in Alberta and Saskatchewan. Can Vet J. 2005, 46 (8): 724-28.

    PubMed Central  PubMed  Google Scholar 

  6. Kühne SA, Hawes WS, La Ragione RM, Woodward MJ, Whitelam GC, Gough KC: Isolation of recombinant antibodies against EspA and intimin of Escherichia coli O157:H7. J Clin Microbiol. 2004, 42 (7): 2966-76. 10.1128/JCM.42.7.2966-2976.2004.

    Article  PubMed Central  PubMed  Google Scholar 

  7. Garrido P, Blanco M, Moreno-Paz M, Briones C, Dahbi G, Blanco J, Blanco J, Parro V: STEC-EPEC Oligonucleotide Microarray: A New Tool for Typing Genetic Variants of the LEE Pathogenicity Island of Human and Animal Shiga Toxin-Producing Escherichia coli (STEC) and Enteropathogenic E. coli (EPEC) Strains. Clin Chem. 2006, 52 (2): 192-201. 10.1373/clinchem.2005.059766.

    Article  CAS  PubMed  Google Scholar 

  8. China B, Jacquemin E, Devrin AC, Pirson V, Mainil J: Heterogeneity of the eae genes in attaching/effacing Escherichia coli from cattle: comparison with human strains. Res Microbiol. 1999, 150 (5): 323-32. 10.1016/S0923-2508(99)80058-8.

    Article  CAS  PubMed  Google Scholar 

  9. La Ragione RM, Patel S, Maddison B, Woodward MJ, Best A, Whitelam GC, Gough KC: Recombinant anti-EspA antibodies block Escherichia coli O157:H7-induced attaching and effacing lesions in vitro. Microbes Infect. 2006, 8 (2): 426-33. 10.1016/j.micinf.2005.07.009.

    Article  CAS  PubMed  Google Scholar 

  10. Paton AW, Manning PA, Woodrow MC, Paton JC: Translocated intimin receptors (Tir) of Shiga-toxigenic Escherichia coli isolates belonging to serogroups O26, O111, and O157 react with sera from patients with hemolytic-uremic syndrome and exhibit marked sequence heterogeneity. Infect Immun. 1998, 66 (11): 5580-6.

    PubMed Central  CAS  PubMed  Google Scholar 

  11. Goffaux F, China B, Dams L, Clinquart A, Daube G: Development of a genetic traceability test in pig based on single nucleotide polymorphism detection. Forensic Sci Int. 2005, 151 (2-3): 239-47. 10.1016/j.forsciint.2005.02.013.

    Article  CAS  PubMed  Google Scholar 

  12. Yuste M, Orden JA, De La Fuente R, Ruiz-Santa-Quiteria JA, Cid D, Martínez-Pulgarín S, Domínguez-Bernal G: Polymerase chain reaction typing of genes of the locus of enterocyte effacement of ruminant attaching and effacing Escherichia coli. Can J Vet Res. 2008, 72 (5): 444-48.

    PubMed Central  PubMed  Google Scholar 

  13. Cleary J, Lai LC, Shaw RK, Straatman-Iwanowska A, Donnenberg MS, Frankel G, Knutton S: Enteropathogenic Escherichia coli (EPEC) adhesion to intestinal epithelial cells: role of bundle-forming pili (BFP), EspA filaments and intimin. Microbiology. 2004, 150 (3): 527-38. 10.1099/mic.0.26740-0.

    Article  CAS  PubMed  Google Scholar 

  14. Dean-Nystrom EA, Gansheroff LJ, Mills M, Moon HW, O'Brien AD: Vaccination of pregnant dams with intimin(O157) protects suckling piglets from Escherichia coli O157:H7 infection. Infect Immun. 2002, 70 (5): 2414-8. 10.1128/IAI.70.5.2414-2418.2002.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Kang TJ, Han SC, Jang MO, Kang KH, Jang YS, Yang MS: Enhanced expression of B-subunit of Escherichia coli heat-labile enterotoxin in tobacco by optimization of coding sequence. Appl Biochem Biotechnol. 2004, 117 (3): 175-87. 10.1385/ABAB:117:3:175.

    Article  CAS  PubMed  Google Scholar 

  16. Judge NA, Mason HS, O'Brien AD: Plant cell-based intimin vaccine given orally to mice primed with intimin reduces time of Escherichia coli O157:H7 shedding in feces. Infect Immun. 2004, 72 (1): 168-75. 10.1128/IAI.72.1.168-175.2004.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  17. Schillberg S, Twyman RM, Fischer R: Opportunities for recombinant antigen and antibody expression in transgenic plants--technology assessment. Vaccine. 2005, 23 (15): 1764-9. 10.1016/j.vaccine.2004.11.002.

    Article  CAS  PubMed  Google Scholar 

  18. Lal P, Ramachandran VG, Goyal R, Sharma R: Edible vaccines: current status and future. Indian J Med Microbiol. 2007, 25 (2): 93-102. 10.4103/0255-0857.32713.

    Article  CAS  PubMed  Google Scholar 

  19. Suo G, Chen B, Zhang J, Duan Z, He Z, Yao W, Yue C, Dai J: Effects of codon modification on human BMP2 gene expression in tobacco plants. Plant Cell Rep. 2006, 25 (7): 689-97. 10.1007/s00299-006-0133-6.

    Article  CAS  PubMed  Google Scholar 

  20. Mechold U, Gilbert C, Ogryzko V: Codon optimization of the BirA enzyme gene leads to higher expression and an improved efficiency of biotinylation of target proteins in mammalian cells. J Biotechnol. 2005, 116 (3): 245-49. 10.1016/j.jbiotec.2004.12.003.

    Article  CAS  PubMed  Google Scholar 

  21. Lim LH, Li HY, Cheong N, Lee BW, Chua KY: High-level expression of a codon optimized recombinant dustmite allergen, Blot5, in Chinese hamster ovary cells. Biochem Biophys Res Commun. 2004, 316 (4): 991-96. 10.1016/j.bbrc.2004.02.148.

    Article  CAS  PubMed  Google Scholar 

  22. Gustafsson C, Govindarajan S, Minshull J: Codon bias and heterologous protein expression. Trends Biotechnol. 2004, 22 (7): 346-353. 10.1016/j.tibtech.2004.04.006.

    Article  CAS  PubMed  Google Scholar 

  23. Batchelor M, Prasannan S, Daniell S, Reece S, Connerton I, Bloomberg G: Structural basis for recognition of the translocated intimin receptor (Tir) by intimin from enteropathogenic Escherichia coli. EMBO J. 2000, 19 (11): 2452-64. 10.1093/emboj/19.11.2452.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  24. Frankel G, Candy DC, Everest P, Dougan G: Characterization of the C-terminal domains of intimin-like proteins of enteropathogenic and enterohemorrhagic Escherichia coli, Citrobacter freundii, and Hafnia alvei. Infect Immun. 1994, 62 (5): 1835-42.

    PubMed Central  CAS  PubMed  Google Scholar 

  25. Hartland EL, Batchelor M, Delahay RM, Hale C, Matthews S, Dougan G: Binding of intimin from enteropathogenic Escherichia coli to Tir and to host cells. Mol Microbiol. 1999, 32 (1): 151-8. 10.1046/j.1365-2958.1999.01338.x.

    Article  CAS  PubMed  Google Scholar 

  26. Arai R, Ueda H, Kitayama A, Kamiya N, Nagamune T: Design of the linkers which effectively separate domains of a bifunctional fusion protein. Protein Eng. 2001, 14 (8): 529-32. 10.1093/protein/14.8.529.

    Article  CAS  PubMed  Google Scholar 

  27. Kozak M: The scanning model for translation: an update. J Cell Biol. 1989, 108 (2): 229-41. 10.1083/jcb.108.2.229.

    Article  CAS  PubMed  Google Scholar 

  28. Graf M, Deml L, Wagner R: Codon-optimized genes that enable increased heterologous expression in mammalian cells and elicit efficient immune responses in mice after vaccination of naked DNA. Methods Mol Med. 2004, 94: 197-210.

    CAS  PubMed  Google Scholar 

  29. Zuker M: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003, 31 (13): 3406-15. 10.1093/nar/gkg595.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  30. Garnier J, Gibrat JF, Robson B: GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol. 1996, 266: 540-53. full_text.

    Article  CAS  PubMed  Google Scholar 

  31. Rost B, Sander C, Schneider R: PHDsec an automatic mail server for protein secondary structure prediction. Comput Appl Biosci. 1994, 10 (1): 53-60.

    CAS  PubMed  Google Scholar 

  32. Yang S, Onuchic JN, García AE, Levine H: Folding time predictions from all-atom replica exchange simulations. J Mol Biol. 2007, 372 (3): 756-63. 10.1016/j.jmb.2007.07.010.

    Article  CAS  PubMed  Google Scholar 

  33. Ginalski K: Comparative modeling for protein structure prediction. Curr Opin Struct Biol. 2006, 16 (2): 172-7. 10.1016/

    Article  CAS  PubMed  Google Scholar 

  34. Rost B, Sander C: Conservation and prediction of solvent accessibility in protein families. Proteins. 1994, 20 (3): 216-26. 10.1002/prot.340200303.

    Article  CAS  PubMed  Google Scholar 

  35. Parker JM, Guo D, Hodges RS: New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. Biochemistry. 1986, 25 (19): 5425-32. 10.1021/bi00367a013.

    Article  CAS  PubMed  Google Scholar 

  36. Kolaskar AS, Tongaonkar PC: A semi-empirical method for prediction of antigenic determinants on protein antigens. FEBS Lett. 1990, 276 (1-2): 172-4. 10.1016/0014-5793(90)80535-Q.

    Article  CAS  PubMed  Google Scholar 

  37. Ponnuswamy PK, Prabhakaran M, Manavalan P: Hydrophobic packing and spatial arrangement of amino acid residues in globular proteins. Biochim Biophys Acta. 1980, 623 (2): 301-16.

    Article  CAS  PubMed  Google Scholar 

  38. Saha S, Bhasin M, Raghava GP: Bcipep: a database of B-cell epitopes. BMC Genomics. 2005, 6 (1): 79-10.1186/1471-2164-6-79.

    Article  PubMed Central  PubMed  Google Scholar 

  39. Karpman D, Békássy ZD, Sjögren AC, Dubois MS, Karmali MA, Mascarenhas M: Antibodies to intimin and Escherichia coli secreted proteins A and B in patients with enterohemorrhagic Escherichia coli infections. Pediatr Nephrol. 2002, 17 (3): 201-11. 10.1007/s00467-001-0792-z.

    Article  PubMed  Google Scholar 

  40. Julia Scerbo M, Bibolini MJ, Barra JL, Roth GA, Monferran CG: Expression of a bioactive fusion protein of Escherichia coli heat-labile toxin B subunit to a synapsin peptide. Protein Expr Purif. 2008, 59 (2): 320-6. 10.1016/j.pep.2008.02.017.

    Article  CAS  PubMed  Google Scholar 

  41. Gerdts V: Mucosal delivery of vaccines in domestic animals. Vet Res. 2006, 487 (37): 487-510. 10.1051/vetres:2006012.

    Article  Google Scholar 

  42. Lauterslager TG, Florack DE, Wal van der TJ, Molthoff JW, Langeveld JP, Bosch D: Oral immunisation of naive and primed animals with transgenic potato tubers expressing LT-B. Vaccine. 2001, 19 (17-19): 2749-55. 10.1016/S0264-410X(00)00513-2.

    Article  CAS  PubMed  Google Scholar 

  43. Stephenson J: E coli O157 Vaccine. JAMA. 1998, 279 (11): 818-b-10.1001/jama.279.11.818.

    Article  Google Scholar 

  44. Potter AA, Klashinsky S, Li Y, Frey E, Townsend H, Rogan D, Erickson G, Hinkley S, Klopfenstein T, Moxley RA, Smith DR, Finlay BB: Decreased shedding of Escherichia coli O157:H7 by cattle following vaccination with type III secreted proteins. Vaccine. 2004, 22: 362-369. 10.1016/j.vaccine.2003.08.007.

    Article  CAS  PubMed  Google Scholar 

  45. Mustafa AS: Recombinant and synthetic peptides to identify Mycobacterium tuberculosis antigens and epitopes of diagnostic and vaccine relevance. Tuberculosis. 2005, 85 (5-6): 367-76. 10.1016/

    Article  CAS  PubMed  Google Scholar 

  46. Vani J, Shaila MS, Chandra NR, Nayak R: A combined immuno-informatics and structure-based modeling approach for prediction of T cell epitopes of secretory proteins of Mycobacterium tuberculosis. Microbes Infect. 2006, 8 (3): 738-46. 10.1016/j.micinf.2005.09.012.

    Article  CAS  PubMed  Google Scholar 

  47. Rossetti RAM, Lorenzi JCC, Giuliatti S, Silva CL, Coelho CAAM: In Silico Prediction of the Tertiary Structure of M. leprae Hsp65 Protein Shows an Unusual Structure in Carboxy-terminal Region. J Comp Sci Syst Biol. 2008, 1: 126-131. 10.4172/jcsb.1000012.

    Article  CAS  Google Scholar 

  48. Barker CJ, Beagley KW, Hafner LM, Timms P: In silico identification and in vivo analysis of a novel T-cell antigen from Chlamydia, NrdB. Vaccine. 2008, 26 (10): 1285-96. 10.1016/j.vaccine.2007.12.048.

    Article  CAS  PubMed  Google Scholar 

  49. Goto Y, Coler RN, Reed SG: Bioinformatic identification of tandem repeat antigens of the Leishmania donovani complex. Infect Immun. 2007, 75 (2): 846-51. 10.1128/IAI.01205-06.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  50. Kolaskar AS, Kulkarni-Kale U: Prediction of three-dimensional structure and mapping of conformational epitopes of envelope glycoprotein of Japanese encephalitis virus. Virology. 1999, 261 (1): 31-42. 10.1006/viro.1999.9859.

    Article  CAS  PubMed  Google Scholar 

  51. Murray EE, Rocheleau T, Eberle M, Stock C, Sekar V, Adang M: Analysis of unstable RNA transcripts of insecticidal crystal protein genes of Bacillus thuringiensis in transgenic plants and electroporated protoplasts. Plant Mol Biol. 1991, 16 (6): 1035-50. 10.1007/BF00016075.

    Article  CAS  PubMed  Google Scholar 

  52. Jarvis P, Belzile F, Dean C: Inefficient and incorrect processing of the Ac transposase transcript in iae1 and wild-type Arabidopsis thaliana. Plant J. 1997, 11 (5): 921-31. 10.1046/j.1365-313X.1997.11050921.x.

    Article  CAS  PubMed  Google Scholar 

  53. Haseloff J, Siemering KR, Prasher DC, Hodge S: Removal of a cryptic intron and subcellular localization of green fluorescent protein are required to mark transgenic Arabidopsis plants brightly. Proc Natl Acad Sci USA. 1997, 94 (6): 2122-7. 10.1073/pnas.94.6.2122.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  54. Perlak FJ, Fuchs RL, Dean DA, McPherson SL, Fischhoff DA: Modification of the coding sequence enhances plant expression of insect control protein genes. Proc Natl Acad Sci USA. 1991, 88 (8): 3324-8. 10.1073/pnas.88.8.3324.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  55. Campbell WH, Gowri G: Codon Usage in Higher Plants, Green Algae, and Cyanobacteria. Plant Physiol. 1990, 92 (1): 1-11. 10.1104/pp.92.1.1.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  56. Burgess-Brown NA, Sharma S, Sobott F, Loenarz C, Oppermann U, Gileadi O: Codon optimization can improve expression of human genes in Escherichia coli: A multi-gene study. Protein Expr Purif. 2008, 59 (1): 94-102. 10.1016/j.pep.2008.01.008.

    Article  CAS  PubMed  Google Scholar 

  57. Tartaglia GG, Pechmann S, Dobson CM, Vendruscolo M: A Relationship between mRNA Expression Levels and Protein Solubility in E. coli. JMol Biol. 2009, 388 (2): 381-89. 10.1016/j.jmb.2009.03.002.

    Article  CAS  Google Scholar 

  58. Thanaraj TA, Argos P: Ribosome-mediated translational pause and protein domain organization. Protein Sci. 1996, 5 (8): 1594-612. 10.1002/pro.5560050814.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  59. ChuanTong J, Tan TW, Ranganathan S: Methods and protocols for prediction of immunogenic epitopes. BRIEF BIOINFORM. 2006, 8 (2): 96-108. 10.1093/bib/bbl038.

    Article  Google Scholar 

  60. Barbosa MD, Vielmetter J, Chu S, Smith DD, Jacinto J: Clinical link between MHC class II haplotype and interferon-beta (IFN-beta) immunogenicity. Clin Immunol. 2006, 118 (1): 42-50. 10.1016/j.clim.2005.08.017.

    Article  CAS  PubMed  Google Scholar 

  61. Koren E, De Groot AS, Jawa V, Beck KD, Boone T, Rivera D, Li L, Mytych D: Clinical validation of the "in silico" prediction of immunogenicity of a human recombinant therapeutic protein. Clin Immunol. 2007, 124 (1): 26-32. 10.1016/j.clim.2007.03.544.

    Article  CAS  PubMed  Google Scholar 

  62. Allen TE, Herrgard MJ, Liu M, Qiu Y, Glasner JD, Blattner FR, Palsson B: Genome-Scale Analysis of the Uses of the Escherichia coli Genome: Model-Driven Analysis of Heterogeneous Data Sets. J BACTERIOL. 2003, 185 (21): 6392-6399. 10.1128/JB.185.21.6392-6399.2003.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  63. Langedijk JP, Daus FJ, van Oirschot JT: Sequence and structure alignment of Paramyxoviridae attachment proteins and discovery of enzymatic activity for a morbillivirus hemagglutinin. J Virol. 1997, 8: 6155-67.

    Google Scholar 

  64. Edwards YJ, Cottage A: Bioinformatics methods to predict protein structure and function. A practical approach. Mol Biotechnol. 2003, 23 (2): 139-66. 10.1385/MB:23:2:139.

    Article  CAS  PubMed  Google Scholar 

  65. Xu J, Tan L, Lamport DT, Showalter AM, Kieliszewski MJ: The O-Hyp glycosylation code in tobacco and Arabidopsis and a proposed role of Hyp-glycans in secretion. Phytochemistry. 2008, 69 (8): 1631-40. 10.1016/j.phytochem.2008.02.006.

    Article  CAS  PubMed  Google Scholar 

  66. Davies MN, Flower DR: Harnessing bioinformatics to discover new vaccines. Drug Discov Today. 2007, 12 (9-10): 389-95. 10.1016/j.drudis.2007.03.010.

    Article  CAS  PubMed  Google Scholar 

  67. Korber B, LaBute M, Yusim K: Immunoinformatics comes of age. PLoS Comput Biol. 2006, 2 (6): e71-10.1371/journal.pcbi.0020071.

    Article  PubMed Central  PubMed  Google Scholar 

Download references


The authors thank Iraj Rasouli from Shahed University and Mohsen R. Heidari from Baqiyatallah University of Medical Science for their helpful discussions. This work was supported by NIGEB grant NIGEB-368 (AHS) and Shahed University grant 57243 (SLM).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ali H Salmanian.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

All four authors (JA, SLM, SR, AHS) contributed equally to this manuscript. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Amani, J., Mousavi, S.L., Rafati, S. et al. In silico analysis of chimeric espA, eae and tir fragments of Escherichia coli O157:H7 for oral immunogenic applications. Theor Biol Med Model 6, 28 (2009).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: