Structural, phylogenetic and docking studies of D-amino acid oxidase activator (DAOA), a candidate schizophrenia gene

Background Schizophrenia is a neurodegenerative disorder that occurs worldwide and can be difficult to diagnose. It is the foremost neurological disorder leading to suicide among patients in both developed and underdeveloped countries. D-amino acid oxidase activator (DAOA), also known as G72, is directly implicated in the glutamateric hypothesis of schizophrenia. It activates D-amino acid oxidase, which oxidizes D-serine, leading to modulation of the N-methyl-D-aspartate receptor. Methods MODELLER (9v10) was utilized to generate three dimensional structures of the DAOA candidate gene. The HOPE server was used for mutational analysis. The Molecular Evolutionary Genetics Analysis (MEGA5) tool was utilized to reconstruct the evolutionary history of the candidate gene DAOA. AutoDock was used for protein-ligand docking and Gramm-X and PatchDock for protein-protein docking. Results A suitable template (1ZCA) was selected by employing BLASTp on the basis of 33% query coverage, 27% identity and E-value 4.9. The Rampage evaluation tool showed 91.1% favored region, 4.9% allowed region and 4.1% outlier region in DAOA. ERRAT demonstrated that the predicted model had a 50.909% quality factor. Mutational analysis of DAOA revealed significant effects on hydrogen bonding and correct folding of the DAOA protein, which in turn affect protein conformation. Ciona was inferred as the outgroup. Tetrapods were in their appropriate clusters with bifurcations. Human amino acid sequences are conserved, with chimpanzee and gorilla showing more than 80% homology and bootstrap value based on 1000 replications. Molecular docking analysis was employed to elucidate the binding mode of the reported ligand complex for DAOA. The docking experiment demonstrated that DAOA is involved in major amino acid interactions: the residues that interact most strongly with the ligand C28H28N3O5PS2 are polar but uncharged (Gln36, Asn38, Thr 122) and non-polar hydrophobic (Ile119, Ser171, Ser21, Ala31). Protein-protein docking simulation demonstrated two ionic bonds and one hydrogen bond involving DAOA. Lys-7 of the receptor protein interacted with Lys-163 and Asp-2037. Tyr-03 interacted with Arg-286 of the ligand protein and formed a hydrogen bond. Conclusion The predicted interactions might serve to inhibit the disease-related allele. It is assumed that current bioinformatics methods will contribute significantly to identifying, analyzing and curing schizophrenia. There is an urgent need to develop effective drugs for schizophrenia, and tools for examining candidate genes more accurately and efficiently are required.


(Continued from previous page)
Conclusion: The predicted interactions might serve to inhibit the disease-related allele. It is assumed that current bioinformatics methods will contribute significantly to identifying, analyzing and curing schizophrenia. There is an urgent need to develop effective drugs for schizophrenia, and tools for examining candidate genes more accurately and efficiently are required.

Background
The nature of a human medical disorder is often elucidated through biological markers and behavioral studies. Diagnosis of mental disorders is very difficult because it primarily relies on behavioral markers. An example of a complex mental disorder is schizophrenia (SZ), diagnosis of which depends on abnormal behavior such as paranoia, dampening of emotions and auditory hallucinations. Genome-wide studies have attained a major role in SZ research because high-throughput technologies are valuable for discovering relevant genes. SZ is a psychiatric disorder with severe manifestationsabnormal behavior, disorganized speech and figments of the imagination -and an estimated heritability of about 80% [1]. Negative symptoms can also include affective flattening, avolition, and alogia. Approximately 1% of the population is affected during the course of life. The effects of SZ usually start during the patient's late teens to early twenties; females have an age of onset five years later than males [2]. A recent metadata analysis estimated the risk of SZ in males to be about 40% higher than in females [3]. Epidemiological studies of SZ have shown that it occurs in all populations with a prevalence of approximately 1.5-4.5 per thousand and an incidence of 0.17-0.43 per thousand [4].
According to analysis of gene linkage data and meta-analysis of genome scans [5], highly vulnerable genes on chromosomes 1q, 3p, 5q, 6p, 8p, 11q, 14p, 20q and 22q [6,7] contribute to SZ. Both functional and positional candidate SZ genes have been studied and various promising candidates that might be involved in risk for the disease have been identified.
The symptoms of SZ have different dimensions that usually occur together and can reflect substantial variation among patient phenotypes [8][9][10]. Different researchers have formulated various models of these dimensions but the most widely appreciated 3D models were first proposed by Bilder et al. and Liddle [9,11]. These authors concluded that the main symptoms are poverty of speech, formal thought disorder, decreased voluntary movement, psychomotor impairment, bizarre behavior, hallucinations, abnormal acts, inappropriate affects, flat affects, flattening, avolition, and alogia.
A genome-wide association study (GWAS) for SZ was conducted in 2008 but no significant loci were reported, though 7000 samples were used [12,13].
The gene DAOA, located on chromosome 13q3, encodes the D-amino acid oxidase activator protein, as shown by functional and expression studies. It is significantly associated with SZ and is also known as G72. The D-amino acid oxidase activator (DAOA) is directly implicated in the glutamateric hypothesis of SZ [14]. When D-amino acid oxidase is activated, D-serine is oxidized and the product modulates the N-methyl-D-aspartate receptor. Modulation of this receptor leads to the cause of SZ; glutamate signaling is involved in important pathways directly linked to SZ [15].
DAOA is also involved in other psychotic disorders and can modify the cognitive and negative symptoms of mood. It could be the primary genetic cause of the observed overlap of phenotypes between bipolar disorder and SZ [16]. Bioinformatics has been used for in silico analysis of biological queries using mathematical and statistical techniques. X-ray and NMR techniques are expensive and timeconsuming for structural modeling of proteins. Screening of small chemical compounds against target receptors by high throughput screening (HTS) is very expensive.
In this work, we predicted the 3D structure and the protein-ligand and proteinprotein docking of DAOA using different bioinformatics strategies. The main aim of our research was to predict the 3D structure and docking. The objective of the present study was to elucidate the interactions of DAOA protein with ligands and other proteins and to identify the connection of DAOA to SZ. Protein-protein docking and interaction simulations reveal hydrogen and ionic bonds. The present work was conducted to provide molecular insights into the structure of the protein and to find its most plausible function.

Results
This paper describes the implementation of an in silico technique to recruit and analyze DAOA, the most likely candidate gene for SZ. The direct involvement of DAOA in disease pathogencity has already been reported in several research studies on SZ.
Initially, a literature search was conducted to explore the most likely candidate gene involved in SZ. A comparative modeling technique (MODELER 9v10) was adopted to predict the three dimensional structure of the protein encoded by the selected gene. The protein data bank (PDB) was checked for the 3D structure of the selected protein, and it was confirmed that no 3D structure had been predicted to date. To check the quality and reliability of the predicted model, the evaluation tools ERRAT and Rampage were used.
Protein-ligand and protein-protein docking of DAOA were simulated. The ZINC and PubChem databases were used to retrieve the ligand and STRING was used to identify protein interactions [17].
DAOA has been mapped on chromosome 13, with starting and ending base pairs 06118216 and 10143383 respectively. Homology modeling was implemented to generate the 3D structure of the encoded protein. MODELER 9v10 was used to construct the protein model. A basic local alignment technique (BLAST) was utilized to identify the homology between the target protein and its template. The lowest energy minimization value for the predicted structure was selected for further analysis.
The 3D structure or modeling of DAOA is not known and no structural information can be found for the templates. The amino acid sequence of DAOA in FASTA format was retrieved from Uniprot with accession number A2T115. Table 1 lists the three templates 1ZCA, 1V30 and 2E5K with optimal alignment of the first template and good alignment for the others, sorted by overall quality, query coverage, similarity and Evalues. The structure predicted by MODELLER 9v10 with the alpha helices and betapleated sheets visualized by Chimera 1.6 is illustrated in Figure 1(A). Figure 1(B) demonstrates a superimposition of structure and template. The predicted structure is evaluated in Figures 2 and 3.
Mutations in DAOA have been reported so text mining was used to retrieve them. The Arg30Lys mutation is directly involved in SZ. Arginine in the wild type protein is replaced with Lysine at position 30, in a highly conserved part of the amino acid sequence. This mis-sense mutation occurs in rare cases but damages the protein when it occurs. The polypeptide backbone is not affected but the side chains of the two amino acids are distinct. Every amino acid has its own specific charge and hydrophobicity value. The mutated and wild type proteins differ in these properties. The mutated amino acid is also smaller than the wild type residue, affecting its interaction with other molecules.
A dataset of proteins for a huge range of invertebrate and vertebrate genomes is currently available for analyzing the phylogenetic history of the SZ candidate gene. The phylogenetic neighbor-joining (NJ) tree presented in this study reveals many interesting characteristics of the candidate gene in vertebrates ( Figure 4). The phylogenetic history of the gene was analyzed by including the protein sequences from teleosts and tetrapods in the tree; amphioxus sequences were also used as the closest invertebrate relative to vertebrates. Autodock VINA docking software was used to investigate how the ligand binds to the respective protein, the binding conformation, functionally interacting residues and best structural information. The ligand retrieved for DAOA is described in Table 2 and its structure is illustrated in Figure 5 (a and b). Docking of different ligand conformations to the protein was simulated. About 50 complexes were generated by the software, and the one with the lowest binding energy was selected for further analysis.   The amino acids present at the active site of the protein were identified by observing the residues within 4Å using VMD visualizing software. The hydrogen, hydrophobic and ionic interactions of the protein-docked ligand complex are described in Table 3.
GRAMM-X and PatchDock were utilized to characterize the protein-protein docking of DAOA. The functional interacting partners of DAOA were retrieved from the STRING database. Figure 8 reveals the proteins that interact most closely with DAOA; DAO showed the highest interaction score (0.953). The 3D structure of DAO was retrieved from the PDB with accession number 2E48. The interaction between DAOA and DAO is illustrated in Figure 9. Table 4 lists the interactions and distances between the receptor and ligand proteins.

Discussion
SZ is a heterogeneous neural disorder with a spectrum of symptoms such as decrease in volition, displays of antisocial behavior, disordered sense of self, psychomotor slowing and alteration in perception. It has been accepted that SZ is caused by multiple or single dysfunctions within discrete brain regions. Numerous mechanisms and models Figure 4 The evolutionary history of the DAOA gene was constructed. No paralogs of DAOA had been reported in biological databases. ENSEMBL Blastp/Blat was also performed for paralogs of DAOA but with the same result. Only orthologs of DAOA were used to construct the phylogenetic tree. Complete-deletion, p-distance and Uncorrected options were used. Bootstrap values are given as numbers of branches (based on 1000 replications) supporting that branch; values of ≥80% are presented. Amino acid substitution rate per site is shown by the scale bar. Eight orthologs were used and Ciona was the outgroup. Two main clusters were generated. Human, Gorilla and Chimpanzee lie in the same cluster and show sequence similarity. Tetrapods are also closely related to each other in the analysis. Organisms in the same clusters show the highest similarity with point mutations. have been proposed for the development of SZ in terms of the timing, situation and nature of brain changes but the exact mechanisms are still not understood.
No three-dimensional structure of DAOA was reported in the Protein Data Bank or resolved by X-ray crystallography and NMR. MODELLER 9v10 was employed for comparative modeling. Suitable templates for DAOA were identified by running BLASTp, which retrieved three options: 1ZCA, 1V30 and 2E5K. Among these, 1ZCA showed the highest query coverage. The model was predicted by all the templates and 1ZCA showed good evaluation results: 91.1% of the predicted amino acids fell in the favored region and 4.9% in the allowed region. Only four amino acids were outliers. ERRAT revealed a quality factor of 50.909%.
The gene DAOA was subjected to mutational analysis and mutations were extracted from biological databases and the literature. Only one mutation of DAOA has been    reported and authenticated in biological databases: the Arg30Lys mis-sense mutation, which affects protein function because the mutant protein is smaller than wild type and the region around residue 30 is conserved. The mutated protein has abnormal function and its binding and interactions with other proteins are affected, including components of signaling pathways. A phylogenetic tool (MEGA5) was used to construct a neighbor-joining tree of the selected candidate gene to determine its evolutionary history among human, mammals, tetrapods, primates, birds, teleosts and rodents. DAOA has no reported paralogs in biological databases (ENSEMBL and UCSC) or in the literature. Orthologs of DAOA were used for tree construction. Ciona was inferred as outgroup in the trees. DAOA showed Figure 8 Interaction network of DAOA showed closely interacting proteins. DAO protein showed the closest interaction with DAOA with score 0.953; 3D structure is available in PDB. a conserved sequence with primates having few mutations. It has evolved successively from ciona to human.
Protein-ligand docking for DAOA was simulated using AutoDock software. The ligand for receptor-ligand docking was retrieved from different databases, commonly PubChem and ZINC. DAOA: O, O-Diethylthiophosphoryl (Z)-2-(2-aminothiazol-4-yl)-2-trityloxyiminoacetate was used for the protein-ligand docking study. The complex with the lowest docked binding energy was selected for post-docking analysis using VMD software (version 1.9.1) and interactions between receptor and ligand were analyzed. In particular, amino acid residues located within 4 Å of the ligand were identified and their prospective interaction types were inferred. Eight interactions were observed between receptor protein and ligand. No ionic interaction was found but there were two hydrogen bonds between an oxygen atom of Asn-38 and a hydrogen in the ligand, and an oxygen of Gln-36 in the receptor protein and a hydrogen in the ligand. The bond distances between Asn-38 and Gln-36 and the ligand were respectively 3.45 Å and 2.85Å. Six hydrophobic interactions were observed between carbon atoms of the receptor protein and ligand. A carbon in Ile-119 interacted hydrophobically with a carbon in the ligand at 3.80 Å bond distance. Another carbon atom of Ile-119 also  interacted with a ligand carbon at 3.84 Å bond distance. Ser-121 and Ser-71 of the receptor protein also showed hydrophobic bonding with carbon atoms of the ligand with bond distances 3.56 Å and 3.89 Å respectively. Carbon atoms of Ala-31 and Thr-122 also interacted with the ligand and had 3.96 Å and 3.87 Å bond distances. GRAMM-X and PatchDock were utilized for DAOA protein-protein docking. The PyMol visualization tool was used for post-docking analysis of protein-protein interactions. DAOA and its functionally interacting partner DAO showed two ionic interactions: an oxygen atom in Arg-286 of the ligand interacted with a hydrogen in Tyr-03 with bond distance 2.5 Å; and a nitrogen atom in Lys-163 of the ligand interacted with an oxygen in Lys-7 with bond distance 3.5 Å. An oxygen atom in Asp-2037 of DAO formed an ionic bond with a nitrogen atom in Lys-7 of DAOA with bond distance 3.3 Å.

Conclusion
For receptor-ligand interactions, both functional and expressional studies show that the product of DAOA interacts with the enzyme D-amino acid oxidase and modulates its activity. Glutamate signaling is involved in important pathways directly implicated in SZ [15]. The ligand examined here (C 28 H 28 N 3 O 5 PS 2 ) can be used as a biomolecule, and in vivo experiments could be performed in mice to check its effects and interactions, with a view to developing an approved drug for SZ. More than 80% homology between human and primates is strong evidence for an ancestral relationship that will help in predicting protein function and family. Our research suggests a baseline for the design, development and validation of novel drugs against SZ.

Materials & methods
The amino acid sequence of DAOA (125 residues) was used for homology modeling since it is involved in SZ. The sequence was retrieved in FASTA format from Uniprot Knowledge base with accession number A2T115.
The retrieved amino acid sequence of DAOA was subjected to a protein-protein BLAST (BLASTp) search against the Protein Data Bank (PDB) to identify a suitable template structure for comparative modeling. [PDB ID: 1ZCA] was selected as a suitable template with query sequence having 27% identity, 33% query coverage and Evalue 4.9. The automated protein modeling program MODELLER 9v10 was used to generate models. It predicted the 3D structure of the protein by satisfying spatial restraints [18]. The evaluation tools Rampage and ERRAT were applied to assess the predicted 3D model of DAOA. Rampage generated a Ramachandran plot and ERRAT evaluated the quality of the predicted structures.
The mutation was retrieved from the HGMD biological database. The HOPE server was used for mutational analysis. To investigate ancestral relationships, the most popular software for phylogenetic analysis, the Molecular Evolutionary Genetic Algorithm (MEGA 5), was used on DAOA. The distance-based approach was applied using Neighbor-Joining, considering a bootstrap value of 1000 replications to construct the phylogenetic tree.
Blind docking was simulated to identify the specific binding site for receptor-ligand and protein-protein interactions. The coordinates of the ligand molecule [Accession number PubChem: 19894] were obtained from PubChem. The SDF (Sql Database File) format of the ligand was transformed into .pdb format using Chem Draw Ultra Version 8.0 [19]. The .pdb formats of protein (receptor) and ligand were used as input files to run AutoDock Vina. To determine the binding affinities between DAOA and the selected ligand, a flexible automated docking program was applied to the active site of the protein using AutoDock Vina. AutoDock has a grid map to aid the actual docking process. The dimensions of the grid were 40 × 40 × 40 points for the ligand with 0.375 Å spacing between the grid points.
The STRING server was used to assess the protein interactions of DAOA. It is an online database of known and predicted protein interactions including direct (physical) and indirect (functional) relationships. Protein docking of DAOA with its interactive protein DAO was simulated using PatchDock and Gramm-X. Visual Molecular Dynamics (VMD) software with a table of all tools utilized was used to visualize assay and post-docking analysis for protein-ligand docking; PyMol software was used for proteinprotein docking. The tools employed in this study are listed in Table 5.