Part 2:
Protein structure prediction

In this phase, you will predict and analyze the structure of dihydrofolate reductase (DHFR) from S. aureus that you identified in Part 1. DHFR is an essential enzyme involved in folate metabolism and an important antibiotic target.

By completing this project, you will:

Gain practical experience with state-of-the-art protein structure prediction tools.
Learn to critically evaluate and compare protein structure predictions.
Understand how to analyze protein structures and their functional features.
Develop skills in structural visualization and analysis.

Instructions¶

Use PyMOL for your structural analysis. You often will need to use the "licorice" representation of the protein to see that atomistic differences.

To support your computational biology project, your instructor has provided a PSE file for each question. A PSE file is a PyMOL session file that contains a pre-configured molecular structure and relevant visualizations to help you address the specific questions. These files are designed to save you time and effort by providing the necessary setup, including loaded molecules, annotations, and possibly preset views that align with the problem requirements.

In each screenshot, ensure you provide a visual indication and/or label to specify the origin (e.g., I-TASSER or 6PRD) of the structure. Below is an example of an appropriate image to use in your report.

Example image

Conformations of His23, Glu143, and Lys144 of DHFR are shown below for 3FRD and 6PR6 with carbon atoms shown in pink and white, respectively. Structures were aligned by their alpha carbons.

Structure predictions¶

Using the UniProt DHFR sequence from your genome assembly project, we will perform several predictions. Instead of flooding these free servers with the same jobs, we will all use the same outputs.

I-TASSER (Job ID S799334): A threading-based method for predicting protein structures and functions by assembling models from template fragments and refining them iteratively; it is robust for cases with moderate to strong homologous templates and provides functional annotations, often generating medium-resolution models.
D-I-TASSER (Job ID DIT6377): An advanced iteration of I-TASSER using deep-learning-based spatial restraints to model proteins with greater accuracy, especially in challenging cases with weak homology, providing more reliable multi-domain structure predictions compared to I-TASSER.
I-TASSER-MTD (Job ID ITM552669806): Designed specifically for multi-domain proteins, it combines domain parsing, single-domain folding, and inter-domain assembly with improved functionality and accuracy for large, multi-domain proteins compared to D-I-TASSER

SWISS-MODEL (Job ID a7BMLv): A homology modeling platform emphasizing user accessibility and high-quality models through evolutionary template matching; while not as suitable for low-homology cases as I-TASSER or AlphaFold3, it excels in providing intuitive results for well-conserved targets
AlphaFold3 (Download the PDB file here): A state-of-the-art model leveraging a diffusion-based architecture to predict highly accurate protein structures and biomolecular interactions across diverse targets, outperforming specialized tools in protein-ligand and protein-nucleic acid interactions.

Submission

In your submission, answer the following questions:

Download the PDB files for SWISS-MODELS 01 (6E4E), 05 (3FYW), 03 (6PRP), and 06 (6PR8). Identify if there are any apparent protein conformational differences near the NADP(H) and folate binding pockets. Provide screenshots to support your claims. (PyMol session file)
Download the PDB file for SWISS-MODEL 02 (2W3M) and compare it against model 01 (6E4E). What is the alpha-carbon RMSD after alignment? Out of these two structures, which would you use for docking? Justify your choice. Provide screenshots to support your claims. (PyMol session file)
Download the I-TASSER, D-I-TASSER, I-TASSER-MTD, and your AlphaFold3 PDB structures. Compare these structure to each other and to SWISS-MODEL 01. Which prediction has the highest similarity (i.e., low RMSD) to the SWISS-MODEL? Which method would you generally find more reliable? Provide screenshots to support your claims. (PyMol session file)

Experimental structures¶

The following experimental structures were selected for our analysis. All are wild-type S. aureus DHFR with co-crystallized NADP(H).

PDB ID	Additional ligand
3FRD	Folate
6PRA	None
6PRB	OWM
6PRD	OWG
6PR6	OWS
6PR7	OWP
6PR8	OWJ
6PR9	OWV

Remember that you need to add hydrogens to your structures!