HT2: Processing
T5: Tutorial: Automatic Model Building, Refinement and Ligand Fitting
Maria Fando
Science and Technology Facilities Council, CCP4 core group
Oxford, UK
From phased model to (re-)built and refined structure with ligand fitted with automatic pipelines and project workflows in CCP4 Cloud
Automatic Model Building (AMB) is recommended when large parts of structure need to be built (e.g., after Experimental Phasing) or re-built (e.g., after Molecular Replacement, especially if MR search model has low sequence similarity to the target structure). AMB is expected to deliver 70% to 90% of structure residues fit correctly in the density, depending on the difficulty of the case, and results should be always inspected visually, corrected and completed manually with Coot. AMB usually saves considerable time and effort and is particularly useful when dealing with large and complex protein structures, where manual model building would be time-consuming and error-prone.
CCP4 provides several AMB tools, which include automatic pipelines and project workflows for model building, refinement, and ligand fitting. AMB software generates an initial trace of the macromolecule (protein or RNA/DNA) based on the electron density map obtained after Experimental Phasing (EP) or Molecular Replacement (MR). While the need for AMB is rather obvious in case of EP (note that AMB is included in Crank-2 pipeline for automatic EP), it is often required also in ME even if MR search model was calculated by AlphaFold-2 or obtained from AFDB with 100% sequence homology. This is because using highly-homologous search models does not guarantee correct conformations, low-confidence residues (equivalent to high B-factors) are removed before MR, and the structure may be partly disordered.
CCP4 Cloud gives access to the following AMB tools: Buccaneer (fast protein building), ARP/wARP (thorough protein building in resolutions up to 2.5A), CCP4Build (thorough protein building), Modelcraft (thorough protein and RNA/DNA building) and awNUCE (RNA building without phase modification). Appreciating the confusing range of tools and their specifications, we tend to recommend Modelcraft as a first attempt in most cases, although performance of each tool depends on the case.
In this tutorial, we will cover steps involved in automatic model building, and get practical experience of using the corresponding CCP4 Cloud tasks.
Step 1: Density modification
Density modification is used to improve the phase quality and make electron density more suitable for AMB. This step is especially needed after EP, which often results in phases of insufficient quality. Note, however, that Modelcraft and CCP4Build include density modification steps, therefore, it is not recommended before running these tasks. Density modification uses several techniques, equivalent to incorporating additional information into the electron density map, which helps auto-builders to start from better assumptions and produce higher quality structures in the end.
Step 2: Automatic Model building
The next step is to generate an initial trace of the protein using Modelcraft, ARP/wARP, Buccaneer, CCP4Build and Automatic RNA/DNA Building with Arp/wArp (awNUCE).
Step 3: Automated Model Refinement
After generating the initial trace, the next step is to improve structure quality and phases by refining with chosen parameters. Suitable parameters for Refmac may be found in the course of multiple trials with examination of Verdict section in Refmac report pages. Alternatively, good results can be usually achieved with the automatic refinement and ligand fitting workflow wREL. In case of low-resolution data, using the LoRESTR pipeline may be required.
Step 4: Automatic Ligand Fitting
Once the protein model has been refined, the next step is to fit ligands into the model. Make sure to remove water molecules that can be put by model builders into ligand density blobs. Also keep in mind that auto-builders may build protein backbone into ligand density by mistake, in which case residues in wrong positions must be removed with Coot before ligand fitting. We will consider ligand fitting techniques using automatic tools in CCP4 Cloud.
Step 5: Model Validation
The final step in automatic model building is model validation. This involves checking the quality and accuracy of the final model with reference to acceptable range of quality scores and structures of similar resolution found in the PDB. Here we will demonstrate a range of validation tools available in CCP4 Cloud.