In this tutorial, from within CCP4i2 we will generate coordinates and description of a ligand from a SMILES string using AceDRG, use Coot to fit and the ligand into a macromolecular model, and refine the model using REFMAC5.
A basic working knowledge of the CCP4i2 GUI is assumed.
Before We Start
Required files:
ligand_removed.pdb - the model with PDB code 3DZ4, after removing the ligand;
Generating Ligand Coordinates And Description From Within CCP4i2
We will begin by using AceDRG to generate the required coordinates and description for the ligand:
Open the CCP4i2 GUI, and create a new project as appropriate.
Open the "Make Ligand - Acedrg" task interface, which can be found in the Ligands module.
Enter a meaningful job title, e.g. "Tutorial - Ligand Generation".
Specify to start from either the SMILES file c8m.smi.
Or alternatively copy and paste the SMILES string into the interface directly:
CN(CCC(N)=O)C[C@H]1O[C@H]([C@H](O)[C@@H]1O)n1c(C)nc2c(N)ncnc12
Specify the three letter code for the output monomer: C8M
Click the "Run" button on the top-right of the interface to run the job.
AceDRG will infer chemistry from the SMILES string - information about atomic connectivity and bond orders. It will then consult the AceDRG tables in order to generate a full ligand description, including the restraints for the bonds, angles, etc. It then performs conformer generation using RDKit, followed by final optimisation using REFMAC5.
When the job has finished, a 2D representation of the ligand will be displayed using Coot's Lidia.
Finding the Ligand Position
Now open our newly created ligand in Coot:
Click the Coot button (at the bottom of the screen) - this will start a new job that will open Coot to display the ligand created by AceDRG.
Note that the ligand coordinates are already selected under "Atomic model", and the dictionary for the ligand C8M is already selected in the "Geometry dictionary" field.
As well as displaying the ligand, we also want to display our protein model.
Under "Coordinates", select "Show List". Click the "+" button to add a coordinate set, and browse to find the coordinates file: ligand_removed.pdb
Any changes we make to these coordinates can be saved back to CCP4i2 once when exit Coot.
Now run the job.
Coot will open.
Both the protein coordinates and the ligand coordinates have been loaded into Coot... but the ligand is not in the correct position. We must now fit the ligand.
The first thing to do is to open the electron density maps using Coot's "Auto Open MTZ" option in the File menu - select the file: ligand_removed.mtz
Now let's find the position of the ligand by using Coot's "Unmodelled blobs" feature in the Validate menu.
Make sure that the protein model is selected - not the ligand coordinates - and then click Find Blobs.
It should find one blob - click on it to centre on the blob.
If it finds many blobs then something has gone wrong... perhaps the map was masked using the ligand coordinates instead of the protein coordinates...?
Zoom in (keyword: m) and you should see green difference density corresponding to the ligand.
Fitting the Ligand
Now we must fit our ligand coordinates into the blob, whilst ensuring chemical sense using the AceDRG restraint dictionary. Since this ligand has rotatable bonds, we try generating many conformers in the hope that one of them matches our blob sufficiently well.
To fit the ligand, go to Coot's' Calculate menu, select "Other Modelling Tools" and then "Find Ligands".
Select the coordinates for the ligand C8M, and tick the box to do flexible fitting. This will result in Coot trialling multiple conformers by sampling different rotatable bond angles.
Select the protein model in the "Select Protein" section (by default the coordinates for the ligand may be selected instead of the protein model).
Under "Where to Search?" select "Right here".
Leave everything else at the default values.
Now click "Find ligands".
Hopefully it will have found a ligand - select this ligand.
If the ligand is not found, then repeat the process again, perhaps increasing the number of conformers to search.
The ligand should be placed in roughly the correct place, but it is unlikely to be in exactly the correct conformation.
Manual intervetion is required in order to correctly position the ligand. Try real space refinement (keyword: r), and manually drag atoms into a reasonable position.
We now need to merge this ligand model into the protein model. In the Edit menu, select "Merge Molecules". We want to insert the "Fitted Ligand" into Molecule "ligand_removed.pdb".
You should see the ligand change colour to match the surrounding protein model.
Ligand Validation
Alongside real space refinement in Coot, it is neccesary to perform validation in order to ensure that the protein-ligand interactions are sensible. Make any final adjustments to the ligand coordinates, aided by Coot's ligand validation tools:
Select "Environment Disances" in the Measures menu, and enable "Show Residue Environment" in order to analyse interactions between the protein and ligand. This helps to analyse the hydrogen bonding network.
Select "Isolated Molprobity dots for this ligand" from the Ligand menu. This helps to identify any bad clashes.
Select "Display Ligand Distortions" from the Ligand menu. This assesses the geometry (bonds, angles, etc.) of the ligand using Mogul. This analysis is a type of cross-validation, since Mogul gets its data from the CSD, whereas AceDRG's data comes from the COD.
To further analyse the ligand's environment, you can use Coot's Flatland Environment View. Select "FLEV this residue" from the Ligands menu. The Lidia ligand builder will open - click the "Env. Residues" button to analyse the ligand environment in 2D.
Full Model Refinement
Once you are happy with the ligand, the next step is to save the model back to CCP4i2 so that we can continue with further rounds of model building and refinement:
In Coot's File menu, select "Save mol to CCP4i2". Make sure that you save the correct model corresponding to the protein-ligand complex.
Now close Coot, to return back to CCP4i2.
Click the REFMAC5 button (at the bottom of the screen) - this will start a new job that will run REFMAC5 to to refine the protein-ligand complex.
You will see that the coordinate model has been pre-populated, as has the ligand dictionary. However, we need to provide the ligand_removed.mtz file in the Reflections and Free R set fields.
Now run the job.
If the job fails then it could be due to issues with FreeR labels - in this case, generate a new set of Free R flags (select "Generate a Free R set" which is located in the "X-ray data reduction and analysis" section of the Task menu) then clone the REFMAC5 job and re-run it using this new Free R set.
Follow the REFMAC5 job by a Coot job. Note that the atomic model, the map coefficients, and the ligand geometry dictionary will be automatically transferred to Coot.
In Coot, inspect the ligand. Does the difference density map support the ligand model? Re-enable environment distances, isolated dots and ligand distortions. Does the ligand require seem reasonnable?
In practice, further rounds of manual and full-model refinement could be performed by iterating the above procedure.