ACEDRG (CCP4: Supported Program)

NAME

acedrg
 -A stereo-chemical description generator for ligands

SYNOPSIS

acedrg -h

acedrg -v (or --version)

acedrg -c (or --mmcif=) input_mmcif_file -o (or --out=) name_root_for_output_files -r (or --res= ) output_short_monomer_name(optional)

acedrg -i (or --smi= ) input_file_containing_a_SMILES_string -o (or --out=) name_root_for_your_output_files -r (or --res= ) output_short_monomer_name(optional)

acedrg -m (or --mol= ) input_mol_file -o (or --out=) name_root_for_output_files -r (or --res= ) output_short_monomer_name(optional)

acedrg -g (or --mol2=) input_mol2_file -o (or --out=) name_root_for_output_files -r (or --res= ) output_short_monomer_name(optional)

acedrg -L (or --linkInstruction=) instruction_file_for_build_covalent-links (txt format) -o (or --out=) name_root_for_output_files -r (or --res= ) output_short_monomer_name(optional)


Description
Input and output files
Usage
Keyworded input
References
Authors and credits
How to cite ACEDRG

DESCRIPTION

The program ACEDRG is designed for the derivation of stereo-chemical information about monomers/ligands (or small molecules). It uses atom typing based on local chemical and topological environment to organise bond lengths and angles from a small molecule database i.e. the Crystallography Open Database (COD). Information about hybridisation states of atoms, small ring belongingness (up to seven membered rings), ring aromaticity and nearest-neighbour information is encoded in the atom types. All atoms from COD have been classified according to the generated atom types. All bonds and angles have also been classified according to the atom types, and, in a certain sense, bond types.

Using the tables containing those bonds and angles, ACEDRG can derive ideal bond lengths, angles for an unknown monomer/ligand. It also generates information onabout planar groups and stereo-chemical properties in the monomer/ligand. The minumum information Acedrg requires is element types of atoms in the monomer/ligand, and the basic bonding pattern in the monomer/ligand, such as atom connnections and bond-orders. Of course, users can provide some extra information such as coordinates of atoms, information about known chiral-centers.

Acedrg can also generate link information encapsulating covalent bond between two monomers. A user needs to define atoms to be covalently bonded with the bond order as well as atoms that need to be removed from each monomer. Acedrg generates a file containing (1) information on the link, i.e. the bonds, angles and torsion angles that involve atoms from both monomerss, (2) information about modifications of both monomers. Modification can contain change, deletion or addition of atoms, atom types, bonds, angles torsion angles, planaritites, chiral centres and nominal charges.

Note: When a link are generated AceDRG requires that all bond orders of both monomers to be one of "single", "double", "triple". "Deloc" or "aromatic" bond orders can confuse AceDRF. It is therefore recommended that before running AceDRG in link generation mode both monomers should be generated by AceDRG.

Note 2: In cases like bonds between phosphate , carboxyl groups and other ligand one must make sure that the correct atom is selected. Usually atom with formal charge should be selected as an atom involved in covalent link. Before generating link imformation one must study the chemistry of covalent bonds between monomers.

INPUT AND OUTPUT FILES

When used to generate a full descriptioon of a monomer/ligand, Acedrg can take several input file formats used in the computational chemistry: SMILES, mmCIF, SDF/MOL, and SYBYL MOL2 files. It outputs ACEDRG-derived ideal bond lengths, angles, plane groups, aromatic rings and chirality information, and writes them to an file of mmCif format that can be used by the refinement programs and model building programs. AceDRG also outputs a set of coordinates in the PDB file format.

An input instruction file is required for running Acedrg in covalent link generation mode.

Input

SMILES
a typographical line notation for specifying chemical structure
  1. "C1=CC=CC(CCC2)=C12", which can be feed into command-lines 
  2. a_smiles.smi, which contains the above SMILES string and can be feed into command-lines 
MDL MOL File
A MDL Molfile is a file format created by MDL and now owned by Symyx. It contains information about the atoms, bonds, connectivity and coordinates of a molecule. It also includes some header information, the Connection Table (CT) containing atom info, then bond connections and types, followed by sections for more complex information
mmCif File
The macromolecular Crystallographic Information File (mmCIF) is an extension of Crystallographic Information File (CIF).
Mol2 file
A MOL2 (.mol2) is a flexible representation of molecules, containing atom coordinates, bonds, substructure information.

Output

Execution of ACEDRG in different modes will result in different output files. The most frequently used mode is for generation ligand descriptions. ACEDRG writes two output file when job finishes successfully.

(a) Ligand description generation mode:
AceDRG writes two output files:

Usuage

  1. Generate descriptions of a ligand using different input file formats
  2. SMILES
    Generate descriptions of a ligand from a SMILES string
    1.  
             acedrg -i "C1=CC=CC(CCC2)=C12"  -o my_ligand
             When the job finishes, you will see two output files, my_ligand.cif and my_ligand.pdb.
                    
    2.  
             acedrg -i my_ligand.smi  -o my_ligand
             The file - my_ligand.smi, contains a SMILES string, such as C1=CC=CC(CCC2)=C12. 
             Again, the output files are my_ligand.cif and my_ligand.pdb.
                    
    mmCIF
    Generate descriptions of a ligand using mmcif ligand file
           acedrg -c my_ligand.cif  -o my_ligand_fromAcedrg   
           Output files:
           my_ligand_fromAcedrg.cif  - description file in a ccp4 mmcif ligand format
           my_ligand.pdb - coordinate file in the PDB format
           
    MDL MOL
    Generate descriptions of a ligand from mdl/mol file
           acedrg -m my_ligand.mol  -o my_ligand  
           Output files:
           my_ligand_fromAcedrg.cif  - description file in a ccp4 mmcif ligand format
           my_ligand.pdb - coordinate file in the PDB format
           
    SYBL MOL2
    Generate descriptions of a ligand using SYBL mol2 (.mol2) file
           acedrg -g my_ligand.mol2  -o my_ligand       
           Outut files:
           my_ligand_fromAcedrg.cif  - description file in a ccp4 mmcif ligand format
           my_ligand.pdb - coordinate file in the PDB format
         
    Other options:
    1.        acedrg -h 
             -h signals that AceDRG print out a help manual that contains all options. It will print some info about all input options.
                    
    2.        acedrg -v 
             -v signals that AceDRG will print out its  version number.
                    
    3.        acedrg -c my_ligand.cif  -o my_ligand_fromAcedrg -r a_3_letter_ligand_name
             -r signals that AceDRG will use the input a_3_letter_ligand_name as the
             short name for the ligand in the output mmcif file, i.e. my_ligand_fromAcedrg.cif.
                    
    4.        acedrg -c my_ligand.cif  -o my_ligand_fromAcedrg -p
             -p signals that AceDRG must use the coordinates from the input file.
                    
    5.        acedrg -m my_ligand.mol  -o my_ligand  -K (upper case)     
             -K signals that AceDRG will keep protonation state defined in the input file
                   
    6.        acedrg -i my_ligand.smi  -o my_ligand  -j n (an integer)      
             -j signals that AceDRG will try to generate n initial conformers and
                optimize them. Then output one conformer that corresponds 
                to the lowest engergy.  
                   
    7.        acedrg -i my_ligand.smi  -o my_ligand  -k n (an integer)      
             -k signals that AceDRG will try to generate n initial conformers and
                optimize them. Then output all those n different files.
                   
    8.        acedrg -i my_ligand.smi -o my_ligand -l (low case) n (an integer)   
             -l signals that AceDRG will make RDKit to run number of n steps in
                its geometrical optimizations for the molecule conformers. It
                is usually used together with option -k or -j to get better
                conformers. But it may take much longer time. 
                   
    9.        acedrg  -o my_atom_types_in_ligand  -n      
             -n signals that AceDRG will generate two kinds of atom types (CCP4 type, aceDRG type)
                for all atoms of the ligand. Then output all  atom types in an output file, 
                i.e. my_atom_types_in_ligand.txt
                   

  3. Generate a description of a link between two monomers
    1. To generte a link one must create a file containing instruction about the covalent bonds to be created.
      1. Create an instruction file, e.g. my_instructions.txt
      2. Run aceDRG in a command line, using the instruction file in a commamd line, e.g.
               acedrg -L(upper case) my_instructions.txt -o my_link
                          
      3. The output file, e.g. my_link.cif, contains the detailed information about the link.
    2. How to create an instruction file.
      1. Keywords:
        1. The required keywords :
          • RES-NAME-1
          • ATOM-NAME-1
          • FILE-1
          • RES-NAME-2
          • ATOM-NAME-2
          • FILE-2
        2. The optional keywords :
          • DELETE
          • CHANGE
          • BOND-TYPE
        3. Keywords are case insensitive, the values of the keywords are case sensitive.
      2. Details and examples:
        • The instruction file should have a single line, e.g.
                     LINK: RES-NAME-1 2OP FILE-1 2OP_acedrg.cif ATOM-NAME-1  C  RES-NAME-2 VAL ATOM-NAME-2  N
          
          	   It means that AceDRG will generate a link between atoms C of 2OP and N of VAL. The description of 2OP is in the file 2OP_acedrg.cif and the description of VAL is from the monomer library.
                                   
        • keywords RES-NAME-1 and RES-NAME-2 should be followed by the residue names, e.g. 2OP and VAL
        • keywords ATOM-NAME-1 and ATOM-NAME-2 should be followed by atom names, e.g. C and N. These are atoms from the first residue (defined by RES-NAME-1) and atom in the second ressidue (defined by RES-NAME-2). AceDRG will generate a bond between these atoms. Default bond order is "SINGLE".
        • keyword FILE-1 and FILE-2 should be followed by a mmcif file name which provides detailed information on the ligand. If file name is not given then AceDRG assumes that this monomer should be taken from the standard monomer library.
          1. You can generate the mmcif file of the ligand/monomer, e.g. 2OP, by running aceDRG, as described below.
            1. If the monomer is present in the PDB then it is a good idea the get mmcif file for thiss monomer, for example for 2OP.cif, from PDB. If the monomer is not in the PDB file then it can be generated using SMILES string, mmcif file, sdf mol file or mol2 file as described above.
            2. Run acedrg using 2OP.cif as an input file.
                       acedrg -c 2OP.cif -o 2OP_acedrg  -p 
                                                                 
            3. Put the generated mmcif file, 2OP_acedrg.cif, into the instruction file as in the example instruction file.
        • Additional keywords can be used to control AceDRG acgions in link generation mode:
          • The keyword DELETE can be used to delete one or more atoms from one of the ligands/monomers. e.g.
                LINK: RES-NAME-1 2OP FILE-1 2OP_acedrg.cif ATOM-NAME-1 C RES-NAME-2 VAL ATOM-NAME-2 N DELETE ATOM OXT 1
                which means that atom, OXT, in ligand 1, i.e. 2OP will be deleted when the link is generated. DELETE instructions should be at the end of the instruction line.
                                            
          • The keyword, CHANGE, can be used to change one of the bonds a in one of the ligands/monomers. e.g.
                LINK: RES-NAME-1 CYS ATOM-NAME-1 SG RES-NAME-2 TMP FILE-2 TMP.cif ATOM-NAME-2 C1 CHANGE BOND C1 C2 SINGLE 2
                which means that the bond between C1 and C2 in ligand 2, i.e. TMP will be changed into a bond order of single.
                                      
          • The keyword, BOND-TYPE, can be used to define the bond order between linked atoms. Default value for bond order is SINGLE.
                LINK: RES-NAME-1 LYS ATOM-NAME-1 NZ RES-NAME-2 PLP FILE-2 PLP_acedrg.cif ATOM-NAME-2 C4A BOND-TYPE DOUBLE DELETE ATOM O4A 2 
                which means that the bond order between NZ in LYS and C4A in PLP_acedrg.cif is double.
                                            

REFERENCES

    1. Fei Long, Robert A Nicholls, Paul Emsley, Saulius GraZulis, Andrius Merkys, Antanas Vaitkus and Garib N Murshudov "ACEDRG: A stereo-chemical description generator for ligands" Acta Cryst. (2017), D73, 112-122.
    2. Fei Long, Robert A Nicholls, Paul Emsley, Saulius GraZulis, Andrius Merkys, Antanas Vaitkus and Garib N Murshudov "Validation and extraction of stereochemical information fromsmall molecular databases" Acta Cryst. (2017), D73, 103-111.


HOW TO CITE ACEDRG

The main reference for ACEDRG is:

Fei Long, Robert A Nicholls, Paul Emsley, Saulius GraZulis, Andrius Merkys, Antanas Vaitkus and Garib N Murshudov
"ACEDRG: A stereo-chemical description generator for ligands"
Acta Cryst. (2017), D73, 112-122.

AceDRG uses rdkit for chemistry perception and REFMAC for :

For RDKit cite
RDKit Documentation


For REFMAC cite
Murshudov, G. N., Skubak, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A. , Winn, M. D., Long, F. & Vagin, A. A.
REFMAC5 for the refinement of macromolecular crystal structures
Acta Cryst. (2011), D67, 355-367

SEE ALSO

RDKit REFMAC