Hints and Troubleshooting |
Introduction Overview Installation Quickstart Hints and tips Utilities PVM version | Generally, the DockIt suite should run well automatically with the default settings. However, occasionally, it is necessary to manually modify either a setting or to check on the success of a step. When using pdb2site to automatically split a ligand and site from a PDB file, it is sometimes not possible for pdb2site to correctly guess the ligand. While pdb2site has a number of rules built into it about how a ligand is most likely represented in a PDB file, there are no hard and fast rules which it can depend on, so sometimes an incorrect choice is made for the ligand. It is important to check the messages output from pdb2site to make sure that the appropriate choice has been made. If not, it will be necessary to manually split out the ligand and protein in order to continue (see below). Similarly, it is sometimes not possible for the atom typing routines to correctly guess the bond orders for the ligand based on geometry, particularly when the ligand geometry from the protein/ligand complex structure is somewhat dubious. Thus, it is often a good idea to use Rasmol to examine the ligand.cex file (rasmol -cex ligand.cex) to check whether or not the bond orders are correct. If not, it is probably best to convert the ligand into a connection table based representation (Smiles or TDT or MDL mol file for example) and start from there. Note that the preceding points only matter when starting from a PDB file which contains both ligand and protein. In many cases, DockIt will be used to dock ligands which come from a database and thus already have the appropriate connectivity specified. To split the ligand and protein manually simply use a text editor to create a protein.pdb file with the ATOM (and any HETATM) records for the protein part of the PDB file. You can include any cofactors, waters, etc which you think might be involved in ligand binding. Then create a ligand.pdb file which contains the ligand ATOM or HETATM records. You will need to put the following lines at the top of protein.pdb: COMPND XXXX where XXXX is replaced by an identifier code for your protein. Also, put the following lines at the top of the ligand.pdb file: COMPND XXXX To proceed with the setup, do the steps in the script bin/setup after pdb2site. If your target protein does not have a bound ligand in the structure, simply create the protein.pdb file as above and skip the steps in bin/setup which involve a ligand (you will need to setup whatever ligands you which to dock in a manner similar to how the script bin/setup_ligands does it). You will also need to remove the -select flag from the sublime step since you do not have a bound ligand to assist in the binding site selection. In almost all cases, DockIt is capable of generating a suitable receptor site representation automatically. However, in some cases, the set of spheres which is generated to represent the binding site are not optimal and some manual tuning of parameters will improve performance. It is possible to graphically examine the spheres either by looking at the spheres_filtered.pdb file with any graphics program or by using Rasmol to look at the spheres.cex file (see tips below for using Rasmol with CEX files). The most common problem with the spheres is that they include too much volume. This can often be corrected by changing the sublime parameter -minscore to a larger number than the default of 3.5. Similarly, if the spheres do not include an important binding pocket, it is often because the pocket is too exposed. In these cases, decreasing -minscore will often correct the situation. sublime generates a rasmol script (sublime.rasmol) which color codes the spheres by ligsite score, which can help in decided how to change -minscore. Sometimes, the largest cluster of spheres is not the correct one and the -clusters and -whichclusters parameters can be used to select the appropriate cluster(s). The -select and -reject options can be useful in fine tuning the sphere set as well. The parameters for the docking step of DockIt are contained in the file cmd.cex by default. These defaults can be modified by editing this file (or a new copy of it). In most cases, these parameters are fine. However, they are tuned for getting good docked geometries at some cost in computational speed. For screening large databases, it may be preferable to get the maximum speed, in which case some of the parameters in cmd.cex can be changed. In particular, TRIALS can be set lower (around 2) and the number of iterations can be lowered (EVAL_LIMIT_1ST, EVAL_LIMIT_2ND, EVAL_LIMIT_3RD) to 300 or 400. Also, by default, hydrogens are added in an optional third stage of refinement. This gives better ligand geometry, particularly for long chains, but costs approximately a factor of 2 in compute time. For the highest speed, it may be appropriate to disable this third stage (set ADDH to FALSE). A good strategy might be to run a large database with the fastest settings and then use the normal default settings to redock the best scoring ligands from the first run. For large docking runs, the output file size from DockIt can get large. The simplest solution is to pipe the output from the dock program into gzip or split. Since the CEX output compresses well, gzip is usually a good solution for this problem. Note that for parallel runs, the same solution can be used since the output is still a single CEX stream, even when multiple processors are being used. The version of Rasmol included with DockIt is capable of reading the CEX files which are used for internal communication within the DockIt program suite. To view a CEX file, use the command rasmol -cex filename.cex. CEX files can be concatenated together (using the Unix cat command: "cat file1.cex file2.cex >combined.cex") and viewed in Rasmol. When more than one CEX molecule object is present in the input file, they can be selected using the Rasmol model commands. Models can be selected either using the ::n syntax (for example, "select ::0" will select the first molecule object in the CEX file) or by using the model specifier (for example, "select model > 2" will select all molecule objects after the third one). So it is possible to look at the protein, the original ligand and the docked ligands by using the following commands: cat protein.cex ligand.cex docked.cex >all.cex in which case," select ::0" will select the protein, "select ::1" will select the original ligand and "select model > 1" will select all the docked ligand conformations. For further information on using Rasmol, use the help command within Rasmol. | |||