Writing New Scoring Functions |
Introduction Overview Installation Quickstart Hints and tips Utilities PVM version | It is possible to write user defined scoring functions which can be used by DockIt in the same way that the supplied scoring functions are used. These new functions can be used as the primary scoring function (provided they can provide gradient information) or as rescoring functions. User defined scoring functions are provided to DockIt as shared object (.so) files which can be loaded dynamically at run time. The shared object file must provide implementations of a small set of functions (not all of which are necessarily required for a particular application). Most of these functions deal with initialization and parameter setup. In addition, appropriate atom type information must be provided in the input CEX stream for the protein and ligands. The atom type information is usually generated by separate programs (analogous to plptypes ) which examine the structures and generate appropriate atom type tags in the CEX stream. These atom type tags are specific to the scoring function and DockIt does not process them in any way. They are passed along to the scoring function via some of the initialization calls which are required of the plugin. API. The scoring function then interprets the atom type information, along with other information about the atoms and bonds in the receptor or ligand (if needed) and creates a data structure for it's own use. This structure is passed back to DockIt as an opaque object (in C terms, a void *). Each time DockIt calls the scoring function it passes this pointer to it. Thus, DockIt is able to pass the necessary information to the scoring function via the plugin API without needing to actually examine or understand the information which is specific to that scoring function. This allows arbitrary scoring functions to be plugged into DockIt without modification of the DockIt executable program. To illustrate the process of adding a new scoring function, an example is provided in the example subdirectory. This example is somewhat contrived so that it can illustrate the steps necessary for a new scoring function without being overly complicated. The new function computes sum of the average atomic weights for all the ligand atoms which are fall within the range of 2.5 to 4 angstroms from any receptor atom (for each docked conformation of each ligand). This function cannot be used as a primary scoring function since it does not provide gradient information. However, it is functional as a rescoring function. The first step is to write a program, (awtype), which adds average atomic weight information as a property of each ligand atoms. This information is later used in the new scoring function to compute the total atomic weights. awtype is similar to the existing programs (e.g. plptypes) which add atom type information to the ligand CEX stream. awtype is a simple program which looks up the average atomic weight of each atom and adds it as the “awtype” property. Generally, the atom typing programs need to be more complicated since they often use substructure matching (using the Daylight SMARTS toolkit, for example) to assign the appropriate atom types. However, awtype illustrates the steps involved in adding a new property tot he CEX stream. The awtype property is used by the new scoring function plugin (scoreawt) when each new ligand is read in. As DockIt read in each ligand atom, the routine UserLigandAtomType is called. This routine is part of the scoring function API, by which DockIt communicates to each scoring function. In this case, scoreawt simply reads the awtype property from the CEX atom and stores in away in a private structure. Score, which is private to scoreawt. Score is created in UserSetupSite and passed back (as a void *) to DockIt. DockIt then passes the pointer as an argument to most of the API functions, but does not examine the contents of Score. The purpose of Score is to hold any ligand and receptor information needed by the new scoring function. The Score structure is entirely private to the scoring function and can be laid out in any way which is useful to the new functions. In this example, the average atomic weight for each ligand atom is all that is needed so that is stored as each ligand is read in. In most cases, scoring functions also store information about the receptor atom types as well. As can be seen from the scoreawt source code, not all the routines defined by the API are necessarily used in each scoring function. In the current example, a number of the functions simply return a success code. This, again, is completely determined by what information is required by the scoring function. The actual scoring function (energy) is called following the setup. In the case of a rescoring function, it is only called once per docked ligand conformation. If the function is being used as the primary scoring function, then it is called many times during the optimization phases of DockIt operations. In this example, the coordinates of the ligand and receptor atom are retrieved, the distances calculated and if a ligand atom falls within the distance range (2.5 to 4 angstroms) then it's average atomic weight is added to the sum. The code for getting the coordinates and using (if desired) the nearest neighbors list of receptor atoms within ENERGY_CUTOFF (in cmd.cex) of each ligand atom can be used unchanged in new functions. In fact, a samplescore.c file is provided in the example directory. This file can be used as a template for new scoring functions since it contains portions of the code which typically don't change between scoring functions. To try out the example scoring function, add the following to RESCORE_FUNCTIONS in cmd.cex: AWTSCORE, awtscore, scoreawt.so, AWT SCORE Then run awtype on the ligands to assign the appropriate information to each atom. | |||