Best practices

Improving energy

CaverDock has been parametrized to bring a good mix of computational efficiency and precision of computation. However, there is still some room for hand-tuning the CaverDock computation. One can tune the CaverDock parameters to increase the precision of its computation and the structure of molecules to ease the computation of the ligand movement.

Improving lower-bound energy

The lower-bound energy depends on a tunnel geometry and the docking ability to find good local minima. To improve (decrease) the energy of the lower-bound trajectory, several arguments can be tuned:

  • --exhaustiveness can be set to a higher value which increases the number of random walks of the Markov-chain Monte Carlo global search algorithm and increases the probability of finding good local minima;

  • --parallel_workers_lb can be set to a higher value and should have a similar effect to --exhaustiveness.

If the energy of the lower-bound trajectory is too high even with high exhaustiveness, the ligand probably cannot pass through the tunnel in the real world, or there is some issue with the tunnel or the receptor geometry. Three typical issues are described below.

  • The tunnel obtained from CAVER forces the ligand to move too deeply into the tunnel, where it reaches the energetic barrier at the tunnel bottom. This situation can be detected by exploring the energy graph: the energy is very high in the area of the beginning of the tunnel (i.e. around the position zero at the x-axis). In such a case, the --dock_like parameter can be used to set the correct starting disc, so the ligand will not be pushed against the tunnel bottom.

  • Side-chain residues are forming a bottleneck. We can detect this issue by exploring the bottleneck dump. In such a case, side-chain residues forming the bottleneck should be set to be flexible by MGTools.

  • The backbone residues are forming a bottleneck. We can detect this issue by exploring the bottleneck dump. In such a case, we need to use a different geometry of the receptor (e.g. one obtained from an MD simulation). This is a typical issue when the receptor structure is taken from a crystal with tunnels closed due to intramolecular interactions and crystal packing.

Improving upper-bound energy

When we are satisfied with the lower-bound energy, we can focus on the upper-bound energy. If the difference between the lower-bound and the upper-bound energies is too high, we can tune several CaverDock parameters:

  • --backtrack_threshold can be set to a lower value, if we consider 1 kcal/mol already as an undesired difference between the lower-bound and the upper-bound energies. Beware that tuning this value may result in longer computation times.

  • --backtrack_limit can be set to a lower value so that CaverDock will execute backtracking more aggressively. Beware that tuning this value may result in longer computation times.

  • --multiple_search can be set to a larger value so that CaverDock will execute more instances of forward movements or backtrackings. Beware that tuning this value may result in longer computation times if insufficient cores are used by CaverDock.

Together with the CaverDock parameters, we should also check the geometry of the tunnels and the receptor, as some bottlenecks, as well as issues with a tunnel bottom, may arise only when a contiguous trajectory is computed.

Improving computation time

The CaverDock is usually quite fast on a standard desktop computer (its execution commonly takes from minutes to dozens of minutes). However, the execution time can be improved by the following actions.

  • Use as low flexible side-chain residues as possible. The computational time of docking grows rapidly with the number of degrees of freedom of the system. Using flexible side-chain residues may greatly improve the energy profile; however, we recommend setting flexibility only on residues, which are proven to form a bottleneck in a particular tunnel (i.e. they are reported with the --dump_bottlenecks option).

  • Use the right number of parallel processes. CaverDock uses MPI; thus, it can use multiple cores of a desktop machine or even multiple nodes of a cluster. The number of processes (passed by the -np parameter of mpirun) should be set to a number of virtual cores plus one (e.g. use -np 9 on a machine with 8 cores). Please note that computing lower-bound trajectory scales very well (CaverDock can utilize a hundred cores in a typical scenario), whereas upper-bound trajectory scales up to the number of concurrently executed docking multiplied by concurrent searches (set by --parallel_workers_smooth and --multiple_search, respectively).

  • Try increasing the number of parallel workers instead of exhaustiveness. The default number of processes solving the same docking scenario in parallel is 4. If one wants to increase the exhaustiveness of docking, the values of --parallel_workers_lb and --parallel_workers_smooth parameters may be increased instead of the value of --exhaustiveness.

The tips described above improve the CaverDock speed in general. However, some issues result from the properties of analyzed biochemical systems. We summarize typical issues and suggest possible solutions in the listing below.

  • Backtracking is executed very frequently (this can be observed in CaverDock log files). When an upper-bound trajectory is computed, CaverDock tries to keep its energy as close to the lower-bound as possible. In some cases, it is not possible, and CaverDock executes a lot of backtracking trying to find a better trajectory without any success. The number of executed backtracking runs can be decreased by parameters --backtrack_threshold (higher difference of the lower-bound and upper-bound energies may be required to start backtracking) or --backtrack_limit (the frequency of backtracking execution may be lowered). Beware that usually high computational times and a high number of backtracking executions are related to suboptimal geometry of the tunnel or the receptor (see section Improving lower-bound energy for geometry optimization tips).

  • The number of degrees of freedom is very high. If the number of flexible side-chain residues is high, we can fix some residues in such a position which does not form a bottleneck and make them rigid. We may also try to compute a contiguous ligand movement with a non-contiguous movement of side-chain residues by using the parameter --allow_flex_discontinuity (which removes some restraints in the search space). If the origin of the high number of degrees of freedom is mainly the complexity and the size of a ligand, this system may be too complex to be efficiently analyzed by CaverDock.