We start by downloading the structure named 1FKG from the Protein Data Bank's website (http://www.rcsb.org/pdb/home/home.do). This is the crystal structure of the FKBP12 protein, complexed with the SB3 ligand. In this tutorial we are only going to apply the HSMD method to the ligand in this system, so we start by removing from the file the header, the protein part, the crystallographic waters and also the hydrogens in the SB3 residue (we'll put them back later). The resulting file is SB3noH.pdb.
To generate the files we need for the MD in Amber, we use the programs Antechamber and Leap, that are included in the package AmberTools, freely available at http://ambermd.org/. Below are the commands we need to perform in Antechamber:
> reduce SB3noH.pdb > SB3.pdb 
# add hydrogens to ligand 
> reduce SB3noH.pdb > SB3.pdb 
# add hydrogens to ligand 
> parmchk i SB3.mol2 f mol2 o SB3.frcmod 
# check parameters and save them in a frcmod file 
In this way we generated the 3 files that Tleap will need next, SB3.pdb, SB3.mol2 and SB3.frcmod. So now let's run tleap:
> tleap s f leaprc.ff99SB 
# open Tleap with ff99SB force field 
> source leaprc.gaff 
# include gaff force field 
> SB3=loadmol2 SB3.mol2 
# load parameters just generated with Antechamber (frcmod) 
> loadamberparams SB3.frcmod 
# load parameters just generated with Antechamber (mol2) 
> list 
# list residues in Tleap's memory; SB3 should appear 
> check SB3 
# check unit 
> saveoff SB3 SB3.lib 
# save library with info for new residue, in case we need it later 
> ligand=loadpdb SB3.pdb 
# load ligand's pdb 
> check ligand 
# check unit 
> saveamberparm ligand SB3.prmtop SB3.inpcrd 
# save files for Sander 
> quit 
# quit 
We have just created the following 3 files: SB3.lib (library for Tleap, with information about the SB3 residue), SB3.prmtop (parameter file for SB3) and SB3.inpcrd (coordinate file for SB3). The last two files are the ones needed for the molecular dynamics simulation.
Now we are going to add water around the ligand. This could have been done in the previous step, but if we want to do it now, maybe because we want to try with different amounts of water, we can do it in this way, again using Tleap:
tleap s f leaprc.ff99SB 
source leaprc.gaff 
loadamberparams SB3.frcmod 
loadoff SB3.lib 
ligand=loadpdb SB3.pdb 
solvatecap ligand TIP3PBOX ligand.SB3.O2 30.0 
saveamberparm ligand SB3wat30.prmtop SB3wat30.inpcrd 
savepdb ligand SB3wat30.pdb 
quit 
We've just prepared a 30.0 A radius TIP3P water sphere, centered in the ligand's O2 atom. These are the files we will use for the simulations:
SB3wat30.prmtop (parameter file) and SB3wat30.inpcrd (coordinate file). (We also created SB3wat30.pdb, not needed in the rest of the tutorial, but it's always a good idea to have a pdb of the system, in case we need to look for the location, name or type of some atom).
Now we are going to perform a minimization and equilibration, in order to relax the system. Then, we'll do a longer simulation (checking that the ligand remains in the same microstate) from which we are going to extract some frames. Each of these frames are later going to be “reconstructed” via the HSMD method. This will allow us to calculate the entropy of the structures in each frame, and its average. This number will represent the entropy of the microstate.
Below, we detail the 3 steps of the initial simulation (minimization, equilibration and production), with the input and output files, and the commands needed in Amber to perform the simulations:
One should check carefully that simulations above went fine, that the system is well equilibrated, and that the ligand remained, during the production run, in the same microstate (otherwise, we will be measuring the entropy of more than one microstate).
Now, we would usually extract about 80 frames from the previous simulation (production run), and proceed to the reconstruction of each of them. In this way, we would obtain the entropy corresponding to each frame, and from there the average. By using a large number of frames, we are able to reduce the error (standard deviation) of the average entropy. Since this implies working with a big number of files, and the trajectories from each reconstruction are very big files, here we are going to proceed with only one frame.
In order to extract frames from the previous simulation, we can use the Ptraj utility, that comes in AmberTools.
> ptraj SB3wat30.prmtop < extract.ptraj 
This will extract one structure every 12, between frame 1 and 1000 of the initial trajectory. One of these is 133_rec_step0.rst.
Now that we have the parameter and coordinate files, we are ready to start the reconstruction of this frame. We can use a script like prepare_files_REC133.bsh, which will prepare the input files for Amber and execute the runs for each step in the reconstruction. Essentially, what the script does is:
generate an input file for a run where the waters and all the atoms to be reconstructed in the SB3 are free to move.
Run with the previous input file.
generate an input file for a run where the waters and all the atoms to be reconstructed, but one, are free to move.
Run with the previous input file.
generate an input file for a run where the waters and all the atoms to be reconstructed, but two, are free to move.
...
generate an input file for a run where the waters and only the last atom to be reconstructed is free to move.
Run with the previous input file.
Below we print the inputs generated by the script prepare_files_REC_133.bsh for steps 1, 2 and 28 of the reconstruction:
# Production  60 ps  Reconstructing SB3 with water. &cntrl imin = 0, ntx = 5, irest = 1, ntpr = 500, ntwr = 1000, ntwx = 3, ntc = 2, ntf = 2, ntb = 0, cut = 999.0, ntt = 1, temp0 = 300.0, tautp = 1.5, ivcap = 0, fcap = 5.0, nstlim= 30000, dt = 0.002, ibelly=1, bellymask=':(@O2 < :11.0 & !:SB3)  :WAT  :SB3@H=  @6,33  (@26,27,22,21,20,19,28,29,30,31,32,9,1,8,2,3,4,5,7,10,11,12,13,14,17,18,15,16)' &end 
# Production  60 ps  Reconstructing SB3 with water. &cntrl imin = 0, ntx = 5, irest = 1, ntpr = 500, ntwr = 1000, ntwx = 3, ntc = 2, ntf = 2, ntb = 0, cut = 999.0, ntt = 1, temp0 = 300.0, tautp = 1.5, ivcap = 0, fcap = 5.0, nstlim= 30000, dt = 0.002, ibelly=1, bellymask=':(@O2 < :11.0 & !:SB3)  :WAT  :SB3@H=  @6,33  (@27,22,21,20,19,28,29,30,31,32,9,1,8,2,3,4,5,7,10,11,12,13,14,17,18,15,16)' &end 
# Production  60 ps  Reconstructing SB3 with water. &cntrl imin = 0, ntx = 5, irest = 1, ntpr = 500, ntwr = 1000, ntwx = 3, ntc = 2, ntf = 2, ntb = 0, cut = 999.0, ntt = 1, temp0 = 300.0, tautp = 1.5, ivcap = 0, fcap = 5.0, nstlim= 30000, dt = 0.002, ibelly=1, bellymask=':(@O2 < :11.0 & !:SB3)  :WAT  :SB3@H=  @6,33  (@16)' &end 
We see that the above files are identical, except for the bellymask line, where we tell Amber which atoms are allowed to move during the MD. (NOTE: Ignore the first part of bellymask “:(@O2 < :11.0 & !:SB3)”, because that represents residues that are closer than 11A to the O2 atom of SB3, other than SB3. But that, in the present system, means waters, but waters are represented in the next part of bellymask).
In the first step, all the atoms to be reconstructed can move: @26,27,22,21,20,19,28,29,30,31,32,9,1,8,2,3,4,5,7,10,11,12,13,14,17,18,15,16.
In the second step, all the previous atoms can move, but one: @27,22,21,20,19,28,29,30,31,32,9,1,8,2,3,4,5,7,10,11,12,13,14,17,18,15,16.
In the last step, only one of them can move.
So we see that, according to the theory, we are doing runs where initially the whole chain (SB3) moves, but then, one by one, their atoms are getting fixed.
Note that this script needs you to specify some things, like the path to amber, the name of the prmtop file (without the extension) and the succession of atoms (by atom number) that we need to reconstruct (in the list of atoms in the script, we don't include neither hydrogens nor one atom in the rings; their contribution is expected to be very small, and it will save us some computing time). Also verify that the file names are correct.
NOTE: atoms 23, 24 and 25 are not included in the reconstruction procedure, because we need to have 3 atoms in the system fixed, in order to be able to correctly measure the dihedral angle of another atom. So we will need to add a correction to this at the end. The next version of the reconstruction programs here available will also include the motion of these 3 atoms.
Also note that this script will execute the reconstruction in a serial way, i.e., it will reconstruct a given atom only when the previous one has finished. It is easy to adapt the script to a parallel version in which all the atoms are reconstructed simultaneously.
When this script is executed correctly, we obtain a lot of files (inputs, output, restarts and trajectories): REC133.tar.bz2 (86 MB). With this, we are ready to proceed to the analysis of the results.
We need now to analyze the trajectories of each reconstructed atom. This will allow us to extract the entropy for each atom and thus, for the frame. To that end, we'll use the script RECONSTR_analysis_manytraj_amber.sh, which in turn, calls the programs RECONST_angles_list_amber.c and RECONSTR_count_amber.c.
This script has two parts. The first one calls the RECONST_angles_list_amber.c program, to read the coordinates from the trajectories, and calculate from them the angles (bond and dihedral) related to the atoms we want to reconstruct. The program knows which atoms this are, because we also need to supply an angle file, which contains the list of angles to be treated at each step of the reconstruction analysis. This list has to specify, in each line, the number of four atoms, where the four are needed to measure the dihedral angle of the last one (the one we are reconstructing), and where the last three are needed to measure the bond angle of the last one. The angle file for our case is this. The output of the RECONST_angles_list_amber.c program is a file containing a matrix with all the angles related to the reconstructed atoms. In that matrix, the columns represent a different angle, and the rows represent a different step in the production run.
In the second part, the script calls the program RECONSTR_count_jacobian_amber.c, which will read the output of the previous program (the matrix of angles) and from there calculate the entropy of the system. The output is the entropy of the frame we are studying (frame 133 in the present case. If we start we several frames, then the output will be the average entropy of all the frames).
Looking inside the script, we see that some parameters need to be fixed, e.g., the name of the angles file and the parameters file, how many frames there are in the trajectory of each atom (steps in the production run), how many atoms we are reconstructing, number of steps from the beginning of the trajectories we wan to use as equilibration, etc. Notice that the generation of the angles matrix takes a relatively long time, while the calculation of S is fast. So we would generally generate the matrix one time, without deleting it at the end (BUILD_ANGLES=1, REMOVE_ANGLES=0), and then on calculate entropies without generating the matrix again (BUILD_ANGLES=0).
Below, we present the results for the average entropy of 80 frames (frame 133, the one we have been using so far, and the other 79 frames we didn't use for the sake of simplicity), as a function of the bin size and the number of trajectory points used for the analysis.
The first column in the table represents the integer, n, by which the maximum dispersion of the angles in the trajectories is divided, in order to define the bin size. In the script, we use DELTA_B and DELTA_D for the integers for the bond angles and dihedral angles, respectively.
Table with 1000 equilibration points, minimum count per bin=50:
n in max_delta/n 
data points 
T.ΔS 
0.01 
10000 
129.9 
0.1 
10000 
91.4 
0.5 
10000 
64.6 
1.0 
10000 
54.2 
3.0 
10000 
43.0 
5.0 
10000 
41.3 
10.0 
10000 
39.9 
15.0 
10000 
39.5 
20.0 
10000 
39.2 

8000 
41.4 

6000 
44.1 

4000 
48.5 

2000 
58.3 
25.0 
10000 
39.2 
30.0 
10000 
39.2 

8000 
41.2 

6000 
44.1 

4000 
48.4 

2000 
58.2 
40.0 
10000 
39.2 

8000 
41.3 

6000 
44.0 

4000 
48.3 

2000 
58.2 
50.0 
10000 
39.1 

8000 
41.2 

6000 
44.0 

4000 
48.3 

2000 
58.3 
60.0 
10000 
39.1 

8000 
41.2 

6000 
44.0 

4000 
48.3 

2000 
58.2 
70.0 
10000 
39.1 
150.0 
10000 
39.1 
300.0 
10000 
39.1 
500.0 
10000 
39.1 
Below, in Figure 1, we have a plot of the entropy as a function of the inverse of the bin size, where we see a nice convergence as the size is decreased (we should mention that this convergence is obtained only because in the processing we are increasing the size of the bin if the count on that bin is less than 50). And in Figure 2, we also see a plot of the entropy as a function of the number of data points. Here, we can see that we are approaching convergence, but are not quite there yet. In general, it is very difficult to achieve convergence in the number of data points. But we are generally interested in differences of entropies (e.g., difference complexeduncomplexed when we study binding), and in this case the convergence is usually much faster.

Figure 2 
Correction: in order to take into account the motion of the 3 atoms of SB3 that were not included in the reconstruction procedure, we need to do the following:
S_{correction} = k_{B} ln (p(23,24,25)),
where k_{B} is Boltzmann's constant and p(23,24,25) is the probability that atoms 23, 24 and 25 will occupy the location in space that they occupy in the frame that we want to reconstruct. But it's easy to calculate this probability, if we think that we are constructing this three points in space, and we use the correct assumption that these atoms don't have any preference with respect to their location, other than respecting the correct relations between them (e.g., it's equally likely to have atom 23 above or below atom 24):
probability of the first atom (23) to occupy some specific location in space: 1 / V, where V is the available space; in our case, the volume of the simulation sphere.
probability of the line connecting atoms 23 and 24 to have some specific direction in space: 1 / (4 pi).
probability of the triangle connecting atoms 23, 24 and 25 to have some specific orientation in space: 1 / (2 pi).
Thus, the correction for the case of a spherical volume with radius = 15A, is given by:
T S_{correction} = k_{B} T ln (1 / (V 8 pi^2)) ~ 8 kcal/mol.
So we should shift every number in the table/plots above by this amount.
last
updated