Molecular Dynamics

Input preparation

The original PDB files for both mutated and wild types were obtained from the PDB website. The chosen PDB wild-type entry contained an additional ligand, which was manually removed from the file. These two PDB structures were then given as input to the Input Generator tool from Charmm-gui. This tool produces the input files for the different simulations, namely by:

The workflow changes slightly at the beginning of the process for each system, as described below.
Step mt2 input files mt1 input files wt2 input files wt1 input files
1 www.charmm-gui.org $\rightarrow$ Input Generator $\rightarrow$ Solution Builder $\rightarrow$ Download PDB File: 8DFN www.charmm-gui.org $\rightarrow$ Input Generator $\rightarrow$ Solution Builder $\rightarrow$ Upload PDB File: 7si9_modified.pdb. Select Check/Correct PDB Format
2 Select both PROA and PROB Select only PROA Select both PROA and PROB Select only PROA
3 Select Terminal group patching and Model missing residues (select as well the respective residues to model).
4: Solvator Waterbox Size Options $\rightarrow$ Specify Waterbox Size $\rightarrow$ X,Y,Z = 105
Add Ions $\rightarrow$ Ion Placing Method: Monte-Carlo
Basic Ion Types: NaCl $\rightarrow$ Add Simple Ion Type (Concentration: 0.15, Neutralizing)
Remove the default KCl
solvent composition should be: $101 Na^+$, $93 Cl^-$ $103 Na^+$, $99 Cl^-$ $101 Na^+$, $93 Cl^-$ $102 Na^+$, $99 Cl^-$
5: PBC Setup Periodic Boundary Condition Options $\rightarrow$ Generate grid information for PME FFT automatically
6: Input Generator Force Field Options $\rightarrow$ CHARMM36m
Input Generation Options $\rightarrow$ GROMACS
Dynamics Input Generation Options $\rightarrow$ Temperature 310 K

Molecular Dynamics Simulation

The outcome of the previous step is provided in the repository, separated in 4 folders inside data_md/. The parameters for the following steps are also provided in data_md/_params/. The source files for the following steps can also be found in src_md/. It is required to set this as the current working directory before attempting to execute any of the scripts.

Step 0: Energy Minimization

At this point, the system is complete but the protein is still in a high energy configuration. Hence it is necessary to find the configuration associated to the minimum value of the energy and then let it evolve. This is known as energy minimization. The protein is allowed to explore many configurations to find the one associated to a minimum value of the molecular energy.

Parameters employed:

For running this step, it's necessary to submit 0_EM.pbs as a job (or run locally an equivalent .sh script) only once.

NVT and NPT equilibration

For further stabilizing the initial state of the system, the next steps consist of finding the volume and the pressure values which correspond to the minimum of the molecular energy of the protein. For running this step, it's necessary to submit 1_NVT_NPT.pbs as a job (or run locally an equivalent .sh script) only once.

NVT

Here, the volume is fixed, and the pressure slightly changes until the minimum energy configuration is found. The value of the pressure which corresponds to the minimum energy configuration will be employed in the following step.

Parameters employed:

The total time of the molecular dynamics evolution is of $250000 \times 0.002 ps = 0.5 ns$.

NPT

Here, the volume slightly changes, until the configuration which corresponds to the minimum energy is found.

Parameters employed:

The total time of the molecular dynamics evolution is of $500000 \times 0.002 ps = 1 ns$.

Unrestrained Molecular Dynamics

At this point we can proceed with the unrestrained MD simulation.

Parameters employed:

The total time of the molecular dynamics evolution is of $150000000 \times 0.002 ps = 300 ns$. For running this step, a script called 2_MD.sh is provided, which simplifies the process of submitting multiple jobs to the cluster. This script needs to be executed multiple times until the simulations reach the desired amount of steps.

Benchmarking

Given the large amount of integration steps required for this phase, it is pertinent to perform a benchmark to better choose the appropriate amount of computational resources needed (i.e. cores). For this project, the MD simulation was performed on high performance CPUs available in cluster of the university.

The benchmark process for choosing the right amount of cores is as following. The unrestrained MD was performed with a walltime of $10$ and $30$ minutes. We also repeated the benchmarks two times for each walltime. We noticed that the efficiency obtained with the same number of cores and walltime was not constant. This is probably due to the simulation running on different cores every time, which might have different performances.

In the two plots below it is shown, as an example, the benchmark performed for the NVT equilibration. It can be easily observed that the nanoseconds in the simulation processed in a day, in $ns/day$, and the efficiency, in $h/ns$ (hours to perform a nanosecond of simulation), depend on the wall time. It can be also observed that the outcomes obtained for different repetitions are different.

figures/md/benchmark_h_ns.png
Post processing steps of the trajectory.

Post processing

If the resulting md_plain.xtc trajectories are observed, it is evident that they need further processing before analysing them. The script 3_POST.sh is provided to deal with this.

animations/0_plain.gif
Post processing steps of the trajectory.
The aligned trajectories are now ready to be used in the General Analysis.