Running MD Simulations¶
After building your protein-ligand system with PRISM, you're ready to run molecular dynamics simulations using GROMACS. This guide covers the complete simulation workflow from energy minimization to production MD.
Quick Start
The auto-generated script runs the complete EM → NVT → NPT → Production workflow with checkpoint restart support.The MD Workflow¶
Overview¶
PRISM follows the standard GROMACS workflow for biomolecular simulations:
1. Energy Minimization (EM)
↓ Remove bad contacts, relax structure
2. NVT Equilibration
↓ Heat system to target temperature (constant Volume)
3. NPT Equilibration
↓ Equilibrate pressure and density (constant Pressure)
4. Production MD
↓ Collect data for analysis
Each stage uses optimized MDP parameters generated by PRISM.
Simulation Stages¶
1. Energy Minimization (EM)¶
Purpose: Remove bad contacts and steric clashes
Generated Files:
GMX_PROLIG_MD/
├── em/
│ ├── em.tpr # Run input
│ ├── em.gro # Minimized structure
│ ├── em.edr # Energy data
│ └── em.log # Log file
Key Parameters (from em.mdp):
integrator = steep # Steepest descent
emtol = 200.0 # Force tolerance (kJ/mol/nm)
nsteps = 10000 # Maximum steps
Manual Execution:
cd GMX_PROLIG_MD
# Create run input
gmx grompp -f ../mdps/em.mdp -c solv_ions.gro -r solv_ions.gro \
-p topol.top -o em/em.tpr -maxwarn 1
# Run minimization
gmx mdrun -deffnm em/em -v
Success Criteria: - Final potential energy < 0 - Force < 200 kJ/mol/nm - No warnings about instabilities
Check Results:
# Extract and plot energy
gmx energy -f em/em.edr -o em_energy.xvg
# Select: Potential
# Check final energy (should be negative)
tail em/em.log | grep "Potential Energy"
2. NVT Equilibration¶
Purpose: Heat system to target temperature (310 K default)
Generated Files:
GMX_PROLIG_MD/
├── nvt/
│ ├── nvt.tpr # Run input
│ ├── nvt.gro # Final structure
│ ├── nvt.xtc # Trajectory
│ ├── nvt.edr # Energy data
│ ├── nvt.log # Log file
│ └── nvt.cpt # Checkpoint (for restart/continuation)
Key Parameters (from nvt.mdp):
integrator = md # Leap-frog MD
dt = 0.002 # 2 fs time step
nsteps = 250000 # 500 ps total
tcoupl = V-rescale # Temperature coupling
ref_t = 310 # Target temperature (K)
tau_t = 0.1 # Coupling time constant
Manual Execution:
# Create run input
gmx grompp -f ../mdps/nvt.mdp -c em/em.gro -r em/em.gro \
-p topol.top -o nvt/nvt.tpr -maxwarn 1
# Run NVT (CPU)
gmx mdrun -deffnm nvt/nvt -v
# Or with GPU acceleration
gmx mdrun -deffnm nvt/nvt -v \
-ntmpi 1 -ntomp 10 \
-nb gpu -bonded gpu -pme gpu -gpu_id 0
Success Criteria: - Temperature stabilizes at ~310 K - No drift in total energy - System remains stable
Check Results:
# Temperature vs time
echo "Temperature" | gmx energy -f nvt/nvt.edr -o nvt_temp.xvg
# Visualize (with xmgrace or Python)
xmgrace nvt_temp.xvg
Expected: Temperature oscillates around 310 K
3. NPT Equilibration¶
Purpose: Equilibrate pressure and density
Generated Files:
GMX_PROLIG_MD/
├── npt/
│ ├── npt.tpr # Run input
│ ├── npt.gro # Final structure
│ ├── npt.xtc # Trajectory
│ ├── npt.edr # Energy data
│ ├── npt.log # Log file
│ └── npt.cpt # Checkpoint
Key Parameters (from npt.mdp):
integrator = md
dt = 0.002
nsteps = 250000 # 500 ps
tcoupl = V-rescale
pcoupl = C-rescale # Pressure coupling (new)
ref_p = 1.0 # Target pressure (bar)
tau_p = 1.0 # Barostat time constant
compressibility = 4.5e-5 # Water compressibility
Manual Execution:
# Create run input
gmx grompp -f ../mdps/npt.mdp \
-c nvt/nvt.gro -r nvt/nvt.gro -t nvt/nvt.cpt \
-p topol.top -o npt/npt.tpr -maxwarn 1
# Run NPT
gmx mdrun -deffnm npt/npt -v \
-ntmpi 1 -ntomp 10 \
-nb gpu -bonded gpu -pme gpu -gpu_id 0
Success Criteria: - Pressure stabilizes at ~1 bar - Density stabilizes (~1000 kg/m³ for water) - Box volume equilibrated
Check Results:
# Pressure and density
echo "Pressure" | gmx energy -f npt/npt.edr -o npt_pressure.xvg
echo "Density" | gmx energy -f npt/npt.edr -o npt_density.xvg
Expected: - Pressure: 1.0 ± 200 bar (large fluctuations are normal) - Density: ~1000 kg/m³
4. Production MD¶
Purpose: Collect simulation data for analysis
Generated Files:
GMX_PROLIG_MD/
├── prod/
│ ├── md.tpr # Run input
│ ├── md.xtc # Compressed trajectory (main output!)
│ ├── md.edr # Energy data
│ ├── md.log # Log file
│ ├── md.cpt # Checkpoint
│ └── md.gro # Final structure
Key Parameters (from md.mdp):
integrator = md
dt = 0.002
nsteps = 250000000 # 500 ns (default)
; Output control
nstxout-compressed = 250000 # Save frame every 500 ps
compressed-x-grps = System # Save all atoms
nstlog = 5000 # Log every 10 ps
nstenergy = 5000 # Energy every 10 ps
Manual Execution:
# Create run input
gmx grompp -f ../mdps/md.mdp \
-c npt/npt.gro -t npt/npt.cpt \
-p topol.top -o prod/md.tpr -maxwarn 1
# Run Production MD
gmx mdrun -deffnm prod/md -v \
-ntmpi 1 -ntomp 10 \
-nb gpu -bonded gpu -pme gpu -gpu_id 0
For Long Simulations:
The Automated Script¶
PRISM generates localrun.sh with these features:
Features¶
- Checkpoint Restart: Automatically resumes if interrupted
- Skip Completed Stages: Won't re-run finished steps
- GPU Acceleration: Pre-configured for NVIDIA GPUs
- Progress Monitoring: Clear status messages
Script Structure¶
#!/bin/bash
# Auto-generated by PRISM
# Energy Minimization
mkdir -p em
if [ -f ./em/em.gro ]; then
echo "✓ EM already completed, skipping..."
elif [ -f ./em/em.tpr ]; then
echo "Resuming EM from checkpoint..."
gmx mdrun -s ./em/em.tpr -deffnm ./em/em -cpi ./em/em.cpt -v
else
echo "Starting EM..."
gmx grompp -f ../mdps/em.mdp -c solv_ions.gro -r solv_ions.gro \
-p topol.top -o ./em/em.tpr -maxwarn 999
gmx mdrun -s ./em/em.tpr -deffnm ./em/em -v
fi
# NVT, NPT, Production...
# (Similar structure for each stage)
Customizing the Script¶
Edit localrun.sh to adjust:
1. GPU Settings:
2. CPU Threading:
# More threads
NTOMP=20 # Use 20 threads instead of 10
# Disable threading (single-threaded)
NTOMP=1
3. No GPU (CPU-only):
# Remove GPU flags
gmx mdrun -deffnm nvt/nvt -v -ntmpi 1 -ntomp 10
# (Remove: -nb gpu -bonded gpu -pme gpu -gpu_id 0)
GPU Acceleration¶
Checking GPU Availability¶
# Check NVIDIA GPU
nvidia-smi
# Check GROMACS GPU support
gmx mdrun -version
# Look for: "GPU support: CUDA"
GPU Performance Flags¶
gmx mdrun -deffnm prod/md -v \
-ntmpi 1 # Number of thread-MPI ranks
-ntomp 10 # OpenMP threads per rank
-nb gpu # Nonbonded on GPU
-bonded gpu # Bonded on GPU
-pme gpu # PME on GPU
-gpu_id 0 # Which GPU to use
-pin on # Pin threads to cores
Multi-GPU Setup¶
# Use 2 GPUs
gmx mdrun -deffnm prod/md -v \
-ntmpi 2 -ntomp 10 \
-nb gpu -bonded gpu -pme gpu \
-gpu_id 01 -pin on
Performance Tips¶
# Check performance
gmx mdrun -deffnm test -nsteps 10000 \
-nb gpu -pme gpu -bonded gpu \
-ntomp 10 -pin on
# Compare CPU vs GPU
# CPU only:
gmx mdrun -deffnm test_cpu -nsteps 10000 -ntomp 20
# GPU:
gmx mdrun -deffnm test_gpu -nsteps 10000 -nb gpu -pme gpu -ntomp 10
Monitoring Simulations¶
Real-time Progress¶
Performance Metrics¶
# Extract performance from log
grep "Performance" prod/md.log
# Typical output:
# Performance: 45.234 ns/day
# Time per step: 1.91 ms
Interpretation:
- ns/day: How much simulation time per real day
- 45 ns/day → 500 ns simulation takes ~11 days
- Target: >20 ns/day (GPU), >5 ns/day (CPU)
Checking Stability¶
# Quick check of temperature
echo "Temperature" | gmx energy -f prod/md.edr -o temp.xvg
# Check for crashes
tail -50 prod/md.log
# Look for: "Finished mdrun"
Customizing MDP Files¶
PRISM generates optimized MDP files in mdps/ directory. You can modify them before running:
Common Modifications¶
1. Change Simulation Length:
2. Change Output Frequency:
# Save frames more frequently
nstxout-compressed = 100000 # Every 200 ps (instead of 500 ps)
# Save less frequently (smaller files)
nstxout-compressed = 500000 # Every 1 ns
3. Change Temperature:
4. Add Position Restraints:
# In mdps/nvt.mdp and mdps/npt.mdp
; Position restraints
define = -DPOSRES -DPOSRES_LIG
; Restraint force constant
refcoord_scaling = com # Scale with COM motion
5. Change Trajectory Output:
MDP Parameters Reference¶
| Parameter | Purpose | Default | Recommendation |
|---|---|---|---|
nsteps |
Simulation length | 250M (500 ns) | Adjust as needed |
dt |
Time step | 0.002 ps | Don't change (with constraints) |
nstxout-compressed |
Frame interval | 250000 (500 ps) | 100000-500000 |
ref_t |
Temperature | 310 K | 298-310 K typical |
ref_p |
Pressure | 1.0 bar | Usually 1.0 |
tau_t |
T-coupling constant | 0.1 ps | 0.1-1.0 ps |
tau_p |
P-coupling constant | 1.0 ps | 1.0-5.0 ps |
Cluster Submission¶
SLURM Example¶
Create submit_md.sh:
#!/bin/bash
#SBATCH --job-name=prism_md
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=10
#SBATCH --time=7-00:00:00
#SBATCH --mem=32GB
module load gromacs/2024.3-cuda
cd $SLURM_SUBMIT_DIR/GMX_PROLIG_MD
# Run localrun.sh
bash localrun.sh
Submit:
PBS Example¶
#!/bin/bash
#PBS -N prism_md
#PBS -l select=1:ncpus=10:ngpus=1
#PBS -l walltime=168:00:00
cd $PBS_O_WORKDIR/GMX_PROLIG_MD
module load gromacs-cuda
bash localrun.sh
Trajectory Files¶
Understanding Output¶
Compressed Trajectory (.xtc): - Main output for analysis - Reduced precision (sufficient for most analyses) - Much smaller than .trr files - Default: frames every 500 ps
Energy File (.edr):
- Contains all energy terms
- Temperature, pressure, potential, kinetic energy
- Use with gmx energy
Checkpoint (.cpt): - For restarting simulations - Contains complete state (positions, velocities, etc.) - Binary format
File Sizes¶
For a typical 50k atom system, 500 ns:
| File | Size | Notes |
|---|---|---|
| md.xtc | ~500 MB | Compressed trajectory |
| md.edr | ~50 MB | Energy data |
| md.log | ~10 MB | Text log |
| md.cpt | ~100 MB | Checkpoint |
| md.tpr | ~50 MB | Run input |
Managing Disk Space¶
# Compress old trajectories
gzip prod/md.xtc # Creates md.xtc.gz
# Remove checkpoint files after completion
rm prod/md.cpt
# Keep only last N frames
gmx trjconv -f prod/md.xtc -s prod/md.tpr \
-o prod/md_last100ns.xtc -b 400000
Troubleshooting¶
Common Issues¶
"LINCS WARNING"¶
Causes: - Time step too large - Poor starting structure - Exploding systemSolutions:
# 1. Re-run energy minimization longer
# Edit mdps/em.mdp:
nsteps = 50000 # Increase from 10000
# 2. Use smaller time step
# Edit mdps/*.mdp:
dt = 0.001 # 1 fs instead of 2 fs
# 3. Check for bad contacts
gmx check -f em/em.gro
"System Blew Up"¶
Solutions:# 1. Longer equilibration
# Edit mdps/nvt.mdp and mdps/npt.mdp:
nsteps = 500000 # 1 ns instead of 500 ps
# 2. Gentler temperature coupling
# Edit mdps/nvt.mdp:
tau_t = 1.0 # Slower coupling
# 3. Check starting structure
gmx editconf -f solv_ions.gro -o check.pdb
# Open check.pdb in VMD/PyMOL
"GPU Performance Slow"¶
Solutions:# 1. Check GPU utilization
nvidia-smi -l 1
# 2. Optimize thread count
# Try different values:
gmx mdrun -ntmpi 1 -ntomp 8 ... # Fewer threads
gmx mdrun -ntmpi 1 -ntomp 12 ... # More threads
# 3. Update GROMACS
# Newer versions have better GPU support
"Out of Memory"¶
Solutions:# 1. Use fewer OpenMP threads
-ntomp 4 # Instead of 10
# 2. Submit to high-mem queue
#SBATCH --mem=64GB
# 3. Reduce system size
# Use smaller box in initial build
Best Practices¶
Before Production MD¶
-
Check Equilibration:
-
Visualize System:
-
Check Protein-Ligand Distance:
During Production¶
-
Monitor Progress:
-
Backup Regularly:
-
Check Disk Space:
REST2 Replica Exchange¶
PRISM can set up REST2 (Replica Exchange with Solute Tempering 2) simulations for enhanced sampling:
| Flag | Description | Default |
|---|---|---|
--rest2 |
Enable REST2 mode | off |
--t-ref |
Reference temperature (K) | 310.0 |
--t-max |
Maximum effective temperature (K) | 450.0 |
--replica-number |
Number of replicas | 16 |
--rest2-cutoff |
Hot region cutoff around ligand (nm) | 0.5 |
Theory¶
In standard replica exchange MD (REMD), the entire system is simulated at multiple temperatures. This requires many replicas and enormous computational cost because the acceptance probability decreases exponentially with system size. REST2 (Replica Exchange with Solute Tempering 2) overcomes this by scaling only the solute (ligand + nearby residues) potential energy, while keeping the solvent at a constant thermostat temperature \(T_0\).
Temperature Ladder¶
PRISM constructs a geometric temperature ladder of \(n\) replicas:
The scaling factor for replica \(i\) is:
where \(\lambda_0 = 1\) (unscaled, physical replica) and \(\lambda_{n-1} = T_{\text{ref}} / T_{\text{max}}\) (most scaled).
Potential Energy Scaling¶
The total potential energy of the system is decomposed into solute-solute (\(V_{ss}\)), solute-solvent (\(V_{sw}\)), and solvent-solvent (\(V_{ww}\)) terms. For replica \(i\), the scaled Hamiltonian is:
In practice, PRISM implements this by modifying the GROMACS topology for each replica:
| Interaction type | Condition | Scaling |
|---|---|---|
| LJ \(\varepsilon\) (solute atom types) | Solute-solute | \(\varepsilon' = \lambda \, \varepsilon\) |
| Partial charges (solute atoms) | Solute-solvent (mixed) | \(q' = \sqrt{\lambda} \, q\) |
| Bonded force constants | All solute atoms | \(k' = \lambda \, k\) |
| Bonded force constants | Mixed solute-solvent | \(k' = \sqrt{\lambda} \, k\) |
| Solvent interactions | All solvent atoms | Unchanged |
This ensures that solute-solute interactions scale as \(\lambda\), mixed interactions scale as \(\sqrt{\lambda}\) (because two \(\sqrt{\lambda}\) charges produce a \(\lambda\)-scaled Coulomb interaction for solute-solute, or a \(\sqrt{\lambda}\)-scaled interaction when one partner is solvent), and solvent-solvent interactions remain unscaled.
Hot Region Selection¶
PRISM identifies the "hot" region (solute) as the ligand plus all protein residues with any heavy atom within the --rest2-cutoff distance (default 0.5 nm) of any ligand atom. These atoms are marked in the topology, and their force field parameters are scaled per-replica.
Replica Exchange Protocol¶
During the production run, adjacent replicas periodically attempt Hamiltonian exchanges. The acceptance probability for swapping replicas \(i\) and \(j\) follows the Metropolis criterion:
Because only the solute region is scaled, the acceptance rate remains high even for large solvated systems, making REST2 far more efficient than standard temperature REMD.
Workflow¶
graph LR
A[Standard MD System] --> B[Identify Hot Region]
B --> C[Mark Solute Atoms in Topology]
C --> D[Generate Scaled Topologies per Replica]
D --> E[Run Hamiltonian Replica Exchange]
E --> F[Analyze Replica 0 for Physical Ensemble]
The production run uses gmx mdrun -multidir -hrex -replex 1000, performing exchange attempts every 1000 steps across all replicas simultaneously.
What's Next¶
Example: Complete Workflow¶
# 1. Build system with PRISM
prism protein.pdb ligand.mol2 -o my_system
# 2. Navigate to MD directory
cd my_system/GMX_PROLIG_MD
# 3. (Optional) Modify MDP parameters
nano ../mdps/md.mdp
# Change nsteps for longer/shorter simulation
# 4. Run complete MD workflow
bash localrun.sh
# 5. Monitor progress
tail -f prod/md.log
# 6. After completion, check
tail prod/md.log
# Look for "Finished mdrun"
Performance Benchmarks¶
Typical performance for different system sizes:
| System Size | Atoms | GPU | CPU | ns/day (GPU) | ns/day (CPU) |
|---|---|---|---|---|---|
| Small | ~30k | RTX 3090 | 20 cores | 80-100 | 10-15 |
| Medium | ~50k | RTX 3090 | 20 cores | 50-70 | 5-8 |
| Large | ~100k | RTX 3090 | 20 cores | 25-35 | 2-4 |
| Very Large | ~200k | A100 | 32 cores | 30-40 | 1-2 |
Note: Performance varies with: - GPU/CPU model - Box shape (dodecahedron is faster) - Number of water molecules - Force field complexity - GROMACS version
Resuming Interrupted Simulations¶
If a simulation is interrupted:
cd GMX_PROLIG_MD
# Option 1: Re-run localrun.sh (detects checkpoint automatically)
bash localrun.sh
# Option 2: Manual continuation
gmx mdrun -deffnm prod/md -cpi prod/md.cpt -append
The checkpoint file (.cpt) contains everything needed to continue.
Advanced: Extending Simulations¶
To extend an already-completed simulation:
# Convert checkpoint to tpr with extended time
gmx convert-tpr -s prod/md.tpr -extend 100000 -o prod/md_ext.tpr
# Continue from where it stopped
gmx mdrun -deffnm prod/md -s prod/md_ext.tpr -cpi prod/md.cpt -append
This adds another 200 ns (100000 * 0.002 ps) to the simulation.