一个使用gromacs进行蛋白质模拟的入门教程

合集下载

一个使用gromacs进行蛋白质模拟的入门教程

一个使用gromacs进行蛋白质模拟的入门教程

Lysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will contain an explanation of input and output, using typical settings for general use.This tutorial assumes you are using a GROMACS version in the 4.5.x series.Step One: Prepare the TopologyWe must download the protein structure file we will be working with. For this tutorial, we will utilize hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and download the PDB text for the crystal structure.Once you have downloaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters. Use a plain text editor like vi, em acs (Linux/Mac), or Notepad (Windows). Do not use word processing software! Delete the lines corresponding to these molecules (residue "HOH" in the PDB file). Note that such a procedure is not universally appropriate (i.e., the case of a bound active site water molecule). For our intentions here, we do not need crystal water.Always check your .pdb file for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be absent, and may not present a problem for dynamics. Incomplete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topologies for arbitrary molecules, just the residues defined by the force field (in the *.rtp files - generally proteins, nucleic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB file should contain only protein atoms, and is ready to be input into the first GROMACS tool, pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topology for the molecule.2. A position restraint file.3. A post-processed structure file.The topology (topol.top by default) contains all the information necessary to define the molecule within a simulation. This information includes nonbonded parameters (atom types and charges) as well as bonded parameters (bonds, angles, and dihedrals). We will take a more detailed look at the topology once it has been generated.Execute pdb2gmx by issuing the following command:pdb2gmx -f 1AKI.pdb-o 1AKI_processed.gro-water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a forcefield:Select the Force Field:From '/usr/local/gromacs/share/gromacs/top':1: AMBER03 force field (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)3: AMBER96 force field (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)4: AMBER99 force field (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006)6: AMBER99SB-ILDN force field (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)8: CHARMM27 all-atom force field (with CMAP) - version 2.09: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)15: [DEPRECATED] Encad all-atom force field, using full solvent charges16: [DEPRECATED] Encad all-atom force field, using scaled-down vacuum charges17: [DEPRECATED] Gromacs force field (see manual)18: [DEPRECATED] Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force field and decide which is most applicable to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, followed by 'Enter'.There are many other options that can be passed to pdb2gmx. Some are listed here:-ignh: Ignore H atoms in the PDB file; especially useful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GR OMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides to Cys.You have now generated three new files: 1AKI_processed.gro, topol.top, and posre.itp. 1AKI_processed.gro is a GROMACS-formatted structure file that contains all the atoms defined within the force field (i.e., H atoms have been added to the amino acids in the protein). The topol.top file is the system topology (more on this in a minute). The posre.itp file contains information used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLet's look at what is in the output topology (topol.top). Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the following:#include "oplsaa.ff/forcefield.itp"This line calls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force field. The next important line is [ moleculetype ], below which you will find; Name nrexclProtein_A 3The name "Protein_A" defines the molecule name, based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section defines the [ atoms ] in the protein. The information is presented as columns:[ atoms ]; nr type resnr residue atom cgnr charge mass typeB chargeB massB; residue 1 LYS rtp LYSH q +2.01 opls_287 1 LYS N 1 -0.3 14.0067 ; qtot -0.32 opls_290 1 LYS H1 1 0.33 1.008 ; qtot 0.033 opls_290 1 LYS H2 1 0.33 1.008 ; qtot 0.364 opls_290 1 LYS H3 1 0.33 1.008 ; qtot 0.695 opls_293B 1 LYS CA 1 0.25 12.011 ; qtot 0.946 opls_140 1 LYS HA 1 0.06 1.008 ; qtot 1The interpretation of this information is as follows:∙nr: Atom number∙type: Atom type∙resnr: Amino acid residue number∙residue: The amino acid residue nameNote that this residue was "LYS" in the PDB file; the use of .rtp entry "LYSH" indicates that the residue is protonated (the predominant state at neutral pH).∙atom: Atom name∙cgnr: Charge group numberCharge groups define units of integer charge; they aid in speeding up calculations∙charge: Self-explanatoryThe "qtot" descriptor keeps a running count of the total charge on the molecule∙mass: Also self-explanatory∙typeB, chargeB, massB: Used for free energy perturbation (not discussed here)Subsequent sections include [ bonds ], [ pairs ], [ angles ], and [ dihedrals ]. Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chapter 5 of the GROMACS manual. Special 1-4 interactions are included under "pairs" (section 5.3.4 of the GROMACS manual).The remainder of the file involves defining a few other useful/necessary topologies, starting with position restraints. The "posre.itp" file was generated by pdb2gmx; it defines a force constant used to keep atoms in place during equilibration (more on this later).; Include Position restraint file#ifdef POSRES#include "posre.itp"#endifThis ends the "Protein_A" moleculetype definition. The remainder of the topology file is dedicated to defining other molecules and providing system-level descriptions. The next moleculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing "-water spce" to pdb2gmx. For an excellent summary of the many different water models, click here, but be aware that not all of these models are present within GROMACS.; Include water topology#include "oplsaa.ff/spce.itp"#ifdef POSRES_WATER; Position restraint for each water oxygen[ position_restraints ]; i funct fcx fcy fcz1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1 nm-2.Ion parameters are included next:; Include generic topology for ions#include "oplsaa.ff/ions.itp"Finally come system-level definitions. The [ system ] directive gives the name of the system that will be written to output files during the simulation. The [ molecules ] directive lists all of the molecules in the system.[ system ]; NameLYSOZYME[ molecules ]; Compound #molsProtein_A 1A few key notes about the [ molecules ] directive:1.The order of the listed molecules must exactly match the order of the molecules inthe coordinate (in this case, .gro) file.2.The names listed must match the [ moleculetype ] name for each species, not residuenames or anything else.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, molecules not being found, or a number of others.Now that we have examined the contents of a topology file, we can continue building our system.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possible to simulate proteins and other molecules in different solvents, provided that good parameters are available for all species involved.There are two steps to defining the box and filling it with solvent:1.Define the box dimensions using editconf.2.Fill the box with water using genbox.You are now presented with a choice as to how to treat the unit cell. For the purpose of this tutorial, we will use a simple cubic box as the unit cell. As you become more comfortable with periodic boundary conditions and box types, I highly recommend the rhombicdodecahedron, as its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be added to solvate the protein.Let's define the box using editconf:editconf-f 1AKI_processed.gro-o 1AKI_newbox.gro-c -d 1.0 -bt cubicThe above command centers the protein in the box (-c), and places it at least 1.0 nm from the box edge (-d 1.0). The box type is defined as a cube (-bt cubic). The distance to the edge of the box is an important parameter. Since we will be using periodic boundary conditions, we must satisfy the minimum image convention. That is, a protein should never see its periodic image, otherwise the forces calculated will be spurious. Specifying a solute-box distance of 1.0 nm will mean that there are at least 2.0 nm between any two periodic images of a protein. This distance will be sufficient for just about any cutoff scheme commonly used in simulations.Now that we have defined a box, we can fill it with solvent (water). Solvation is accomplished using genbox:genbox -cp 1AKI_newbox.gro-cs spc216.gro-o 1AKI_solv.gro-p topol.topThe configuration of the protein (-cp) is contained in the output of the previous editconf step, and the configuration of the solvent (-cs) is part of the standard GROMACS installation. We are using spc216.gro, which is a generic equilibrated 3-point solvent model. You can use spc216.gro as the solvent configuration for SPC, SPC/E, or TIP3P water, since they are all three-point water models. The output is called 1AKI_solv.gro, and we tell genbox the name of the topology file (topol.top) so it can be modified. Note the changes to the [ molecules ] directive of topol.top:[ molecules ]; Compound #molsProtein_A 1SOL 10832What genbox has done is keep track of how many water molecules it has added, which it then writes to your topology to reflect the changes that have been made. Note that if you use any other (non-water) solvent, genbox will not make these changes to your topology! Its compatibility with updating water molecules is hard-coded.Step Four: Adding IonsWe now have a solvated system that contains a charged protein. The output of pdb2gmx told us that the protein has a net charge of +8e (based on its amino acid composition). If you missed this information in the pdb2gmx output, look at the last line of your [ atoms ] directive in topol.top; it should read (in part) "qtot 8." Since life does not exist at a net charge, we must add ions to our system.The tool for adding ions within GROMACS is called genion. What genion does is read through the topology and replace water molecules with the ions that the user specifies. The input is called a run input file, which has an extension of .tpr; this file is produced by the GROMACS tool grompp(GROMACS pre-processor), which will also be used later when we run our first simulation. What grompp does is process the coordinate file and topology (which describes the molecules) to generate an atomic-level input (.tpr). The .tpr file contains all the parameters for all of the atoms in the system.To produce a .tpr file with grompp, we will need an additional input file, with the extension .mdp (molecular dynamics parameter file); grompp will assemble the parameters specified in the .mdp file with the coordinates and topology information to generate a .tpr file.An .mdp file is normally used to run energy minimization or an MD simulation, but in this case is simply used to generate an atomic description of the system. An example .mdp file (the one we will use) can be downloaded here.In reality, the.mdp file used at this step can contain any legitimate combination of parameters. I typically use an energy-minimization script, because they are very basic and do not involve any complicated parameter combinations.Assemble your .tpr file with the following:grompp -f ions.mdp-c 1AKI_solv.gro-p topol.top -o ions.tprNow we have an atomic-level description of our system in the binary file ions.tpr. We will pass this file to genion:genion -s ions.tpr-o 1AKI_solv_ions.gro-p topol.top-pname NA-nname CL-nn 8When prompted, choose group 13 "SOL" for embedding ions. You do not want to replace parts of your protein with ions.In the genion command, we provide the structure/state file (-s) as input, generate a .gro file as output (-o), process the topology (-p) to reflect the removal of water molecules and addition of ions, define positive and negative ion names (-pname and -nname, respectively), and tell genion to add only the ions necessary to neutralize the net charge on the protein by adding the correct number of negative ions (-nn 8). You could also use genion to add a specified concentration of ions in addition to simply neutralizing the system by specifyingthe -neutral and -conc options in conjunction. Refer to the genion man page for information on how to use these options.The names of the ions specified with -pname and -nname were force field-specific in previous versions of GROMACS, but have been standardized as of version 4.5. The specified atom names are always the elemental symbol in all capital letters, along with the [ moleculetype ]. Residue names may or may not append the sign of the charge (+/-). Refer to ions.itp for proper nomenclature if you encounter difficulties.Your [ molecules ] directive should now look like:[ molecules ]; Compound #molsProtein_A 1SOL 10824CL 8Step Five: Energy MinimizationThe solvated, electroneutral system is now assembled. Before we can begin dynamics, we must ensure that the system has no steric clashes or inappropriate geometry. The structure is relaxed through a process called energy minimization (EM).The process for EM is much like the addition of ions. We are once again going to use grompp to assemble the structure, topology, and simulation parameters into a binary input file (.tpr), but this time, instead of passing the .tpr to genion, we will run the energy minimization through the GROMACS MD engine, mdrun.Assemble the binary input using grompp using this input parameter file:grompp -f minim.mdp-c 1AKI_solv_ions.gro -p topol.top-o em.tprMake sure you have been updating your topol.top file when running genbox and genion, or else you will get lots of nasty error messages ("number of coordinates in coordinate file does not match topology," etc).We are now ready to invoke mdrun to carry out the EM:mdrun -v-deffnm emThe -v flag is for the impatient: it makes mdrun verbose, such that it prints its progress to the screen at every step. The -deffnm flag will define the file names of the input and output. So, if you did not name your grompp output "em.tpr," you will have to explicitly specify its name with the mdrun -s flag. In our case, we will get the following files:∙em.log: ASCII-text log file of the EM process∙em.edr: Binary energy file∙em.trr: Binary full-precision trajectory∙em.gro: Energy-minimized structureThere are two very important factors to evaluate to determine if EM was successful. The first is the potential energy (printed at the end of the EM process, even without -v). Epot should be negative, and (for a simple protein in water) on the order of 105-106, depending on the system size and number of water molecules. The second important feature is the maximum force, Fmax, the target for which was set in minim.mdp- "emtol = 1000.0" - indicating a target Fmax of no greater than 1000 kJ mol-1 nm-1. It is possible to arrive at a reasonable Epot with Fmax > emtol. If this happens, your system may not be stable enough for simulation. Evaluate why it may be happening, and perhaps change your minimization parameters (integrator, emstep, etc).Let's do a bit of analysis. The em.edr file contains all of the energy terms that GROMACS collects during EM. You can analyze any .edr file using the GROMACS tools g_energy:g_energy -f em.edr-o potential.xvgAt the prompt, type "10 0" to select Potential (10); zero (0) terminates input. You will be shown the average of Epot, and a file called "potential.xvg" will be written. To plot this data, you will need the Xmgrace plotting tool. The resulting plot should look something like this, demonstrating the nice, steady convergence of Epot:Now that our system is at an energy minimum, we can begin real dynamics.Step Six: EquilibrationEM ensured that we have a reasonable starting structure, in terms of geometry and solvent orientation. To begin real dynamics, we must equilibrate the solvent and ions around the protein. If we were to attempt unrestrained dynamics at this point, the system may collapse. The reason is that the solvent is mostly optimized within itself, and not necessarily with the solute. It needs to be brought to the temperature we wish to simulate and establish the proper orientation about the solute (the protein). After we arrive at the correct temperature (based on kinetic energies), we will apply pressure to the system until it reaches the proper density.Remember that posre.itp file that pdb2gmx generated a long time ago? We're going to use it now. The purpose of posre.itp is to apply a position restraining force on the heavy atoms of the protein (anything that is not a hydrogen). Movement is permitted, but only after overcoming a substantial energy penalty. The utility of position restraints is that they allow us to equilibrate our solvent around our protein, without the added variable of structural changes in the protein.Equilibration is often conducted in two phases. The first phase is conducted under an NVT ensemble (constant Number of particles, Volume, and Temperature). This ensemble is also referred to as "isothermal-isochoric" or "canonical." The timeframe for such a procedure is dependent upon the contents of the system, but in NVT, the temperature of the system should reach a plateau at the desired value. If the temperature has not yet stabilized,additional time will be required. Typically, 50-100 ps should suffice, and we will conduct a 100-ps NVT equilibration for this exercise. Depending on your machine, this may take a while (just over an hour on a dual-core MacBook). Get the .mdp file here.We will call grompp and mdrun just as we did at the EM step:grompp -f nvt.mdp-c em.gro -p topol.top-o nvt.tprmdrun -deffnm nvtA full explanation of the parameters used can be found in the GROMACS manual, in addition to the comments provided. Take note of a few parameters in the .mdp file:∙gen_vel = yes: Initiates velocity generation. Using different random seeds (gen_seed) gives different initial velocities, and thus multiple (different) simulations can be conducted from the same starting structure.∙tcoupl = V-rescale: The velocity rescaling thermostat is an improvement upon the Berendsen weak coupling method, which did not reproduce a correct kinetic ensemble.∙pcoupl = no: Pressure coupling is not applied.Let's analyze the temperature progression, again using g_energy:g_energy -f nvt.edrType "15 0" at the prompt to select the temperature of the system and exit. The resulting plot should look something like the following:From the plot, it is clear that the temperature of the system quickly reaches the target value (300 K), and remains stable over the remainder of the equilibration. For this system, a shorter equilibration period (on the order of 50 ps) may have been adequate.Step Seven: Equilibration, Part 2The previous step, NVT equilibration, stabilized the temperature of the system. Prior to data collection, we must also stabilize the pressure (and thus also the density) of the system. Equilibration of pressure is conducted under an NPT ensemble, wherein the Number of particles, Pressure, and Temperature are all constant. The ensemble is also called the "isothermal-isobaric" ensemble, and most closely resembles experimental conditions.The .mdp file used for a 100-ps NPT equilibration can be found here. It is not drastically different from the parameter file used for NVT equilibration. Note the addition of the pressure coupling section, using the Parrinello-Rahman barostat.A few other changes:∙continuation = yes: We are continuing the simulation from the NVT equilibration phase ∙gen_vel = no: Velocities are read from the trajectory (see below)We will call grompp and mdrun just as we did for NVT equilibration. Note that we are now including the -t flag to include the checkpoint file from the NVT equilibration; this file contains all the necessary state variables to continue our simulation. To conserve the velocities produced during NVT, we must include this file. The coordinate file (-c) is the final output of the NVT simulation.grompp -f npt.mdp -c nvt.gro-t nvt.cpt -p topol.top -o npt.tprmdrun -deffnm nptLet's analyze the pressure progression, again using g_energy:g_energy -f npt.edr -o pressure.xvgType "16 0" at the prompt to select the pressure of the system and exit. The result ing plot should look something like the following:The pressure value fluctuates widely over the course of the 100-ps equilibration phase, but this behavior is not unexpected. The running average of these data are plotted as the red line in the plot. Over the course of the equilibration, the average value of the pressure is 1.05 bar.Let's take a look at density as well, this time using g_energy and entering "22 0" at the prompt.g_energy -f npt.edr -o density.xvgAs with the pressure, the running average of the density is also plotted in red. The average value over the course of 100 ps is 998.3 kg m-3, close to the experimental value of 1000 kg m-3. The parameters for the SPC/E water model closely replicate experimental values for water. The density values are very stable over time, indicating that the system is well-equilibrated now with respect to pressure and density.Step Eight: Production MDUpon completion of the two equilibration phases, the system is now well-equilibrated at the desired temperature and pressure. We are now ready to release the position restraints and run production MD for data collection. The process is just like we have seen before, as we will make use of the checkpoint file (which in this case now contains preserve pressure coupling information) to grompp. We will run a 1-ns MD simulation, the script for which can be found here.grompp -f md.mdp-c npt.gro-t npt.cpt-p topol.top -o md_0_1.tprmdrun -deffnm md_0_1The mdrun step is definitely best run in parallel on a cluster, since it will take several hours to complete. The proper command for executing such a job would be:mpirun -np X mdrun_mpi -deffnm md_0_1where X indicates the number of processors to be used for the simulation. Near the end of the grompp output, you should have seen a line like:Estimate for the relative computational load of the PME mesh part: 0.25The estimate for PME load will dictate how many processors should be dedicated to the PME calculation, and how many for the PP calculations. Refer to the GROMACS 4 publication and the manual for details. For a cubic box, the optimal setup will have a PME load of 0.25 (3:1 PP:PME - we're in luck!); for a dodecahedral box, the optimal PME load is 0.33 (2:1 PP:PME). When executing mdrun, the program should automatically determine the best number of processors to assign for the PP and PME calculations. Thus, make sure you indicate an appropriate number of nodes for your calculation (the value of -np X), so that you can get the best performance. For this system, I achieved roughly 14 ns/day performance on 24 CPU's (18 PP, 6 PME).Step Nine: AnalysisNow that we have simulated our protein, we should run some analysis on the system. What types of data are important? This is an important question to ask before running the simulation, so you should have some ideas about the types of data you will want to collect in your own systems. For this tutorial, a few basic tools will be introduced.The first is trjconv, which is used as a post-processing tool to strip out coordinates, correct for periodicity, or manually alter the trajectory (time units, frame frequency, etc). For this exercise, we will use trjconv to account for any periodicity in the system. The protein will diffuse through the unit cell, and may appear to "jump" across to the other side of the box. To account for such actions, issue the following:trjconv -s md_0_1.tpr-f md_0_1.xtc-o md_0_1_noPBC.xtc -pbc mol-ur compactSelect 0 ("System") for output. We will conduct all our analyses on this "corrected" trajectory. Let's look at structural stability first. GROMACS has a built-in utility for RMSD calculations called g_rms. To use g_rms, issue this command:g_rms -s md_0_1.tpr-f md_0_1_noPBC.xtc-o rmsd.xvg-tu nsChoose 4 ("Backbone") for both the least squares fit and the group for RMSD calculation. The -tu flag will output the results in terms of ns, even though the trajectory was written in ps. This is done for clarity of the output (especially if you have a long simulation - 1e+06 ps does not look as nice as 100 ns). The output plot will show the RMSD relative to the structure present in the minimized, equilibrated system:If we wish to calculate RMSD relative to the crystal structure, we could issue the following:g_rms -s em.tpr -f md_0_1_noPBC.xtc-o rmsd_xtal.xvg -tu nsThe result will be something like:。

一个使用gromacs进行蛋白质模拟的入门教程

一个使用gromacs进行蛋白质模拟的入门教程

一个使用g r o m a c s进行蛋白质模拟的入门教程-CAL-FENGHAI.-(YICAI)-Company One1GROMACS TutorialLysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will contain an explanation of input and output, using typical settings for general use. This tutorial assumes you are using a GROMACS version in the series.GROMACS TutorialStep One: Prepare the TopologyWe must download the protein structure file we will be working with. For this tutorial, we will utilize hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and download the PDB text for the crystal structure.Once you have downloaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters. Use a plain text editor like vi, emacs (Linux/Mac), or Notepad (Windows). Do not use word processing software! Delete the lines corresponding to these molecules (residue "HOH" in the PDB file). Note that such a procedure is not universally appropriate ., the case of a bound active site water molecule). For our intentions here, we do not need crystal water.Always check your .pdb file for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be absent, and may not present a problem for dynamics. Incomplete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topologies for arbitrary molecules, just the residues defined by the force field (in the *.rtp files - generally proteins, nucleic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB file should contain only protein atoms, and is ready to be input into the first GROMACS tool, pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topology for the molecule.2. A position restraint file.3. A post-processed structure file.The topology by default) contains all the information necessary to define the molecule within a simulation. This information includes nonbonded parameters (atom types and charges) as well as bonded parameters (bonds, angles, anddihedrals). We will take a more detailed look at the topology once it has been generated.Execute pdb2gmx by issuing the following command:pdb2gmx -f-o-water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a force field:Select the Force Field:From '/usr/local/gromacs/share/gromacs/top':1: AMBER03 force field (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)3: AMBER96 force field (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)4: AMBER99 force field (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006)6: AMBER99SB-ILDN force field (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010) 7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)8: CHARMM27 all-atom force field (with CMAP) - version9: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)15: [DEPRECATED] Encad all-atom force field, using full solvent charges16: [DEPRECATED] Encad all-atom force field, using scaled-down vacuum charges 17: [DEPRECATED] Gromacs force field (see manual)18: [DEPRECATED] Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force field and decide which is most applicable to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, followed by'Enter'.There are many other options that can be passed to pdb2gmx. Some are listed here: -ignh: Ignore H atoms in the PDB file; especially useful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GROMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides to Cys.You have now generated three new files: , , and . is a GROMACS-formatted structure file that contains all the atoms defined within the force field ., H atoms have been added to the amino acids in the protein). The file is the system topology (more on this in a minute). The file contains information used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLet's look at what is in the output topology . Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the following: #include ""This line calls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force field. The next important line is[moleculetype ], below which you will find; Name nrexclProtein_A 3The name "Protein_A" defines the molecule name, based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section defines the[ atoms ]in the protein. The information is presented as columns:[ atoms ]; nr type resnr residue atom cgnr charge mass typeB chargeB massB ; residue 1 LYS rtp LYSH q +1 opls_287 1 LYS N 1 ; qtot2 opls_290 1 LYS H1 1 ; qtot3 opls_290 1 LYS H2 1 ; qtot4 opls_290 1 LYS H3 1 ; qtot5 opls_293B 1 LYS CA 1 ; qtot6 opls_140 1 LYS HA 1 ; qtot 1The interpretation of this information is as follows:nr: Atom numbertype: Atom typeresnr: Amino acid residue numberresidue: The amino acid residue nameNote that this residue was "LYS" in the PDB file; the use of .rtp entry "LYSH"indicates that the residue is protonated (the predominant state at neutralpH).atom: Atom namecgnr: Charge group numberCharge groups define units of integer charge; they aid in speeding upcalculationscharge: Self-explanatoryThe "qtot" descriptor keeps a running count of the total charge on themoleculemass: Also self-explanatorytypeB, chargeB, massB: Used for free energy perturbation (not discussedhere)Subsequent sections include[ bonds ], [ pairs ], [ angles ],and[ dihedrals ]. Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chapter 5 of the GROMACS manual. Special 1-4 interactions are included under "pairs" (section of the GROMACS manual).The remainder of the file involves defining a few other useful/necessary topologies, starting with position restraints. The "" file was generated by pdb2gmx; it defines a force constant used to keep atoms in place during equilibration (more on this later). ; Include Position restraint file#ifdef POSRES#include ""#endifThis ends the "Protein_A" moleculetype definition. The remainder of the topology file is dedicated to defining other molecules and providing system-level descriptions. The next moleculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing "-water spce" to pdb2gmx. For an excellent summary of the many different water models, click?here, but be aware that not all of these models are present within GROMACS.; Include water topology#include ""#ifdef POSRES_WATER; Position restraint for each water oxygen[ position_restraints ]; i funct fcx fcy fcz1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1nm-2.Ion parameters are included next:; Include generic topology for ions#include ""Finally come system-level definitions. The[ system ]directive gives the name of the system that will be written to output files during the simulation.The[ molecules ]directive lists all of the molecules in the system.[ system ]; NameLYSOZYME[ molecules ]; Compound #molsProtein_A 1A few key notes about the[ molecules ]directive:1.The order of the listed molecules mustexactlymatch the order of themolecules in the coordinate (in this case, .gro) file.2.The names listed must match the[ moleculetype ]name for each species, notresidue names or anything else.3.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, molecules not being found, or a number of others.Now that we have examined the contents of a topology file, we can continue building our system.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possible to simulate proteins and other molecules in different solvents, provided that good parameters are available for all species involved.There are two steps to defining the box and filling it with solvent:1.Define the box dimensions using editconf.2.Fill the box with water using genbox.You are now presented with a choice as to how to treat the unit cell. For the purpose of this tutorial, we will use a simple cubic box as the unit cell. As you become more comfortable with periodic boundary conditions and box types, I highly recommend the rhombic dodecahedron, as its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be added to solvate the protein.Let's define the box using editconf:editconf-f-o -c -d -bt cubicThe above command centers the protein in the box (-c), and places it at least nm from the box edge (-d . The box type is defined as a cube (-bt cubic). The distance to the edge of the box is an important parameter. Since we will be using periodic boundary conditions, we must satisfy the minimum image convention. That is, aprotein should never see its periodic image, otherwise the forces calculated will be spurious. Specifying a solute-box distance of nm will mean that there are at least nm between any two periodic images of a protein. This distance will be sufficient for just about any cutoff scheme commonly used in simulations.Now that we have defined a box, we can fill it with solvent (water). Solvation is accomplished using genbox:genbox -cp-cs-o -pThe configuration of the protein (-cp) is contained in the output of the previous editconf step, and the configuration of the solvent (-cs) is part of the standard GROMACS installation. We are using , which is a generic equilibrated 3-point solvent model. You can use as the solvent configuration for SPC, SPC/E, or TIP3P water, since they are all three-point water models. The output is called , and we tell genbox the name of the topology file so it can be modified. Note the changes to the[ molecules ]directive of :[ molecules ]; Compound #molsProtein_A 1SOL 10832What genbox has done is keep track of how many water molecules it has added, which it then writes to your topology to reflect the changes that have been made. Note that if you use any other (non-water) solvent, genbox will not make these changes to your topology! Its compatibility with updating water molecules is hard-coded.Step Four: Adding IonsWe now have a solvated system that contains a charged protein. The output ofpdb2gmx told us that the protein has a net charge of +8e(based on its amino acid composition). If you missed this information in the pdb2gmx output, look at the last line of your[ atoms ]directive in ; it should read (in part) "qtot 8." Since life does not exist at a net charge, we must add ions to our system.The tool for adding ions within GROMACS is called genion. What genion does is read through the topology and replace water molecules with the ions that the user specifies. The input is called a run input file, which has an extension of .tpr; this file is produced by the GROMACS tool grompp (GROMACSpre-processor), which will also be used later when we run our first simulation. What grompp does is process thecoordinate file and topology (which describes the molecules) to generate an atomic-level input (.tpr). The .tpr file contains all the parameters for all of the atoms in the system.To produce a .tpr file with grompp, we will need an additional input file, with the extension .mdp (moleculardynamics parameter file); grompp will assemble the parameters specified in the .mdp file with the coordinates and topology information to generate a .tpr file.An .mdp file is normally used to run energy minimization or an MD simulation, but in this case is simply used to generate an atomic description of the system. An example .mdp file (the one we will use) can be downloadedhere.In reality, the .mdp file used at this step can contain any legitimate combination of parameters. I typically use an energy-minimization script, because they are very basic and do not involve any complicated parameter combinations.Assemble your .tpr file with the following:grompp -f -c-p -oNow we have an atomic-level description of our system in the binary file . We will pass this file to genion:genion -s-o-p -pname NA-nname CL-nn 8When prompted, choose group 13 "SOL" for embedding ions. You do not want to replace parts of your protein with ions.In the genion command, we provide the structure/state file (-s) as input, generate a .gro file as output (-o), process the topology (-p) to reflect the removal of water molecules and addition of ions, define positive and negative ion names (-pname and -nname, respectively), and tell genion to add only the ions necessary to neutralize the net charge on the protein by adding the correct number of negative ions (-nn 8). You could also use genion to add a specified concentration of ions in addition to simply neutralizing the system by specifying the -neutral and -conc options in conjunction. Refer to the genion man page for information on how to use these options.The names of the ions specified with -pname and -nname were force field-specific in previous versions of GROMACS, but have been standardized as of version . The specified atom names are always the elemental symbol in all capital letters, along with the[ moleculetype ]. Residue names may or may not append the sign of the charge (+/-). Refer to for proper nomenclature if you encounter difficulties.Your[ molecules ]directive should now look like:[ molecules ]; Compound #molsProtein_A 1SOL 10824CL 8Step Five: Energy MinimizationThe solvated, electroneutral system is now assembled. Before we can begin dynamics, we must ensure that the system has no steric clashes or inappropriate geometry. The structure is relaxed through a process called energy minimization (EM).The process for EM is much like the addition of ions. We are once again going to use grompp to assemble the structure, topology, and simulation parameters into a binary input file (.tpr), but this time, instead of passing the .tpr to genion, we will run the energy minimization through the GROMACS MD engine, mdrun.Assemble the binary input using grompp usingthisinput parameter file:grompp -f -c -p-oMake sure you have been updating your file when running genbox and genion, or else you will get lots of nasty error messages ("number of coordinates in coordinate file does not match topology," etc).We are now ready to invoke mdrun to carry out the EM:mdrun -v-deffnm emThe -v flag is for the impatient: it makes mdrun verbose, such that it prints its progress to the screen at every step. The -deffnm flag will define the file names of the input and output. So, if you did not name your grompp output "," you will have to explicitly specify its name with the mdrun -s flag. In our case, we will get the following files:: ASCII-text log file of the EM process: Binary energy file: Binary full-precision trajectory: Energy-minimized structureThere are two very important factors to evaluate to determine if EM was successful. The first is the potential energy (printed at the end of the EM process, even without -v). Epot?should be negative, and (for a simple protein in water) on the order of 105-106, depending on the system size and number of water molecules. The second important feature is the maximum force, Fmax, the target for which was set in - "emtol = " - indicating a target Fmax?of no greater than 1000 kJ mol-1nm-1. It is possible to arrive at a reasonable Epotwith Fmax> emtol. If this happens, your system may not be stable enough for simulation. Evaluate why it may be happening, and perhaps change your minimization parameters (integrator, emstep, etc).Let's do a bit of analysis. The file contains all of the energy terms that GROMACS collects during EM. You can analyze any .edr file using the GROMACS tools g_energy:g_energy -f-oAt the prompt, type "10 0" to select Potential (10); zero (0) terminates input. You will be shown the average of Epot, and a file called "" will be written. To plot this data, you will need the?Xmgrace plotting tool. The resulting plot should look something like this, demonstrating the nice, steady convergence of Epot:Now that our system is at an energy minimum, we can begin real dynamics.Step Six: EquilibrationEM ensured that we have a reasonable starting structure, in terms of geometry and solvent orientation. To begin real dynamics, we must equilibrate the solvent and ions around the protein. If we were to attempt unrestrained dynamics at this point, the system may collapse. The reason is that the solvent is mostly optimized within itself, and not necessarily with the solute. It needs to be brought to the temperature we wish to simulate and establish the proper orientation about the solute (the protein). After we arrive at the correct temperature (based on kinetic energies), we will apply pressure to the system until it reaches the proper density.Remember that file that pdb2gmx generated a long time ago We're going to use it now. The purpose of is to apply apositionrestraining force on the heavy atoms of the protein (anything that is not a hydrogen). Movement is permitted, but only after overcoming a substantial energy penalty. The utility of position restraints is that they allow us to equilibrate our solvent around our protein, without the added variable of structural changes in the protein.Equilibration is often conducted in two phases. The first phase is conducted under anNVTensemble (constant Number of particles, Volume, and Temperature). This ensemble is also referred to as "isothermal-isochoric" or "canonical." The timeframe for such a procedure is dependent upon the contents of the system, but inNVT, the temperature of the system shouldreach a plateau at the desired value. If the temperature has not yet stabilized, additional time will be required. Typically, 50-100 ps should suffice, and we will conduct a 100-psNVTequilibration for this exercise. Depending on your machine, this may take a while (just over an hour on a dual-core MacBook). Get the .mdp file?here.We will call grompp and mdrun just as we did at the EM step:grompp -f-c -p -omdrun -deffnm nvtA full explanation of the parameters used can be found in the GROMACS manual, in addition to the comments provided. Take note of a few parameters in the .mdp file:gen_vel = yes: Initiates velocity generation. Using different random seeds(gen_seed) gives different initial velocities, and thus multiple (different)simulations can be conducted from the same starting structure.tcoupl = V-rescale: The velocity rescaling thermostat is an improvement upon the Berendsen weak coupling method, which did not reproduce a correctkinetic ensemble.pcoupl = no: Pressure coupling is not applied.Let's analyze the temperature progression, again using g_energy:g_energy -fType "15 0" at the prompt to select the temperature of the system and exit. The resulting plot should look something like the following:From the plot, it is clear that the temperature of the system quickly reaches the target value (300 K), and remains stable over the remainder of the equilibration. For this system, a shorter equilibration period (on the order of 50 ps) may have been adequate.Step Seven: Equilibration, Part 2The previous step,NVTequilibration, stabilized the temperature of the system. Prior to data collection, we must also stabilize the pressure (and thus also the density) of the system. Equilibration of pressure is conducted under anNPT ensemble, wherein the Number of particles, Pressure, and Temperature are all constant. The ensemble is also called the "isothermal-isobaric" ensemble, and most closely resembles experimental conditions.The .mdp file used for a 100-psNPTequilibration can be found here. It is not drastically different from the parameter file used forNVTequilibration. Note the addition of the pressure coupling section, using the Parrinello-Rahman barostat.A few other changes:continuation = yes: We are continuing the simulation fromthe NVT equilibration phasegen_vel = no: Velocities are read from the trajectory (see below)We will call grompp and mdrun just as we did forNVTequilibration. Note that we are now including the -t flag to include the checkpoint file from theNVTequilibration; this file contains all the necessary state variables to continue our simulation. To conserve the velocities produced duringNVT, we must include this file. The coordinate file (-c) is the final output of the?NVTsimulation.grompp -f -c-t-p -omdrun -deffnm nptLet's analyze the pressure progression, again using g_energy:g_energy -f -oType "16 0" at the prompt to select the pressure of the system and exit. The resulting plot should look something like the following:The pressure value fluctuates widely over the course of the 100-ps equilibration phase, but this behavior is not unexpected. The running average of these data are plotted as the red line in the plot. Over the course of the equilibration, the average value of the pressure is bar.Let's take a look at density as well, this time using g_energy and entering "22 0" at the prompt.g_energy -f -oAs with the pressure, the running average of the density is also plotted in red. The average value over the course of 100 ps is kg m-3, close to the experimental value of 1000 kg m-3. The parameters for the SPC/E water model closely replicate experimental values for water. The density values are very stable over time, indicating that the system is well-equilibrated now with respect to pressure and density.Step Eight: Production MDUpon completion of the two equilibration phases, the system is now well-equilibrated at the desired temperature and pressure. We are now ready to release the position restraints and run production MD for data collection. The process is just like we have seen before, as we will make use of the checkpoint file (which in this case now contains preserve pressure coupling information) to grompp. We will run a 1-ns MD simulation, the script for which can be found?here.grompp -f -c -t -p -omdrun -deffnm md_0_1The mdrun step is definitely best run in parallel on a cluster, since it will take several hours to complete. The proper command for executing such a job would be:mpirun -np X mdrun_mpi -deffnm md_0_1where X indicates the number of processors to be used for the simulation. Near the end of the grompp output, you should have seen a line like:Estimate for the relative computational load of the PME mesh part:The estimate for PME load will dictate how many processors should be dedicated to the PME calculation, and how many for the PP calculations. Refer to the GROMACS 4publicationand the manual for details. For a cubic box, the optimal setup will have a PME load of (3:1 PP:PME - we're in luck!); for a dodecahedral box, the optimal PME load is (2:1 PP:PME). When executing mdrun, the program should automatically determine the best number of processors to assign for the PP and PME calculations. Thus, make sure you indicate an appropriate number of nodes for your calculation (the value of -np X), so that you can get the best performance. For this system, I achieved roughly 14 ns/day performance on 24 CPU's (18 PP, 6 PME).Step Nine: AnalysisNow that we have simulated our protein, we should run some analysis on the system. What types of data are important This is an important question to ask before running the simulation, so you should have some ideas about the types of data you will want to collect in your own systems. For this tutorial, a few basic tools will be introduced.The first is trjconv, which is used as a post-processing tool to strip out coordinates, correct for periodicity, or manually alter the trajectory (time units, frame frequency, etc). For this exercise, we will use trjconv to account for any periodicity in the system. The protein will diffuse through the unit cell, and may appear to "jump" across to the other side of the box. To account for such actions, issue the following:trjconv -s-f -o -pbc mol-ur compactSelect 0 ("System") for output. We will conduct all our analyses on this "corrected" trajectory. Let's look at structural stability first. GROMACS has a built-in utility for RMSD calculations called g_rms. To use g_rms, issue this command:g_rms -s-f-o -tu nsChoose 4 ("Backbone") for both the least squares fit and the group for RMSD calculation. The -tu flag will output the results in terms of ns, even though the trajectory was written in ps. This is done for clarity of the output (especially if youhave a long simulation - 1e+06 ps does not look as nice as 100 ns). The output plot will show the RMSD relative to the structure present in the minimized, equilibrated system:If we wish to calculate RMSD relative to the crystal structure, we could issue the following:g_rms -s -f -o -tu nsThe result will be something like:Both plots show the RMSD levels off to ~ nm (1 ), indicating that the structure is very stable. Subtle differences between the plots indicate that the structure at t = 0 ns is slightly different from this crystal structure. This is to be expected, since it has been energy-minimized, and because the position restraints are not 100% perfect, as discussed previously.The radius of gyration of a protein is a measure of its compactness. If a protein is stably folded, it will likely maintain a relatively steady value of?Rg. If a protein unfolds, its?Rg will change over time. Let's analyze the radius of gyration for lysozyme in our simulation:g_gyrate -s -f-oWe can see from the reasonably invariant?Rg values that the protein remains very stable, in its compact (folded) form over the course of 1 ns at 300 K. This result is not unexpected, but illustrates an advanced capacity of GROMACS analysis that comes built-in.SummaryYou have now conducted a molecular dynamics simulation with GROMACS, and analyzed some of the results. This tutorial should not be viewed as comprehensive. There are many more types of simulations that one can conduct with GROMACS (free energy calculations, non-equilibrium MD, and normal modes analysis, just to。

GROMACS教程

GROMACS教程

GROMACS程序DEMO例程####################### 概述 #######################----------------------------------------------------------------------------------------------------------------------------------该例程来自Gromacs程序/share/tutor/目录下。

整个例程大概只需要十分钟就可以完成,非常适合初学者学习。

该例程是一个完整的分子动力学模拟过程,涵盖了Gromacs程序基本的使用方法。

模拟内容是一个水环境下的小肽链。

模拟唯一要求是该小肽链的PDB文件。

文档翻译如果有错,请你给我发信:sen@。

相关内容请参阅Gromacs文档,或者给Gromacs开发组询问。

Gromacs官方网站:Email: gromacs@----------------------------------------------------------------------------------------------------------------------------------############################ 环境变量设置############################----------------------------------------------------------------------------------------------------------------------------------在以下的例程中,所有命令都直接运行,没有添加绝对路径。

所以,必须将Gromacs安装路径的bin文件夹加入到系统PATH变量中。

如果不加入PATH变量,那么运行时要加入命令的绝对路径。

Gromacs模拟基本流程

Gromacs模拟基本流程

Gromacs 运行参数
1 Preprocessing include = … ; 指定拓扑结构目录 define = ; 预处理控制拓扑文件 -DPOSRES ; 位置限制 restraints -DFLEXIBLE ; 柔性水代替刚性水 2 Run control integrator = ; 指定积分算法(仅给出常用算法) md ; 蛙跳牛顿积分算法, 用于平衡动力学积分 steep ; 最陡下降法,用于能量最小化 cg ; 共轭梯度法; 用于能量最小化 (需双精度, 且先需做 steep) tinit = dt = ; 模拟开始时刻(仅用于 md、sd、bd) ; 积分步长(仅用于 md、sd、bd)

输入为:em.mdp topol.top 输出为:sol_ion.gro 命令为:grompp -f em.mdp -p topol.top -c sol.gro -o em.tpr genion -s em.tpr -p topol.top -o sol_ion.gro -pname NA+ -np 9 -nname CL-nn 9 -random 这里强调一点,em.mdp 文件是进行模拟的参数文件,在模拟前就必须存在。参 数文件也是分子模拟的一个重要文件, 到哪找些参考了。 呵呵, 你可能有经验了, SENSENBOBO 的博客。这个家伙真是太烦人了! !-np 表示正离子的个数(可能是 number of positive)-nn 就不说了。结果也会提示 back up.因为我们加入离子 改变了拓扑文件。不想写了,同上 好,现在准备工作已经就绪。我们开始最重要的三步模拟。第一步,能量 最优化模拟。首先,得用 grompp 命令将 .gro 文件 .mdp 文件 .top 文件 集合起来建立一个二进制文件.tpr。可以了,就用 mdrun 命令开始模拟! 输入为:em.mdp topol.top sol_ion.gro 输出为:em.gro .........很多文件,都看看,以后分析有用 命令为:grompp -f em.mdp -p topol.top -c sol_ion.gro -o em.tpr mdrun -deffnm em -v 你可能发现这一步生成的 em.tpr 和上一步的不一样,是的,输入的.gro 文件就 不一样嘛!mdrun 命令不懂的话就硬着头皮做,放心以后会懂的!结果还会有很 多其他的文件,像 .trr .xtc .edr .log .mdp,唉,自己看吧! 我们开始进行第二步模拟,水平衡模拟。建立文件的方式同第 6 步,不过 采用了不同的参数文件,可以理解吧! 输入为:pr.mdp topol.top em.gro 输出为:pr.gro...........blablablabla 命令为:grompp -f pr.mdp -p topol.top -c em.gro -o pr.tpr mdrun -deffnm pr -v 最后,终于到 Production simulation 了,一般这一步就会花十个小时以 上,视具体分子大小和设定空间而定。跑吧!!哥们,终点不远了。 输入为:md.mdp topol.top pr .gro 输出为:md.gro...........blablablabla 命令为:grompp -f pr.mdp -p topol.top -c em.gro -o pr.tpr mdrun -deffnm md -v 最后从文件中找出 md.gro, 用 editconf 转换为.pdb 文件, 就可以用 VMD 或 PyMOL 看了。OVER

gromacs使用手册

gromacs使用手册

gromacs使用手册摘要:1.引言2.gromacs 简介3.gromacs 的基本使用3.1 安装gromacs3.2 创建模拟系统3.3 运行模拟3.4 分析结果4.gromacs 的高级使用4.1 模拟参数设置4.2 力场参数设置4.3 脚本编写与批处理5.gromacs 的常见问题及解决方法6.总结正文:gromacs 使用手册gromacs 是一款开源的分子动力学模拟软件,广泛应用于生物大分子、药物分子、液晶等体系的模拟研究。

本手册将详细介绍gromacs 的使用方法及高级技巧。

1.引言分子动力学模拟是一种通过计算原子之间的相互作用力,来研究分子运动规律的方法。

gromacs 作为一款功能强大的分子动力学模拟软件,得到了广大科研工作者的青睐。

2.gromacs 简介gromacs 是一款基于GROMOS 力场的分子动力学模拟软件,支持多种体系、多种力场的模拟计算。

gromacs 采用高效的算法和技术,可以实现高效的大规模模拟计算。

3.gromacs 的基本使用3.1 安装gromacs用户可以根据gromacs 的官方文档,选择合适的安装方式,如使用编译器进行编译安装,或使用包管理器进行安装。

3.2 创建模拟系统首先需要构建分子模型,包括原子类型、坐标文件、相互作用参数等。

接着,通过gromacs 的脚本或命令行,设定模拟参数,如温度、压力、模拟时间等。

3.3 运行模拟根据设定的参数,运行gromacs 命令,开始模拟计算。

gromacs 会生成一系列模拟结果文件,包括轨迹文件、能量文件、坐标文件等。

3.4 分析结果使用gromacs 提供的分析工具或其他第三方软件,对模拟结果进行后处理,如计算均方根偏差(RMSD)、计算相互作用能等。

4.gromacs 的高级使用4.1 模拟参数设置根据实际需求,调整模拟参数,如采用更高级的力场、改变模拟方法等,以优化模拟效果。

4.2 力场参数设置通过修改力场参数文件,可以自定义力场参数,以适应不同体系的研究需求。

gromacs使用手册

gromacs使用手册

gromacs使用手册【原创实用版】目录1.Gromacs 简介2.Gromacs 的功能和应用领域3.Gromacs 的使用方法和技巧4.Gromacs 的安装与配置5.Gromacs 的常见问题与解决方案6.Gromacs 的未来发展趋势正文1.Gromacs 简介Gromacs 是一个开源的生物大分子模拟软件,主要用于分子动力学模拟和静态模拟。

Gromacs 可以模拟各种生物大分子,如蛋白质、核酸和多糖等,被广泛应用于生物学、化学、材料科学等领域的研究中。

2.Gromacs 的功能和应用领域Gromacs 具有以下主要功能:(1) 分子动力学模拟:Gromacs 可以使用不同的力场和算法进行分子动力学模拟,以研究分子结构和功能的关系。

(2) 静态模拟:Gromacs 可以进行静态模拟,以研究分子的静态结构。

(3) 计算化学:Gromacs 可以进行量子化学计算和分子轨道计算。

(4) 溶剂模型:Gromacs 提供了多种溶剂模型,以模拟不同环境下分子的行为。

Gromacs 的应用领域包括:(1) 蛋白质结构和功能研究(2) 药物设计(3) 材料科学(4) 生物物理学3.Gromacs 的使用方法和技巧(1) 安装 Gromacs:Gromacs 的安装过程相对简单,只需按照官方提供的安装指南进行即可。

(2) 配置 Gromacs:Gromacs 需要配置一些环境变量和参数文件,以保证其正常运行。

(3) 使用 Gromacs:Gromacs 提供了命令行界面和图形界面两种使用方式。

(4) 技巧:在进行模拟时,需要注意选择合适的力场和算法,以及合理设置模拟参数。

4.Gromacs 的安装与配置(1) 安装 Gromacs:可以从 Gromacs 官网下载最新版本的安装包,然后按照官方提供的安装指南进行安装。

(2) 配置 Gromacs:需要配置环境变量和参数文件,以保证 Gromacs 正常运行。

Gromacs分子动力学模拟流程概述

Gromacs分子动力学模拟流程概述

Gromacs分⼦动⼒学模拟流程概述Gromacs分⼦动⼒学模拟主要可以分为以下⼏个步骤,不同的体系步骤可能略有不同。

在开始之前,先简单了解⼀下预平衡:分⼦动⼒学模拟的最终⽬的是对体系进⾏抽样,然后计算体系的能量,各种化学键,成分分析等等。

打个⽐⽅说,我们有⼀个蛋⽩质,我们想将它放⼊⼀种溶液中(可能是⽔,也可能不是),然后看看这个体系的能量如何变化,蛋⽩质的化学键,与⽔分⼦形成的氢键等等信息,那么我们需要将蛋⽩质放⼊溶液中,映射到现实中就是讲溶剂放⼊溶剂中,然后等体系稳定后,观察其性质。

在MD中,这⼀过程不向现实中⼀样是⾃然发⽣的,我们需要通过模拟是体系演化到平衡状态,这就是预平衡。

⼀般来说预平衡会有以下办法:蛋⽩质结构能量最⼩化:PDB⽂件都是从晶体中获得的,所以蛋⽩质放⼊溶液中后必然会发⽣变化,这就需要对其进⾏能量最⼩化,确保蛋⽩质的结构是稳定结构。

蛋⽩质位置限定性模拟:有时加⼊溶剂后,分⼦间相互作⽤⼒会过⼤,导致蛋⽩质体系崩溃。

这时我们需要限制蛋⽩质中重原⼦的位置,维持其结构,等溶剂分⼦弛豫之后再放开限制进⾏模拟。

NVT预平衡,NPT预平衡:⼀般先做NVT模拟,减⼩盒⼦内压⼒,然后再做NPT模拟。

以上步骤当然不⽤全做,视情况⽽定,不过⼀般蛋⽩质能量最⼩化和位置限定性NPT还是要做的。

以下是分⼦动⼒学模拟的步骤,有些步骤可以省略。

1. 获取并处理PDB⽂件⼀般PDB⽂件是从⽹站上下载,如/pdb/home/home.do。

获取PDB⽂件后有可能还要做⼀些处理,如末端氢原⼦,结晶⽔,等等。

视情况⽽定。

2. 使⽤pdb2gmx获得拓扑⽂件命令pdb2gmx的详细信息可以参加/programs/gmx-pdb2gmx.html。

具体的命令参数我会在另⼀篇⽂章中详述。

⼀般⽽⾔,我们使⽤时会是向下⾯这样:gmx pdb2gmx -ff amber99sb-ildn -f *.pdb -o *.gro -p *.top -water tip3p-ff 选项,制定要使⽤的⼒场;-f选项,制定输⼊的PDB⽂件;-o选项,制定⽣成的gro⽂件名-p选项,制定要⽣成的拓扑⽂件名-water选项,制定要使⽤的⽔分⼦模型注意,除了⽣成*.gro⽂件和*.top⽂件之外,还会⽣成⼀个posre.itp,位置限定性⽂件(我把它理解成position-restraints的缩写)。

gromacs使用手册

gromacs使用手册

gromacs使用手册摘要:一、Gromacs简介二、Gromacs的安装与配置三、Gromacs的基本操作1.创建模拟配置文件2.运行模拟3.分析结果四、Gromacs的高级功能1.分子动力学模拟2.热力学计算3.对接与筛选五、Gromacs的优缺点六、Gromacs的未来发展正文:一、Gromacs简介Gromacs是一款用于分子动力学模拟的开源软件,广泛应用于生物化学、材料科学等领域。

它具有高效的计算性能、丰富的功能和友好的用户界面,为科学家提供了强大的分子模拟工具。

二、Gromacs的安装与配置要在计算机上安装Gromacs,首先需要确保满足系统的硬件和软件要求。

接下来,按照官方文档的指引进行安装和配置。

在配置过程中,用户可以根据自己的需求选择相应的模块和参数。

三、Gromacs的基本操作1.创建模拟配置文件要运行Gromacs,首先需要创建一个模拟配置文件(gro文件),其中包含了模拟系统的信息,如原子、盒子、温度、压力等。

通过编辑gro文件,用户可以设置模拟的具体参数。

2.运行模拟在完成gro文件设置后,使用Gromacs提供的脚本(如mdrun)运行模拟。

根据需要,用户可以选择不同的模拟模式,如NVT、NPT等。

3.分析结果Gromacs可以自动生成模拟过程中的数据文件(如gro、xtc、trr等),用户可以通过Gromacs提供的分析工具(如g_analysis)对这些文件进行处理和可视化。

四、Gromacs的高级功能1.分子动力学模拟Gromacs支持多种分子动力学算法,如Verlet积分器、Langevin动力学等。

用户可以根据研究需求选择合适的算法进行模拟。

2.热力学计算Gromacs可以用于计算系统的热力学性质,如比热、熵等。

这些计算有助于深入了解系统的热力学行为。

3.对接与筛选Gromacs提供了对接和筛选工具,用于寻找分子间的最佳结合位点。

这对于药物设计和蛋白质筛选等领域具有重要的应用价值。

一个使用gromacs进行蛋白质模拟的入门教程

一个使用gromacs进行蛋白质模拟的入门教程

Lysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will contain an explanation of input and output, using typical settings for general use.This tutorial assumes you are using a GROMACS version in the series.GROMACS TutorialStep One: Prepare the TopologyWe must download the protein structure file we will be working with. For this tutorial, we will utilize hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and download the PDB text for the crystal structure.Once you have downloaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters. Use a plain text editor like vi, emacs (Linux/Mac), or Notepad (Windows). Do not use word processing software! Delete the lines corresponding to these molecules (residue "HOH" in the PDB file). Note that such a procedure is not universally appropriate ., the case of a bound active site water molecule). For our intentions here, we do not need crystal water.Always check your .pdb file for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be absent, and may not present a problem for dynamics. Incomplete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topologies for arbitrary molecules, just the residues defined by the force field (in the *.rtp files - generally proteins, nucleic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB file should contain only protein atoms, and is ready to be input into the first GROMACS tool,pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topology for the molecule.2. A position restraint file.3. A post-processed structure file.The topology by default) contains all the information necessary to define the molecule within a simulation. This information includes nonbonded parameters (atom types and charges) as well as bonded parameters (bonds, angles, and dihedrals). We will take a more detailed look at the topology once it has been generated.Execute pdb2gmx by issuing the following command:pdb2gmx -f-o-water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a force field:Select the Force Field:From '/usr/local/gromacs/share/gromacs/top':1: AMBER03 force field (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)3: AMBER96 force field (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)4: AMBER99 force field (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006)6: AMBER99SB-ILDN force field (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)8: CHARMM27 all-atom force field (with CMAP) - version9: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)15: [DEPRECATED] Encad all-atom force field, using full solvent charges16: [DEPRECATED] Encad all-atom force field, using scaled-down vacuum charges 17: [DEPRECATED] Gromacs force field (see manual)18: [DEPRECATED] Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force field and decide which is most applicable to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, followed by 'Enter'.There are many other options that can be passed to pdb2gmx. Some are listed here:-ignh: Ignore H atoms in the PDB file; especially useful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GROMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides to Cys.You have now generated three new files: , , and . is a GROMACS-formatted structurefile that contains all the atoms defined within the force field ., H atoms have been added to the amino acids in the protein). The file is the system topology (more on this in a minute). The file contains information used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLet's look at what is in the output topology . Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the following:#include ""This line calls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force field. The next important line is [ moleculetype ], below which you will find ; Name nrexclProtein_A 3The name "Protein_A" defines the molecule name, based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section defines the [ atoms ] in the protein. The information is presented as columns:[ atoms ]; nr type resnr residue atom cgnr charge mass typeB chargeB massB; residue 1 LYS rtp LYSH q +1 opls_287 1 LYS N 1 ; qtot2 opls_290 1 LYS H1 1 ; qtot3 opls_290 1 LYS H2 1 ; qtot4 opls_290 1 LYS H3 1 ; qtot5 opls_293B 1 LYS CA 1 ; qtot6 opls_140 1 LYS HA 1 ; qtot 1The interpretation of this information is as follows:nr: Atom numbertype: Atom typeresnr: Amino acid residue numberresidue: The amino acid residue nameNote that this residue was "LYS" in the PDB file; the use of .rtp entry "LYSH"indicates that the residue is protonated (the predominant state at neutralpH).atom: Atom namecgnr: Charge group numberCharge groups define units of integer charge; they aid in speeding upcalculationscharge: Self-explanatoryThe "qtot" descriptor keeps a running count of the total charge on the moleculemass: Also self-explanatorytypeB, chargeB, massB: Used for free energy perturbation (not discussed here) Subsequent sections include [ bonds ], [ pairs ], [ angles ], and [ dihedrals ]. Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chapter 5 of the GROMACS manual. Special 1-4 interactions are included under "pairs" (section of the GROMACS manual).The remainder of the file involves defining a few other useful/necessary topologies, starting with position restraints. The "" file was generated by pdb2gmx; it defines a force constant used to keep atoms in place during equilibration (more on this later).; Include Position restraint file#ifdef POSRES#include ""#endifThis ends the "Protein_A" moleculetype definition. The remainder of the topology file is dedicated to defining other molecules and providing system-level descriptions. The next moleculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing "-water spce" to pdb2gmx. For an excellent summary of the many different water models, click here, but be aware that not all of these models are present within GROMACS.; Include water topology#include ""#ifdef POSRES_WATER; Position restraint for each water oxygen[ position_restraints ]; i funct fcx fcy fcz1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1nm-2.Ion parameters are included next:; Include generic topology for ions#include ""Finally come system-level definitions. The [ system ] directive gives the name of the system that will be written to output files during the simulation. The [ molecules ] directive lists all of the molecules in the system.[ system ]; NameLYSOZYME[ molecules ]; Compound #molsProtein_A 1A few key notes about the [ molecules ] directive:1.The order of the listed molecules must exactly match the order of themolecules in the coordinate (in this case, .gro) file.2.The names listed must match the [ moleculetype ] name for each species,not residue names or anything else.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, molecules not being found, or a number of others.Now that we have examined the contents of a topology file, we can continue building our system.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possible to simulate proteins and other molecules in different solvents, provided that good parameters are available for all species involved. There are two steps to defining the box and filling it with solvent:1.Define the box dimensions using editconf.2.Fill the box with water using genbox.You are now presented with a choice as to how to treat the unit cell. For the purpose of this tutorial, we will use a simple cubic box as the unit cell. As you becomemore comfortable with periodic boundary conditions and box types, I highly recommend the rhombic dodecahedron, as its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be added to solvate the protein.Let's define the box using editconf:editconf-f-o -c -d -bt cubicThe above command centers the protein in the box (-c), and places it at least nm from the box edge (-d. The box type is defined as a cube (-bt cubic). The distance to the edge of the box is an important parameter. Since we will be using periodic boundary conditions, we must satisfy the minimum image convention. That is, a protein should never see its periodic image, otherwise the forces calculated will be spurious. Specifying a solute-box distance of nm will mean that there are at least nm between any two periodic images of a protein. This distance will be sufficient for just about any cutoff scheme commonly used in simulations.Now that we have defined a box, we can fill it with solvent (water). Solvation is accomplished using genbox:genbox -cp-cs-o -pThe configuration of the protein (-cp) is contained in the output of the previous editconf step, and the configuration of the solvent (-cs) is part of the standard GROMACS installation. We are using , which is a generic equilibrated 3-point solvent model. You can use as the solvent configuration for SPC, SPC/E, or TIP3P water, since they are all three-point water models. The output is called , and we tell genboxthe name of the topology file so it can be modified. Note the changes to the [ molecules ] directive of :[ molecules ]; Compound #molsProtein_A 1SOL 10832What genbox has done is keep track of how many water molecules it has added, which it then writes to your topology to reflect the changes that have been made. Note that if you use any other (non-water) solvent, genbox will not make these changes to your topology! Its compatibility with updating water molecules is hard-coded.Step Four: Adding IonsWe now have a solvated system that contains a charged protein. The output of pdb2gmx told us that the protein has a net charge of +8e (based on its amino acid composition). If you missed this information in the pdb2gmx output, look at the last line of your [ atoms ] directive in ; it should read (in part) "qtot 8." Since life does not exist at a net charge, we must add ions to our system.The tool for adding ions within GROMACS is called genion. What genion does is read through the topology and replace water molecules with the ions that the user specifies. The input is called a run input file, which has an extension of .tpr; this file is produced by the GROMACS tool grompp (GROMACS pre-processor), which will also be used later when we run our first simulation. What grompp does is process the coordinate file and topology (which describes the molecules) to generate anatomic-level input (.tpr). The .tpr file contains all the parameters for all of the atoms in the system.To produce a .tpr file with grompp, we will need an additional input file, with the extension .mdp (molecular dynamics parameter file); grompp will assemble the parameters specified in the .mdp file with the coordinates and topology information to generate a .tpr file.An .mdp file is normally used to run energy minimization or an MD simulation, but in this case is simply used to generate an atomic description of the system. An example .mdp file (the one we will use) can be downloaded here.In reality, the .mdp file used at this step can contain any legitimate combination of parameters. I typically use an energy-minimization script, because they are very basic and do not involve any complicated parameter combinations.Assemble your .tpr file with the following:grompp -f -c-p -oNow we have an atomic-level description of our system in the binary file . We will pass this file to genion:genion -s-o-p -pname NA-nname CL-nn 8When prompted, choose group 13 "SOL" for embedding ions. You do not want to replace parts of your protein with ions.In the genion command, we provide the structure/state file (-s) as input, generate a .gro file as output (-o), process the topology (-p) to reflect the removal of water molecules and addition of ions, define positive and negative ion names (-pname and-nname, respectively), and tell genion to add only the ions necessary to neutralize the net charge on the protein by adding the correct number of negative ions (-nn 8). You could also use genion to add a specified concentration of ions in addition to simply neutralizing the system by specifying the -neutral and -conc options in conjunction. Refer to the genion man page for information on how to use these options. The names of the ions specified with -pname and -nname were force field-specific in previous versions of GROMACS, but have been standardized as of version . The specified atom names are always the elemental symbol in all capital letters, along with the [ moleculetype ]. Residue names may or may not append the sign of the charge (+/-). Refer to for proper nomenclature if you encounter difficulties.Your [ molecules ] directive should now look like:[ molecules ]; Compound #molsProtein_A 1SOL 10824CL 8Step Five: Energy MinimizationThe solvated, electroneutral system is now assembled. Before we can begin dynamics, we must ensure that the system has no steric clashes or inappropriate geometry. The structure is relaxed through a process called energy minimization (EM).The process for EM is much like the addition of ions. We are once again going to use grompp to assemble the structure, topology, and simulation parameters into a binary input file (.tpr), but this time, instead of passing the .tpr to genion, we will run the energy minimization through the GROMACS MD engine, mdrun.Assemble the binary input using grompp using this input parameter file: grompp -f -c -p-oMake sure you have been updating your file when running genbox and genion, or else you will get lots of nasty error messages ("number of coordinates in coordinate file does not match topology," etc).We are now ready to invoke mdrun to carry out the EM:mdrun -v-deffnm emThe -v flag is for the impatient: it makes mdrun verbose, such that it prints its progress to the screen at every step. The -deffnm flag will define the file names of the input and output. So, if you did not name your grompp output "," you will have to explicitly specify its name with the mdrun -s flag. In our case, we will get the following files:: ASCII-text log file of the EM process: Binary energy file: Binary full-precision trajectory: Energy-minimized structureThere are two very important factors to evaluate to determine if EM was successful. The first is the potential energy (printed at the end of the EM process, even without-v). Epot should be negative, and (for a simple protein in water) on the order of 105-106, depending on the system size and number of water molecules. The second important feature is the maximum force, Fmax, the target for which was set in - "emtol = " - indicating a target Fmax of no greater than 1000 kJ mol-1nm-1. It is possible to arrive at a reasonable Epot with Fmax > emtol. If this happens, your system may not be stable enough for simulation. Evaluate why it may be happening, and perhaps change your minimization parameters (integrator, emstep, etc).Let's do a bit of analysis. The file contains all of the energy terms that GROMACS collects during EM. You can analyze any .edr file using the GROMACS tools g_energy: g_energy -f-oAt the prompt, type "10 0" to select Potential (10); zero (0) terminates input. You will be shown the average of Epot, and a file called "" will be written. To plot this data, you will need the Xmgrace plotting tool. The resulting plot should look something like this, demonstrating the nice, steady convergence of Epot:Now that our system is at an energy minimum, we can begin real dynamics.Step Six: EquilibrationEM ensured that we have a reasonable starting structure, in terms of geometry and solvent orientation. To begin real dynamics, we must equilibrate the solvent and ions around the protein. If we were to attempt unrestrained dynamics at this point, the system may collapse. The reason is that the solvent is mostly optimized within itself, and not necessarily with the solute. It needs to be brought to the temperaturewe wish to simulate and establish the proper orientation about the solute (the protein). After we arrive at the correct temperature (based on kinetic energies), we will apply pressure to the system until it reaches the proper density. Remember that file that pdb2gmx generated a long time ago? We're going to use it now. The purpose of is to apply a position restraining force on the heavy atoms of the protein (anything that is not a hydrogen). Movement is permitted, but only after overcoming a substantial energy penalty. The utility of position restraints is that they allow us to equilibrate our solvent around our protein, without the added variable of structural changes in the protein.Equilibration is often conducted in two phases. The first phase is conducted under an NVT ensemble (constant Number of particles, Volume, and Temperature). This ensemble is also referred to as "isothermal-isochoric" or "canonical." The timeframe for such a procedure is dependent upon the contents of the system, but in NVT, the temperature of the system should reach a plateau at the desired value. If the temperature has not yet stabilized, additional time will be required. Typically, 50-100 ps should suffice, and we will conduct a 100-ps NVT equilibration for this exercise. Depending on your machine, this may take a while (just over an hour on a dual-core MacBook). Get the .mdp file here.We will call grompp and mdrun just as we did at the EM step:grompp -f-c -p -omdrun -deffnm nvtA full explanation of the parameters used can be found in the GROMACS manual, in addition to the comments provided. Take note of a few parameters in the .mdp file:gen_vel = yes: Initiates velocity generation. Using different random seeds (gen_seed) gives different initial velocities, and thus multiple (different) simulations can be conducted from the same starting structure.tcoupl = V-rescale: The velocity rescaling thermostat is an improvement upon the Berendsen weak coupling method, which did not reproduce a correct kinetic ensemble.pcoupl = no: Pressure coupling is not applied.Let's analyze the temperature progression, again using g_energy:g_energy -fType "15 0" at the prompt to select the temperature of the system and exit. The resulting plot should look something like the following:From the plot, it is clear that the temperature of the system quickly reaches the target value (300 K), and remains stable over the remainder of the equilibration. For this system, a shorter equilibration period (on the order of 50 ps) may have been adequate.Step Seven: Equilibration, Part 2The previous step, NVT equilibration, stabilized the temperature of the system. Prior to data collection, we must also stabilize the pressure (and thus also the density) of the system. Equilibration of pressure is conducted under an NPT ensemble, wherein the Number of particles, Pressure, and Temperature are allconstant. The ensemble is also called the "isothermal-isobaric" ensemble, and most closely resembles experimental conditions.The .mdp file used for a 100-ps NPT equilibration can be found here. It is not drastically different from the parameter file used for NVT equilibration. Note the addition of the pressure coupling section, using the Parrinello-Rahman barostat.A few other changes:continuation = yes: We are continuing the simulation from the NVT equilibration phasegen_vel = no: Velocities are read from the trajectory (see below)We will call grompp and mdrun just as we did for NVT equilibration. Note that we are now including the -t flag to include the checkpoint file from the NVT equilibration; this file contains all the necessary state variables to continue our simulation. To conserve the velocities produced during NVT, we must include this file. The coordinate file (-c) is the final output of the NVT simulation.grompp -f -c-t-p -omdrun -deffnm nptLet's analyze the pressure progression, again using g_energy:g_energy -f -oType "16 0" at the prompt to select the pressure of the system and exit. The resulting plot should look something like the following:The pressure value fluctuates widely over the course of the 100-ps equilibration phase, but this behavior is not unexpected. The running average of these data are plotted as the red line in the plot. Over the course of the equilibration, the average value of the pressure is bar.Let's take a look at density as well, this time using g_energy and entering "22 0" at the prompt.g_energy -f -oAs with the pressure, the running average of the density is also plotted in red. The average value over the course of 100 ps is kg m-3, close to the experimental value of 1000 kg m-3. The parameters for the SPC/E water model closely replicate experimental values for water. The density values are very stable over time, indicating that the system is well-equilibrated now with respect to pressure and density.Step Eight: Production MDUpon completion of the two equilibration phases, the system is now well-equilibrated at the desired temperature and pressure. We are now ready to release the position restraints and run production MD for data collection. The process is just like we have seen before, as we will make use of the checkpoint file (which in this case now contains preserve pressure coupling information) to grompp. We will run a 1-ns MD simulation, the script for which can be found here.grompp -f -c -t -p -omdrun -deffnm md_0_1The mdrun step is definitely best run in parallel on a cluster, since it will take several hours to complete. The proper command for executing such a job would be: mpirun -np X mdrun_mpi -deffnm md_0_1where X indicates the number of processors to be used for the simulation. Near the end of the grompp output, you should have seen a line like:Estimate for the relative computational load of the PME mesh part:The estimate for PME load will dictate how many processors should be dedicated to the PME calculation, and how many for the PP calculations. Refer to the GROMACS 4 publication and the manual for details. For a cubic box, the optimal setup will have a PME load of (3:1 PP:PME - we're in luck!); for a dodecahedral box, the optimal PME load is (2:1 PP:PME). When executing mdrun, the program should automatically determine the best number of processors to assign for the PP and PME calculations. Thus, make sure you indicate an appropriate number of nodes for your calculation (the value of -np X), so that you can get the best performance. For this system, I achieved roughly 14 ns/day performance on 24 CPU's (18 PP, 6 PME).Step Nine: AnalysisNow that we have simulated our protein, we should run some analysis on the system. What types of data are important? This is an important question to ask before running the simulation, so you should have some ideas about the types of data you will wantto collect in your own systems. For this tutorial, a few basic tools will be introduced.The first is trjconv, which is used as a post-processing tool to strip out coordinates, correct for periodicity, or manually alter the trajectory (time units, frame frequency, etc). For this exercise, we will use trjconv to account for any periodicity in the system. The protein will diffuse through the unit cell, and may appear to "jump" across to the other side of the box. To account for such actions, issue the following:trjconv -s-f -o -pbc mol-ur compactSelect 0 ("System") for output. We will conduct all our analyses on this "corrected" trajectory. Let's look at structural stability first. GROMACS has a built-in utility for RMSD calculations called g_rms. To use g_rms, issue this command:g_rms -s-f-o -tu nsChoose 4 ("Backbone") for both the least squares fit and the group for RMSD calculation. The -tu flag will output the results in terms of ns, even though the trajectory was written in ps. This is done for clarity of the output (especially if you have a long simulation - 1e+06 ps does not look as nice as 100 ns). The output plot will show the RMSD relative to the structure present in the minimized, equilibrated system:If we wish to calculate RMSD relative to the crystal structure, we could issue the following:g_rms -s -f -o -tu nsThe result will be something like:Both plots show the RMSD levels off to ~ nm (1 Å), indicating that the structure is very stable. Subtle differences between the plots indicate that the structure at t = 0 ns is slightly different from this crystal structure. This is to be expected, since it has been energy-minimized, and because the position restraints are not 100% perfect, as discussed previously.The radius of gyration of a protein is a measure of its compactness. If a protein is stably folded, it will likely maintain a relatively steady value of Rg. If a protein unfolds, its Rg will change over time. Let's analyze the radius of gyration for lysozyme in our simulation:g_gyrate -s -f-oWe can see from the reasonably invariant Rg values that the protein remains very stable, in its compact (folded) form over the course of 1 ns at 300 K. This result is not unexpected, but illustrates an advanced capacity of GROMACS analysis that comes built-in.SummaryYou have now conducted a molecular dynamics simulation with GROMACS, and analyzed some of the results. This tutorial should not be viewed as comprehensive. There are many more types of simulations that one can conduct with GROMACS (free energy calculations, non-equilibrium MD, and normal modes analysis, just to name a few).。

gromacs分子动力学模拟方法

gromacs分子动力学模拟方法

Gromacs分子动力学模拟方法1. 引言Gromacs(Groningen Machine for Chemical Simulations)是一种常用的分子动力学模拟软件,广泛应用于生物物理、化学和材料科学领域。

分子动力学模拟是一种计算实验方法,通过模拟分子的运动来研究物质的性质和行为。

本文将介绍Gromacs分子动力学模拟方法的基本原理、应用场景以及实现步骤。

2. 基本原理Gromacs分子动力学模拟方法基于牛顿第二定律和经典力场原理,通过数值积分求解分子的运动方程。

它将分子系统看作一组粒子(原子或分子),根据粒子之间的相互作用力,计算粒子的加速度和速度,从而推导出粒子在下一个时间步长的位置。

这个过程通过以下几个步骤实现:2.1 力场参数化力场是描述分子相互作用的数学模型,包括键长、键角、二面角等参数。

在Gromacs中,常用的力场有GROMOS、AMBER和CHARMM等。

在进行分子动力学模拟之前,需要根据所研究的分子的化学结构和性质,选择合适的力场,并通过参数化过程确定力场的参数。

2.2 初始构型生成在进行分子动力学模拟之前,需要生成分子的初始构型。

常见的方法包括从实验数据或计算结果中获取分子的结构信息,或者通过分子建模软件生成分子的三维结构。

Gromacs支持多种文件格式,如PDB和GRO,用于存储分子的结构信息。

2.3 系统能量最小化在模拟开始之前,需要对系统进行能量最小化,以消除构型中的不合理接触或过度重叠。

Gromacs提供了多种能量最小化算法,如共轭梯度法和牛顿法。

在能量最小化过程中,系统中的粒子会根据力场的作用力逐渐移动,直到达到一个局部能量最小值。

2.4 模拟参数设置在进行分子动力学模拟之前,需要设置模拟的时间步长、模拟时间和模拟温度等参数。

时间步长决定了模拟的时间分辨率,一般选择在飞秒量级;模拟时间决定了模拟的总时长,需要根据研究目的和计算资源来确定;模拟温度可以通过控制系统与外界的热交换来模拟不同温度下的系统行为。

gromacs分子计算模拟流程

gromacs分子计算模拟流程

gromacs分子计算模拟流程English Answer:GROMACS Molecular Simulation Workflow.GROMACS is an open-source molecular simulation software package that is used to simulate the behavior of biomolecules, such as proteins, DNA, and lipids. The GROMACS workflow typically consists of the following steps:1. System preparation: This involves creating a molecular system that includes the molecule(s) of interest, as well as the surrounding solvent and any other necessary components. The system is typically prepared using a molecular editor, such as VMD or PyMOL.2. Force field selection: A force field is a set of equations that describe the interactions between atoms and molecules. The force field is selected based on the system being simulated.3. Simulation setup: This involves setting up the simulation parameters, such as the temperature, pressure, and time step. The simulation is also set up to output the desired data, such as the coordinates of the molecules and their energies.4. Simulation: The simulation is performed using a molecular dynamics engine. The engine integrates the equations of motion to update the positions and velocities of the molecules.5. Analysis: The simulation data is analyzed to extract information about the behavior of the system. This can be done using a variety of tools, such as VMD, PyMOL, and GROMACS utilities.Chinese Answer:GROMACS分子计算模拟流程。

gromacs使用手册

gromacs使用手册

gromacs使用手册(原创实用版)目录1.Gromacs 简介2.Gromacs 的功能3.Gromacs 的使用方法4.Gromacs 的应用实例5.Gromacs 的未来发展正文1.Gromacs 简介Gromacs 是一个开源的生物大分子模拟软件,主要用于分子动力学和势能计算。

它由 Warren P.Moran 等人开发,最早的版本于 1996 年发布。

Gromacs 名字来源于“Groningen 分子模拟器”,这反映出它的起源地——荷兰格罗宁根大学。

2.Gromacs 的功能Gromacs 具有多种功能,包括:- 分子动力学模拟:Gromacs 可以使用不同的力场进行分子动力学模拟,包括经典的力场如 CHARMM、OPLS、AMBER 等,也可以使用更高级的力场如 GB/SA、ITS/SA 等。

- 势能计算:Gromacs 可以计算生物大分子的势能,包括分子间的相互作用能、分子内能等。

- 模拟过程中的能量计算:Gromacs 可以在模拟过程中实时计算各种能量,如动能、势能、内能等。

- 轨迹文件的输出:Gromacs 可以输出轨迹文件,方便用户进行后续的分析。

- 模拟过程中的原子运动轨迹可视化:Gromacs 具有可视化工具,可以实时显示原子在模拟过程中的运动轨迹。

3.Gromacs 的使用方法Gromacs 的使用方法相对简单,用户只需要按照以下步骤进行操作:- 准备输入文件:用户需要根据需要编写输入文件,包括分子结构文件、模拟参数文件、力场文件等。

- 运行 Gromacs:用户需要在终端中输入 Gromacs 命令,指定输入文件和输出文件。

- 查看输出文件:模拟完成后,用户可以在指定的输出文件中查看模拟结果。

4.Gromacs 的应用实例Gromacs 广泛应用于生物大分子的模拟研究,包括蛋白质结构预测、药物设计、生物材料研究等。

gromacs5.x入门教程中文版,超详细

gromacs5.x入门教程中文版,超详细

GROMACS 教程第一步:准备拓扑一些GROMACS 基础知识随着版本5.0的GROMACS 的发布,所有的工具本质上是二进制名为“gmx”的模块。

这与以前的版本不同,其中每个工具都被调用为自己的命令。

在5.0中,这仍然通过符号链接工作,但是在将来版本中将会消失,所以最好习惯于新的做事方式。

要获取有关任何GROMACS 模块的帮助信息,您可以调用以下任一命令:gmx help (module)或者gmx (module) -h其中(module )被您想要发出的命令的实际名称所替代。

信息将打印到终端,包括可用的算法,选项,所需的文件格式,已知的错误和限制等。

对于GROMACS 的新用户,调用常用命令的帮助信息是了解每个命令可以执行的一个很好的方法。

现在,到有趣的东西!溶菌酶教程我们必须下载我们将要工作的蛋白质结构文件。

在本教程中,我们将使用鸡蛋白溶菌酶(PDB 代码1AKI )。

转到RCSB 网站并下载PDB 文本的晶体结构。

下载结构后,您可以使用VMD ,Chimera ,PyMOL 等查看程序来可视化结构。

一旦看到分子,您将要去除晶体水域。

使用纯文本编辑器,如vi ,emacs (Linux / Mac )或记事本(Windows )。

不要使用文字处理软件!删除与这些分子对应的行(PDB 文件中的残基“HOH”)。

注意,这种方法不是普遍适用的(即,结合的活性位点水分子的情况)。

对于我们这里的意图,我们不需要水晶。

请务必检查您的.pdb 文件中的评论MISSING 中列出的条目,因为这些条目表示不存在于晶体结构中的原子或全部残基。

终端区域可能不存在,并且可能不存在动态问题。

不完整的内部序列或任何具有缺失原子的氨基酸残基将导致pdb2gmx 失败。

这些缺失的原子/残基必须使用其他软件包进行建模。

还要注意,pdb2gmx 不是魔术。

它不能为任意分子产生拓扑,只是由力场定义的残基(在* .rtp 文件中 ­ 通常是蛋白质,核酸和非常有限量的辅因子,如NAD (H )和ATP )。

动力学系列之一-半天上手gromacs上手

动力学系列之一-半天上手gromacs上手

动力学系列之一-半天上手gromacs上手作者:dddc_redsnow 整理时间:2006-04-11在老何的鼓励下,发一下我的gromacs上手手册(我带人时用的,基本半天可以学会gromcas)####################################################### Process protein files step by step #######################################################pdb2gmx -f 2th_cap.pdb -o 2th_cap.gro -p 2th_cap.top -ignh -ternedit 2th_cap.topeditconf -f 2th_cap.gro -o 2th_cap_box.gro -d 1.5genbox -cp 2th_cap_box.gro -cs -p 2th_cap.top -o 2th_cap_water.gromake_ndx -f 2th_cap_water.gro -o 2th_cap.ndxgenpr -f 2th_cap_water.gro -n 2th_cap.ndx -o 2th_cap_All.itpgenpr -f 2th_cap_water.gro -n 2th_cap.ndx -o 2th_cap_M.itpgenpr -f 2th_cap_water.gro -n 2th_cap.ndx -o 2th_cap_C.itpnedit Flavo.itpgrompp -f em.mdp -c 2th_cap_water.gro -p 2th_cap.top -o prepare.tprgenion -s prepare.tpr -o 2th_cap_water_ion.gro -np 1 -pq 1###################################################### Minimize step by step ## 1. minimization fixing whole protein ## 2. minimization fixing maincharin of protein ## 3. minimization fixing Ca of protein ## 4. minimization without fix ######################################################grompp -np 4 -f em.mdp -c 2th_cap_water_ion.gro -p 2th_cap.top -o minimize_water.tprmpirun -np 4 mdrun -nice 0 -s minimize_water.tpr -o minimize_water.trr -c minimize_water.gro -e minimize_water.edr -g minimize_water.log &grompp -np 4 -f em.mdp -c minimize_water.gro -p 2th_cap.top -o minimize_sidechain.tprmpirun -np 4 mdrun -nice 0 -s minimize_sidechain.tpr -o minimize_sidechain.trr -c minimize_sidechain.gro -e minimize_sidechain.edr -g minimize_sidechain.log &grompp -np 4 -f em.mdp -c minimize_sidechain.gro -p 2th_cap.top -o minimize_sidechain_ex.tpr mpirun -np 4 mdrun -nice 0 -s minimize_sidechain_ex.tpr -o minimize_sidechain_ex.trr -c minimize_sidechain_ex.gro -e minimize_sidechain_ex.edr minimize_sidechain_ex.log &grompp -np 4 -f em.mdp -c minimize_sidechain_ex.gro -p 2th_cap.top -o minimize_all.tpr mpirun -np 4 mdrun -nice 0 -s minimize_all.tpr -o minimize_all.trr -c minimize_all.gro -e minimize_allx.edr -g minimize_all.log&editconf -f minimize_all.gro -o minimize_all.pdb###################################################### Raise temperature step by step ## 1. raise from 0K to 100K fixing whole protein ## 2. raise from 100K to 200K fixing whole protein ## 3. raise from 200K to 300K fixing whole protein ## 4. balance fixing maincharin of protein ## 5. balance fixing Ca of protein ######################################################grompp -np 4 -f heat.mdp -c minimize_all.gro -p 2th_cap.top -o temperature100K.tprmpirun -np 4 mdrun -nice 0 -s temperature100K.tpr -o temperature100K.trr -c temperature100K.gro -e temperature100K.edr -g temperature100K.log &grompp -np 4 -f heat.mdp -c temperature100K.gro -p 2th_cap.top -o temperature200K.tprmpirun -np 4 mdrun -nice 0 -s temperature200K.tpr -o temperature200K.trr -c temperature200K.gro -e temperature200K.edr -g temperature200K.log &grompp -np 4 -f heat.mdp -c temperature200K.gro -p 2th_cap.top -o temperature300K.tprmpirun -np 4 mdrun -nice 0 -s temperature300K.tpr -o temperature300K.trr -c temperature300K.gro -e temperature300K.edr -g temperature300K.log &g_energy -f temperature300K.edr -s temperature300K.tpr -o temperature300K.xvggrompp -np 4 -f heat.mdp -c temperature300K.gro -p 2th_cap.top -o T300K_M.tprmpirun -np 4 mdrun -nice 0 -s T300K_M.tpr -o T300K_M.trr -c T300K_M.gro -e T300K_M.edr -g T300K_M.log &grompp -np 4 -f heat.mdp -c T300K_M.gro -p 2th_cap.top -o T300K_Ca.tprmpirun -np 4 mdrun -nice 0 -s T300K_Ca.tpr -o T300K_Ca.trr -c T300K_Ca.gro -e T300K_Ca.edr -g T300K_Ca.log &grompp -np 4 -f heat.mdp -c T300K_Ca.gro -p 2th_cap.top -o T300K_MD.tprmpirun -np 4 mdrun -nice 0 -s T300K_MD.tpr -o T300K_MD.trr -c T300K_MD.gro -e T300K_MD.edr -g T300K_MD.log &至此,全部搞定,可以跑了。

K20 在65%DMSO 溶液中的Gromacs 模拟流程

K20 在65%DMSO 溶液中的Gromacs 模拟流程

K20在65%DMSO溶液中的Gromacs模拟流程1.将PDB文件转换成相应的GRO文件,选择pH10.0条件pdb2gmx_mpi-f K20.pdb-o K20.gro-p K20.top-i K20.itp-water spc-ignh-ss-ter-lys-asp-glu-his-missing 构建模拟盒子,选择距离蛋白质表面为3.0nm,采用十二面体模拟盒子;先向盒子中添加DMSO分子;之前要按分子比例计算出要添加DMSO的分子数:(阿伏伽德罗常数)6.02Х110(DMSO密度)Х657.579(在所见盒子里面所占的体积65%-1011.66)Х10-27/78(DMSO分子量)Х10-3editconf_mpi-f K20.gro-o K20_box.gro-d3.0-bt dodecahedron(首先删除体系中原有水分子)。

添加DMSO分子数genbox_mpi-cp K20_box.gro–cs DMSO.gro-o K20_sol.gro-p K20.top改变d的大小以调整DMSO的分子数合适计算的DMSO分子数editconf_mpi–f DMSO.gro–o DMSO.gro–d0.18genbox_mpi-cp K20_box.gro–cs DMSO.gro-o K20_sol.gro-p K20.top添加H2O分子genbox_mpi-cp K20_sol.gro–cs-o K20_sol2.gro-p K20.top更改K20_sol2.gro和K20.top和DMSO.gro和DMSO.itp,以符合加了DMSO分子(如图1);添加离子,中和电荷(电荷数根据实际情况添加)图1文件内容的更改grompp_mpi-f em.mdp-c K20_sol2.gro-p K20.top-o K20_sol2.tprgenion_mpi-s K20_sol2.tpr-o K20_sol3.gro-g K20_genion.log-p K20.top-np10-pq12.采用最速下降法降低体系能量,优化过程中采用了PME。

GROMACS使用教程要点

GROMACS使用教程要点

GROMACS教程一Gromacs基本模拟流程 (3)1 下载pdb文件 (3)2 用pdb2gmx 处理pdb 文件 (3)3 建立盒子 (3)5 设置能量最小化 (4)6 用grompp程序进行文件处理 (6)7 使用genion 和tpr文件添加离子 (6)8 用fws_ion.pdb来产生能量最小化的输入文件 (6)9 在后台运行能量最小化(在命令后加&) (7)二设置位置限制性动力学模拟 (7)三设置非限制性动力学模拟 (9)1 如何重启一个计算 (11)2 如何延长一个计算 (11)3 如何设置并行计算 (11)五模拟结果分析 (12)1 如何将特定帧的轨迹保存成*.pdb文件 (12)2 用ngmx观察轨迹文件(也可以用VMD观察轨迹文件) (12)3 比较常用的分析工具 (14)3.3 g_covar 计算斜方差 (16)3.4 g_energy 能量数据作图,如压力、体积、密度等 (16)3.5 g_gyrate 测量回旋半径 (17)3.6 g_rms 与g_rmsdist 计算结构的RMSD 值 (17)3.7 g_rmsf 计算原子位置的根均方波动(rmsf ) (18)3.8 do_dssp 计算模型的二级结构 (20)3.9 g_hbond 计算模拟过程中分子间的氢键的数目、距离或角度 (21)3.10 g_saltbr 分析模拟中残基间的盐桥 (21)GROMACS 是一个使用经典分子动力学理论研究蛋白质动力学的高端的高效的工具。

GROMACS是遵守GNU许可的免费软件,可以从以下站点下载:,并且可以在linux和Windows上使用。

在本教程中,将研究一个从漏斗形蜘蛛的毒液中分离的毒素。

我们将使用显性溶剂动力学的方法来进行研究。

首先比较真空中和溶解的模型。

我们将把毒素肽溶在水盒子里,紧接着用牛顿运动定律加以平衡。

我们还将比较偿离子在显性溶剂动力学中的影响。

gromacs计算方法

gromacs计算方法

gromacs计算方法一、简介GROMACS是一款广泛应用于生物分子模拟和分子动力学研究的软件包。

它基于Molecular Dynamics(分子动力学)原理,通过模拟分子的运动来研究其结构和性质。

GROMACS适用于各种生物分子,如蛋白质、核酸、脂质等,并支持多种常用的生物分子模拟软件库。

二、软件安装与配置1. 下载并安装GROMACS软件包,确保正确配置环境变量。

2. 准备模拟系统文件,包括原子坐标、力场参数、初始构型等。

3. 确保系统文件符合GROMACS支持的文件格式要求。

三、分子动力学模拟1. 导入系统文件到GROMACS中,设置初始参数和时间步长。

2. 运行分子动力学模拟,观察系统的运动变化。

3. 分析模拟结果,包括能量、构型、键长等。

四、常用命令与参数1. `mdrun`:主运行命令,用于启动模拟。

* `-s` 指定系统文件路径。

* `-dt` 设置时间步长。

* `-potential` 指定力场参数。

* `-append` 用于将模拟结果追加到已有文件。

2. `analyze`:分析模拟结果,包括能量、构型等。

* `energy`:计算系统能量。

* `rmsdist`:计算指定原子间的平均距离。

3. `modify`:修改系统参数,如温度和压力等。

* `unpair`:删除两个原子间的非键相互作用。

* `unit kg`:将系统单位设为质量单位。

4. `plot`:生成模拟结果的图表。

* `friction=0.05 dt`:生成时间步长与摩擦力的关系图。

五、输出文件与数据处理1. GROMACS生成的主要输出文件包括.log、.out和.tpr等。

* .log文件记录模拟过程中的日志信息。

* .out文件包含每个时间步长的详细信息,如能量、构型等。

* .tpr文件记录了整个模拟过程的运动轨迹。

2. 常用的数据处理工具包括Excel、Matplotlib和VMD等。

* Excel:用于数据整理和统计分析。

GROMACS使用教程要点

GROMACS使用教程要点

GROMACS教程一Gromacs基本模拟流程 (3)1 下载pdb文件 (3)2 用pdb2gmx 处理pdb 文件 (3)3 建立盒子 (3)5 设置能量最小化 (4)6 用grompp程序进行文件处理 (6)7 使用genion 和tpr文件添加离子 (6)8 用fws_ion.pdb来产生能量最小化的输入文件 (6)9 在后台运行能量最小化(在命令后加&) (7)二设置位置限制性动力学模拟 (7)三设置非限制性动力学模拟 (9)1 如何重启一个计算 (11)2 如何延长一个计算 (11)3 如何设置并行计算 (11)五模拟结果分析........................................................................................................121 如何将特定帧的轨迹保存成*.pdb文件 (12)2 用ngmx观察轨迹文件(也可以用VMD观察轨迹文件) (12)3 比较常用的分析工具 (14)3.3 g_covar 计算斜方差 (16)3.4 g_energy 能量数据作图,如压力、体积、密度等 (16)3.5 g_gyrate 测量回旋半径 (17)3.6 g_rms 与g_rmsdist 计算结构的RMSD 值 (17)3.7 g_rmsf 计算原子位置的根均方波动(rmsf ) (18)3.8 do_dssp 计算模型的二级结构 (20)3.9 g_hbond 计算模拟过程中分子间的氢键的数目、距离或角度 (21)3.10 g_saltbr 分析模拟中残基间的盐桥 (21)1GROMACS 是一个使用经典分子动力学理论研究蛋白质动力学的高端的高效的工具。

GROMACS是遵守GNU许可的免费软件,可以从以下站点下载:,并且可以在linux和Windows上使用。

在本教程中,将研究一个从漏斗形蜘蛛的毒液中分离的毒素。

生物信息学中的基因组序列分析工具使用指南

生物信息学中的基因组序列分析工具使用指南

生物信息学中的基因组序列分析工具使用指南随着高通量测序技术的发展,大量的基因组序列数据被不断产生。

为了从这些序列数据中获取有用的信息,生物学家们需要利用生物信息学工具对基因组序列进行分析。

本文将为您提供生物信息学中常用的基因组序列分析工具的使用指南。

一、BLAST(Basic Local Alignment Search Tool)BLAST是一种用于序列比对的常用工具。

它能够通过比对查询序列与已知序列数据库中的序列,来找到相似的序列并进行注释。

以下是使用BLAST的基本步骤:1. 准备查询序列:将待比对的查询序列保存为文本文件的形式,可以是单个序列或多个序列。

2. 选择BLAST程序:根据不同的比对目的,选择合适的BLAST程序,如blastn用于核酸与核酸的比对,blastp用于蛋白质与蛋白质的比对。

3. 选择数据库:根据需求选择适合的数据库,如NCBI核酸数据库(nt)或非冗余蛋白质数据库(nr)等。

4. 运行BLAST:使用命令行界面或图形界面,输入相应的参数,运行BLAST程序。

5. 分析结果:根据比对结果,分析相似序列的特征、功能等信息。

二、MAFFT(Multiple Alignment using Fast Fourier Transform)MAFFT是一种用于多序列比对的工具,能够同时比对多个序列,识别共有的区域,并预测不同序列间的变异位置。

以下是使用MAFFT 的基本步骤:1. 准备序列:将待比对的序列保存为文本文件的形式,可以是核酸序列或蛋白质序列。

2. 运行MAFFT:使用命令行界面,输入相应的参数,运行MAFFT 程序。

3. 分析比对结果:根据比对结果,分析序列间的共有区域和变异位置,推断序列的进化关系或寻找保守结构。

三、MEME(Multiple EM for Motif Elicitation)MEME是一种用于寻找DNA、RNA或蛋白质序列中共有模体(motif)的工具。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

GROMACS TutorialLysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will contain an explanation of input and output, using typical settings for general use.This tutorial assumes you are using a GROMACS version in the 4.5.x series.GROMACS TutorialStep One: Prepare the TopologyWe must d ownload the protein structure file we will be working with. For this tutorial, we will utilize hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and d ownload the PDB text for the crystal structure.Once you have d ownl oaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once you've had a look at the molecule, you are going to want to strip out the crystal waters. Use a plain text editor like vi, emacs (Linux/Mac), or Notepad (Wind ows). Do not use word processing software! Delete the lines corresponding to these molecules (residue "HOH" in the PDB file). Note that such a procedure is not universally appropriate (i.e., the case of a bound active site water molecul e). For our intentions here, we do not need crystal water.Always check your .pdb fil e for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be absent, and may not present a probl em for dynamics. Incompl ete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topologies for arbitrary molecul es, just the residues defined by the force fiel d (in the *.rtp fil es - generally proteins, nucl eic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB file shoul d contain only protein atoms, and is ready to be input into the first GROMACS tool, pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topol ogy for the molecule.2. A position restraint file.3. A post-processed structure file.The topol ogy (topol.top by default) contains all the information necessary to define the mol ecule within a simulation. This information includ es nonbonded parameters (atom types and charges) as well as bond ed parameters (bonds, angles, and dihedrals). We will take a more detail ed l ook at the topol ogy once it has been generated.Execute pdb2gmx by issuing the following command:pdb2gmx -f 1AKI.pdb-o 1AKI_processed.gro-water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a force field:Select the Force Field:From '/usr/l ocal/gromacs/share/gromacs/top':1: AMBER03 force fiel d (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)2: AMBER94 force fiel d (Cornell et al., JACS 117, 5179-5197, 1995)3: AMBER96 force fiel d (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)4: AMBER99 force fiel d (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006)6: AMBER99SB-ILDN force field (Lind orff-Larsen et al., Proteins 78, 1950-58, 2010)7: AMBERGS force fiel d (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)8: CHARMM27 all-atom force fiel d (with CMAP) - version 2.09: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schul er JCC 2001 22 1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force fiel d (2001 aminoacid dihedrals)15: [DEPRECATED] Encad all-atom force field, using full solvent charges16: [DEPRECATED] Encad all-atom force field, using scaled-d own vacuum charges17: [DEPRECATED] Gromacs force field (see manual)18: [DEPRECATED] Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force fiel d and d ecide which is most applicabl e to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, foll owed by 'Enter'.There are many other options that can be passed to pdb2gmx. Some are listed here:-ignh: Ignore H atoms in the PDB fil e; especially useful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GROMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides to Cys.You have now generated three new fil es: 1AKI_processed.gro, topol.top, and posre.itp. 1AKI_processed.gro is a GROMACS-formatted structure file that contains all the atoms d efined within the force field (i.e., H atoms have been ad ded to the amino acids in the protein). The topol.top fil e is the system topol ogy (more on this in a minute). The posre.itp file contains information used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLet's l ook at what is in the output topol ogy (topol.top). Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the foll owing:#include "oplsaa.ff/forcefield.itp"This line calls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force fiel d. The next important line is [ moleculetype ], bel ow which you will find; Name nrexclProtein_A 3The name "Protein_A" d efines the mol ecul e name, based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section d efines the [ atoms ] in the protein. The information is presented as columns:[ atoms ]; nr type resnr residue atom cgnr charge mass typeB chargeB massB; residue 1 LYS rtp LYSH q +2.01 opls_287 1 LYS N 1 -0.3 14.0067 ; qtot -0.32 opls_290 1 LYS H1 1 0.33 1.008 ; qtot 0.033 opls_290 1 LYS H2 1 0.33 1.008 ; qtot 0.364 opls_290 1 LYS H3 1 0.33 1.008 ; qtot 0.695 opls_293B 1 LYS CA 1 0.25 12.011 ; qtot 0.946 opls_140 1 LYS HA 1 0.06 1.008 ; qtot 1The interpretation of this information is as foll ows:•nr: Atom number•type: Atom type•resnr: Amino acid residue number•residue: The amino acid residue nameNote that this residue was "LYS" in the PDB file; the use of .rtp entry "LYSH" indicates that the residue is protonated (the predominant state at neutral pH).•atom: Atom name•cgnr: Charge group numberCharge groups d efine units of integer charge; they aid in speeding up calculations•charge: Self-explanatoryThe "qtot" descriptor keeps a running count of the total charge on the molecule•mass: Also self-explanatory•typeB, chargeB, massB: Used for free energy perturbation (not discussed here)Subsequent sections includ e [ bonds ], [ pairs ], [ angles ], and [ dihedrals ]. Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chapter 5 of the GROMACS manual. Special 1-4 interactions are included under "pairs" (section 5.3.4 of the GROMACS manual).The remainder of the fil e involves defining a few other useful/necessary topologies, starting with position restraints. The "posre.itp" file was generated by pdb2gmx; it defines a force constant used to keep atoms in place during equilibration (more on this later).; Include Position restraint file#ifdef POSRES#include "posre.itp"#endifThis ends the "Protein_A" moleculetype d efinition. The remaind er of the topol ogy file is dedicated to defining other molecules and providing system-level descriptions. The next mol eculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing "-water spce" to pdb2gmx. For an excellent summary of the many different water models, click here, but be aware that not all of these models are present within GROMACS.; Include water topology#include "oplsaa.ff/spce.itp"#ifdef POSRES_WATER; Position restraint for each water oxygen[ position_restraints ]; i funct fcx fcy fcz1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1 nm-2.Ion parameters are includ ed next:; Include generic topology for ions#include "oplsaa.ff/ions.itp"Finally come system-level definitions. The [ system ] directive gives the name of the system that will be written to output files during the simulation. The [ molecul es ] directive lists all of the mol ecul es in the system.[ system ]; NameLYSOZYME[ molecules ]; Compound #molsProtein_A 1A few key notes about the [ molecules ] directive:1.The order of the listed mol ecules must exactly match the ord er of the mol ecul es in thecoordinate (in this case, .gro) file.2.The names listed must match the [ moleculetype ] name for each species, not residuenames or anything else.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, mol ecules not being found, or a number of others.Now that we have examined the contents of a topol ogy file, we can continue building our system.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possibl e to simulate proteins and other mol ecul es in different solvents, provid ed that good parameters are availabl e for all species involved.There are two steps to defining the box and filling it with solvent:1.Define the box dimensions using editconf.2.Fill the box with water using genbox.You are now presented with a choice as to how to treat the unit cell. For the purpose of this tutorial, we will use a simpl e cubic box as the unit cell. As you become more comfortable with periodic boundary conditions and box types, I highly recommend the rhombic dodecahedron, as its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be add ed to solvate the protein.Let's d efine the box using editconf:editconf-f 1AKI_processed.gro-o 1AKI_newbox.gro-c -d 1.0 -bt cubicThe above command centers the protein in the box (-c), and places it at least 1.0 nm from the box edge (-d 1.0). The box type is defined as a cube (-bt cubic). The distance to the edge of the box is an important parameter. Since we will be using periodic boundary conditions, we must satisfy the minimum image convention. That is, a protein should never see its periodic image, otherwise the forces calculated will be spurious. Specifying a solute-box distance of 1.0 nm will mean that there are at least 2.0 nm between any two periodic images of a protein. This distance will be sufficient for just about any cutoff scheme commonly used in simulations.Now that we have defined a box, we can fill it with solvent (water). Solvation is accomplished using genbox:genbox -cp 1AKI_newbox.gro-cs spc216.gro-o 1AKI_solv.gro-p topol.topThe configuration of the protein (-cp) is contained in the output of the previous editconf step, and the configuration of the solvent (-cs) is part of the standard GROMACS installation. We are using spc216.gro, which is a generic equilibrated 3-point solvent model. You can use spc216.gro as the solvent configuration for SPC, SPC/E, or TIP3P water, since they are all three-point water models. The output is called 1AKI_solv.gro, and we tell genbox the name of the topology file (topol.top) so it can be modified. Note the changes to the [ mol ecul es ] directive of topol.top:[ mol ecul es ]; Compound #molsProtein_A 1SOL 10832What genbox has d one is keep track of how many water mol ecul es it has ad ded, which it then writes to your topol ogy to refl ect the changes that have been mad e. Note that if you use any other (non-water) solvent, genbox will not make these changes to your topol ogy! Its compatibility with updating water molecules is hard-coded.Step Four: Adding IonsWe now have a solvated system that contains a charged protein. The output of pdb2gmx told us that the protein has a net charge of +8e (based on its amino acid composition). If you missed this information in the pdb2gmx output, look at the last line of your [ atoms ] directive in topol.top; it shoul d read (in part) "qtot 8." Since life d oes not exist at a net charge, we must ad d ions to our system.The tool for adding ions within GROMACS is called genion. What genion does is read through the topology and replace water molecules with the ions that the user specifies. The input is call ed a run input fil e, which has an extension of .tpr; this file is produced by the GROMACS tool grompp (GROMACS pre-processor), which will also be used later when we run our first simulation. What grompp does is process the coordinate fil e and topology (which describes the mol ecules) to generate an atomic-level input (.tpr). The .tpr file contains all the parameters for all of the atoms in the system.To produce a .tpr file with grompp, we will need an additional input fil e, with the extension .mdp (molecular dynamics parameter file); grompp will assemble the parameters specified in the .mdp fil e with the coordinates and topol ogy information to generate a .tpr file.An .mdp fil e is normally used to run energy minimization or an MD simulation, but in this case is simply used to generate an atomic description of the system. An example .mdp file (the one we will use) can be d ownload ed here.In reality, the .mdp file used at this step can contain any legitimate combination of parameters.I typically use an energy-minimization script, because they are very basic and d o not involve any complicated parameter combinations.Assembl e your .tpr file with the foll owing:grompp -f ions.mdp-c 1AKI_solv.gro-p topol.top -o ions.tprNow we have an atomic-level description of our system in the binary fil e ions.tpr. We will pass this file to genion:genion -s ions.tpr-o 1AKI_solv_ions.gro-p topol.top-pname NA-nname CL-nn 8When prompted, choose group 13 "SOL" for embedding ions. You d o not want to replace parts of your protein with ions.In the genion command, we provid e the structure/state file (-s) as input, generate a .gro file as output (-o), process the topology (-p) to reflect the removal of water molecules and addition of ions, d efine positive and negative ion names (-pname and -nname, respectively), and tell genion to ad d only the ions necessary to neutralize the net charge on the protein by adding the correct number of negative ions (-nn 8). You could also use genion to add a specified concentration of ions in addition to simply neutralizing the system by specifying the-neutral and -conc options in conjunction. Refer to the genion man page for information on how to use these options.The names of the ions specified with -pname and -nname were force fiel d-specific in previous versions of GROMACS, but have been standardized as of version 4.5. The specified atom names are always the el emental symbol in all capital letters, along with the [ mol eculetype ]. Residue names may or may not append the sign of the charge (+/-). Refer to ions.itp for proper nomenclature if you encounter difficulties.Your [ molecules ] directive should now look like:[ molecules ]; Compound #molsProtein_A 1SOL 10824CL 8Step Five: Energy MinimizationThe solvated, electroneutral system is now assembled. Before we can begin dynamics, we must ensure that the system has no steric clashes or inappropriate geometry. The structure is relaxed through a process called energy minimization (EM).The process for EM is much like the addition of ions. We are once again going to use grompp to assemble the structure, topology, and simulation parameters into a binary input file (.tpr), but this time, instead of passing the .tpr to genion, we will run the energy minimization through the GROMACS MD engine, mdrun.Assembl e the binary input using grompp using this input parameter file:grompp -f minim.mdp-c 1AKI_solv_ions.gro -p topol.top-o em.tprMake sure you have been updating your topol.top file when running genbox and genion, or else you will get lots of nasty error messages ("number of coordinates in coordinate file does not match topology," etc).We are now ready to invoke mdrun to carry out the EM:mdrun -v-deffnm emThe -v flag is for the impatient: it makes mdrun verbose, such that it prints its progress to the screen at every step. The -deffnm flag will define the file names of the input and output. So, if you did not name your grompp output "em.tpr," you will have to explicitly specify its name with the mdrun -s flag. In our case, we will get the foll owing files:•em.log: ASCII-text log file of the EM process•em.edr: Binary energy file•em.trr: Binary full-precision trajectory•em.gro: Energy-minimized structureThere are two very important factors to evaluate to d etermine if EM was successful. The first is the potential energy (printed at the end of the EM process, even without -v). Epot shoul d be negative, and (for a simple protein in water) on the ord er of 105-106, depending on the system size and number of water molecul es. The second important feature is the maximum force, Fmax, the target for which was set in minim.mdp - "emtol = 1000.0" - indicating a target Fmax of no greater than 1000 kJ mol-1 nm-1. It is possible to arrive at a reasonable Epot with Fmax > emtol. If this happens, your system may not be stable enough for simulation. Evaluate why it may be happening, and perhaps change your minimization parameters (integrator, emstep, etc).Let's do a bit of analysis. The em.edr fil e contains all of the energy terms that GROMACS collects during EM. You can analyze any .edr fil e using the GROMACS tools g_energy:g_energy -f em.edr-o potential.xvgAt the prompt, type "10 0" to select Potential (10); zero (0) terminates input. You will be shown the average of Epot, and a file called "potential.xvg" will be written. To pl ot this data, you will need the Xmgrace plotting tool. The resulting plot should look something like this, demonstrating the nice, steady convergence of Epot:Now that our system is at an energy minimum, we can begin real dynamics.Step Six: EquilibrationEM ensured that we have a reasonabl e starting structure, in terms of geometry and solvent orientation. To begin real dynamics, we must equilibrate the solvent and ions around the protein. If we were to attempt unrestrained dynamics at this point, the system may collapse. The reason is that the solvent is mostly optimized within itself, and not necessarily with the solute. It needs to be brought to the temperature we wish to simulate and establish the proper orientation about the solute (the protein). After we arrive at the correct temperature (based on kinetic energies), we will apply pressure to the system until it reaches the proper density.Remember that posre.itp fil e that pdb2gmx generated a long time ago? We're going to use it now. The purpose of posre.itp is to apply a position restraining force on the heavy atoms of the protein (anything that is not a hydrogen). Movement is permitted, but only after overcoming a substantial energy penalty. The utility of position restraints is that they all ow us to equilibrate our solvent around our protein, without the ad ded variabl e of structural changes in the protein.Equilibration is often conducted in two phases. The first phase is conducted und er an NVT ensemble (constant Number of particles, Volume, and Temperature). This ensemble is also referred to as "isothermal-isochoric" or "canonical." The timeframe for such a procedure is dependent upon the contents of the system, but in NVT, the temperature of the system should reach a plateau at the desired value. If the temperature has not yet stabilized, additional time will be required. Typically, 50-100 ps shoul d suffice, and we will conduct a100-ps NVT equilibration for this exercise. Depending on your machine, this may take a while (just over an hour on a dual-core MacBook). Get the .mdp file here.We will call grompp and mdrun just as we did at the EM step:grompp -f nvt.mdp-c em.gro -p topol.top-o nvt.tprmdrun -deffnm nvtA full explanation of the parameters used can be found in the GROMACS manual, in addition to the comments provid ed. Take note of a few parameters in the .mdp file:•gen_vel = yes: Initiates velocity generation. Using different random seeds (gen_seed) gives different initial velocities, and thus multiple (different) simulations can be conducted from the same starting structure.•tcoupl = V-rescale: The velocity rescaling thermostat is an improvement upon the Berendsen weak coupling method, which did not reproduce a correct kinetic ensemble.•pcoupl = no: Pressure coupling is not applied.Let's analyze the temperature progression, again using g_energy:g_energy -f nvt.edrType "15 0" at the prompt to sel ect the temperature of the system and exit. The resulting plot should look something like the foll owing:From the plot, it is cl ear that the temperature of the system quickly reaches the target value (300 K), and remains stabl e over the remainder of the equilibration. For this system, a shorter equilibration period (on the ord er of 50 ps) may have been adequate.Step Seven: Equilibration, Part 2The previous step, NVT equilibration, stabilized the temperature of the system. Prior to data collection, we must also stabilize the pressure (and thus also the d ensity) of the system. Equilibration of pressure is conducted und er an NPT ensemble, wherein the Number of particl es, Pressure, and Temperature are all constant. The ensembl e is also call ed the "isothermal-isobaric" ensemble, and most closely resembles experimental conditions.The .mdp file used for a 100-ps NPT equilibration can be found here. It is not drastically different from the parameter file used for NVT equilibration. Note the addition of the pressure coupling section, using the Parrinello-Rahman barostat.A few other changes:•continuation = yes: We are continuing the simulation from the NVT equilibration phase •gen_vel = no: Velocities are read from the trajectory (see below)We will call grompp and mdrun just as we did for NVT equilibration. Note that we are now including the -t flag to include the checkpoint file from the NVT equilibration; this file contains all the necessary state variables to continue our simulation. To conserve the velocities produced during NVT, we must includ e this file. The coordinate file (-c) is the final output of the NVT simulation.grompp -f npt.mdp -c nvt.gro-t nvt.cpt -p topol.top -o npt.tprmdrun -deffnm nptLet's analyze the pressure progression, again using g_energy:g_energy -f npt.edr -o pressure.xvgType "16 0" at the prompt to select the pressure of the system and exit. The resulting plot should look something like the foll owing:The pressure value fluctuates widely over the course of the 100-ps equilibration phase, but this behavior is not unexpected. The running average of these data are plotted as the red line in the plot. Over the course of the equilibration, the average value of the pressure is 1.05 bar.Let's take a l ook at density as well, this time using g_energy and entering "22 0" at the prompt.g_energy -f npt.edr -o density.xvgAs with the pressure, the running average of the density is also plotted in red. The average value over the course of 100 ps is 998.3 kg m-3, close to the experimental value of 1000 kg m-3. The parameters for the SPC/E water model cl osely replicate experimental values for water. The density values are very stable over time, indicating that the system is well-equilibrated now with respect to pressure and d ensity.Step Eight: Production MDUpon completion of the two equilibration phases, the system is now well-equilibrated at the desired temperature and pressure. We are now ready to rel ease the position restraints and run production MD for data coll ection. The process is just like we have seen before, as we will make use of the checkpoint file (which in this case now contains preserve pressure coupling information) to grompp. We will run a 1-ns MD simulation, the script for which can be found here.grompp -f md.mdp-c npt.gro-t npt.cpt-p topol.top -o md_0_1.tprmdrun -deffnm md_0_1The mdrun step is d efinitely best run in parallel on a cluster, since it will take several hours to complete. The proper command for executing such a job woul d be:mpirun -np X mdrun_mpi -deffnm md_0_1where X indicates the number of processors to be used for the simulation. Near the end of the grompp output, you should have seen a line like:Estimate for the relative computational l oad of the PME mesh part: 0.25The estimate for PME l oad will dictate how many processors shoul d be dedicated to the PME cal culation, and how many for the PP cal culations. Refer to the GROMACS 4 publication and the manual for details. For a cubic box, the optimal setup will have a PME load of 0.25 (3:1 PP:PME - we're in luck!); for a dodecahedral box, the optimal PME l oad is 0.33 (2:1 PP:PME). When executing mdrun, the program shoul d automatically determine the best number of processors to assign for the PP and PME cal culations. Thus, make sure you indicate an appropriate number of nod es for your calculation (the value of -np X), so that you can get the best performance. For this system, I achieved roughly 14 ns/day performance on 24 CPU's (18 PP, 6 PME).Step Nine: AnalysisNow that we have simulated our protein, we shoul d run some analysis on the system. What types of data are important? This is an important question to ask before running the simulation, so you shoul d have some ideas about the types of data you will want to collect in your own systems. For this tutorial, a few basic tools will be introduced.The first is trjconv, which is used as a post-processing tool to strip out coordinates, correct for periodicity, or manually alter the trajectory (time units, frame frequency, etc). For this exercise, we will use trjconv to account for any periodicity in the system. The protein will diffuse through the unit cell, and may appear to "jump" across to the other side of the box. To account for such actions, issue the foll owing:trjconv -s md_0_1.tpr-f md_0_1.xtc-o md_0_1_noPBC.xtc -pbc mol-ur compactSelect 0 ("System") for output. We will conduct all our analyses on this "corrected" trajectory. Let's l ook at structural stability first. GROMACS has a built-in utility for RMSD cal culations call ed g_rms. To use g_rms, issue this command:g_rms -s md_0_1.tpr-f md_0_1_noPBC.xtc-o rmsd.xvg-tu nsChoose 4 ("Backbone") for both the least squares fit and the group for RMSD cal culation. The -tu flag will output the results in terms of ns, even though the trajectory was written in ps. This is d one for clarity of the output (especially if you have a long simulation - 1e+06 ps d oes not look as nice as 100 ns). The output plot will show the RMSD relative to the structure present in the minimized, equilibrated system:If we wish to calculate RMSD relative to the crystal structure, we coul d issue the following:g_rms -s em.tpr -f md_0_1_noPBC.xtc-o rmsd_xtal.xvg -tu nsThe result will be something like:。

相关文档
最新文档