Начнём с визуального анализа моделирования. Для этого с помощью команды trjconv конвертируем файл с траекториями в pdb файл и посмотрим на него.
! echo '1\n1' | gmx trjconv -f pep_md.xtc -s pep_md.tpr -o pep_fit_1.pdb -skip 20 -fit rot+trans
:-) GROMACS - gmx trjconv, 2020.1-Ubuntu-2020.1-1 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen Christian Wennberg Maarten Wolf Artem Zhmurov and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2019, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx trjconv, version 2020.1-Ubuntu-2020.1-1 Executable: /usr/bin/gmx Data prefix: /usr Working dir: /home/marinakan/pr12 Command line: gmx trjconv -f pep_md.xtc -s pep_md.tpr -o pep_fit_1.pdb -skip 20 -fit rot+trans Note that major changes are planned in future for trjconv, to improve usability and utility. Will write pdb: Protein data bank file Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Select group for least squares fit Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Select group for output Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Reading frame 0 time 0.000 Precision of pep_md.xtc is 0.001 (nm) Back Off! I just backed up pep_fit_1.pdb to ./#pep_fit_1.pdb.2# Last frame 2000 time 20000.000 -> frame 100 time 20000.000 GROMACS reminds you: "There's no kill like overkill, right?" (Erik Lindahl)
Рассчитаем среднеквадратичное отклонение в ходе моделирования. Так как у нас происходит конформационный переход, сначала рассчитаем отклонение в ходе всей симуляции относительно стартовой структуры
! echo '1\n1' | gmx rms -f pep_md.xtc -s pep_md.tpr -o rms_1
:-) GROMACS - gmx rms, 2020.1-Ubuntu-2020.1-1 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen Christian Wennberg Maarten Wolf Artem Zhmurov and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2019, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx rms, version 2020.1-Ubuntu-2020.1-1 Executable: /usr/bin/gmx Data prefix: /usr Working dir: /home/marinakan/pr12 Command line: gmx rms -f pep_md.xtc -s pep_md.tpr -o rms_1 Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Select group for least squares fit Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Select group for RMSD calculation Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Last frame 2000 time 20000.000 GROMACS reminds you: "Like other defaulters, I like to lay half the blame on ill-fortune and adverse circumstances" (Mr. Rochester in Jane Eyre by Charlotte Bronte)
И рассчитаем среднеквадратичное отклонение относительно каждой предыдущей структуры на растоянии 400 кадров.
! echo '1\n1' | gmx rms -f pep_md.xtc -s pep_md.tpr -o rms_2 -prev 400
:-) GROMACS - gmx rms, 2020.1-Ubuntu-2020.1-1 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen Christian Wennberg Maarten Wolf Artem Zhmurov and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2019, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx rms, version 2020.1-Ubuntu-2020.1-1 Executable: /usr/bin/gmx Data prefix: /usr Working dir: /home/marinakan/pr12 Command line: gmx rms -f pep_md.xtc -s pep_md.tpr -o rms_2 -prev 400 WARNING: using option -prev with large trajectories will require a lot of memory and could lead to crashes Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Select group for least squares fit Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Select group for RMSD calculation Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Last frame 2000 time 20000.000 GROMACS reminds you: "The Poodle Bites" (F. Zappa)
Вычислим поверхность, доступную растворителю.
! echo 1 | gmx sasa -f pep_md.xtc -s pep_md.tpr -o sas_pep.xvg
:-) GROMACS - gmx sasa, 2020.1-Ubuntu-2020.1-1 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen Christian Wennberg Maarten Wolf Artem Zhmurov and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2019, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx sasa, version 2020.1-Ubuntu-2020.1-1 Executable: /usr/bin/gmx Data prefix: /usr Working dir: /home/marinakan/pr12 Command line: gmx sasa -f pep_md.xtc -s pep_md.tpr -o sas_pep.xvg Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ Frank Eisenhaber and Philip Lijnzaad and Patrick Argos and Chris Sander and Michael Scharf The Double Cube Lattice Method: Efficient Approaches to Numerical Integration of Surface Area and Volume and to Dot Surface Contouring of Molecular Assemblies J. Comp. Chem. 16 (1995) pp. 273-284 -------- -------- --- Thank You --- -------- -------- WARNING: Masses and atomic (Van der Waals) radii will be guessed based on residue and atom names, since they could not be definitively assigned from the information in your input files. These guessed numbers might deviate from the mass and radius of the atom type. Please check the output files if necessary. NOTE: From version 5.0 gmx sasa uses the Van der Waals radii from the source below. This means the results may be different compared to previous GROMACS versions. ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++ A. Bondi van der Waals Volumes and Radii J. Phys. Chem. 68 (1964) pp. 441-451 -------- -------- --- Thank You --- -------- -------- Last frame 2000 time 20000.000 Analyzed 2001 frames, last time 20000.000 GROMACS reminds you: "We'll celebrate a woman for anything, as long as it's not her talent." (Colleen McCullough)
Построим график изменения RMSD от времени моделирования.
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
df_rms_1 = pd.read_table("rms_1.xvg", skiprows=18, delim_whitespace=True, names=['Time, ps','RMSD, nm'])
df_rms_1.plot(kind='scatter',x='Time, ps',y='RMSD, nm')
<AxesSubplot:xlabel='Time, ps', ylabel='RMSD, nm'>
df_rms_2 = pd.read_table("rms_2.xvg", skiprows=18, delim_whitespace=True, names=['Time, ps','RMSD, nm'])
df_rms_2.plot(kind='scatter',x='Time, ps',y='RMSD, nm')
<AxesSubplot:xlabel='Time, ps', ylabel='RMSD, nm'>
df_sas = pd.read_table("sas_pep.xvg", skiprows=24, delim_whitespace=True, names=['Time, ps','Area'])
df_sas.plot(kind='scatter',x='Time, ps',y='Area')
<AxesSubplot:xlabel='Time, ps', ylabel='Area'>
Посчитаем количество водородных связей в пептиде.
! echo '1\n1' | gmx hbond -f pep_md.xtc -s pep_md.tpr -num hbond_pep
:-) GROMACS - gmx hbond, 2020.1-Ubuntu-2020.1-1 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen Christian Wennberg Maarten Wolf Artem Zhmurov and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2019, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx hbond, version 2020.1-Ubuntu-2020.1-1 Executable: /usr/bin/gmx Data prefix: /usr Working dir: /home/marinakan/pr12 Command line: gmx hbond -f pep_md.xtc -s pep_md.tpr -num hbond_pep Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Specify 2 groups to analyze: Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Select a group: Selected 1: 'Protein' Calculating hydrogen bonds in Protein (243 atoms) Found 25 donors and 48 acceptors Reading frame 0 time 0.000 Will do grid-search on 12x11x10 grid, rcut=0.34999999 Frame loop parallelized with OpenMP using 2 threads. Last frame 2000 time 20000.000 Average number of hbonds per timeframe 8.635 out of 600 possible GROMACS reminds you: "set: No match." (tcsh)
Посчитаем количество водородных связей пептид-формамид.
! echo '1\n13' | gmx hbond -f pep_md.xtc -s pep_md.tpr -num hbond_pep_sl
:-) GROMACS - gmx hbond, 2020.1-Ubuntu-2020.1-1 (-: GROMACS is written by: Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen Par Bjelkmar Christian Blau Viacheslav Bolnykh Kevin Boyd Aldert van Buuren Rudi van Drunen Anton Feenstra Alan Gray Gerrit Groenhof Anca Hamuraru Vincent Hindriksen M. Eric Irrgang Aleksei Iupinov Christoph Junghans Joe Jordan Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund Pascal Merz Pieter Meulenhoff Teemu Murtola Szilard Pall Sander Pronk Roland Schulz Michael Shirts Alexey Shvetsov Alfons Sijbers Peter Tieleman Jon Vincent Teemu Virolainen Christian Wennberg Maarten Wolf Artem Zhmurov and the project leaders: Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2019, The GROMACS development team at Uppsala University, Stockholm University and the Royal Institute of Technology, Sweden. check out http://www.gromacs.org for more information. GROMACS is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. GROMACS: gmx hbond, version 2020.1-Ubuntu-2020.1-1 Executable: /usr/bin/gmx Data prefix: /usr Working dir: /home/marinakan/pr12 Command line: gmx hbond -f pep_md.xtc -s pep_md.tpr -num hbond_pep_sl Reading file pep_md.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision) Specify 2 groups to analyze: Group 0 ( System) has 6298 elements Group 1 ( Protein) has 243 elements Group 2 ( Protein-H) has 127 elements Group 3 ( C-alpha) has 15 elements Group 4 ( Backbone) has 45 elements Group 5 ( MainChain) has 61 elements Group 6 ( MainChain+Cb) has 76 elements Group 7 ( MainChain+H) has 78 elements Group 8 ( SideChain) has 165 elements Group 9 ( SideChain-H) has 66 elements Group 10 ( Prot-Masses) has 243 elements Group 11 ( non-Protein) has 6055 elements Group 12 ( Other) has 6054 elements Group 13 ( FAM) has 6054 elements Group 14 ( NA) has 1 elements Group 15 ( Ion) has 1 elements Group 16 ( FAM) has 6054 elements Group 17 ( NA) has 1 elements Select a group: Selected 1: 'Protein' Select a group: Selected 13: 'FAM' Checking for overlap in atoms between Protein and FAM Calculating hydrogen bonds between Protein (243 atoms) and FAM (6054 atoms) Found 1034 donors and 2066 acceptors Reading frame 0 time 0.000 Will do grid-search on 12x11x10 grid, rcut=0.34999999 Frame loop parallelized with OpenMP using 2 threads. Last frame 2000 time 20000.000 Average number of hbonds per timeframe 44.249 out of 1.06812e+06 possible GROMACS reminds you: "A ship in port is safe, but that is not what ships are for. Sail out to sea and do new things." (Grace Hopper, developer of COBOL)