Transmembrane proteins
Last update on the 16th of May, 2018The task is dedicated to studying transmembrane proteins and related databases and services.
File | Link |
---|---|
Count mean number of residues per one transmembrane subunit | aver_count.py |
list of real and predicted alpha helices | helices.csv |
TMHMM output | tmhmm.txt |
Phobius output | phobius.txt |
Compute JI and OC | helices.py |
OPM database
aver_count.pyTwo proteins were chosen for the following analysis: Gamma-secretase complex, structure 1 (alpha-helical polytopic protein, PDB ID: 5A63) and Cytolysin and hemolysin HlyA Pore-forming toxin (beta-barrel transmembrane protein, PDB ID: 3O44). Protein models with membrane surronding and their properites are shown in table 1.
Protein | Gamma-secretase complex, structure 1 | Cytolysin and hemolysin HlyA Pore-forming toxin |
---|---|---|
PDB ID | 5A63 | 3O44 |
Type | Alpha-helical polytopic | Beta-barrel transmembrane |
Hydrophobic thickness | 29.8Å | 24.6Å |
Transmembrane subunits' coordinates |
A chain: 669-690 B chain: 85-102, 171-186, 195-214, 221-240, 243-260, 383-399, 403-421, 436-458 C chain: 4-26, 32-56, 70-91, 118-139, 157-179, 187-207, 211-231 D chain: 58-79 |
A - G chains: 291-297, 304-310 |
Mean number of residues per one transmembrane subunit | 20.7 | 7 |
Location | Human ER membrane [3] | Incorporates into cellular membrane of Mammalians[4] |
Image |
To count mean number of residues per one transmembrane subunit a python script aver_count.py
was developed.
Prediction of transmembrane helices
helices.csv, helices.py, phobius.txt, tmhmm.txt
There are two services to predict transmembrane helices in given protein sequence: TMHMM and Phobius. Both count posterior probability of given residue to be
included in several areas of cell. TMHMM areas are transmembrane, outside cell and inside cell, Phobius areas are transmembrane, cytoplasmic, non cytoplasmic and
signal peptide. Both services take protein sequence as input and provide text and graphical output. Raw text output of both services for 5A63 protein is available at phobius.txt
and tmhmm.txt
, the comparison of predicted and real data is provided in tables 2 and 3, graphical output of both services is shown in fig. 1 and 2.
Graphs of posterior probabilites are very similar between two services.
chain | OPM | TMHMM | Phobius |
---|---|---|---|
A | 669-690 | 670-692 | 670-690 |
B | 85-102 | 82-100 | 82-100 |
B | - | 132-154 | 133-154 |
B | 171-186 | 161-183 | 161-183 |
B | 195-214 | 193-215 | 195-213 |
B | 221-240 | 224-241 | 225-241 |
B | 243-260 | 246-268 | 247-263 |
B | - | 281-298 | - |
B | 383-399 | - | - |
B | 403-421 | 404-426 | 408-428 |
B | 436-458 | 431-453 | 434-453 |
C | 4-26 | 4-26 | 6-25 |
C | 32-56 | 33-55 | 32-55 |
C | 70-91 | 65-82 | 67-86 |
C | 118-139 | 119-141 | 117-135 |
C | 157-179 | 156-178 | 155-180 |
C | 187-207 | 187-209 | 187-209 |
C | 211-231 | 214-236 | 215-236 |
D | - | 19-41 | 18-38 |
D | 58-79 | 56-78 | 58-81 |
Both services predicted almost all real transmembrane helices except for one missing in chain B. Phobius predicted 2 additional helices, TMHMM — one more.
Ranges of predicted regions seem to be more concordant between predictions rather than between reality and prediction. To measure this effect a python script
helices.py
was developed that computes pairwise Jaccard index and overlap coefficient between all sets of data. The result is given in table 3.
OPM | TMHMM | Phobius | |
OPM | — | — | — |
TMHMM | 0.64 0.85 | — | — |
Phobius | 0.67 0.83 | 0.83 0.96 | — |
Predictive tools are indeed more concordant between them than with real data. Both algorithm implement an HMM[1, 2] to define which regions belong to several considered types but differ in the layout of the model (fig. 3). Phobius developers claim Phobius possesses reduced FDR of signal peptides than TMHMM[2]. However, observation of given transmembrane predictions doesn't proove Phobius deals with TM regions better than TMHMM.
BIological functions of observed proteins
The information provided from given databases.
TCDB
The transporter classification database is a curated database that classifies membrane transport proteins via Transporter Classification system (TC). It is quite similar to well-known EC system in marking proteins with codes of TC V.W.X.Y.Z type. The letters stand for:
- V (a number): transporter class;
- W (a letter): transporter subclass (energy source);
- X (a number): transporter family (superfamily);
- Y (a number): transporter subfamily;
- Z: specific transporter with a particular range of substrates transported.
The protein 5A63 is encoded as 1.A.54.1.1, which stands for:
- 1: Channels/pores;
- 1.A: α-type channels;
- 1.A.54: The rresenilin ER Ca2+ leak channel (presenilin) family;
- 1.A.54.1.1: Presenilin-1 Ca2+ leak channel (part of the γ-secretase complex).
The protein 3O44 is encoded as 1.C.14.1.1, which stands for:
- 1: Channels/pores;
- 1.C: Pore-forming toxins (proteins and peptides);
- 1.C.14 The cytohemolysin (CHL) family;
- 1.C.14.1.1: Cytohemolysin precursor, HlyA (Vibrio cholerae cytolysin, VCC), a beta-barrel pore-forming toxin (beta-PFT).
KEGG
The 5A63 protein is noted in KEGG under 5663 accession. It participates in several signalling pathways and Alzheimer's disease pathway (hsa05010) and human papillomavirus infection (hsa05165).
The 3O44 protein is noted in KEGG under VCA0219 accession and is included in Vibrio cholerae infection (vch05110).
CDD
No COGs were found for both proteins.
GO
5A63 protein is associated with many GO terms like GO:0042987 amyloid precursor protein catabolic process and GO:0008624 induction of apoptosis by extracellular signalling.
3O44 protein is associated with restricted number of GO terms especially with GO:0020002 host cell plasma membrane and GO:0019836 hemolysis by symbiont of host erythrocytes.
Wikipedia
The 5A63 protein is a gamma-secretase protein that cleaves single-pass transmembrane proteins at residues buried in membrane[3]. The most well-known substrate is amyloid precursor protein that being cleaved produces amyloid beta whose abnormally folded fibrillar form is a primary component of amyloid plaques found in brains of Alzheimer's disease patients. The gamma secretase also processes several integral membrane proteins such as Notch and E-cadherin.
The 3O44 protein is a beta pore-forming toxin (β-PFT)[4] of Vibrio cholerae. When the pore is formed the regulation of celullar up- and downtake is disrupted that is followed by cell lysis. β-PFT also induce host response beneficial to bacteria proliferation.
References
- Anders Krogh, Björn Larsson, Gunnar von Heijne, Erik L.L Sonnhammer, Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. Journal of Molecular Biology, Volume 305, Issue 3, 2001, Pages 567-580, ISSN 0022-2836, doi.org/10.1006/jmbi.2000.4315.
- Lukas Käll Anders Krogh Erik L.L. Sonnhammer, Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server. Nucleic Acids Research, Volume 35, Issue suppl_2, 1 July 2007, Pages W429–W432, doi.org/10.1093/nar/gkm256.
- Gamma secretase Wikipedia article;
- β-PFT Wikipedia article.