[Home ] [Archive]   [ فارسی ]  
:: ::
:: Volume 14, Issue 3 (May 2020) ::
Qom Univ Med Sci J 2020, 14(3): 54-63 Back to browse issues page
Identification of Heat-Resistant Bacteria Based on Selection of Proper Representation of Protein Sequences Using Deep Learning Approach
Reza Ahsan1 , Mansour Ebrahimi * 2
1- School of Engineering, University of Qom
2- Department of Biology, School of Basic Sciences, University of Qom , mansour@future.edu
Abstract:   (2151 Views)
Background and Objectives: Identification of effective mechanisms in heat-resistance in bacteria is of great importance in some industries, such as food industry, textile manufacturing, and especially in detergent production industries. For this purpose, deep learning tools were used to identify the characteristics of heat-resistant bacteria based on protein properties.
 
Methods: Some characteristics of heat-resistant and non-heat-resistant proteins, such as the structural properties of amino acids, the number and the frequency of each amino acid, and their physicochemical properties, were calculated. Bacterial classification was performed in three steps: first, attribute weighting methods were used to select the important variables, then those variables, were selected and finally deep learning networks were employed to extract the hierarchy of the features.
 
Results: The results of 10 weighting methods showed that out of 73 characteristics of the number and frequency of amino acids, only 40 had weights higher than zero. Of these variables, 13 variable gained weight higher than 0.5 and only 10 variables had weight above 0.09. These 10 features were selected as important variables. The frequencies of glutamine and glutamic acid obtained the highest possible weights and were considered as two important features in the classification of heat-resistant and non-heat-resistant bacteria. The highest prediction accuracy of the deep learning networks was 92.42% for the classification of heat resistant bacteria.
 
Conclusion: The deep neural networks can be effectively used to identify heat-resistant bacteria based on their protein properties.
Keywords: Thermostable, Protein sequence, Classification, Deep learning networks.
Full-Text [PDF 581 kb]   (604 Downloads)    
Type of Study: Original Article | Subject: آمار
Received: 2020/01/2 | Accepted: 2020/06/23 | Published: 2020/06/30
References
1. 1. Zhang C, Zheng G, Xu SF, Xu D. Computational challenges in characterization of bacteria and bacteria-host interactions based on genomic data. J Comput Sci Technol 2012;27(2):225-39. Link [DOI:10.1007/s11390-012-1219-y]
2. Banerjee AK, Ravi V, Murty US, Sengupta N, Karuna B. Application of intelligent techniques for classification of bacteria using protein sequence-derived features. Appl Biochem Biotechnol 2013;170(6):1263-81. PMID: 23657902 [DOI:10.1007/s12010-013-0268-1]
3. Berezovsky IN, Shakhnovich EI. Physics and evolution of thermophilic adaptation. Proc Natl Acad Sci U S A 2005;102(36):12742-7. PMID: 16120678 [DOI:10.1073/pnas.0503890102]
4. Fujita M, Kanehisa M. Comparative analysis of DNA-binding proteins between thermophilic and mesophilic bacteria. Genome Inform 2005;16(1):174-81. PMID: 16362920
5. Angermueller C, Pärnamaa T, Parts L, Stegle O. Deep learning for computational biology. Mol Syst Biol 2016;12(7):878. PMID: 27474269 [DOI:10.15252/msb.20156651]
6. Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks? Advances in neural information processing systems. Vancouver: Neural Information Processing Systems location; 2014. P. 3320-8. Link
7. Xiong HY, Alipanahi B, Lee LJ, Bretschneider H, Merico D, Yuen RK, et al. The human splicing code reveals new insights into the genetic determinants of disease. Science 2015;347(6218):1254806. PMID: 25525159 [DOI:10.1126/science.1254806]
8. Leung MK, Xiong HY, Lee LJ, Frey BJ. Deep learning of the tissue-regulated splicing code. Bioinformatics 2014;30(12):i121-9. PMID: 24931975 [DOI:10.1093/bioinformatics/btu277]
9. Alipanahi B, Delong A, Weirauch MT, Frey BJ. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature Biotechnol 2015;33(8):831-8. PMID: 26213851 [DOI:10.1038/nbt.3300]
10. Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 2015;12(10):931-4. PMID: 26301843 [DOI:10.1038/nmeth.3547]
11. Zhou J, Theesfeld CL, Yao K, Chen KM, Wong AK, Troyanskaya OG. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat Genet. 2018;50(8):1171-1179. PMID:30013180 [DOI:10.1038/s41588-018-0160-6]
12. Ahsan R, Ebrahimi M. Image processing techniques represent innovative tools for comparative analysis of proteins. Comput Biol Med 2020;117:103584. PMID: 32072976 [DOI:10.1016/j.compbiomed.2019.103584]
13. Paloheimo M, Mäntylä A, Kallio J, Puranen T, Suominen P. Increased production of xylanase by expression of a truncated version of the xyn11A gene from Nonomuraea flexuosa in Trichoderma reesei. Appl Environ Microbiol 2007;73(10):3215-24. PMID: 17384308 [DOI:10.1128/AEM.02967-06]
14. Yang HM, Yao B, Meng K, Wang YR, Bai YG, Wu NF. Introduction of a disulfide bridge enhances the thermostability of a Streptomyces olivaceoviridis xylanase mutant. J Ind Microbiol Biotechnol 2007;34(3):213-8. PMID: 17139507 [DOI:10.1007/s10295-006-0188-y]
15. Yang HM, Yao B, Fan YL. Recent advances in structures and relative enzyme properties of xylanase. Sheng Wu Gong Cheng Xue Bao 2005;21(1):6-11. PMID: 15859321
16. Ebrahimie E, Ebrahimi M. Searching for patterns of thermostability in proteins and defining the main features contributing to enzyme thermostability through screening, clustering, and decision tree algorithms. EXCLI 2009;8:218-33. Link
Send email to the article author

Add your comments about this article
Your username or Email:

CAPTCHA



XML   Persian Abstract   Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Ahsan R, Ebrahimi M. Identification of Heat-Resistant Bacteria Based on Selection of Proper Representation of Protein Sequences Using Deep Learning Approach. Qom Univ Med Sci J 2020; 14 (3) :54-63
URL: http://journal.muq.ac.ir/article-1-2704-en.html


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Volume 14, Issue 3 (May 2020) Back to browse issues page
مجله دانشگاه علوم پزشکی قم Qom University of Medical Sciences Journal
Persian site map - English site map - Created in 0.07 seconds with 30 queries by YEKTAWEB 4533