Abstract:
Cell penetrating peptides (CPPs) facilitate the transport of pharmacologically ac- tive molecules, such as plasmid DNA, short interfering RNA (siRNA), nanoparticles, and small peptides. The accurate identification of new and unique CPPs is the ini- tial step to gain insight into CPP activity. Experiments can provide detailed insight into the cell-penetration property of CPPs. However, synthesis and identification of CPPs through wet-lab experiments is both resource and time expensive. Therefore, development of an efficient prediction tool is essential for identification of unique CPP prior to experiments. To this end, we developed a kernel extreme learning machine (KELM) based CPP prediction model called as KELM-CPPpred. The main dataset used in this study consists of 408 CPPs and an equal number of non-CPPs. The input features, used to train the proposed prediction model, include amino acid composi- tion (AAC), dipeptide amino acid composition (DAC), Pseudo-amino acid composi- tion (PseAAC) and the motif based hybrid features. We further used independent dataset to validate the proposed model. In addition, we have also tested the pre- diction accuracy of KELM-CPPpred models with existing artificial neural network (ANN), random forest (RF) and support vector machine (SVM) approaches on re- spective benchmark dataset used in the previous studies. Empirical tests showed that KELM-CPPpred outperformed existing prediction approaches based on SVM, RF and ANN. We developed a web interface named KELM-CPPpred, which is freely available at http://sairam.people.iitgn.ac.in/KELM-CPPpred.html