基于蛋白质二级序列的关联多分类算法

doi:10.3969/j.issn.1001-506X.2010.06.043

Journal of Systems Engineering and Electronics ›› 2010, Vol. 32 ›› Issue (6): 1318-1324.doi: 10.3969/j.issn.1001-506X.2010.06.043

基于蛋白质二级序列的关联多分类算法

杨炳儒,周谆,侯伟

北京科技大学信息工程学院，北京 100083

出版日期:2010-06-28 发布日期:2010-01-03

Association multi-classification algorithm based on protein secondary structure sequence

YANG Bing-ru, ZHOU Zhun, HOU Wei

Information Engineering School, Univ. of Science and Technology Beijing, Beijing 100083, China

Online:2010-06-28 Published:2010-01-03

摘要/Abstract

摘要：

蛋白质二级结构预测是公认的生物信息学领域的国际性难题。以基于内在认知机理的知识发现理论(knowledge discovery theory based on inner cognitive mechanism, KDTICM)理论的扩展性研究与数据库中的知识发现（knowledge discovery in database*, KDD*)模型为基础,提出一种基于结构序列的多分类算法——SAC(structural association classification),可以有效地解决蛋白质二级结构预测问题。该算法借助设定支持度阈值的精化知识库的方法,其预测准确率能够超过85%。以该算法为核心，构建了一个蛋白质二级预测模型——复合金字塔模型。实验证明,在RS126、CB513、ILP数据集上的预测准确率均超过80%,超过目前已知的国际主流水平。

Abstract:

The prediction of protein secondary structure is one of the major issues in Bioinformatics. As one of the researches of KDTICM theory, a multiclassification algorithm based on structure sequence is proposed, which is based on knowledge discovery in database* (KDD*) model. The SAC algorithm can effectively solve the problem of protein secondary structure prediction. The algorithm’s accuracy exceeded by 85% by using the reduction of knowledge base through the setting of the confidence threshold value. A compound pyramid model is built with the SAC algorithm being regarded as a kernel. Experimental results show that the predictive accuracy exceeded by 80% when using in the datasets of RS126,CB513 and ILP, which is equivalent or even excels known national and international levels.

杨炳儒,周谆,侯伟. 基于蛋白质二级序列的关联多分类算法[J]. Journal of Systems Engineering and Electronics, 2010, 32(6): 1318-1324.

YANG Bing-ru, ZHOU Zhun, HOU Wei. Association multi-classification algorithm based on protein secondary structure sequence[J]. Journal of Systems Engineering and Electronics, 2010, 32(6): 1318-1324.

基于蛋白质二级序列的关联多分类算法

Association multi-classification algorithm based on protein secondary structure sequence

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

编辑推荐

Metrics

本文评价