Journal of Systems Engineering and Electronics ›› 2010, Vol. 32 ›› Issue (8): 1775-1779.doi: 10.3969/j.issn.1001-506X.2010.08.47

• 电子技术 • 上一篇    下一篇

一种基于混合策略的孤立点检测方法

田江,顾宏   

  1. (大连理工大学电子与信息工程学院, 辽宁 大连 116023)
  • 出版日期:2010-08-13 发布日期:2010-01-03

Outlier detection method based on hybrid strategies

TIAN Jiang,GU Hong   

  1. (School of Electronic and Information Engineering, Dalian Univ. of Technology, Dalian 116023, China)
  • Online:2010-08-13 Published:2010-01-03

摘要:

孤立点检测面临数据不平衡和代价敏感两个问题。利用改进的一类支持向量机对数据集进行重构,并结合代价敏感支持向量机提出了一种混合策略检测方法。首先在传统的一类支持向量机优化过程中设定不同权重,通过刻画超平面消除部分正常样本进而平衡数据集;重构过程保留了孤立点信息,同时能克服数据混叠现象。通过代价敏感支持向量机对样本进行训练,利用受试者工作特征分析作为评判依据搜索最优参数,进而调节阈值获得孤立点检测模型。仿真实验结果表明,本文方法能提高检测精度,同时有效降低总的误分类代价。

Abstract:

Outlier detection is an important problem, and it is necessary to solve both problems of  imbalanced dataset as well as cost sensitive instantly. First a new re-sampling algorithm based on a modified oneclass support vector machine (SVM) is presented, and then a two-stage outlier detection approach combining with cost sensitive SVM is designed. In the first stage, various low weights are set for outliers, and some common points are removed proportionally by the hyper-plane in feature space, and could  overcome the effect of  overlapping data points. In the second stage, a receiver operating characteristic (ROC) analysis is applied to  select  the optimum parameters of cost sensitive SVM in limited grid scope, and finally the detection decision function is obtained after adjusting the threshold. Experiment results show that the proposed method can improve  the classification accuracy and decrease the misclassification cost effectively.