系统工程与电子技术 ›› 2021, Vol. 43 ›› Issue (9): 2526-2534.doi: 10.12305/j.issn.1001-506X.2021.09.20
赖德迪, 罗智徽, 马应龙*
收稿日期:
2020-07-29
出版日期:
2021-08-20
发布日期:
2021-08-26
通讯作者:
马应龙
作者简介:
赖德迪(1998—), 男, 硕士研究生, 主要研究方向为多标签分类、自然语言处理|罗智徽(1999—), 男, 硕士研究生, 主要研究方向为多标签分类、自然语言处理|马应龙(1976—), 男, 教授, 博士, 主要研究方向为人工智能与知识工程、大数据分析与处理技术、软件工程等
基金资助:
Dedi LAI, Zhihui LUO, Yinglong MA*
Received:
2020-07-29
Online:
2021-08-20
Published:
2021-08-26
Contact:
Yinglong MA
摘要:
针对分类器链模型采用随机生成方式确定标签序列会极大影响分类器链性能的问题。通过共现分析技术深入挖掘标签间的潜在关系, 提出一种基于贪心算法和n-gram模型的两种标签序列优化策略以提升分类器链模型性能。基于贪心算法的策略通过计算和排序标签之间共现率来生成优化的分类器链标签序列, 而基于n-gram模型的策略则通过最大化标签之间条件概率来生成优化的分类器链标签序列。最后通过多个多标签基准数据集进行实验验证, 实验结果表明, 与当前流行的各种分类器链模型相比, 所提的两种策略很有竞争力, 可以明显提升多标签分类效果。
中图分类号:
赖德迪, 罗智徽, 马应龙. 基于共现分析的分类器链标签序列优化方法[J]. 系统工程与电子技术, 2021, 43(9): 2526-2534.
Dedi LAI, Zhihui LUO, Yinglong MA. Label order optimization method of classifier chains based on co-occurrence analysis[J]. Systems Engineering and Electronics, 2021, 43(9): 2526-2534.
表3
不同算法关于Accuracy的性能比较"
数据集 | 算法 | |||||
CC | BR | LOCC | PwRakel | GCC | NCC | |
yeast | 0.458 5 | 0.463 6 | 0.464 9 | 0.472 7 | 0.483 2 | 0.480 2 |
emotions | 0.385 1 | 0.372 8 | 0.366 5 | 0.370 9 | 0.378 9 | 0.385 8 |
enron | 0.403 4 | 0.403 4 | 0.399 7 | 0.408 7 | 0.399 2 | 0.406 8 |
Slashdot-F | 0.394 5 | 0.406 5 | 0.414 7 | 0.406 0 | 0.411 1 | 0.423 9 |
CAL500 | 0.221 0 | 0.219 1 | 0.223 3 | 0.213 5 | 0.235 7 | 0.230 2 |
表4
不同算法关于F1的性能比较"
数据集 | 算法 | |||||
CC | BR | LOCC | PwRakel | GCC | NCC | |
yeast | 0.558 5 | 0.537 8 | 0.550 5 | 0.557 7 | 0.559 4 | 0.571 7 |
emotions | 0.656 3 | 0.651 6 | 0.660 1 | 0.667 4 | 0.678 7 | 0.674 3 |
enron | 0.583 4 | 0.584 3 | 0.586 0 | 0.585 0 | 0.590 0 | 0.586 6 |
Slashdot-F | 0.650 3 | 0.642 2 | 0.653 8 | 0.646 7 | 0.654 9 | 0.656 3 |
CAL500 | 0.509 8 | 0.508 8 | 0.510 4 | 0.508 4 | 0.515 9 | 0.510 6 |
1 | 牟甲鹏, 蔡剑, 余孟池, 等. 基于标签相关性的类属属性多标签分类算法[J]. 计算机应用研究, 2020, 37 (9): 2656- 2658. |
MOU J P , CAI J , YU M C , et al. Multi label classification algorithm for generic attributes based on label correlation[J]. Computer Application Research, 2020, 37 (9): 2656- 2658. | |
2 | 李锋, 杨有龙. 基于标签特征和相关性的多标签分类算法[J]. 计算机工程与应用, 2019, 55 (4): 48- 55. |
LI F , YANG Y L . Multi label classification algorithm based on label features and correlation[J]. Computer Engineering and Application, 2019, 55 (4): 48- 55. | |
3 | 马云飞, 贾希胜, 白华军, 等. 基于一维CNN参数优化的压缩振动信号故障诊断[J]. 系统工程与电子技术, 2020, 42 (9): 1911- 1919. |
MA Y F , JIA X S , BAI H J , et al. Fault diagnosis of compressed vibration signal based on 1-dimensional CNN with optimized parameters[J]. Systems Engineering and Electronics, 2020, 42 (9): 1911- 1919. | |
4 |
PEREIRA R B , PLASTINO A , ZADROZNY B , et al. A lazy feature selection method for multi-label classification[J]. Intelligent Data Analysis, 2021, 25 (1): 21- 34.
doi: 10.3233/IDA-194878 |
5 |
KONG X , NG M K , ZHOU Z H . Transductive multilabel learning via label set propagation[J]. IEEE Trans.on Knowledge and Data Engineering, 2013, 25 (3): 704- 719.
doi: 10.1109/TKDE.2011.141 |
6 | CHEN Z M, WEI X S, WANG P, et al. Multi-label image recog-nition with graph convolutional networks[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 5177-5186. |
7 |
ZHANG M L , ZHOU Z H . A review on multi-label learning algorithms[J]. IEEE Trans.on Knowledge and Data Engineering, 2014, 26 (8): 1819- 1837.
doi: 10.1109/TKDE.2013.39 |
8 | HUANG J , LI G R , HUANG Q M , et al. Joint feature selection and classification for multilabel learning[J]. IEEE Trans.on Systems, Man, and Cybernetics, 20118, 48 (3): 876- 889. |
9 | HUANG S J, ZHOU Z H. Multi-label learning by exploiting label correlations locally[C]//Proc. of National Conference on Artificial Intelligence, 2012: 949-955. |
10 |
ZHANG M L , ZHOU Z H . ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40 (7): 2038- 2048.
doi: 10.1016/j.patcog.2006.12.019 |
11 | MADJAROV G, GJORGJEVIKJ D, DELEV T. Efficient two stage voting architecture for pairwise multi-label classification[C]//Proc. of the Artificial Intelligence-Australasian Joint Confe-rence, 2010: 164-173. |
12 | GONZALEZ-LOPEZ J , SEBASTIAN V , CANO A . Distributed selection of continuous features in multilabel classification using mutual information[J]. IEEE Trans.on Neural Networks and Learning Systems, 2020, 31 (7): 2280- 2293. |
13 |
SHA Z C , LIU Z M , MA C , et al. Feature selection for multi-label classification by maximizing full-dimensional conditional mutual information[J]. Applied Intelligence, 2021, 51 (1): 326- 340.
doi: 10.1007/s10489-020-01822-0 |
14 |
TSOUMAKAS G , KATAKIS I , VLAHAVAS I . Random k-labelsets for multilabel classification[J]. IEEE Trans.on Knowledge and Data Engineering, 2011, 23 (7): 1079- 1089.
doi: 10.1109/TKDE.2010.164 |
15 | 周恩波, 叶荣华, 张微微, 等. 一种基于成对标签的Rakel算法改进[J]. 计算机与现代化, 2016, 247 (3): 20- 22. |
ZHOU E B , YE R H , ZHANG W W , et al. An improved rakel algorithm based on paired tags[J]. Computer and Moder-nization, 2016, 247 (3): 20- 22. | |
16 | GUO Y M, CHUANG F L, LI G Z, et al. An ensemble embedded feature selection method for multi-label clinical text classification[C]//Proc. of the Bioinformatics and Biomedicine, 2016: 823-826. |
17 | YADOLLAHI A , SHAHRAKI A G , ZAIANE O R . Current state of text sentiment analysis from opinion to emotion mining[J]. ACM Computing Surveys, 2017, 50 (2): 1- 33. |
18 | NAM J, KIM J, GUREVYCH I, et al. Large-scale multi-label text classification-revisiting neural networks[C]//Proc. of the Joint European Conference on Machine Learning and Know-ledge Discovery in Databases, 2013: 437-452. |
19 | SUN L, KUDO M. Polytree-augmented classifier chains for multi-label classification[C]//Proc. of the 24th International Joint Conference on Artificial Intelligence, 2015: 3834-3840. |
20 |
VENKATESAN R , ER M J , DAVE M , et al. A novel online multi-label classifier for high-speed streaming data applications[J]. Evolving Systems, 2017, 8 (4): 303- 315.
doi: 10.1007/s12530-016-9162-8 |
21 | 王继娜, 陈军华, 高建华. 基于排序损失的ECC多标签代码异味检测方法[J]. 计算机研究与发展, 2021, 58 (1): 178- 188. |
WANG J N , CHEN J H , GAO J H . ECC multi label code odor detection method based on sorting loss[J]. Computer Research and Development, 2021, 58 (1): 178- 188. | |
22 | CERRI R, MANTOVANI R G, BASGALUPP M P, et al. Multi-label feature selection techniques for hierarchical multi-label protein function prediction[C]//Proc. of the International Joint Conference on Neural Network, 2018. |
23 | WEHRMANN J, CERRI R, BARROS R C. Hierarchical multi-label classification networks[C]//Proc. of the International Conference on Machine Learning, 2018: 5075-5084. |
24 |
READ J , PFAHRINGER B , HOLMES G , et al. Classifier chains for multi-label classification[J]. Machine Learning, 2011, 85 (3): 333- 259.
doi: 10.1007/s10994-011-5256-5 |
25 | ENRIQUE S L , BIELZA C , MORALES E F , et al. Multi-label classification with bayesian network-based chain classifiers[J]. Pattern Recognition Letters, 2014, 41 (1): 14- 22. |
26 | WANG J, YANG Y, MAO J H, et al. CNN-RNN: a unified framework for multi-label image classification[C]//Proc. of the Computer Vision and Pattern Recognition, 2016: 2285-2294. |
27 |
ZHANG M L , LI Y K , LIU X Y , et al. Binary relevance for multi-label learning: an overview[J]. Frontiers of Computer Science in China, 2018, 12 (2): 191- 202.
doi: 10.1007/s11704-017-7031-7 |
28 |
MENCA E L , JANSSEN F . Learning rules for multi-label classification: a stacking and a separate-and-conquer approach[J]. Machine Learning, 2016, 105 (1): 1- 50.
doi: 10.1007/s10994-016-5589-1 |
29 | SENGE R, COZ J J D, HVLLERMEIER E. Rectifying classifier chains for multi-label classification[EB/OL]. [2020-07-20]. http://www.minf.uni-bamberg.de/lwa2013/FinalPapers/lwa2013_submission_34.pdf. |
30 | GIBAJA E , VENTURA S . A tutorial on multilabel learning[J]. ACM Computing Surveys, 2015, 47 (3): 52. |
31 |
HUANG J , LI G R , HUANG Q M , et al. Learning label-specific features and class-dependent labels for multi-label classification[J]. IEEE Trans. on Knowledge and Data Engineering, 2016, 28 (12): 3309- 3323.
doi: 10.1109/TKDE.2016.2608339 |
32 | DEMBCZYNSKI K, CHENG W W, HULLERMEIER E. Bayes optimal multilabel classification via probabilistic classifier chains[C]//Proc. of the 27th International Conference on Machine Learning, 2010: 279-286. |
33 |
胡天磊, 王皓波, 尹文栋. 基于深度双向分类器链的多标签新闻分类算法[J]. 浙江大学学报(工学版), 2019, 53 (11): 2110- 2117.
doi: 10.3785/j.issn.1008-973X.2019.11.008 |
HU T L , WANG H B , YIN W D . Multi-tag news classification algorithm based on deep bidirectional classifier chain[J]. Journal of Zhejiang University (Engineering Science), 2019, 53 (11): 2110- 2117.
doi: 10.3785/j.issn.1008-973X.2019.11.008 |
|
34 | JI M M , ZHANG K K , WU Q F , et al. Multi-label learning for crop leaf diseases recognition and severity estimation based on convolutional neural networks[J]. Soft Computing, 2020, 24 (10): 15327- 15340. |
35 | SOUIAI M, STREKALOVSKIY E, NIEUWENHUIS C, et al. A co-occurrence prior for continuous multi-label optimization[C]//Proc. of the International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, 2013: 209-222. |
36 | PENNINGTON J, SOCHER R, MANNING C. glove: global vectors for word representation[C]//Proc. of the Conference on Empirical Methods in Natural Language Processing, 2014: 1532-1543. |
37 | HUANG J , LI G R , WANG S H , et al. Multi-label classification by exploiting local positive and negative pairwise label correlation[J]. Neurocomputing, 2017, 257 (16): 164- 174. |
38 | JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]//Proc. of the Conference of the European Chapter of the Association for Computational Linguistics, 2017: 427-431. |
39 | MIKHAYLOV D V , KOZLOV A P , EMELYANOV G M . An approach based on analysis of n-grams on links of words to extract the knowledge and relevant linguistic means on subject-oriented text sets[J]. Yaroslav-the-Wise Novgorod State University, 2017, 41 (3): 461- 471. |
40 |
邱继钊, 计华, 张化祥. 用于多标记学习的局部顺序分类器链算法[J]. 计算机应用研究, 2013, 30 (9): 2606- 2609.
doi: 10.3969/j.issn.1001-3695.2013.09.011 |
QIU J Z , JI H , ZHANG H X . Local sequence classifier chain algorithm for multi-marker learning[J]. Computer Application Research, 2013, 30 (9): 2606- 2609.
doi: 10.3969/j.issn.1001-3695.2013.09.011 |
|
41 | YADOLLAHI A , SHAHRAKI A G , ZAIANE O R . Current state of text sentiment analysis from opinion to emotion mining[J]. ACM Computing Surveys, 2017, 50 (2): 25. |
42 |
MADJAROV G , KOCEV D , GJORGJEVIKJ D , et al. An extensive experimental comparison of methods for multi-label learning[J]. Pattern Recognition, 2012, 45 (9): 3084- 3104.
doi: 10.1016/j.patcog.2012.03.004 |
[1] | 张俊, 张新禹, 姜卫东, 刘永祥, 黎湘. 基于广义近似消息传递的快速DOA估计方法[J]. 系统工程与电子技术, 2022, 44(10): 2995-3002. |
[2] | 万齐天, 卢宝刚, 赵雅心, 温求遒. 基于深度强化学习的驾驶仪参数快速整定方法[J]. 系统工程与电子技术, 2022, 44(10): 3190-3199. |
[3] | 何立, 沈亮, 李辉, 王壮, 唐文泉. 强化学习中的策略重用: 研究进展[J]. 系统工程与电子技术, 2022, 44(3): 884-899. |
[4] | 程恺, 陈刚, 余晓晗, 刘满, 邵天浩. 知识牵引与数据驱动的兵棋AI设计及关键技术[J]. 系统工程与电子技术, 2021, 43(10): 2911-2917. |
[5] | 吴子龙, 陈红, 雷迎科. 改进DenseNet在电台建链行为识别时的可视化研究[J]. 系统工程与电子技术, 2021, 43(5): 1371-1381. |
[6] | 施伟, 黄红蓝, 冯旸赫, 刘忠. 面向多类别分类问题的子抽样主动学习方法[J]. 系统工程与电子技术, 2021, 43(3): 700-708. |
[7] | 李琛, 黄炎焱, 张永亮, 陈天德. Actor-Critic框架下的多智能体决策方法及其在兵棋上的应用[J]. 系统工程与电子技术, 2021, 43(3): 755-762. |
[8] | 马文, 李辉, 王壮, 黄志勇, 吴昭欣, 陈希亮. 基于深度随机博弈的近距空战机动决策[J]. 系统工程与电子技术, 2021, 43(2): 443-451. |
[9] | 白静, 陈业华, 石蕊, 张亚明. 基于概率语言赋权交叉熵信度分配的应急决策[J]. 系统工程与电子技术, 2021, 43(2): 476-486. |
[10] | 吴子龙, 陈红, 雷迎科, 李昕, 熊颢. 基于堆栈式LSTM网络的通信辐射源个体识别[J]. 系统工程与电子技术, 2020, 42(12): 2915-2923. |
[11] | 邵天浩, 张宏军, 程恺, 戴成友, 余晓晗, 张可. 层次任务网络中的重新规划研究综述[J]. 系统工程与电子技术, 2020, 42(12): 2833-2846. |
[12] | 王泊涵, 吴超, 柯文俊, 郑恺之, 付修锋, 江山. 基于概率图的作战任务智能规划方法[J]. 系统工程与电子技术, 2020, 42(12): 2795-2801. |
[13] | 闫实, 贺静, 王跃东, 孙自强, 梁彦. 基于强化学习的多机协同传感器管理[J]. 系统工程与电子技术, 2020, 42(8): 1726-1733. |
[14] | 李岩, 陈云翔, 罗承昆, 蔡忠义. 基于概率犹豫-直觉模糊熵和证据推理的多属性决策方法[J]. 系统工程与电子技术, 2020, 42(5): 1116-1123. |
[15] | 李岩, 陈云翔, 罗承昆, 蔡忠义. 基于FA-ER的装备保障资源供给能力评估方法[J]. 系统工程与电子技术, 2020, 42(3): 630-637. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||