系统工程与电子技术 ›› 2024, Vol. 46 ›› Issue (9): 3060-3069.doi: 10.12305/j.issn.1001-506X.2024.09.18
• 系统工程 • 上一篇
张庭瑜1,2, 曾颖1,2,*, 李楠3, 黄洪钟1,2
收稿日期:
2023-10-09
出版日期:
2024-08-30
发布日期:
2024-09-12
通讯作者:
曾颖
作者简介:
张庭瑜 (1993—), 男, 博士研究生, 主要研究方向为电子器件可靠性分析、电源系统可靠性优化设计、系统可靠性基金资助:
Tingyu ZHANG1,2, Ying ZENG1,2,*, Nan LI3, Hongzhong HUANG1,2
Received:
2023-10-09
Online:
2024-08-30
Published:
2024-09-12
Contact:
Ying ZENG
摘要:
为了实现航天器电源系统的灵活高效并网, 最大化有限能量的利用, 提出一种基于深度强化学习(deep reinforcement learning, DRL) 的功率传输与信号传输复合网络拓扑优化模型, 并使用知识蒸馏原理的多种可解释组件模型对优化过程进行剖析。首先, 分析在轨运行阶段航天器母线电压调节控制域变换规律, 并结合节点传播性参数, 建立功率传输与信号通信的复合网络拓扑模型。然后, 利用A3C (asynchronous advantage actor-critic) 算法, 对信号传输网络路由分布、拓扑结构等方面潜在的运行可靠性风险进行自适应性优化。最后, 结合多种可解释组件对已训练的DRL模型进行知识蒸馏, 形成一种可解释的量化分析方法。所提方法可以指导空间电源在随机阴影影响下选择最佳并网方案, 并为更高任务要求和复杂环境下空间电源控制器设计提供理论支持。
中图分类号:
张庭瑜, 曾颖, 李楠, 黄洪钟. 基于深度强化学习的航天器功率-信号复合网络优化算法[J]. 系统工程与电子技术, 2024, 46(9): 3060-3069.
Tingyu ZHANG, Ying ZENG, Nan LI, Hongzhong HUANG. Spacecraft power-signal composite network optimization algorithm based on DRL[J]. Systems Engineering and Electronics, 2024, 46(9): 3060-3069.
表1
GRU-RS-TA-AR模型下显著转移奖励状态统计"
奖励所涉及前后节点VE, VC | 当前转移状态st→st+1 | 当前时刻奖励|R(st+1, a)|≥γ(0.6) | 长期累积折扣奖励Gt |
| | | |
{vE, 21, vE, 24, vE, 29, vE, 35} | 78 | 0.69 | 52.31 |
{vE, 33, vE, 47, vE, 48, vE, 79, vE, 80} | 164 | 0.62 | 61.24 |
| | | |
{vE, 33, vE, 47, vE, 48, vE, 49, vE, 52} | 420 | -0.8 | 77.17 |
| | | |
{vE, 29, vE, 35, vE, 70, vE, 79} | 633 | 0.77 | 124.95 |
{vE, 29, vE, 35, vE, 70, vE, 79} | 633 | 0.77 | 124.95 |
1 | 王文龙, 杨建中. 航天器对接与捕获技术综述[J]. 机械工程学报, 2021, 57 (20): 215- 231. |
WANG W L , YANG J Z . Spacecraft docking & capture technology: review[J]. Journal of Mechanical Engineering, 2021, 57 (20): 215- 231. | |
2 |
李孝鹏, 黄洪钟, 李福秋. 基于PRA的复杂航天多阶段任务系统可靠性分析[J]. 系统工程与电子技术, 2019, 41 (9): 2141- 2147.
doi: 10.3969/j.issn.1001-506X.2019.09.30 |
LI X P , HUANG H Z , LI F Q . PRA based reliability analysis of complex space phased-mission system[J]. Systems Engineering and Electronics, 2019, 41 (9): 2141- 2147.
doi: 10.3969/j.issn.1001-506X.2019.09.30 |
|
3 |
JASEM K , MOHSEN H , KEYHAN S . Modeling and control of quasi Z-source inverters for parallel operation of battery energy storage systems: application to micro grids[J]. Electric Power Systems Research, 2015, 125, 164- 173.
doi: 10.1016/j.epsr.2015.04.004 |
4 | RYAN M, KENNETH A. The use of software agents for autonomous control of a DC space power system[C]//Proc. of the 12th International Energy Conversion Engineering Conference, 2014. |
5 | OKAYA S. Advanced concept of the space electric power system integrated with the propulsion[C]//Proc. of the 13th International Energy Conversion Engineering Conference, 2015. |
6 | RICHARD C, BRENT G. Modular power standard for space explorations missions[C]//Proc. of the AIAA Space Conferences and Exposition, 2016. |
7 | 何雄, 陈永刚, 王力. 适用于分布式宇航电源系统的电源控制器研究[J]. 电源学报, 2022, 20 (5): 5- 13. |
HE X , CHEN Y G , WANG L . Research on power conditioning unit for distributed aerospace power supply system[J]. Journal of Power Supply, 2022, 20 (5): 5- 13. | |
8 |
钟丹华, 唐筱, 舒斌, 等. 载人飞船电源系统并网供电特性研究[J]. 航天器工程, 2020, 29 (1): 29- 33.
doi: 10.3969/j.issn.1673-8748.2020.01.005 |
ZHONG D H , TANG X , SHU B , et al. Characteristic of parallel power supply technology for manned spacecraft power system[J]. Spacecraft Engineering, 2020, 29 (1): 29- 33.
doi: 10.3969/j.issn.1673-8748.2020.01.005 |
|
9 | 周新顺, 王蓓蓓, 郭晓峰. 航天器大功率并网控制技术研究[J]. 中国空间科学技术, 2018, 38 (6): 59- 66. |
ZHOU X S , WANG B B , GUO X F . Research on high power bus interconnection control technology for spacecraft[J]. Chinese Space Science and Technology, 2018, 38 (6): 59- 66. | |
10 |
蒋冀, 王宏佳, 徐志伟. 一种航天器直流供电并网系统控制方法[J]. 电源技术, 2018, 42 (9): 1383- 1386.
doi: 10.3969/j.issn.1002-087X.2018.09.041 |
JIANG J , WANG H J , XU Z W . A kind of control methods of spacecraft DC grid-connected power supply system[J]. Chinese Journal of Power Sources, 2018, 42 (9): 1383- 1386.
doi: 10.3969/j.issn.1002-087X.2018.09.041 |
|
11 |
张大鹏, 孟宪会. 一种航天器间并网供电方案的研究[J]. 航天器工程, 2009, 18 (5): 101- 107.
doi: 10.3969/j.issn.1673-8748.2009.05.018 |
ZHANG D P , MENG X H . Research on parallel operation between power support systems of different spacecrafts[J]. Spacecraft Engineering, 2009, 18 (5): 101- 107.
doi: 10.3969/j.issn.1673-8748.2009.05.018 |
|
12 | SALAMEH H, KHASAWNEH H. Spectrum-time availability-aware routing mechanism for software-defined networks with out-of-band full-duplex capabilities[C]//Proc. of the 17th International Conference on Software Defined Systems, 2020: 24-28. |
13 |
SUURBALLE J . Disjoint paths in a network[J]. Networks, 1974, 4 (2): 125- 145.
doi: 10.1002/net.3230040204 |
14 |
VASS B , TAPOLCAI J , BERCZI-KOVACS E . Enumerating maximal shared risk link groups of circular disk failures hitting k nodes[J]. IEEE/ACM Trans.on Networking, 2021, 29 (4): 1648- 1661.
doi: 10.1109/TNET.2021.3070100 |
15 | YE Z Y, CHEN Z Y, NI P C. Reliability analysis and optimization algorithm of power communication network based on resource association features[C]//Proc. of the International Wireless Communications and Mobile Computing, 2020: 116-119. |
16 |
ROSATO V , ISSACHAROFF L , TIRITICCO F . Modelling interdependent infrastructures using interacting dynamical mo-dels[J]. International Journal of Critical Infrastructure, 2008, 4 (1/2): 63- 79.
doi: 10.1504/IJCIS.2008.016092 |
17 | CHEN Y, MILANOVIC J V. Hybrid modelling of interconnected electric power and ICT system for reliability analysis[C]//Proc. of the IEEE Belgrade PowerTech, 2023. |
18 | LIU N , HU X J , MA L . Vulnerability assessment for coupled network consisting of power grid and EV traffic network[J]. IEEE Trans.on Smart Grid, 2021, 13 (1): 589- 598. |
19 |
KAELBLING L P , LITTMAN M L , MOORE A W . Reinforcement learning: a survey[J]. Journal of Artificia Intelligence Research, 1996, 4, 237- 285.
doi: 10.1613/jair.301 |
20 |
PAPADIMITRIOU C H , TSITSIKLIS J N . The complexity of Markov decision processes[J]. Mathematics of Operations Research, 1987, 12 (3): 441- 450.
doi: 10.1287/moor.12.3.441 |
21 | RIEDMILLER M. Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method[C]//Proc. of the 16th European Conference on Machine Learning, 2005: 317-328. |
22 | LANGE S, RIEDMILLER M. Deep auto-encoder neural networks in reinforcement learning[C]//Proc. of the International Joint Conference on Neural Networks, 2010. |
23 |
孟泠宇, 郭秉礼, 杨雯, 等. 基于深度强化学习的网络路由优化方法[J]. 系统工程与电子技术, 2022, 44 (7): 2311- 2318.
doi: 10.12305/j.issn.1001-506X.2022.07.28 |
MENG L Y , GUO B L , YANG W , et al. Network routing optimization approach based on deep reinforcement learning[J]. Systems Engineering and Electronics, 2022, 44 (7): 2311- 2318.
doi: 10.12305/j.issn.1001-506X.2022.07.28 |
|
24 | MUHATI E, RAWAT D. Asynchronous advantage actor-critic (A3C) learning for cognitive network security[C]//Proc. of the International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, 2021: 106-113. |
25 | FUSEINI M , ALHASSAN M . Improving deep learning with prior knowledge and cognitive models: a survey on enhancing interpretability, adversarial robustness and zero-shot learning[J]. Cognitive Systems Research, 2023, 30, 101188. |
26 | WÜRFEL M, HAN Q, KAISER M. Online advertising revenue forecasting: an interpretable deep learning approach[C]//Proc. of the IEEE International Conference on Big Data, 2021: 1980-1989. |
27 |
ALEJANDRO B . Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI[J]. Information Fusion, 2020, 58, 82- 115.
doi: 10.1016/j.inffus.2019.12.012 |
28 |
ZHANG Z L , LI Y , YANG S . Code-aware fault localization with pre-training and interpretable machine learning[J]. Expert Systems with Applications, 2024, 238, 121689.
doi: 10.1016/j.eswa.2023.121689 |
29 | 周志杰, 曹友, 胡昌华. 基于规则的建模方法的可解释性及其发展[J]. 自动化学报, 2020, 47 (6): 1201- 1216. |
ZHOU Z J , CAO Y , HU C H . The interpretability of rule-based modeling approach and its development[J]. Acta Automatica Sinica, 2020, 47 (6): 1201- 1216. | |
30 |
BEVEN K . Deep learning, hydrological processes and the uniqueness[J]. Hydrological Processes, 2020, 34 (16): 3608- 3613.
doi: 10.1002/hyp.13805 |
31 | 马季军, 何小斌, 涂浡. 我国载人航天电源系统的技术发展成就及趋势[J]. 上海航天, 2021, 38 (3): 207- 218. |
MA J J , HE X B , TU B . Technical development achievements and trends of manned spaceflight power system in China[J]. Aerospace Shanghai, 2021, 38 (3): 207- 218. | |
32 | PATEL M R . Spacecraft power systems[M]. Florida: CRC Press, 2004. |
33 | ZHANG T Y , HUANG H Z , LI Y F . Hierarchical fault propagation of command and control system[J]. Smart Structures and Systems, 2022, 29 (6): 791- 797. |
34 |
GAO S , HUANG Y F , ZHANG S . Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation[J]. Journal of Hydrology, 2020, 589, 125188.
doi: 10.1016/j.jhydrol.2020.125188 |
35 | GARBAY T, CHUQUIMIA O, PINNA A. Distilling the knowledge in CNN for WCE screening tool[C]//Proc. of the Conference on Design and Architectures for Signal and Image Processing, 2019: 19-22. |
36 | YIM J, JOO D, BAE J. A gift from knowledge distillation: fast optimization, network minimization and transfer learning[C]//Proc. of the Conference on Computer Vision and Pattern Recognition, 2017: 7130-7138. |
[1] | 郭宏达, 娄静涛, 徐友春, 叶鹏, 李永乐, 陈晋生. 基于MADDPG的多无人车协同事件触发通信[J]. 系统工程与电子技术, 2024, 46(7): 2525-2533. |
[2] | 张梦钰, 豆亚杰, 陈子夷, 姜江, 杨克巍, 葛冰峰. 深度强化学习及其在军事领域中的应用综述[J]. 系统工程与电子技术, 2024, 46(4): 1297-1308. |
[3] | 李彦铃, 罗飞舟, 葛致磊. 基于鲁棒观测器的深度强化学习垂直起降运载器姿态稳定研究[J]. 系统工程与电子技术, 2024, 46(3): 1038-1047. |
[4] | 俞锦涛, 肖兵, 熊家军. 基于拓扑势的网络毁伤最大算法[J]. 系统工程与电子技术, 2023, 45(9): 2812-2818. |
[5] | 张冰雪, 李希胜, 尤佳, 宋委任. 基于节点复合特性的端-边协同网络生成算法[J]. 系统工程与电子技术, 2023, 45(8): 2588-2596. |
[6] | 吴冯国, 陶伟, 李辉, 张建伟, 郑成辰. 基于深度强化学习算法的无人机智能规避决策[J]. 系统工程与电子技术, 2023, 45(6): 1702-1711. |
[7] | 唐进, 梁彦刚, 白志会, 黎克波. 基于DQN的旋翼无人机着陆控制算法[J]. 系统工程与电子技术, 2023, 45(5): 1451-1460. |
[8] | 唐斯琪, 潘志松, 胡谷雨, 吴炀, 李云波. 深度强化学习在天基信息网络中的应用——现状与前景[J]. 系统工程与电子技术, 2023, 45(3): 886-901. |
[9] | 黄通, 高钦和, 刘志浩, 王冬, 马栋, 高蕾. 基于复杂网络理论的发射平台抗毁伤网络拓扑性质研究[J]. 系统工程与电子技术, 2023, 45(10): 3157-3164. |
[10] | 李信, 李勇军, 赵尚弘. 基于深度强化学习的卫星光网络波长路由算法[J]. 系统工程与电子技术, 2023, 45(1): 264-270. |
[11] | 王冠, 茹海忠, 张大力, 马广程, 夏红伟. 弹性高超声速飞行器智能控制系统设计[J]. 系统工程与电子技术, 2022, 44(7): 2276-2285. |
[12] | 孟泠宇, 郭秉礼, 杨雯, 张欣伟, 赵柞青, 黄善国. 基于深度强化学习的网络路由优化方法[J]. 系统工程与电子技术, 2022, 44(7): 2311-2318. |
[13] | 杨清清, 高盈盈, 郭玙, 夏博远, 杨克巍. 基于深度强化学习的海战场目标搜寻路径规划[J]. 系统工程与电子技术, 2022, 44(11): 3486-3495. |
[14] | 王琮, 沈会良, 夏永祥, 白光晗, 方依宁. 装备保障体系关键节点分析[J]. 系统工程与电子技术, 2022, 44(10): 3134-3142. |
[15] | 王岩韬, 杨拯. 航班运行风险网络的传播与控制改进[J]. 系统工程与电子技术, 2021, 43(9): 2544-2552. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||