1 |
高利剑, 毛启容. 环境辅助的多任务混合声音事件检测方法[J]. 计算机科学, 2020, 47 (1): 159- 164.
|
|
GAO L J , MAO Q R . Environment-assisted multitasking mixed sound event detection method[J]. Computer Science, 2020, 47 (1): 159- 164.
|
2 |
ORTEGA J D S, CARDINAL P, KOERICH A L. Emotion recognition using fusion of audio and video features[C]//Proc. of the IEEE International Conference on Systems, Man and Cybernetics, 2019.
|
3 |
李伟, 李硕. 理解数字声音——基于一般音频/环境声的计算机听觉综述[J]. 复旦学报(自然科学版), 2019, 58 (3): 269- 313.
|
|
LI W , LI S . Understanding digital audio: a review of general audio/ambient sound based computer audition[J]. Journal of Fudan University (Natural Science), 2019, 58 (3): 269- 313.
|
4 |
MESAROS A, HEITTOLA T, VIRTANEN T. Acoustic scene classification: an overview of dcase 2017 challenge entries[C]//Proc. of the 16th International Workshop on Acoustic Signal Enhancement, 2018.
|
5 |
SUN F J, WANG M J, XU Q H, et al. Acoustic scene recognition based on convolutional neural networks[C]//Proc. of the IEEE 4th International Conference on Signal and Image Processing, 2019.
|
6 |
BASBUG A M, SERT M. Acoustic scene classification using spatial pyramid pooling with convolutional neural networks[C]//Proc. of the IEEE 13th International Conference on Semantic Computing, 2019.
|
7 |
NARANJO-ALCAZAR J , PEREZ-CASTANOS S , ZUCCARELLO P , et al. Acoustic scene classification with squeeze-excitation residual networks[J]. IEEE Access, 2020, 8, 112287- 112296.
doi: 10.1109/ACCESS.2020.3002761
|
8 |
PHAYE S S R, BENETOS E, WANG Y. Subspectralnet-using sub-spectrogram based convolutional neural networks for acoustic scene classification[C]//Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2019.
|
9 |
KOUTINI K, EGHBAL-ZADEH H, DORFER M, et al. The receptive field as a regularizer in deep convolutional neural networks for acoustic scene classification[C]//Proc. of the 27th European Signal Processing Conference, 2019.
|
10 |
JIN X , WU L , LI X D , et al. ILGNet: inception modules with connected local and global features for efficient image aesthetic quality classification using domain adaptation[J]. IET Computer Vision, 2019, 13 (2): 206- 212.
doi: 10.1049/iet-cvi.2018.5249
|
11 |
ESMAEILPOUR M , CARDINAL P , KOERICH A L . A robust approach for securing audio classification against adversarial attacks[J]. IEEE Trans.on Information Forensics and Security, 2020, 15, 2147- 2159.
doi: 10.1109/TIFS.2019.2956591
|
12 |
JONGPIL L , JIYOUNG P , KEUNHYOUNG K , et al. SampleCNN: end-to-end deep convolutional neural networks using very small filters for music classification[J]. Applied Sciences, 2018, 8 (2): 150.
|
13 |
ZHANG W Y, SUN M, WANG L, et al. End-to-end overlapped speech detection and speaker counting with raw waveform[C]//Proc. of the IEEE Automatic Speech Recognition and Understanding Workshop, 2020.
|
14 |
RAJAN V , BRUTTI A , CAVALLARO A . ConflictNET: end-to-end learning for speech-based conflict intensity estimation[J]. IEEE Signal Processing Letters, 2019, 26 (11): 1668- 1672.
doi: 10.1109/LSP.2019.2944004
|
15 |
UBALE R, RAMANARAYANAN V, QIAN Y, et al. Native language identification from raw waveforms using deep convolutional neural networks with attentive pooling[C]//Proc. of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019.
|
16 |
KRISHNA D N, AMRUTH A, REDDY S S, et al. Language independent gender identification from raw waveform using multi-scale convolutional neural networks[C]//Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2020.
|
17 |
DENG Y C, WANG Y R, CHEN S H, et al. Recent progress of mandrain spontaneous speech recognition on mandrain conversation dialogue corpus[C]//Proc. of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2020.
|
18 |
WU B, YU M, CHEN L W, et al. Improving speech enhancement with phonetic embedding features[C]//Proc. of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019.
|
19 |
FENG Y J, ZHANG Y, XU X. End-to-end speech recognition system based on improved CLDNN structure[C]//Proc. of the IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, 2019.
|
20 |
XU T J, LI H, ZHANG H, et al. Improve data utilization with two-stage learning in CNN-LSTM-based voice activity detection[C]//Proc. of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019.
|
21 |
ZAZO R, SAINATH T N, SIMKO G, et al. Feature learning with raw-waveform CLDNNs for voice activity detection[C]//Proc. of the Interspeech, 2016.
|
22 |
HUANG T Y, LI J L, CHANG C M, et al. A dual-complementary acoustic embedding network learned from raw waveform for speech emotion recognition[C]//Proc. of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019.
|
23 |
TOKOZUME Y, HARADA T. Learning environmental sounds with end-to-end convolutional neural network[C]//Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2017.
|
24 |
EBRAHIMPOUR M, SHEA T, DANIELESCU A, et al. End-to-end auditory object recognition via inception nucleus[C]//Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2020.
|
25 |
肖寒春, 郭俊峰, 张丽. 改进的梅尔倒谱系数在低空飞行器特征提取中的应用[J]. 应用声学, 2018, 37 (6): 77- 83.
|
|
XIAO H C , GUO J F , ZHANG L . The application of improved Mel-frequency cepstral coefficients technology in the feature extraction of low-altitude aircraft[J]. Applied Acoustics, 2018, 37 (6): 77- 83.
|
26 |
卢宏涛, 张秦川. 深度卷积神经网络在计算机视觉中的应用研究综述[J]. 数据采集与处理, 2016, 31 (1): 1- 17.
|
|
LU H T , ZHANG Q C . Applications of deep convolutional neural network in computer vision[J]. Journal of Data Acquisition & Processing, 2016, 31 (1): 1- 17.
|