Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Liang, Xingzhua; b | Liu, Wena; * | Bi, Feilonga | Yan, Xinyunc | Zhang, Chunjiongd
Affiliations: [a] School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan, China | [b] Institute of Environment-Friendly Materials and Occupational Health, Anhui University of Science and Technology, Wuhu, China | [c] Jinling Institute of Technology, Jiangsu AI Transportation Innovations & Applications Engineering Research Center, Nanjing, China | [d] College of Electronics and Information Engineering, Tongji University, Shanghai, China
Correspondence: [*] Corresponding author. Wen Liu, School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan, 232001, China. E-mail: [email protected].
Abstract: Online knowledge distillation breaks the pre-determined strong and weak teacher-student models, it provides a new way of thinking about knowledge distillation. However, the current online methods often use the Logits-based prediction distribution, and the features containing rich semantic information are rarely used. Even if the feature-based methods are used, they only operate on the last layer of the network, without further exploring the representation knowledge of the middle layer feature map. To address the above issues, we propose an innovative feature early fusion and reconstruction (FEFR) method for online knowledge distillation which entails four essential components: multi-scale feature extraction and intermediate layer feature early fusion, reconstruction of features, dual-attention and overall fusion module in this paper. We propose early fusion by “sum” operation for feature matrices between different layers and advance fusion to improve the feature map representation. In order to enhance the communication ability between groups to obtain features, the features were reconstructed. We create a dual-attention to enhance the critical channel and spatial regions adaptively in order to collect more accurate information. The previously processed feature maps are combined and fused using feature fusion, which also aids in student models training. A study of the network architectures of CIFAR-10, CIFAR-100, CINIC-10 and ImageNet 2012 shows that FEFR provides more useful characterization knowledge for refinement and improves accuracy by about 0.5% compared to other methods.
Keywords: Online knowledge distillation, teacher-student models, multi-scale, feature early fusion
DOI: 10.3233/JIFS-232626
Journal: Journal of Intelligent & Fuzzy Systems, vol. 45, no. 6, pp. 9471-9482, 2023
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]