Fast CU Partition Algorithm for Intra Frame Coding Based on Joint Texture Classification and CNN
Abstract
:1. Introduction
- A classification decision method based on the global and local texture features of the CU is proposed, which divides the CU into smooth and complex texture regions efficiently. Moreover, the CUs in the smooth texture regions will no longer be divided which can avoid the CU redundant partition.
- A novel CNN based on a modified depth-separated convolution is designed for predicting the CU partition in the complex texture regions, thus replacing the RDO process in traditional CU classification and effectively reducing the complexity of the CU partition while maintaining the RD performance.
- Combining the texture classification decision with the proposed CNN achieves both early termination for CUs in smooth texture regions and direct prediction for CUs in complex texture regions using the CNN, thus achieving a good balance between the coding complexity and the coding performance.
2. Proposed Fast CU Partitioning Algorithm
2.1. Observation and Motivation
2.2. Texture Judgment Decision
Algorithm 1: Texture judgment decision |
Input: Size of current CU, THA, THB |
|
2.3. Proposed Network
2.4. Fast CU Partitioning Algorithm
3. Experimental Results and Performance Analysis
3.1. Performance Evaluation Index
3.2. Experimental Parameter Configuration
3.3. Optimal Threshold Decision
3.4. Ablation Experiment
3.5. Reduced Complexity and RD Performance Evaluation
4. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Falkowski-Gilski, P.; Uhl, T. Current trends in consumption of multimedia content using online streaming platforms: A user-centric survey. Comput. Sci. Rev. 2020, 37, 100268. [Google Scholar] [CrossRef]
- Falkowski-Gilski, P. On the consumption of multimedia content using mobile devices: A year to year user case study. Arch. Acoust. 2020, 45, 321–328. [Google Scholar]
- Sullivan, G.J.; Ohm, J.R.; Han, W.J.; Wiegand, T. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
- Wiegand, T.; Sullivan, G.J.; Bjontegaard, G.; Luthra, A. Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
- Bossen, F.; Bross, B.; Suhring, K.; Flynn, D. HEVC Complexity and Implementation Analysis. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1685–1696. [Google Scholar] [CrossRef]
- Guo, H.; Zhu, C.; Xu, M.; Li, S. Inter-Block Dependency-Based CTU Level Rate Control for HEVC. IEEE Trans. Broadcast. 2020, 66, 113–126. [Google Scholar] [CrossRef]
- Jamali, M.; Coulombe, S. Fast HEVC Intra Mode Decision Based on RDO Cost Prediction. IEEE Trans. Broadcast. 2019, 65, 109–122. [Google Scholar] [CrossRef]
- Bross, B.; Wang, Y.K.; Ye, Y.; Liu, S.; Chen, J.; Sullivan, G.J.; Ohm, J.R. Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3736–3764. [Google Scholar] [CrossRef]
- Wu, S.; Shi, J.; Chen, Z. HG-FCN: Hierarchical Grid Fully Convolutional Network for Fast VVC Intra Coding. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 5638–5649. [Google Scholar] [CrossRef]
- The Bitmovin. Video Developer Report. Available online: https://go.bitmovin.com/video-developer-report (accessed on 27 November 2018).
- Qi, M.B.; Chen, X.L.; Yang, Y.F.; Jiang, J.G.; Jin, Y.L.; Zhang, J.J. Fast coding unit splitting algorithm for high efficiency video coding intra prediction. J. Electron. Inf. Technol. 2014, 36, 1699–1705. [Google Scholar]
- Zhang, W.B.; Chen, D.; Yao, X.Y.; Feng, Y.B. Fast intra coding unit splitting algorithm based on spatial-temporal correlation in HE-VC. J. Image Graph. 2018, 23, 155–162. [Google Scholar]
- Chen, F.; Jin, D.; Peng, Z.; Jiang, G.; Yu, M.; Chen, H. Fast intra coding algorithm for HEVC based on depth range prediction and mode reduction. Multimed. Tools Appl. 2018, 77, 28375–28394. [Google Scholar] [CrossRef]
- Wang, X.J.; Xue, Y.L. Fast HEVC inter prediction algorithm based on spatio-temporal block information. In Proceedings of the 2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Cagliari, Italy, 7–9 June 2017. [Google Scholar]
- Liu, X.G.; Liu, Y.B.; Wang, P.C.; Lai, C.F.; Chao, H.C. An Adaptive Mode Decision Algorithm Based on Video Texture Characteristics for HEVC Intra Prediction. IEEE Trans. Circuits Syst. Video Technol. 2022, 27, 1737–1748. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, N.; Kwong, S.; Jiang, G.; Zeng, H. Statistical early termination and early skip models for fast mode decision in hevc intra coding. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2019, 15, 1–23. [Google Scholar] [CrossRef]
- Fu, B.; Zhang, Q.Q.; Hu, J. Fast prediction mode selection and cu partition for hevc intra coding. IET Image Process. 2020, 14, 1892–1900. [Google Scholar] [CrossRef]
- Pakdaman, F.; Yu, L.; Hashemi, M.R.; Ghanbari, M.; Gabbouj, M. SVM based approach for complexity control of HEVC intra coding. Signal Process. Image Commun. 2021, 93, 116177. [Google Scholar] [CrossRef]
- Amna, M.; Imen, W.; Soulef, B.; Sayadi, F.E. Machine Learning Based approaches to reduce HEVC intra coding unit partition decision complexity. Multimedi. Tools Appl. 2022, 81, 2777–2802. [Google Scholar] [CrossRef]
- Westland, N.; Dias, A.S.; Mrak, M. Decision Trees for Complexity Reduction in Video Compression. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 15–17 November 2019. [Google Scholar]
- Liu, Z.; Yu, X.; Gao, Y.; Chen, S.; Ji, X.; Wang, D. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network. IEEE Trans. Image Process. 2016, 25, 5088–5103. [Google Scholar] [CrossRef]
- Xu, M.; Li, T.Y.; Wang, Z.; Deng, X.; Yang, R.; Guan, Z. Reducing complexity of HEVC: A deep learning approach. IEEE Trans. Image Process. 2018, 27, 5044–5059. [Google Scholar] [CrossRef]
- Huang, Y.; Song, L.; Xie, R.; Izquierdo, E.; Zhang, W. Modeling acceleration properties for flexible intra hevc complexity control. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 4454–4469. [Google Scholar] [CrossRef]
- Galpin, F.; Racapé, F.; Jaiswal, S.; Bordes, P.; Léannec, F.L.; Francois, E. CNN-based driving of block partitioning for intra slices encoding. In Proceedings of the 2019 Data Compression Conference (DCC), Snowbird, UT, USA, 11–13 March 2019. [Google Scholar]
- Zhang, Y.; Wang, G.; Tian, R.; Xu, M.; Kuo, C.J. Texture-Classification Accelerated CNN Scheme for Fast Intra CU Partition in HEVC. In Proceedings of the 2019 Data Compression Conference (DCC), Snowbird, UT, USA, 26–29 March 2019. [Google Scholar]
- Zaki, F.; Mohamed, A.E.; Sayed, S.G. CtuNet: A Deep Learning-based Framework for Fast CTU Partitioning of H265/HEVC Intra-coding. Ain Shams Eng. J. 2021, 12, 1859–1866. [Google Scholar] [CrossRef]
- Feng, A.; Gao, C.; Li, L.; Liu, D.; Wu, F. Cnn-Based Depth Map Prediction for Fast Block Partitioning in HEVC Intra Coding. In Proceedings of the 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021. [Google Scholar]
- Tahir, M.; Taj, I.A.; Assuncao, P.A.; Muhammad, A. Fast video encoding based on random forests. J. Real-Time Image Process. 2020, 17, 1029–1049. [Google Scholar] [CrossRef]
- Li, Y.; Li, L.; Fang, Y.; Peng, H.; Ling, N. Bagged Tree and ResNet-Based Joint End-to-End Fast CTU Partition Decision Algorithm for Video Intra Coding. Electronics 2022, 11, 1264. [Google Scholar] [CrossRef]
- Yao, C.; Xu, C.; Liu, M. RDNet: Rate–Distortion-Based Coding Unit Partition Network for Intra-Prediction. Electronics 2022, 11, 916. [Google Scholar] [CrossRef]
- Li, N.; Zhang, Y.; Zhu, L.; Luo, W.; Kwong, S. Reinforcement learning based coding unit early termination algorithm for high efficiency video coding. J. Vis. Commun. Image Represent. 2019, 60, 276–286. [Google Scholar] [CrossRef]
- Gao, W.; Yang, L.; Zhang, X.; Zhou, B.; Ma, C. Based on soft-threshold wavelet denoising combining with Prewitt operator edge detection algorithm. In Proceedings of the 2010 2nd International Conference on Education Technology and Computer (ICRTC), Shanghai, China, 22–24 June 2010. [Google Scholar]
- Cho, S.I.; Kang, S.J. Gradient Prior-Aided CNN Denoiser With Separable Convolution-Based Optimization of Feature Dimension. IEEE Trans. Multimed. 2019, 21, 484–493. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Zhang, M.; Lai, D.; Liu, Z.; An, C. A novel adaptive fast partition algorithm based on CU complexity analysis in HEVC. Multimed. Tools Appl. 2019, 78, 1035–1051. [Google Scholar] [CrossRef]
- Bossen, F. Common test conditions and software reference configurations. In Proceedings of the Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 5th Meeting, Geneva, Switzerland, 16–23 March 2011. [Google Scholar]
- Xu, M.; Deng, X.; Li, S.; Wang, Z. Region-of-Interest Based Conversational HEVC Coding with Hierarchical Perception Model of Face. IEEE J. Sel. Top. Signal Process. 2014, 8, 475–489. [Google Scholar] [CrossRef]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
- Li, T.; Xu, M.; Deng, X. A Deep Convolutional Neural Network Approach for Complexity Reduction on Intra-mode HEVC. In Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, China, 10–14 July 2017. [Google Scholar]
- Correa, G.; Assuncao, P.A.; Agostini, L.V.; da Silva Cruz, L.A. Fast HEVC encoding decisions using data mining. IEEE Trans. Circuits Syst. Video Technol. 2015, 25, 660–673. [Google Scholar] [CrossRef]
- Najafabadi, N.; Ramezanpour, M. Mass center direction-based decision method for intraprediction in HEVC standard. J. Real-Time Image Process. 2020, 17, 1153–1168. [Google Scholar] [CrossRef]
Input CU Size | 64 × 64 | ||||
---|---|---|---|---|---|
Layer | Layer-1 | Layer-2 | Layer-3 | Layer-4 | Layer-5 |
Output Size | 16 × 16 × 16 | 16 × 16 × 32 | 8 × 8 × 32 | 8 × 8 × 48 | 8 × 8 × 32 |
Filters | 4 × 4, 16 | Concatenate | |||
Layer | Layer-6 | Layer-7 | Layer-8 | Layer-9 | Layer-10 |
Output Size | 4 × 4 × 32 | 4 × 4 × 48 | 4 × 4 × 80 | 4 × 4 × 80 | 1 × 1 × 1280 |
Filters | Concatenate | model | Flatten | ||
Layer | Layer-11 | Layer-12 | Layer-13 | / | / |
Output Size | 64 128 256 | 48 96 192 | 1 4 16 | / | / |
Class | Sequences | Resolution | Frame Rate (Hz) | Number Frames | Length (s) |
---|---|---|---|---|---|
A | People On Street | 2560 × 1600 | 30 | 150 | 5 |
Traffic | 2560 × 1600 | 30 | 150 | 5 | |
B | Basketball Drive | 1920 × 1080 | 50 | 500 | 10 |
BQ Terrace | 1920 × 1080 | 60 | 600 | 10 | |
Cactus | 1920 × 1080 | 50 | 500 | 10 | |
Kimono | 1920 × 1080 | 24 | 240 | 10 | |
Park Scene | 1920 × 1080 | 24 | 240 | 10 | |
C | Basketball Drill | 832 × 480 | 50 | 500 | 10 |
BQ Mall | 832 × 480 | 60 | 600 | 10 | |
Party Scene | 832 × 480 | 50 | 500 | 10 | |
Race Horses | 832 × 480 | 30 | 300 | 10 | |
D | Basketball Pass | 416 × 240 | 50 | 500 | 10 |
Blowing Bubbles | 416 × 240 | 50 | 500 | 10 | |
BQ Square | 416 × 240 | 60 | 600 | 10 | |
Race Horses | 416 × 240 | 30 | 300 | 10 | |
E | Four People | 1280 × 720 | 60 | 600 | 10 |
Johnny | 1280 × 720 | 60 | 600 | 10 | |
KritenAndSara | 1280 × 720 | 60 | 600 | 10 |
Class | Sequence | BD-BR (%) | BD-PSNR (dB) | ΔT (%) | |||
---|---|---|---|---|---|---|---|
QP = 22 | QP = 27 | QP = 32 | QP = 37 | ||||
A (2560 × 1600) | People On Street | 1.95 | −0.111 | −60.81 | −61.70 | −61.78 | −63.91 |
Traffic | 2.10 | −0.114 | −61.26 | −65.29 | −68.10 | −71.53 | |
B (1920 × 1080) | Basketball Drive | 4.04 | −0.098 | −69.18 | −74.89 | −78.35 | −78.95 |
BQ Terrace | 1.45 | −0.088 | −52.04 | −54.27 | −57.98 | −61.15 | |
Cactus | 1.91 | −0.073 | −53.55 | −61.40 | −66.74 | −76.27 | |
Kimono | 1.57 | −0.056 | −83.44 | −83.10 | −83.25 | −83.96 | |
Park Scene | 1.68 | −0.072 | −61.03 | −72.11 | −74.44 | −75.97 | |
C (832 × 480) | Basketball Drill | 2.72 | −0.131 | −42.96 | −62.12 | −62.09 | −75.16 |
BQ Mall | 1.13 | −0.071 | −62.74 | −59.01 | −61.12 | −64.64 | |
Party Scene | 0.31 | −0.023 | −46.27 | −47.51 | −49.36 | −54.04 | |
Race Horses | 1.50 | −0.095 | −48.38 | −51.41 | −55.90 | −61.00 | |
D (416 × 240) | Basketball Pass | 2.11 | −0.120 | −48.21 | −52.07 | −58.59 | −61.96 |
Blowing Bubbles | 0.67 | −0.040 | −36.35 | −40.03 | −45.63 | −52.45 | |
BQ Square | 0.21 | −0.018 | −37.87 | −42.62 | −44.99 | −46.34 | |
Race Horses | 0.92 | −0.064 | −42.15 | −46.78 | −50.01 | −53.82 | |
E (1280 × 720) | Four People | 2.58 | −0.151 | −59.95 | −62.97 | −65.49 | −68.41 |
Johnny | 3.90 | −0.161 | −70.25 | −72.18 | −73.39 | −74.79 | |
KritenAndSara | 2.83 | −0.144 | −67.18 | −69.17 | −72.25 | −73.27 | |
Average Class A | 2.03 | −0.113 | −61.04 | −63.50 | −64.94 | −67.72 | |
Average Class B | 2.13 | −0.077 | −63.85 | −69.15 | −72.15 | −74.49 | |
Average Class C | 1.42 | −0.080 | −50.47 | −55.33 | −57.18 | −63.84 | |
Average Class D | 0.98 | −0.061 | −41.15 | −45.38 | −49.81 | −53.64 | |
Average Class E | 3.10 | −0.152 | −65.79 | −68.11 | −70.38 | −72.16 | |
Average of Class A–E | 1.86 | −0.090 | −55.76 | −59.92 | −62.75 | −66.53 |
Class | Sequence | [21] | [22] | [23] | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
BD- BR (%) | BD- PSNR (dB) | (%) | BD- BR (%) | BD- PSNR (dB) | (%) | BD- BR (%) | (%) | BD- BR (%) | BD- PSNR (dB) | (%) | ||
A | People On Street | 3.97 | −0.21 | −55.59 | 2.37 | −0.13 | −61.00 | 1.89 | −61.30 | 1.95 | −0.11 | −62.05 |
Traffic | 4.95 | −0.24 | −60.84 | 2.55 | −0.13 | −70.79 | 1.74 | −63.10 | 2.10 | −0.11 | −66.54 | |
B | Basketball Drive | 6.02 | −0.14 | −69.51 | 4.27 | −0.12 | −76.32 | 1.76 | −62.90 | 4.04 | −0.10 | −75.34 |
BQ Terrace | 4.82 | −0.27 | −57.89 | 1.84 | −0.09 | −64.72 | 1.37 | −62.30 | 1.45 | −0.09 | −56.36 | |
Cactus | 6.02 | −0.21 | −62.98 | 2.27 | −0.08 | −60.96 | 1.85 | −63.90 | 1.91 | −0.07 | −64.49 | |
Kimono | 2.38 | −0.08 | −72.72 | 2.59 | −0.09 | −83.53 | 0.85 | −69.00 | 1.57 | −0.06 | −83.19 | |
Park Scene | 3.42 | −0.14 | −66.03 | 1.96 | −0.08 | −67.53 | 1.70 | −63.60 | 1.68 | −0.07 | −70.89 | |
C | Basketball Drill | 12.21 | −0.54 | −63.58 | 2.86 | −0.13 | −52.98 | 3.48 | −63.80 | 2.72 | −0.13 | −60.58 |
BQ Mall | 8.08 | −0.47 | −52.14 | 2.09 | −0.11 | −58.42 | 2.24 | −62.30 | 1.13 | −0.07 | −61.88 | |
Party Scene | 9.45 | −0.67 | −58.75 | 0.66 | −0.04 | −44.49 | 1.70 | −56.00 | 0.30 | −0.02 | −49.30 | |
Race Horses | 4.42 | −0.26 | −58.19 | 1.97 | −0.11 | −57.12 | 1.45 | −62.40 | 1.50 | −0.10 | −54.17 | |
D | Basketball Pass | 8.40 | −0.46 | −64.02 | 1.84 | −0.11 | −56.42 | 2.09 | −62.09 | 2.11 | −0.12 | −55.21 |
Blowing Bubbles | 8.33 | −0.46 | −60.78 | 0.62 | −0.04 | −40.54 | 2.05 | −56.00 | 0.67 | −0.04 | −43.62 | |
BQ Square | 2.56 | −0.21 | −46.72 | 0.91 | −0.07 | −45.82 | 1.50 | −47.90 | 0.21 | −0.02 | −42.95 | |
Race Horses | 4.95 | −0.32 | −57.29 | 1.32 | −0.08 | −55.75 | 1.65 | −57.70 | 0.91 | −0.06 | −48.19 | |
E | Four People | 8.00 | −0.44 | −61.54 | 3.11 | −0.17 | −71.31 | 2.30 | −62.80 | 2.58 | −0.15 | −64.21 |
Johnny | 7.96 | −0.31 | −66.55 | 3.82 | −0.15 | −70.68 | 2.61 | −69.00 | 3.90 | −0.16 | −72.66 | |
KritenAndSara | 5.48 | −0.27 | −64.72 | 3.46 | −0.17 | −74.86 | 1.88 | −64.40 | 2.83 | −0.14 | −70.47 | |
Average Class A | 4.46 | −0.23 | −58.22 | 2.46 | −0.13 | −65.90 | 1.82 | −62.20 | 2.05 | −0.11 | −64.30 | |
Average Class B | 4.53 | −0.17 | −65.83 | 2.59 | −0.09 | −70.61 | 1.51 | −64.30 | 2.13 | −0.08 | −69.86 | |
Average Class C | 8.54 | −0.49 | −58.17 | 1.90 | −0.10 | −53.25 | 2.22 | −61.10 | 1.41 | −0.08 | −56.48 | |
Average Class D | 6.06 | −0.36 | −57.20 | 1.17 | −0.08 | −49.63 | 1.82 | −56.10 | 0.98 | −0.06 | −51.50 | |
Average Class E | 7.15 | −0.34 | −64.27 | 3.46 | −0.16 | −72.28 | 2.26 | −65.40 | 3.10 | −0.15 | −64.27 | |
Average of Class A–E | 6.19 | −0.32 | −61.09 | 2.25 | −0.11 | −61.84 | 1.90 | −61.70 | 1.86 | −0.09 | −61.23 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, T.; Wei, G.; Li, H.; Bui, T.; Zeng, Q.; Wang, R. Fast CU Partition Algorithm for Intra Frame Coding Based on Joint Texture Classification and CNN. Sensors 2023, 23, 7923. https://doi.org/10.3390/s23187923
Wang T, Wei G, Li H, Bui T, Zeng Q, Wang R. Fast CU Partition Algorithm for Intra Frame Coding Based on Joint Texture Classification and CNN. Sensors. 2023; 23(18):7923. https://doi.org/10.3390/s23187923
Chicago/Turabian StyleWang, Ting, Geng Wei, Huayu Li, ThiOanh Bui, Qian Zeng, and Ruliang Wang. 2023. "Fast CU Partition Algorithm for Intra Frame Coding Based on Joint Texture Classification and CNN" Sensors 23, no. 18: 7923. https://doi.org/10.3390/s23187923