Kumaraswamy Wavelet for Heterophilic Scene Graph Generation
DOI:
https://doi.org/10.1609/aaai.v38i2.27875Keywords:
CV: Scene Analysis & Understanding, CV: Language and VisionAbstract
Graph neural networks (GNNs) has demonstrated its capabilities in the field of scene graph generation (SGG) by updating node representations from neighboring nodes. Actually it can be viewed as a form of low-pass filter in the spatial domain, which smooths node feature representation and retains commonalities among nodes. However, spatial GNNs does not work well in the case of heterophilic SGG in which fine-grained predicates are always connected to a large number of coarse-grained predicates. Blind smoothing undermines the discriminative information of the fine-grained predicates, resulting in failure to predict them accurately. To address the heterophily, our key idea is to design tailored filters by wavelet transform from the spectral domain. First, we prove rigorously that when the heterophily on the scene graph increases, the spectral energy gradually shifts towards the high-frequency part. Inspired by this observation, we subsequently propose the Kumaraswamy Wavelet Graph Neural Network (KWGNN). KWGNN leverages complementary multi-group Kumaraswamy wavelets to cover all frequency bands. Finally, KWGNN adaptively generates band-pass filters and then integrates the filtering results to better accommodate varying levels of smoothness on the graph. Comprehensive experiments on the Visual Genome and Open Images datasets show that our method achieves state-of-the-art performance.Downloads
Published
2024-03-24
How to Cite
Chen, L., Song, Y., Lin, S., Wang, C., & He, G. (2024). Kumaraswamy Wavelet for Heterophilic Scene Graph Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1138-1146. https://doi.org/10.1609/aaai.v38i2.27875
Issue
Section
AAAI Technical Track on Computer Vision I