计算机科学 ›› 2020, Vol. 47 ›› Issue (7): 71-77.doi: 10.11896/jsjkx.200200106
李向利1,2,3, 贾梦雪1,4
LI Xiang-li1,2,3, JIA Meng-xue1,4
摘要: 随着多媒体技术的发展,信息越来越多的以图片的形式出现。如何对海量的无标签图片进行聚类,是机器学习领域的热点问题。而图像聚类在人脸识别、手写数字识别等领域也有着重要的作用。由于图片数据通常以非负矩阵的形式存储,因此非负矩阵分解算法(NMF)在图像聚类领域得到了广泛的应用。但是NMF算法直接在数据的原始空间进行处理,这就导致NMF算法所得的图片标签易受到数据采集过程中含有的噪声等不利因素的影响。为了解决这些问题,提出了一种基于预处理的超图非负矩阵分解算法(Nonnegative Matrix Factorization with Hypergraph Based on Per-treatments,PHGNMF)。PHGNMF算法将预处理操作和超图的思想引入到NMF算法。在预处理的过程中,使用灰度处理来去除图片中不同光线条件所带来的影响,采用小波分析来提取图片的低时频子图,同时降低了算法所处理的矩阵维度。采取构建超图的方法来进一步保留对聚类结果有重要影响的数据局部结构。最后在5个主流数据集上的实验验证了PHGNMF算法相对于传统算法的有效性,结果显示聚类精度提升了2%~7%,标准互信息在部分数据集上提升了2%~5%。
中图分类号:
[1]BELHUMEUR P N,HESPANHA J P,KRIEGMAN D J.Eigenfaces vs,Recongnition using class specific linear projection [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,1997,7:711-720. [2]KALMAN D.A singularly valuable decomposition:the SVD of a matrix [J].The College Mathematics Journal,1996,27(1):2-23. [3]FISHER R A.The use of multiple measurements in taxonomic problems [J].Annals of Eugenics,1936,7(2):179-188. [4]LEE D D,SEUNG H S.Learning the parts of objects by non-negative matrix factorization [J].Nature,1999,401(6755):788. [5]GUILLAMET D,JORDI V.Non-negative matrix factorizationfor face recognition[C]//Castellón,Spain,lecture notes in artificial intelligence.Berlin:Springer-Verlag,2002:336-344. [6]ZHANG X,GAO H,LI G,et al.Multi-view clustering based on graph-regularized nonnegative matrix factorization for object recognition [J].Information Sciences,2018,432:463-478. [7]ZURADA J M,ENSARI T,ASL E H,et al.Nonnegative matrix factorization and its application to pattern analysis and text mi-ning[C]//Krakow,Poland,Federated Conference on Computer Science and Information Systems.New York,IEEE,2013:11-16. [8]GAO B,WOO W L,DLAY S S.Variational regularized 2-D nonnegative matrix factorization [J].IEEE Transactions on Neural Networks and Learning Systems,2012,23(5):703-716. [9]CAI D,HE X,HAN J,et al.Graph regularized nonnegative matrix factorization for data representation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,33(8):1548-1560. [10]CUI G,LI X,DONG Y.Subspace clustering guided convex nonnegative matrix factorization [J].Neurocomputing,2018,292:38-48. [11]XU W,GONG Y.Document clustering by concept factorization [C]//Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,Association for Computing Machinery.New York,2004:202-209. [12]HU W,CHOI K S,WANG P,et al.Convex nonnegative matrix factorization with manifold regularization [J].Neural Networks,2015,63:94-103. [13]BABAEE M,TSOUKALAS S.Discriminative nonnegative matrix factorization for dimensionality reduction [J].Neurocomputing,2015,173:212-223. [14]PENG X,CHEN D,XU D.Semi-supervised least squares nonnegative matrix factorization and graph-based extension [J].Neurocomputing,2018,320:98-111. [15]WANG F,LI T,ZHANG C S.Semi-supervised clustering via matrix factorization [C]//Atlanta,Georgia,Proceedings of the 2008 SIAM International Conference on Data Mining.SIAM,America,2008:1-12. [16]LI G,ZHANG X,ZHANG S,et al.Semi-supervised convex nonnegative matrix factorizations with graph regularized for image representation [J].Neurocomputing,2017,237:1-11. [17]ZHOU J.Research of SWNMF with new iteration rules for facial feature extraction and recognition [J].Symmetry,2019,11(3):354. [18]WANG W H,QIAN Y T,TANG Y Y.Hypergraph-regularized sparse NMF for hyperspectral unmixing [J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2016,9(2):681-694. [19]ZHOU D Y,HUANG J Y,BERNHARD S.Learning with hypergraphs:clustering,classification,and embedding[C]// British,Advances in Neural Information Processing Systems 19:Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems,MIT Press,2007:1601-1608. [20]SHI J B,MALIK J.Normalized cuts and image segmentation [J].IEEE Transactions on pattern analysis and machine intelligence,2000,22(8):888-905. [21]HE X,WANG Q,LI X L.Robust adaptive graph regularized non-negative matrix factorization [J].IEEE Access,2019,7:83101-83110. |
[1] | 周海榆, 张道强. 面向多中心数据的超图卷积神经网络及应用 Multi-site Hyper-graph Convolutional Neural Networks and Application 计算机科学, 2022, 49(3): 129-133. https://doi.org/10.11896/jsjkx.201100152 |
[2] | 韩洁, 陈俊芬, 李艳, 湛泽聪. 基于自注意力的自监督深度聚类算法 Self-supervised Deep Clustering Algorithm Based on Self-attention 计算机科学, 2022, 49(3): 134-143. https://doi.org/10.11896/jsjkx.210100001 |
[3] | 官铮, 邓扬琳, 聂仁灿. 光谱重建约束非负矩阵分解的高光谱与全色图像融合 Non-negative Matrix Factorization Based on Spectral Reconstruction Constraint for Hyperspectral and Panchromatic Image Fusion 计算机科学, 2021, 48(9): 153-159. https://doi.org/10.11896/jsjkx.200900054 |
[4] | 陈扬, 王金亮, 夏炜, 杨颢, 朱润, 奚雪峰. 基于特征自动提取的足迹图像聚类方法 Footprint Image Clustering Method Based on Automatic Feature Extraction 计算机科学, 2021, 48(6A): 255-259. https://doi.org/10.11896/jsjkx.200900033 |
[5] | 罗靖杰, 王永利. ADCSM:一种细粒度汽车行驶工况模型构建方法 ADCSM:A Fine-grained Driving Cycle Model Construction Method 计算机科学, 2021, 48(6A): 289-294. https://doi.org/10.11896/jsjkx.200600019 |
[6] | 段菲, 王慧敏, 张超. 面向数据表示的Cauchy非负矩阵分解 Cauchy Non-negative Matrix Factorization for Data Representation 计算机科学, 2021, 48(6): 96-102. https://doi.org/10.11896/jsjkx.200700195 |
[7] | 向昌盛, 陈志刚. 面向海量数据的网络流量混沌预测模型 Chaotic Prediction Model of Network Traffic for Massive Data 计算机科学, 2021, 48(5): 289-293. https://doi.org/10.11896/jsjkx.200400056 |
[8] | 余诗媛, 郭淑明, 黄瑞阳, 张建朋, 苏珂. 嵌套命名实体识别研究进展 Overview of Nested Named Entity Recognition 计算机科学, 2021, 48(11A): 1-10. https://doi.org/10.11896/jsjkx.201100165 |
[9] | 李雨蓉, 刘杰, 刘亚林, 龚春叶, 王勇. 面向语音分离的深层转导式非负矩阵分解并行算法 Parallel Algorithm of Deep Transductive Non-negative Matrix Factorization for Speech Separation 计算机科学, 2020, 47(8): 49-55. https://doi.org/10.11896/jsjkx.190900202 |
[10] | 王成章, 白晓明, 杜金栗. 图像的扩散界面无监督聚类算法 Diffuse Interface Based Unsupervised Images Clustering Algorithm 计算机科学, 2020, 47(5): 149-153. https://doi.org/10.11896/jsjkx.190300125 |
[11] | 王丽星, 曹付元. 基于Huber损失的非负矩阵分解算法 Huber Loss Based Nonnegative Matrix Factorization Algorithm 计算机科学, 2020, 47(11): 80-87. https://doi.org/10.11896/jsjkx.190900144 |
[12] | 周昌, 李向利, 李俏霖, 朱丹丹, 陈世莲, 蒋丽榕. 基于余弦相似度的稀疏非负矩阵分解算法 Sparse Non-negative Matrix Factorization Algorithm Based on Cosine Similarity 计算机科学, 2020, 47(10): 108-113. https://doi.org/10.11896/jsjkx.190700112 |
[13] | 任雪婷, 赵涓涓, 强彦, Saad Abdul RAUF, 刘继华. 联合成对学习和图像聚类的无监督肺癌亚型识别 Lung Cancer Subtype Recognition with Unsupervised Learning Combining Paired Learning and Image Clustering 计算机科学, 2020, 47(10): 200-206. https://doi.org/10.11896/jsjkx.190900073 |
[14] | 杜臻, 马立鹏, 孙国梓. 一种基于小波分析的网络流量异常检测方法 Network Traffic Anomaly Detection Based on Wavelet Analysis 计算机科学, 2019, 46(8): 178-182. https://doi.org/10.11896/j.issn.1002-137X.2019.08.029 |
[15] | 康林瑶, 唐兵, 夏艳敏, 张黎. 基于GPU加速和非负矩阵分解的并行协同过滤推荐算法 GPU-accelerated Non-negative Matrix Factorization-based Parallel Collaborative Filtering Recommendation Algorithm 计算机科学, 2019, 46(8): 106-110. https://doi.org/10.11896/j.issn.1002-137X.2019.08.017 |
|