计算机科学 ›› 2015, Vol. 42 ›› Issue (10): 76-80.
游小容,曹晟
YOU Xiao-rong and CAO Sheng
摘要: Hadoop作为成熟的分布式云平台,能提供可靠高效的存储服务,常用来解决大文件的存储问题,但在处理海量小文件时效率显著降低。提出了基于Hadoop的海量教育资源中小文件的存储优化方案,即利用教育资源小文件间的关联关系,将小文件合并成大文件以减少文件数量,并用索引机制访问小文件及元数据缓存和关联小文件预取机制来提高文件的读取效率。实验证明,以上方法提高了Hadoop文件系统对小文件的存取效率。
[1] kkdelta.告诉你Hadoop是什么[EB/OL].[2014-06-17].http://www.thebigdata.cn/Hadoop/10722.html [2] 周敏奇,王晓玲,金澈清,等.Hadoop权威指南(第2版)[M].北京:清华大学出版社,2011:8-20 [3] White T.The small files problem [EB/OL].[2009-2-2].http://www.cloudera.com/blog/2009/02/the-small-files-problem [4] Dong Bo,Qiu Jie,Zheng Qing-hua,et al.A novel approach to improving the efficiency of storing and accessing small files on Hadoop:a case study by powerpoint files [C]∥IEEE International Conference on Services Computing.Miami,Florida,Piscataway:IEEE,2010:65-72 [5] 李宽.基于HDFS的分布式Namenode节点模型的研究 [D].广州:华南理工大学,2011 Li Kuan.Research of the Model of Distributed Namodes in HDFS[D].Guangzhou:South China University of Technology,2011 [6] 赵晓永,杨扬,孙莉莉,等.基于Hadoop的海量MP3文件存储架构[J].计算机应用技术,2012,32(6):1724-1726 Zhao Xiao-yong,Yang Yang,Sun Li-li,et al.Hadoop-based storage architecture for mass MP3 files[J].Journal of Computer Applications,2012,2(6):1742-1726 [7] Fu Song-ling,Huang Chen-lin,He Li-gang,et al.iFlatLFS:Performance Optimization for Accessing Massive Small Files[C]∥20th International Conference on High Performance Computing.Bangalor,Piscataway:IEEE,2013:10-19 [8] Li Jia,Lin Kun-hui,Wang Jing-jin.Design of the Mass Multimedia Files Storage Architecture Based on Hadoop[C]∥the 8th International Conference on Computer Science & Education.Colomlo,Piscataway:IEEE,2013:801-804 [9] 王涛,姚世红,徐正全,等.云存储中面向访问任务的小文件合并与预取策略[J].武汉大学学报(信息科学版),2013,8(12):1504-1508 Wang Tao,Yao Shi-hong,Yu Zheng-quan,et al.A Small File Merging and Prefetching Strategy Based on Access Task in Cloud Storage[J].Geomatics and Information Science of Wuhan University,2013,8(12):1504-1508 [10] Chandrasekar S,Dakshinamurthy R,Seshakumar P G,et al.A Novel Indexing Scheme for Efficient Handling of Small Files in Hadoop Distributed File System[C]∥2013 International Conference on Computer Communication and Informatics (ICCCI).Coimbatore,Piscataway:IEEE,2013:1-8 [11] 郑庆华,董博,刘均,等.一种基于Hadoop的海量可归类小文件关联存储方法:中国,102332029A[P].2012-01-25 |
No related articles found! |
|