It is well known that small files are often created and accessed in pervasive computing in which information is processed with limited resources via linking with objects as encountered. And the Hadoop framework, as a de facto big data processing platform though very popular in practice, cannot effectively process the small files. In this paper, we propose a scalable HDFS-based storage framework, named SHAstor, to improve the throughput in processing of small-writes for pervasive computing paradigm. Compared to the classic HDFS, the essence of this approach is to merge the incoming small writes into a large chunk of data, either at client side or at server side, and then store it as a big file in the framework. As a consequence, this could substantially reduce the number of small files to process the pervasively gathered information. To reach this goal, the framework takes the HDFS as the basis and adds three extra modules for merging and indexing the small files during the read/write operations in pervasive applications are performed. To further facilitate this process, a new ancillary namenode is also optionally installed to store the index table. With this optimization, SHAstor can not only optimize the small-writes, but also scale out with the number of datanodes to improve the performance of pervasive applications.

Additional Metadata
Keywords Hadoop framework, HDFS-based storage, Pervasive computing, Small write
Persistent URL dx.doi.org/10.1109/SmartWorld.2018.00198
Conference 4th IEEE SmartWorld, 15th IEEE International Conference on Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
Citation
Zeng, L. (Lingfang), Shi, W, Ni, F. (Fan), Jiang, S. (Song), Fan, X. (Xiaopeng), Xu, C. (Chengzhong), & Wang, Y. (Yang). (2018). SHAstor: A scalable HDFS-Based storage framework for small-write efficiency in pervasive computing. In Proceedings - 2018 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018 (pp. 1140–1145). doi:10.1109/SmartWorld.2018.00198