%0 Journal Article %A LI Tian-tian %A SONG Jie %A YAN Zhen-xing %A ZHU Zhi-liang %T Load-Balanced Data Layout Approach in Data-Intensive Computing %D 2013 %R 10.13190/jbupt.201304.76.songj %J Journal of Beijing University of Posts and Telecommunications %P 76-80 %V 36 %N 4 %X
Widely used in data-intensive computing, the MapReduce model deploys computing to the data side so as to execute in parallel. On this occasion, data layout will not only affect the storage itself, but also affect the computing efficiency. Computing efficiency of node is determined by features of data stored on this node. Therefore, the study on load balancing is accordingly shifted from traditional server management or task scheduling to study of data layout as a purpose to improve parallelism. The data layout characteristics in data-intensive computing and MapReduce environment is analyzed, a load-balanced goal of data layout is proposed, and a load-balanced data layout approach in a specific environment is presented as well. The proposed data layout goal and approach are proved effective through experiments. It is shown that the proposed data layout approach can effectively improve the parallelism of MapReduce applications, thus optimizing the computing efficiency.
%U https://journal.bupt.edu.cn/EN/10.13190/jbupt.201304.76.songj