Proceedings of
International Conference on Advances In Engineering And Technology ICAET 2014
"EFFICIENT UTILIZATION OF PROFILES TO REDUCE TIME IN VERY LARGE DATA SET"
Abstract: “Hadoop is a software framework for analysis of large data sets. Hadoop distributed file system and map reduce paradigm provide an efficient way to deal with terabyte of data being produced every second. MapReduce is known as a popular way to hold data in the cloud environment due to its excellent scalability and good fault tolerance. However, creating profiles for the same job again and again makes it less efficient. This paper proposes an INTERFACE that optimizes time taken to match sampled mapreduce jobs (Js) with already created profiles. It acts as mediator between profile store and worker (nodes).”
Keywords: Profile, sampling, tuning, optimization, mapreduce, hadoop distributed file system.