MapReduce-based Data Processing Systems

Key Research Highlights > Databases & Big Data Analytics > MapReduce-based Data Processing Systems

Beng Chin OOI
Distinguished Professor

School of Computing
Department of Computer Science
[email protected]
COM1-03-22
651 66465

Other Projects
Personal Website

MapReduce-based Data Processing Systems

Objective
To allow users of MapReduce-based systems to keep the programming model of the MapReduce framework while empowering them with data management functionalities at an acceptable performance by minimising initialisation overheads.

Results
We have met the objective of this project through 1) careful tuning of key design factors in MapReduce (Hadoop) resulting in improved performance; and 2) the development of a query processing engine under the MapReduce framework with join algorithms at the operator level including MapReduce-based similarity (kNN) join to minimise the number of objects sent to the reduce node minimising computation and communication overheads; schemes for processing multi-join queries efficiently exploiting replication to expand the plan space; an automatic query analyser that accepts an SQL query, optimises it and translates it into a set of MapReduce jobs; and Concurrent Join to support multi-way join over partitioned data.