Yet the Map Reduce paradigm has its limitations. The biggest problem is that it involves writing code for each analysis. This limits the number of companies and people that can use this paradigm. The second problem is that joins of different data sets is hard. The third problem is that Map Reduce works on files and produces files; after a while the number of files multiplies and it becomes difficult to keep track of things. What's lacking is a metadata layer, such as the catalog in database systems. Don't get me wrong; I love Map Reduce, and there are applications that don't need these things, but increasingly there are applications that do.
-Why the World Needs a New Database System
No comments:
Post a Comment