{"id":1079,"date":"2015-10-19T20:46:18","date_gmt":"2015-10-19T20:46:18","guid":{"rendered":"http:\/\/62.131.51.129\/?p=1079"},"modified":"2015-10-19T20:46:18","modified_gmt":"2015-10-19T20:46:18","slug":"hive-mapreduce-extension","status":"publish","type":"post","link":"http:\/\/archief.van-maanen.com\/?p=1079","title":{"rendered":"Hive &#8211; mapreduce extension"},"content":{"rendered":"<p>It is good to realise that Hive is built upon a mapreduce framework. The idea is that Hive is developed by facebook to facilitate analysis on Hadoop files. It is possible to use some kind of a SQL dialect in stead of a Python or a java programme to do your analysis. When a Hive command is run, one sees clearly the map reduce steps. See below.<\/p>\n<pre>\n[pivhdsne:~]$ hive -e \"select count(*) fron drink;\"\n15\/10\/19 22:14:48 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive\n15\/10\/19 22:14:48 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize\n....\n2015-10-19 22:15:34,169 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.14 sec\n2015-10-19 22:15:35,241 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.14 sec\n2015-10-19 22:15:36,320 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.14 sec\n2015-10-19 22:15:37,461 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 5.47 sec\n2015-10-19 22:15:38,502 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 5.47 sec\n2015-10-19 22:15:39,559 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 5.47 sec\nMapReduce Total cumulative CPU time: 5 seconds 470 msec\nEnded Job = job_1445283750993_0001\nMapReduce Jobs Launched: \nJob 0: Map: 1  Reduce: 1   Cumulative CPU: 5.47 sec   HDFS Read: 266 HDFS Write: 2 SUCCESS\nTotal MapReduce CPU Time Spent: 5 seconds 470 msec\nOK\n4\n<\/pre>\n<p>This also led to criticism upon Hive. It is stated that Hive is still limited by the bottlenecks within mapreduce. Therefore other parties, such as Cloudera developed Impala to circumvent such bottlenecks.<br \/>\nBut Hive doesn&#8217;t stand still. With Hortonworks an improved version of Hive is developed that seems to provide far better performance. A name that is often mentioned is Tez: the name of the execution engine that is used within this context.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It is good to realise that Hive is built upon a mapreduce framework. The idea is that Hive is developed by facebook to facilitate analysis on Hadoop files. It is possible to use some kind of a SQL dialect in stead of a Python or a java programme to do your analysis. When a Hive [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1080,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-1079","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-nice-to-know"],"_links":{"self":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts\/1079","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1079"}],"version-history":[{"count":0,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts\/1079\/revisions"}],"wp:attachment":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1079"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1079"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}