{"id":1432,"date":"2016-12-13T12:13:23","date_gmt":"2016-12-13T12:13:23","guid":{"rendered":"http:\/\/62.131.51.129\/?p=1432"},"modified":"2016-12-13T12:13:23","modified_gmt":"2016-12-13T12:13:23","slug":"parquet-format","status":"publish","type":"post","link":"http:\/\/archief.van-maanen.com\/?p=1432","title":{"rendered":"Parquet format"},"content":{"rendered":"<p>As we know, we may store table definitions in the metastore. These table definitions then refer to a location where the data are stored. The format of the data might be an ordinary text file or it might be an avro file. Another possibility is a parquet file. This parquet format is an example of a packed\/ zipped format.<br \/>\nTo create such table is rather straightforward. First, we transfer a table to a parquet file on HDFS:<\/p>\n<pre>\nsqoop import \\\n--connect \"jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=192.168.2.2)(port=1521))(connect_data=(service_name=orcl)))\" \\\n--username scott --password binvegni \\\n--table fam \\\n--columns \"NUMMER, NAAM\" \\\n--m 1 \\\n--target-dir \/loudacre\/fam_parquet \\\n--as-parquetfile;\n<\/pre>\n<p>This results in a file that can be found in directory \/loudacre\/fam_parquet. For some reason, the file is called 5fe8fcaa-6095-40ec-b499-d73d6d971b6f.parquet. From Impala, we may then define the table with:<\/p>\n<pre>\nCREATE EXTERNAL TABLE fam_parquet\nLIKE PARQUET '\/loudacre\/fam_parquet\/5fe8fcaa-6095-40ec-b499-d73d6d971b6f.parquet'\nSTORED AS PARQUET\nLOCATION '\/loudacre\/fam_parquet\/';\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>As we know, we may store table definitions in the metastore. These table definitions then refer to a location where the data are stored. The format of the data might be an ordinary text file or it might be an avro file. Another possibility is a parquet file. This parquet format is an example of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1435,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1432","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts\/1432","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1432"}],"version-history":[{"count":0,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts\/1432\/revisions"}],"wp:attachment":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1432"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1432"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1432"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}