{"id":976,"date":"2015-08-19T11:18:36","date_gmt":"2015-08-19T11:18:36","guid":{"rendered":"http:\/\/www.van-maanen.com\/?p=976"},"modified":"2015-08-19T11:18:36","modified_gmt":"2015-08-19T11:18:36","slug":"avro-in-java","status":"publish","type":"post","link":"http:\/\/archief.van-maanen.com\/?p=976","title":{"rendered":"Avro in Java"},"content":{"rendered":"<p>Another example shows a similar idea. In this example a stream is created. This stream consists of 3 objects that contain a name and a number. Once the stream is created, it is serialised. In other words: the stream is prepared to be stored. It is stored in a file that is called &#8220;test.avro&#8221;.<br \/>\nBefore continuing, one remark on serialisation.<br \/>\nThe idea on serialisation is that one creates a format that is understandable outside the original language. An object that is created in Java can only be handled inside Java. To communicate the content, one needs to use a format that is underable by other languages, such as a string or an integer. The translation from an object into strings\/integers is called serialisation. One then creates something that is understood outside Java. In this case, everyting will be translated into strings and integers. These are wrtten to a file. They can be understood by, say, PHP or Oracle. The strings and integers are written to a file. That file can be read by Oracle or PHP as they will only encounter strings\/integers that can be transmitted from say Java to Oracle\/ PHP.<br \/>\nIn that file, we may detect the scheme along which the data are stored and the actual data. It takes a bit of courage as it is a binary file. Subsequently, it will be read from that file and the contents is shown.<br \/>\nThe programme is written in Java. It reads like:<\/p>\n<pre>\npackage avro;\n\nimport java.io.File;\nimport java.io.FileOutputStream;\nimport java.io.IOException;\n\nimport org.apache.avro.Schema;\nimport org.apache.avro.file.DataFileReader;\nimport org.apache.avro.file.DataFileWriter;\nimport org.apache.avro.generic.GenericData;\nimport org.apache.avro.generic.GenericDatumReader;\nimport org.apache.avro.generic.GenericDatumWriter;\nimport org.apache.avro.io.Encoder;\nimport org.apache.avro.io.EncoderFactory;\nimport org.apache.avro.util.Utf8;\n\n@SuppressWarnings(\"deprecation\")\nclass EmployeeTom\n{\n\tpublic static Schema SCHEMA;\n\t\n\tstatic {\n\t\ttry {\n\t\t\tSCHEMA = Schema.parse(EmployeeTom.class.getResourceAsStream(\"EmployeeTom.avsc\"));\n\t\t}\n\t\tcatch (IOException e)\n\t\t{\n\t\t\tSystem.out.println(\"Couldn't load a schema: \"+e.getMessage());\n\t\t}\n\t}\n\t\n\tprivate String name;\n\tprivate int age;\n\n\tpublic EmployeeTom(String name, int age){\n\t\tthis.name = name;\n\t\tthis.age = age;\n\t}\n\tpublic GenericData.Record serialize() {\n\t\t  GenericData.Record record = new GenericData.Record(SCHEMA);\n\t\t  record.put(\"name\", this.name);\n\t\t  record.put(\"age\", this.age);\n\t\t  return record;\n\t\t}\n\tpublic static void testWrite(File file, EmployeeTom[] people) throws IOException {\n\t\t   GenericDatumWriter datum = new GenericDatumWriter(EmployeeTom.SCHEMA);\n\t\t   DataFileWriter writer = new DataFileWriter(datum);\n\t\t   writer.create(EmployeeTom.SCHEMA, file);\n\t\t   for (EmployeeTom p : people)\n\t\t      writer.append(p.serialize());\n\t\t   writer.close();\n\t\t}\t\n\n\tpublic static void testRead(File file) throws IOException {\n\t\tGenericDatumReader datum = new GenericDatumReader();\n\t\tDataFileReader reader = new DataFileReader(file, datum);\n\t\tGenericData.Record record = new GenericData.Record(reader.getSchema());\n\t\twhile (reader.hasNext()) {\n\t\t\treader.next(record);\n\t\t\tSystem.out.println(\"Name \" + record.get(\"name\") + \n\t\t\t                    \" Age \" + record.get(\"age\") );\n\t\t}\n\t\treader.close();\n\t}\n\tpublic static void main(String[] args) {\n\t\tEmployeeTom e1 = new EmployeeTom(\"Joe\",31);\n\t\tEmployeeTom e2 = new EmployeeTom(\"Jane\",30);\n\t\tEmployeeTom e3 = new EmployeeTom(\"Zoe\",21);\n\t\tEmployeeTom[] all = new EmployeeTom[] {e1,e2,e3};\n\n\t\tFile bf = new File(\"test.avro\");\n\t\t\n\t\ttry {\n\t\t\ttestWrite(bf,all);\n\t\t\ttestRead(bf);\n\t\t}\n\t\tcatch (IOException e) {\n\t\t\tSystem.out.println(\"Main: \"+e.getMessage());\t\t\t\n\t\t}\n\t}\n\t\n}\n<\/pre>\n<p>A final remark. I stored the schema in the same directory as the class files. This allowed the class EmployeeTom to find the schema file. The schema looked like:<\/p>\n<pre>\n{\n  \"type\": \"record\", \n  \"name\": \"Employee\", \n  \"fields\": [\n      {\"name\": \"name\", \"type\": \"string\"},\n      {\"name\": \"age\", \"type\": \"int\"}\n  ]\n}\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Another example shows a similar idea. In this example a stream is created. This stream consists of 3 objects that contain a name and a number. Once the stream is created, it is serialised. In other words: the stream is prepared to be stored. It is stored in a file that is called &#8220;test.avro&#8221;. Before [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":977,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-976","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-nice-to-know"],"_links":{"self":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts\/976","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=976"}],"version-history":[{"count":0,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=\/wp\/v2\/posts\/976\/revisions"}],"wp:attachment":[{"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=976"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=976"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/archief.van-maanen.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=976"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}