Avro Serialization Ruby Example 1

In the previous posts under Avro category we have discussed about Java API for Avro serialization ruby example. As there is no need for the code generation for schema evolution, we can use any other language for interacting with avro serialization and deserialization.

In this post, we will provide a basic introduction for Avro serialization and deserialization via Ruby API. And it is pretty easier than Java API to create avro data file and reading the contents back. In Ruby, there is no need for compiling the code and build jar files to execute it.


For testing examples in this post, we need ruby installed on our machine. and also we need to install avro gem for ruby repository.  Below are the commands to install these on Ubuntu.

Below is the screenshot of terminal performing the above activities.

Ruby Avro Installation

Avro Serialization – Creating Avro Data File:

  • Lets define a simple schema to test examples in this section. Schema with record type with three fields as shown below. Copy the below lines of code into item.avsc file. “.avsc” is the conventional naming standard for avro schema definition files.

  • Below are a set of ruby lines that generates avro data records using the above schema item.avsc. Copy the below lines of ruby code into generate_data.rb file. “.rb”  is the conventional naming standard for ruby files.

The above ruby code is self explanatory and appropriate comments are added to each line, so we are not concentrating on the explanation of ruby code syntax. Once the code is saved in generate_data.rb file we can execute this code with below command.

Now we can see the items.avro data file got created in the same directory.

  • View the contents of an avro data file in JSON format with the help of tojson tool from avro-tools-1.7.7.jar as shown below.

 Below is the screen shot of JSON output from items.avro file.

avro data in JSON format

Avro Deserialization – Reading Avro Data File:

Reading the contents of an avro data file through Ruby API is very simple. Copy the below lines of ruby code into read_data.rb file.

The above code is self explanatory. Run the ruby code with below command.

Below is the snapshot of output from above command in terminal. All the output fields in below screen are tab delimited.

Ruby Output

So, we have successfully installed ruby, avro gem and tested a simple avro serialization & deserialization.

Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a Reply to bill Cancel reply

Your email address will not be published. Required fields are marked *

One thought on “Avro Serialization Ruby Example

  • bill

    any idea of the ruby avro gem supports snappy? At least in my attempts it did not work.

    irb(main):006:0> dr = Avro::DataFile::Reader.new(file,reader)
    Avro::DataFile::DataFileError: Unknown codec: “snappy”
    from /home/foo/.gem/ruby/1.9.1/gems/avro-1.7.7/lib/avro/data_file.rb:71:in get_codec'
    from /home/foo/.gem/ruby/1.9.1/gems/avro-1.7.7/lib/avro/data_file.rb:223:in
    from (irb):6:in new'
    from (irb):6
    from /opt/ruby-1.9.3-p484/bin/irb:12:in

    Also tried it with the snappy gem loaded ahead of avro, no luck there either