Easy way to load bulk data


Hadoop Eco System Forums Hadoop Discussion Forum Easy way to load bulk data

Tagged: , ,

This topic contains 8 replies, has 2 voices, and was last updated by Profile photo of Subramaniyabharathi Subramaniyabharathi 4 years, 10 months ago.

Viewing 9 posts - 1 through 9 (of 9 total)
  • Author
    Posts
  • #1653 Reply
    Profile photo of Subramaniyabharathi
    Subramaniyabharathi
    Participant

    Dear All,

    Please Help me. I am new to Hadoop, HBase and Hive. But Presently, I am in a situation to load 34GB of data to hbase and query it. Is there any easy way to accomplish this.

    #1655 Reply
    Profile photo of Siva
    Siva
    Keymaster

    Hi, If you are using Hive 0.12 or later version then you can use HCatalog to create HBase table via HBaseHcatStorage Handler. Or if your hive version is older than 0.12, then go for HBase integration with Hive and then create a Hbase table via Hive shell and insert the records into Hive table.

    For the procedure of loading data into Hbase table from Hive table please refer –> http://staging.hadooptutorial.info/hbase-integration-with-hive/
    but keep in mind that setting below property in hive will make inserting records into table faster, but this comes at the cost of data loss if any HBase failure occurs
    set hive.hbase.wal.enabled=false;

    #1656 Reply
    Profile photo of Subramaniyabharathi
    Subramaniyabharathi
    Participant

    Hi,
    Can you please elaborate the process of using HCatalog. Am using apache-hive-0.13.1. Because when I tried Hbase integration with hive which is explained in this website it takes lot of time to copy the file. I started at 11 am in the morning still it is copying. I am totally new to these environment. Please help me.

    #1658 Reply
    Profile photo of Siva
    Siva
    Keymaster

    You might have set below property in hive before inserting records into Hive table.

    set hive.hbase.wal.enabled=false;

    This will speed up insertion of records.

    For detailed procedure on HCatalog, you can refer the post from link –> http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.7/bk_user-guide/content/user-guide-hbase-import-2.html

    #1662 Reply
    Profile photo of Subramaniyabharathi
    Subramaniyabharathi
    Participant

    Hi
    Thank you so much for your advice and help Mr.Siva.. Do you know how to add a row key in hbase table. we need a unique key right? but my csv file contains only 3 data column. need to add another column in hbase table as unique rowkey. how to accomplish this?

    #1663 Reply
    Profile photo of Subramaniyabharathi
    Subramaniyabharathi
    Participant

    Hi
    Thank you so much for your advice and help Mr.Siva.. Do you know how to add a row key in hbase table. we need a unique key right? but my csv file contains only 3 data column. need to add another column in hbase table as unique rowkey. how to accomplish this?

    #1665 Reply
    Profile photo of Subramaniyabharathi
    Subramaniyabharathi
    Participant

    Is there anyway to add auto increment column to hive table?

    #1666 Reply
    Profile photo of Siva
    Siva
    Keymaster

    Hi, There is no direct AUTO INCREMENT operator implemented in Hive yet. But we can achieve this by building custom UDF. Use below code to build UDF.

    import org.apache.hadoop.hive.ql.exec.UDF;
    import org.apache.hadoop.hive.ql.udf.UDFType;

    @UDFType(stateful = true)
    public class AutoIncrementUDF extends UDF {

    int lastValue;

    public int evaluate() {
    lastValue++;
    return lastValue;
    }
    }

    Compile this by adding hive-exec-*.jar file to classpath and build jar and add the jar in hive via

    ADD JAR AutoIncrUDF.jar;

    and create a temporary function to use the above created class.
    CREATE TEMPORARY FUNCTION incr AS ‘AutoIncrementUDF’;

    And use that function in your select statements as incr(), thus you can auto increment values in a column.

    I will publish a post on this topic soon. for detailed information.

    #1667 Reply
    Profile photo of Subramaniyabharathi
    Subramaniyabharathi
    Participant

    Hi,
    Thank you so much Mr.Siva. I will try this. Thanks a lot.

Viewing 9 posts - 1 through 9 (of 9 total)
Reply To: Easy way to load bulk data
Your information: