Yearly Archives: 2014


Cloudera-vs-HortonWorks-vs-MapR
Most Popular Hadoop Distributions Currently there are lot of Hadoop distributions available in the big data market, but the major free open source distribution is from Apache Software Foundation. And even remaining hadoop distribution companies provide free versions of Hadoop, and also provide customized hadoop distributions suitable for client organization […]

Most Popular Hadoop Distributions


Datastax logo 1
In the previous post we have discussed about brief introduction to Big Data, and now we will discuss about Big Data Challenges along with its characteristics. Before going into big data challenges, we will briefly go through the characteristics of Big data. Big Data Characteristics Often Big data characteristics are […]

Big Data Challenges


IDC_Hadoop 3
We have been discussing all technical details on hadoop and its eco system tools in all categories of this site till now. To be successful for any hadoop developer, it is very important to focus on the data part in addition to technical details of Hadoop architecture and its sub-components. In […]

Big Data Introduction



1
In this post we will discuss about one of the important commands in Apache Sqoop, Sqoop Import Command Arguments with examples. This documentation is applicable for sqoop versions 1.4.5 or later because earlier versions doesn’t support some of the below mentioned arguments to import command As of Sqoop 1.4.5 version, […]

Sqoop Import Command Arguments


Percentage Table Sampling 14
In our previous post we have discussed about partitioning in Hive, now we will focus on Bucketing In Hive, which is another way of giving more fine grained structure to Hive tables. Bucketing in Hive Usually Partitioning in Hive offers a way of segregating hive table data into multiple files/directories. […]

Bucketing In Hive


14
In this post, we will discuss about one of the most critical and important concept in Hive, Partitioning in Hive Tables. Partitioning in Hive Table partitioning means dividing table data into some parts based on the values of particular columns like date or country, segregate the input records into different files/directories […]

Partitioning in Hive




String data types 7
In this post, we will discuss about all Hive Data Types With Examples for each data type. Hive supports most of the primitive data types supported by many relational databases and even if anything are missing, they are being added/introduced to hive in each release. Hive Data Types With Examples Hive […]

Hive Data Types With Examples


Hive Skewed Tables Example
In this post, we will discuss about hive table commands with examples. This post can be treated as sequel to the previous post Hive Database Commands. Hive Table Creation Commands Introduction to Hive Tables In Hive, Tables are nothing but collection of homogeneous data records which have same schema for […]

Hive Table Creation Commands



Drop Database in Hive 1
In this post, we will discuss about Hive Database Commands (Create/Alter/Use/Drop Database) with some examples for each statement. All these commands and their options are from hive-0.14.0 release documentations. So, in order to use these commands with all the options described below we need at least hive-0.14.0 release. Hive Database Commands […]

Hive Database Commands





Bar charts in qlikview 1
In this post we will discuss about basic introduction to Qlikview BI tool and Qlikview Integration with hadoop hive. In this post we will use Cloudera Hive and its jdbc drivers/connectors to connect with Qlikview and we will see sample table retrieval from cloudera hadoop hive database. QlikView Overview What […]

QlikView Integration with Hadoop


In this post, we will discuss about the details on communication between two nodes in a network via SSH and executing/running remote commands over SSH on a remote machine. These two nodes in the cluster can be treated as server/client machines for easy understanding. To allow secure communications between Server […]

Run Remote Commands over SSH


This post provides a very brief notes on Unix Shell Scripting. As this topic is very well described in many text books,we are not going much deep into the details of each point. This post is for quick review/revision/reference of common Unix commands or Unix Shell Scripting. Unix Shell Scripting […]

Brief Notes on Unix Shell Scripting Concepts