Monthly Archives: June 2014

metastore port 4
Hive Metastore: In Hive, metastore is the central repository to store metadata for hive tables/partitions. Any datastore that has a JDBC driver can be used as a metastore. By default, the metastore service runs in the same JVM as the Hive service and contains an embedded Derby database instance backed by the local […]

Hive Metastore Configuration

mysql user 1
Mysql installation on Ubuntu: In this post we will discuss about MySQL Installation and configuration on Ubuntu machine using apt package repository. Issue below commands in the same order as given below to install MySQL on Ubuntu. [crayon-5e5955df77282596285807/] Below is the snapshot of above install command. Start MySQL: For First run, […]

MySQL Installation On Ubuntu

hive help 4
In this post, we will discuss about Hive installation on Ubuntu in pseudo distributed mode. We are installing latest release of Hive at the time of writing this, which is hive-0.13.1 version. HCatalog is included with Hive, starting with Hive release 0.11.0, so optionally, we can setup configuration required for HCatalog […]

Hive Installation in Pseudo Distribution Mode

This post is continuation for previous post on Java Interview Questions and Answers. We will try to focus only the concepts that are required for Hadoop Developers and Hadoop Administrators. Java Interview Questions and Answers: 1. What is the difference between a JDK and a JRE ? JDK, Java Development Kit […]

Java Interview Questions and Answers – Part 2

In This post, we will discuss about basic details about google’s snappy compression technique. Snappy Compression Introduction: Snappy is one of the fast compression/decompression tools. It formerly known as Zippy. Snappy is written in C++, and is widely used inside Google, in everything from BigTable and MapReduce to our internal RPC […]

Snappy Compression Tool

In this post, we will try to cover some of the important Java interview questions and answers that are required for aspirants facing hadoop developer interviews. We will touch base only the concepts that are needed for an hadoop developer or administrator. Java Interview Questions and Answers for Experienced: 1. […]

Java Interview Questions and Answers

Hive Architecture: Below is the hive level architecture of Hive:   In Hive distribution, we can find the below components majorly. CLI        — Command Line Interface. It is the most common way of interacting with Hive. (Hive shell) This is the default service. HWI      — Hive […]

Hive Architecture

hive-arch 1
In this post we will discuss about basic details of Hive Overview . What is Hive ? : Hive is an important tool in the Hadoop ecosystem and it is a framework for data warehousing on top of Hadoop. Hive is initially developed at Facebook but now, it is an open source […]

Hive Overview

In Hbase cluster, we can start hbase daemons with command or [crayon-5e5955df79fd5219435654/] But in pseudo distribution mode (hbase.cluster.distributed=false), only HMaster daemon will be triggered but not the HRegionServer daemon or HQuorumPeer daemon. When we start the daemons with or individual commands for region server will not trigger daemon because […]

Hbase Daemons in Pseudo Distribution Mode

This post is a continuation for previous post on Hbase Installation. In the previous we have discussed about Hbase installation in pseudo distribution mode and in this post we will learn how to install and configure Hbase in fully distribution mode. Prerequisites:  JDK 1.6 or later versions of Java installed on […]

Hbase Installation in Fully Distribution Mode

Error Scenario: Datanode denied communication with namenode: We usually get this error message while adding a data node to an existing HDFS cluster or at the time of cluster setup. Below is the snapshot of this error message from log file in data node. [crayon-5e5955df7aca4884250812/] Root Cause: This error message is […]

Datanode denied communication with namenode