Monthly Archives: March 2015

In this post we will discuss about NoSQL Databases and their characteristics and differences of NoSQL vs SQL databases. What is NoSQL NoSQL stands for Not Only SQL and provides mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational […]

NoSQL Databases

Hive Built In Functions Functions in Hive are categorized as below. Mathematical Functions: These functions mainly used to perform mathematical calculations. Date Functions: These functions are used to perform operations on date data types like adding the number of days to the date etc. String Functions: These functions are used […]

Hive Built In Functions

In this post we will discuss about common problem in installing cloudera manager 5.3.2 on Ubuntu 14.04 machine and solution for the root cause. Problem: When installing Cloudera Manager 5.3.2 on Ubuntu 14.04.2 Machine getting below error messages after giving root privileges on ssh configuration page. Exhausted available authentication methods […]

Exhausted available authentication methods

In this post, we will discuss about Hive Authorization Models and Hive security. Before discussing about Hive Authorization Models lets note the difference between authentication and authorization. Authentication – Verifying the identity of the user, whether the logged in user is real user or not. Authorization – Verifying whether a user […]

Hive Authorization Models and Hive Security

QuerySurge Configuring Connections: SQL Server When you create a QuerySurge Connection, the Add Connection Wizard will guide you through the process. Different types of QuerySurge connections require different types of information. For an SQL Server Connection, you will need the following information (check with a DBA or other knowledgeable resource […]

QuerySurge Configuring Connections

QuerySurge single machine installation – Windows 1. Download the QuerySurge Installer to the machine you want to install QuerySurge on. 2. Double click on the QuerySurge Installer to start the installation process.   Click “Next” to accept the License Agreement, and “Next” again to set the installation directory. 3. On […]

QuerySurge single machine installation – Windows

Querysurge Tool for Hadoop Testing The QuerySurge CASE tool developed by RTTS is a tool that assists the DW testers in preparing and scheduling query pairs to compare data transformed from the source to the destination, for example; preparing a query pair one that runs on a DS and the […]

Querysurge Tool for Hadoop Testing

In this post, we will discuss about one of common hive clients, JDBC client for both HiveServer1 (Thrift Server) and HiveServer2. Use of HiveServer2 is recommended as HiveServer1 has several concurrency issues and lacks some features available in HiveServer2. JDBC Data Types The following table lists the data types implemented for HiveServer/HiveServer2 […]

Hive JDBC Client Example

In this post we will discuss about HiveServer2 Beeline Introduction. As of hive-0.11.0, Apache Hive started decoupling HiveServer2 from Hive. It is because of overcoming the existing Hive Thrift Server. Below are the Limitations of Hive Thrift Server 1 No Sessions/Concurrency Essentially need 1 server per client Security Client Interface […]

HiveServer2 Beeline Introduction

In this post we will provide solution to famous N-Grams calculator in Mapreduce Programming. Mapreduce Use case for N-Gram Statistics. N-Gram: In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items can be phonemes, syllables, […]

Mapreduce Use Case for N-Gram Statistics

PageRank is a way of measuring the importance of website pages. PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other […]

Mapreduce Use Case to Calculate PageRank