What’s coming in Apache Kafka 0.8.2

Confluent

I am very excited to tell you about the forthcoming 0.8.2 release of Apache Kafka. Kafka is a fault-tolerant, low-latency, high-throughput distributed messaging system used in data pipelines at several companies. Kafka became a top-level Apache project in 2012 and was originally created at LinkedIn, where it forms a critical part of LinkedIn’s infrastructure and transmits data to all systems and applications. The project is currently under active development from a diverse group of contributors.

Since there are many new features in 0.8.2, we released 0.8.2-beta. The final release will be done when 0.8.2 is stable.

Here is a quick overview of the notable work in this release.

New features

New producer

The JVM clients that Kafka ships haven’t changed much since Kafka was originally built. Over time, we have realized some of the limitations and problems that came both from the design of these clients and…

View original post 1,444 more words

More Effective Java With Joshua Bloch

Originally Shared at : [http://saurzcode.in/2014/11/02/more-effective-java-with-joshua-bloch/]

Many of us already agree how great the book Effective Java by Joshua Bloch is and it’s a must read for every Java Developer out there whether you have just started or working for a while.While reading the book and researching on some of the Items listed in the book, I came across this Interview with Joshua Bloch Link at Oracle , in which he speaks about some of the great things in the book and shares his knowledge on some great topics in the language.This should be a good read for someone interested to explore more while reading this book or afterwards –

Here is the link –

http://www.oracle.com/technetwork/articles/java/bloch-effective-08-qa-140880.html


Also take a looks at –

What can I learn right now in just 10 minutes that could be useful for the rest of my life?

Answer by Vishnu Haridas:

This one I discovered recently: If you get an unusable headphones, don’t throw it away. You can cut & remove the wire, and use the TRS jack as the FM antenna for your smartphone.

All you need is to plug-in this TRS jack into your phone’s headphones plug, and open the FM radio app, then start listening through your loudspeaker.

How it works: The headphones wire works as the FM antenna for mobile phones. Usually FM transmission will have a very strong signal, which needs a small piece of wire to receive the signal.

What can I learn right now in just 10 minutes that could be useful for the rest of my life?

Reading List : Hadoop and Big Data Books

Hadoop Reading List


Hadoop: The Definitive Guide

Hadoop: The Definitive Guide

“Hadoop: The Definitive Guide” is the ideal guide for anyone who wants to know about the Apache Hadoop  and all that can be done with it.Good book on basics of Hadoop (HDFS, MapReduce & other related technologies). This book provides all necessary details to start work with Hadoop, program using it

“Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk.” — Doug Cutting, Hadoop Founder, Yahoo!

Hadoop Operations: A Guide for Developers and Administrators

This book is a great resource for getting Hadoop up and running in a serious production environment.

Hadoop In Action

Hadoop in Action

If  you find Hadoop: The Definitive Guide a little intimidating , get your hands on this book and then go ahead with some practical examples.

Hadoop Essentials: A Quantitative Approach

This book adopts a unique approach to helping developers and CS students learn Hadoop MapReduce programming fast. Rather than filled with disjointed, piecemeal code snippets to show Hadoop MapReduce programming features one at a time, it is designed to place your total Hadoop MapReduce programming learning process in a common application context of mining customer spending patterns ensconced in large volumes of credit card transaction record data

Hadoop For Dummies

“Hadoop For Dummies” helps readers understand the value of big data, make a business case for using Hadoop, navigate the Hadoop ecosystem, and build and manage Hadoop applications and clusters

Hadoop in Practice

“Hadoop in Practice” collects nearly 100 Hadoop examples and presents them in a problem/solution format.

Big Data Analytics with R and Hadoop

It is a brief introduction to R and Hadoop and to use them together to solve big data problems.

MapReduce Design Patterns

Mapreduce Design Patterns

This book brings together a collection of MapReduce design patterns.

“A clear exposition of MapReduce programs for common data processing patterns—this book is indespensible for anyone using Hadoop.-Tom White”

Hadoop Beginner’s Guide

This book is  a good starting point for Beginners covering basic Hadoop concepts and tools.

Optimizing Hadoop for MapReduce

Read this book to learn how to configure your Hadoop cluster to run optimal MapReduce jobs.

Hadoop Real-World Solutions Cookbook

“Hadoop Real-World Solutions Cookbook ” serves recipes for working with Hadoop. The book has 10 different chapters dealing with the basics such as setting up Hadoop, getting data into and out of Hadoop and working with HDFS.

Pro Hadoop

This book gives the ins and outs of MapReduce; how to structure a cluster, design, and implement the Hadoop file system; and how to build your first cloud computing tasks using Hadoop

Mastering Hadoop

Another book which gives you basics of HadoopMapReduce and gives knowledge on how to optimize your MapReduce jobs.

Books on Hadoop Ecosystem


Listing down few books focusing on Hadoop Ecosystem projects below –

HBase : The Definitive Guide

HBase: The Definitive Guide

Programming Hive

Programming Pig

Apache Sqoop Cookbook

ZooKeeper

Apache Hadoop Yarn


Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2 

 

Please share your reviews/experiences with some of the books listed above in comments.

 

Related articles :

The history of Hadoop: From 4 nodes to the future of data

Gigaom

Depending on how one defines its birth, Hadoop is now 10 years old. In that decade, Hadoop has gone from being the hopeful answer to Yahoo’s (s yhoo) search-engine woes to a general-purpose computing platform that’s poised to be the foundation for the next generation of data-based applications.

Alone, Hadoop is a software market that IDC predicts will be worth $813 million in 2016 (although that number is likely very low), but it’s also driving a big data market the research firm predicts will hit more than $23 billion by 2016. Since Cloudera launched in 2008, Hadoop has spawned dozens of startups and spurred hundreds of millions in venture capital investment since 2008.

In this four-part series, we’ll explain everything anyone concerned with information technology needs to know about Hadoop. Part I is the history of Hadoop from the people who willed it into existence and took it mainstream. Part…

View original post 4,032 more words

String Interning – What ,Why and When ?

What is String Interning 

String Interning is a method of storing only one copy of each distinct String Value, which must be immutable.

In Java String class has a public method intern() that returns a canonical representation for the string object. Java’s String class privately maintains a pool of strings, where String literals are automatically interned.

When the intern() method is invoked on a String object it looks the string contained by this String object in the pool, if the string is found there then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

The intern() method helps in comparing two String objects with == operator by looking into the pre-existing pool of string literals, no doubt it is faster than equals() method. The pool of strings in Java is maintained for saving space and for faster comparisons. Normally Java programmers are advised to use equals(), not ==, to compare two strings. This is because == operator compares memory locations, while equals() method compares the content stored in two objects.

Why and When to Intern ?

Thought Java automatically interns all Remember that we only need to intern strings when they are not constants, and we want to be able to quickly compare them to other interned strings. The intern() method should be used on strings constructed with new String() in order to compare them by == operator.

Let’s take a look at the following Java program to understand the intern() behavior.

public class TestString {

	public static void main(String[] args) {
		String s1 = "Test";
		String s2 = "Test";
		String s3 = new String("Test");
		final String s4 = s3.intern();
		System.out.println(s1 == s2);
		System.out.println(s2 == s3);
		System.out.println(s3 == s4);
		System.out.println(s1 == s3);
		System.out.println(s1 == s4);
		System.out.println(s1.equals(s2));
		System.out.println(s2.equals(s3));
		System.out.println(s3.equals(s4));
		System.out.println(s1.equals(s4));
		System.out.println(s1.equals(s3));
	}

}


//Output
true
false
false
false
true
true
true
true
true
true


Recommended Readings for Hadoop

Originally Posted here – [http://saurzcode.in/2014/02/04/recommended-readings-for-hadoop/]

I am writing this series to mention some of the recommended reading to understand Hadoop , its architecture, minute details of cluster setup etc.

Understanding Hadoop Cluster Setup and Network – Brad Hedlund, with his expertise in Networks, provide minute details of cluster setup, data exchange mechanisms of a typical Hadoop Cluster Setup.

MongoDB and Hadoop – Webinar by Mike O’Brien,Software Engineer, MongoDB on how MongoDB and Hadoop can be used together , using core MapReduce and Pig and Hive as well.

Please post comments if you have come across some great article/webinar link, which explains things in great details with ease.