Sunday, 8 June 2014

Correct way to check empty string in JAVA

In our day-to-day programming activities, we usually come across multiple scenarios where we need to check whether a string is empty or not.

There are multiple ways to do this check in Java:
  1. equals method: str.equals(“”)
  2. length method: str.length() == 0  //recommended
  3. isEmpty method [From Java 1.6 onward]: str.isEmpty()  //recommended
The recommended way for empty string check is length() or isEmpty() method. We should never use equals(“”) for such a simple requirement.

Let’s have a look on the source code of String class to understand why we should use isEmpty/length:

length() method:
public int length() {
   return count;
}

isEmpty() method:
public boolean isEmpty() {
   return count == 0;
}

equals() method:
public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = count;
        if (n == anotherString.count) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = offset;
            int j = anotherString.offset;
            while (n-- != 0) {
                if (v1[i++] != v2[j++])
                    return false;
            }
            return true;
        }
    }
    return false;
}

If we observe the source code of the above three methods provided by String class, it is very clear that length() is simply a getter method and returns length of the String. It’s obvious that any String having length 0 is always an empty string and so, the following code is able to detect whether the string is empty or not:
str.length() == 0;

The same reason holds behind using isEmpty() method. Moreover this method seems much verbose by its signature itself.

On the other hand, if we take a look on equals method, it seems to be relatively heavyweight. It does a lot of computation and consumes relatively more CPU cycles for the same operation. So, simply for such a basic check, using equals() method will result in lot of waste of CPU cycles.

Do share your thoughts.
Hope this article helped. Happy learning... :)

Wednesday, 7 May 2014

String Constant Pool In Java

String is a special class in Java. It has very high importance in concurrent programming as well because String is immutable. Since, String is immutable, so they should be reused. In order to reuse String objects JVM maintains a special pool called “String literal pool” or “String constant pool” to store references
of String Objects.

There are slight differences in various methods of creating String objects.

1. Creating String Directly [Using String Literals]:    
String str = “hello”;
All the string literals are created and their references are placed in the pool while JVM loads the class. So, the literal(String Object) "hello" will be created in the heap and have a reference in the pool(at the class load time itself) before execution of the following statement
     String str = “hello”;.
Hence, whenever our code tries to create a String literal, JVM checks the String Literal Pool and since the string already exists in the pool, a reference to the pooled instance is returned back to the caller.

So, String literals used in Java code always refers to the pooled object of String pool.
JVM keeps at most one object of any String in the literal pool.

2. Creating String Using Constructor:
String str = new String(“hello”);
In this case since we are creating String object using “new” keyword, String object will be created in heap memory and this is separate from the String Literal Pool. So, it may happen that String Literal Pool might have the equal String object available but using “new” we will always be able to create different String object with same content.

java.lang.String.intern():
It is String literals that get automatically interned/added to the String pool. String objects created with the “new” operator do not refer to objects in the String Literal Pool but can be made to by using String’s intern() method. The String.intern() returns an interned String, that is, one that has an entry in the global String Literal Pool. Using intern(), if the String is not already in the global String Literal Pool, then it will be added.

You can inspect constant pool of a class by running javap -verbose for that class.

e.g.: Following code prints “Hello!” (String literal) on console[Compiled with Java 1.6].
package blog.techcypher.stringpool;

/**
 * 
 * @author abhishek
 *
 */
public class StringLiteral {

    /**
     * main method
     * 
     * @param args
     */
    public static void main(String[] args) {
        System.out.println("Hello!");
    }

}


By inspecting the byte code you can easily see that String Literal “Hello!” resides in the Constant Pool.

Compiled from "StringLiteral.java"
public class blog.techcypher.stringpool.StringLiteral extends java.lang.Object
  SourceFile: "StringLiteral.java"
  minor version: 0
  major version: 50
  Constant pool:
const #1 = class        #2;     //  blog/techcypher/stringpool/StringLiteral
const #2 = Asciz        blog/techcypher/stringpool/StringLiteral;
const #3 = class        #4;     //  java/lang/Object
const #4 = Asciz        java/lang/Object;
const #5 = Asciz        <init>;
const #6 = Asciz        ()V;
const #7 = Asciz        Code;
const #8 = Method       #3.#9;  //  java/lang/Object."<init>":()V
const #9 = NameAndType  #5:#6;//  "<init>":()V
const #10 = Asciz       LineNumberTable;
const #11 = Asciz       LocalVariableTable;
const #12 = Asciz       this;
const #13 = Asciz       Lblog/techcypher/stringpool/StringLiteral;;
const #14 = Asciz       main;
const #15 = Asciz       ([Ljava/lang/String;)V;
const #16 = Field       #17.#19;        //  java/lang/System.out:Ljava/io/PrintStream;
const #17 = class       #18;    //  java/lang/System
const #18 = Asciz       java/lang/System;
const #19 = NameAndType #20:#21;//  out:Ljava/io/PrintStream;
const #20 = Asciz       out;
const #21 = Asciz       Ljava/io/PrintStream;;
const #22 = String      #23;    //  Hello!
const #23 = Asciz       Hello!;
const #24 = Method      #25.#27;        //  java/io/PrintStream.println:(Ljava/lang/String;)V
const #25 = class       #26;    //  java/io/PrintStream
const #26 = Asciz       java/io/PrintStream;
const #27 = NameAndType #28:#29;//  println:(Ljava/lang/String;)V
const #28 = Asciz       println;
const #29 = Asciz       (Ljava/lang/String;)V;
const #30 = Asciz       args;
const #31 = Asciz       [Ljava/lang/String;;
const #32 = Asciz       SourceFile;
const #33 = Asciz       StringLiteral.java;

{
public blog.techcypher.stringpool.StringLiteral();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:   aload_0
   1:   invokespecial   #8; //Method java/lang/Object."<init>":()V
   4:   return
  LineNumberTable:
   line 8: 0

  LocalVariableTable:
   Start  Length  Slot  Name   Signature
   0      5      0    this       Lblog/techcypher/stringpool/StringLiteral;


public static void main(java.lang.String[]);
  Code:
   Stack=2, Locals=1, Args_size=1
   0:   getstatic       #16; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   ldc     #22; //String Hello!
   5:   invokevirtual   #24; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   8:   return
  LineNumberTable:
   line 16: 0
   line 17: 8

  LocalVariableTable:
   Start  Length  Slot  Name   Signature
   0      9      0    args       [Ljava/lang/String;


}

Some Key Points about String Literal Pool:

  1. An object is eligible for garbage collection if it has no live references from active parts of the JVM/application. In the case of String literals, they always have a reference to them from the String Literal Pool and therefore, they are not eligible for garbage collection until the class and its class-loader is unloaded.
  2. All the string literals are created and their references are placed in the pool while JVM loads the class.
Hope this article helped. Happy learning... :)

Monday, 28 April 2014

Recursion and StackOverflowError in Java

Recursion simply means calling the same function inside its own body repeatedly to complete some task. This is one of the standard approaches to achieve some task. While programming using recursion, there is one basic thing one must fix is the base condition to stop the recursion, otherwise we’ll run into an infinite recursion.

Let’s have a look on a very simple recursive function “sum” which can be used to compute the sum of n numbers.

package blog.techcypher.recursion;

/**
 * Returns the sum of 'n' natural numbers.
 * 
 * @author abhishek
 *
 */
public class Sum {
    /**
     * @param n
     * @return
     */
    public static int sum(int n) {
        int sum = 0;
        if (n==1) {
            sum = 1;
        } else {
            sum = n + sum(n-1);
        }
        return sum;
    }

    /**
     * main method to invoke recursion
     * @param args
     */
    public static void main(String[] args) {
        int n = 1000;
        System.out.print("Sum of [" + n + "] naural numbers: ");
        System.out.println(sum(n));
    }

}


Result:
Sum of [1000] naural numbers: 500500

In this example, the base condition is n == 1 i.e. call to sum will be stop when n=1. So, we start summing up numbers starting from n till 1 using recursion.
It works perfectlyJ

But, what will be the depth of recursion for various values of n in the above “sum” function.
Obviously, it will depend on the input. For large values of n like 10000 or more, the program will throw StackOverflowError and JVM may crash.

For n = 10000,
Sum of [10000] naural numbers: Exception in thread "main" java.lang.StackOverflowError
    at blog.techcypher.recursion.Sum.sum(Sum.java:15)

An obvious question comes into mind is - “Why we are getting StackOverflowError for deep recursion…?”

The reason is “The recursive function may run out of the default thread stack size, and such scenario it throws an Error which is StackOverflowError ”.

In Java, the default thread stack size is 512k in the 32-bit VM, and 1024k in the 64-bit VM on Windows. You may refer link to check the default thread stack size for other OS.

So, what should I do, if I am getting StackOverflowError…?

Solutions:
1. Increase the default thread stack size:
    Java provides run-time argument to tune the default thread stack size using –Xss option.
                java –Xss2048k
     This statement means that the stack size will be 2048KB.
     But notice that it may be curable, but a better solution would be to work out how to avoid
     recursing so much.

 2. Change the implementation to an iterative solution:
    Most of the recursive solutions can easily be converted to iterative solutions, which will
     make the code scale to larger inputs much more cleanly. Otherwise we'll really be guessing
     at how much stack to provide, which may not even be obvious from the input.
     This should the most preferred solution.

3. Using Tail Recursion:
    Using tail-call optimization we are able to avoid allocating a new stack frame for a
     function because the calling function will simply return the value that it gets from
     the called function. The most common use is tail-recursion, where a recursive function
     written to take advantage of tail-call optimization can use constant stack space.
     It is supported by many functional languages but it is not supported by Java.  

Coming to the best practices, we should prefer iterative solution over recursion wherever
possible. This will be relatively faster and scalable.

Hope this article helped. Happy learning… J

Friday, 11 April 2014

Using Unix Commands On Windows

In our day to day life, we need to debug various issues with the help of several log files e.g. application logs, server access logs, error logs etc. and during this process we try to fetch relevant information from the log file(s).

While working on windows with such files, we use various editors like Notepad++, JujuEdit etc. But, what if the log files are really big, somewhere in GB’s…? Most of these editors get hanged while opening or they simply refuse to open such big files because these editors support files up to 2 GB.

For such requirements there are few editors like EmEditor, Glogg etc. which can open big files easily. These editors are really good and very fast for big files.

But here clicks a good question - Can we execute 'grep' like Unix command(s) on such big files in our Windows command prompt...? It would be great if we can get such luxury on windows as well.

Fortunately it’s a “Big YES”. There are several open source software’s which can be used to get most of the Unix Commands running on Windows machine in our familiar Windows command prompt. Few such software's are- UnxUtils, Cygwin etc.

I have been using UnxUtils for a long time now on my Windows machine. It’s a lightweight alternative among similar software’s. Getting started with UnxUtils is pretty simple and easy.

Let’s jump directly on the steps to get UnxUtils working on Windows machine:
  1. Download UnxUtils.zip.
  2. Extract the downloaded zip at some location on your windows machine. On my machine I kept it at “D:\Softwares\UnxUtils”
  3. In the PATH environment variable set the path of utility till “wbin” directory. On my machine I set the PATH variable as “D:\Softwares\UnxUtils\usr\local\wbin”
  4. Before moving ahead, verify the PATH setting for “wbin” directory.

          


It's all set… J
Now, we can to use Unix commands in our Windows command prompt. Let’s quickly check the “grep” command help in Windows Command Prompt:



Hope this article helped. Happy learning!

Sunday, 30 March 2014

Creating Object Pool in Java

In this post, we will take a look at how we can create an object pool in Java.

In recent times, JVM performance has been multiplied manifold and so object creation is no longer considered as expensive as it was done earlier. But there are few objects, for which creation of new object still seems to be slight costly as they are not considered as lightweight objects. e.g.: database connection objects, parser objects, thread creation etc. In any application we need to create multiple such objects. Since creation of such objects is costly, it’s a sure hit for the performance of any application. It would be great if we can reuse the same object again and again.

Object Pools are used for this purpose. Basically, object pools can be visualized as a storage where we can store such objects so that stored objects can be used and reused dynamically. Object pools also controls the life-cycle of pooled objects.

As we understood the requirement, let’s come to real stuff. Fortunately, there are various open source object pooling frameworks available, so we do not need to reinvent the wheel.

In this post we will be using apache commons pool to create our own object pool. At the time of writing this post Version 2.2 is the latest, so let us use this.

The basic thing we need to create is-
 1. A pool to store heavyweight objects (pooled objects).
 2. A simple interface, so that client can -
a.)    Borrow pooled object for its use.
b.)    Return the borrowed object after its use.

Let’s start with Parser Objects.
Parsers are normally designed to parse some document like xml files, html files or something else.
Creating new xml parser for each xml file (having same structure) is really costly. One would really like to reuse the same (or few in concurrent environment) parser object(s) for xml parsing.
In such scenario, we can put some parser objects into pool so that they can be reused as and when needed.

Below is a simple parser declaration:
package blog.techcypher.parser;

/**
 * Abstract definition of Parser.
 * 
 * @author abhishek
 *
 */
public interface Parser<E, T> {

    /**
     * Parse the element E and set the result back into target object T.
     * 
     * @param elementToBeParsed
     * @param result
     * @throws Exception
     */
    public void parse(E elementToBeParsed, T result) throws Exception;
    
    
    /**
     * Tells whether this parser is valid or not. This will ensure the we
     * will never be using an invalid/corrupt parser.
     * 
     * @return
     */
    public boolean isValid();
    
    
    /**
     * Reset parser state back to the original, so that it will be as
     * good as new parser.
     * 
     */
    public void reset();
}


Let’s implement a simple XML Parser over this as below:
package blog.techcypher.parser.impl;

import blog.techcypher.parser.Parser;

/**
 * Parser for parsing xml documents.
 * 
 * @author abhishek
 *
 * @param <E>
 * @param <T>
 */
public class XmlParser<E, T> implements Parser<E, T> {
    private Exception exception;

    @Override
    public void parse(E elementToBeParsed, T result) throws Exception {
        try {
            System.out.println("[" + Thread.currentThread().getName()+ "]: Parser Instance:" + this);
            // Do some real parsing stuff.
            
        } catch(Exception e) {
            this.exception = e;
            e.printStackTrace(System.err);
            throw e;
        }
    }

    @Override
    public boolean isValid() {
        return this.exception == null;
    }

    @Override
    public void reset() {
        this.exception = null;
    }

}


At this point, as we have parser object we should create a pool to store these objects.

Here, we will be using GenericObjectPool  to store the parse objects. Apache commons pool has already build-in classes for pool implementation. GenericObjectPool can be used to store any object. Each pool can contain same kind of object and they have factory associated with them.
GenericObjectPool provides a wide variety of configuration options, including the ability to cap the number of idle or active instances, to evict instances as they sit idle in the pool, etc.

If you want to create multiple pools for different kind of objects (e.g. parsers, converters, device connections etc.) then you should use GenericKeyedObjectPool.
package blog.techcypher.parser.pool;

import org.apache.commons.pool2.PooledObjectFactory;
import org.apache.commons.pool2.impl.GenericObjectPool;
import org.apache.commons.pool2.impl.GenericObjectPoolConfig;

import blog.techcypher.parser.Parser;

/**
 * Pool Implementation for Parser Objects.
 * It is an implementation of ObjectPool.
 * 
 * It can be visualized as-
 * +-------------------------------------------------------------+
 * | ParserPool                                                  |
 * +-------------------------------------------------------------+
 * | [Parser@1, Parser@2,...., Parser@N]                         |
 * +-------------------------------------------------------------+
 * 
 * @author abhishek
 *
 * @param <E>
 * @param <T>
 */
public class ParserPool<E, T> extends GenericObjectPool<Parser<E, T>>{

    /**
     * Constructor.
     * 
     * It uses the default configuration for pool provided by
     * apache-commons-pool2.
     * 
     * @param factory
     */
    public ParserPool(PooledObjectFactory<Parser<E, T>> factory) {
        super(factory);
    }

    /**
     * Constructor.
     * 
     * This can be used to have full control over the pool using configuration
     * object.
     * 
     * @param factory
     * @param config
     */
    public ParserPool(PooledObjectFactory<Parser<E, T>> factory,
            GenericObjectPoolConfig config) {
        super(factory, config);
    }

}


As we can see, the constructor of pool requires a factory to manage lifecycle of pooled objects. So we need to create a parser factory which can create parser objects.

Commons pool provide generic interface for defining a factory(PooledObjectFactory). PooledObjectFactory create and manage PooledObjects. These object wrappers maintain object pooling state, enabling PooledObjectFactory methods to have access to data such as instance creation time or time of last use.

A DefaultPooledObject is provided, with natural implementations for pooling state methods. The simplest way to implement a PoolableObjectFactory is to have it extend BasePooledObjectFactory. This factory provides a makeObject() that returns wrap(create()) where create and wrap are abstract. We provide an implementation of create to create the underlying objects that we want to manage in the pool and wrap to wrap created instances in PooledObjects.

So, here is our factory implementation for parser objects-
package blog.techcypher.parser.pool;

import org.apache.commons.pool2.BasePooledObjectFactory;
import org.apache.commons.pool2.PooledObject;
import org.apache.commons.pool2.impl.DefaultPooledObject;

import blog.techcypher.parser.Parser;
import blog.techcypher.parser.impl.XmlParser;

/**
 * Factory to create parser object(s).
 * 
 * @author abhishek
 *
 * @param <E>
 * @param <T>
 */
public class ParserFactory<E, T> extends BasePooledObjectFactory<Parser<E, T>> {

    @Override
    public Parser<E, T> create() throws Exception {
        return new XmlParser<E, T>();
    }

    @Override
    public PooledObject<Parser<E, T>> wrap(Parser<E, T> parser) {
        return new DefaultPooledObject<Parser<E,T>>(parser);
    }
    
    @Override
    public void passivateObject(PooledObject<Parser<E, T>> parser) throws Exception {
        parser.getObject().reset();
    }
    
    @Override
    public boolean validateObject(PooledObject<Parser<E, T>> parser) {
        return parser.getObject().isValid();
    }

}


Now, at this point we have successfully created our pool to store parser objects and we have a factory as well to manage the life-cycle of parser objects.

You should notice that, we have implemented couple of extra methods-
1.  boolean validateObject(PooledObject<T> obj): This is used to validate an object borrowed from 
     the pool or returned to the pool based on configuration. By default, validation remains off.
     Implementing this ensures that client will always get a valid object from the pool.


2. void passivateObject(PooledObject<T> obj): This is used while returning an object back to pool.
    In the implementation we can reset the object state, so that the object behaves as good as a new
    object on another borrow.

Since, we have everything in place, let’s create a test to test this pool. Pool clients can –
 1. Get object by calling pool.borrowObject()
 2. Return the object back to pool by calling pool.returnObject(object)

Below is our code to test Parser Pool-
package blog.techcypher.parser;
import static org.junit.Assert.fail;

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;

import junit.framework.Assert;

import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
import org.junit.Before;
import org.junit.Test;

import blog.techcypher.parser.pool.ParserFactory;
import blog.techcypher.parser.pool.ParserPool;


/**
 * Test case to test-
 *  1. object creation by factory
 *  2. object borrow from pool.
 *  3. returning object back to pool.
 * 
 * @author abhishek
 *
 */
public class ParserFactoryTest {

    private ParserPool<String, String> pool;
    private AtomicInteger count = new AtomicInteger(0);
    
    @Before
    public void setUp() throws Exception {
        GenericObjectPoolConfig config = new GenericObjectPoolConfig();
        config.setMaxIdle(1);
        config.setMaxTotal(1);
        
        /*---------------------------------------------------------------------+
        |TestOnBorrow=true --> To ensure that we get a valid object from pool  |
        |TestOnReturn=true --> To ensure that valid object is returned to pool |
        +---------------------------------------------------------------------*/
        config.setTestOnBorrow(true);
        config.setTestOnReturn(true);
        pool = new ParserPool<String, String>(new ParserFactory<String, String>(), config);
    }

    @Test
    public void test() {
        try {
            int limit = 10;
            
            ExecutorService es = new ThreadPoolExecutor(10, 10, 0L, TimeUnit.MILLISECONDS, new ArrayBlockingQueue<Runnable>(limit));
            
            for (int i=0; i<limit; i++) {
                Runnable r = new Runnable() {
                    @Override
                    public void run() {
                        Parser<String, String> parser = null;
                        try {
                            parser = pool.borrowObject();
                            count.getAndIncrement();
                            parser.parse(null, null);
                        
                        } catch (Exception e) {
                            e.printStackTrace(System.err);
                        
                        } finally {
                            if (parser != null) {
                                pool.returnObject(parser);
                            }
                        }
                    }
                };
                
                es.submit(r);
            }
            
            es.shutdown();
            
            try {
                es.awaitTermination(1, TimeUnit.MINUTES);
                
            } catch (InterruptedException ignored) {}
            
            System.out.println("Pool Stats:\n Created:[" + pool.getCreatedCount() + "], Borrowed:[" + pool.getBorrowedCount() + "]");
            Assert.assertEquals(limit, count.get());
            Assert.assertEquals(count.get(), pool.getBorrowedCount());
            Assert.assertEquals(1, pool.getCreatedCount());
            
        } catch (Exception ex) {
            fail("Exception:" + ex);
        }
    }

}


Result:
[pool-1-thread-1]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-2]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-3]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-4]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-5]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-8]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-7]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-9]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-6]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
[pool-1-thread-10]: Parser Instance:blog.techcypher.parser.impl.XmlParser@fcfa52
Pool Stats:
 Created:[1], Borrowed:[10]


You can easily see that single parser object was created and reused dynamically.

Commons Pool 2 stands far better in term of performance and scalability over version 1.
Also, version 2 includes robust instance tracking and pool monitoring.
Commons Pool 2 requires JDK 1.6 or above. There are lots of configuration options to control and manage the life-cycle of pooled objects.

And so ends our long post… J

Hope this article helped. Happy learning!