Java Rumblings: LinkedHashMap

Showing posts with label LinkedHashMap. Show all posts

Saturday, 28 May 2011

LinkedHashMap in java

New to J2SE 1.4, the LinkedHashMap class is used in exactly the same way as the HashMap class. It has no additional methods but it does have one additional constructor that we will discuss in a moment. The basic difference between HashMap and LinkedHashMap is that the LinkedHashMap maintains the order of the items added to the Map. It does this by maintaining a doubly linked list containing the hash and the original order of the items. According to Sun, the LinkedHashMap should run nearly as fast as the HashMap.

The LinkedHashMap has one additional constructor that takes an additional boolean parameter. This allows you to construct a LinkedHashMap that maintains items, not in the order that they are added to the Map but rather in the order in which they are accessed.

Example of LinkedHashMap

LinkedHashMap example in java

This example shows LinkedHashMap in java:

public class JavaLinkedHashMapExample {
    public static void main(String[] args) {
        //create object of LinkedHashMap
        LinkedHashMap lHashMap = new LinkedHashMap();
        //put the values 
        lHashMap.put("One", new Integer(1));
        lHashMap.put("Two", new Integer(2));
        //get the value
        Object obj = lHashMap.get("One");
        System.out.println(obj); 
        //iterate over linked hash map values
        Collection c = lHashMap.values();
        
        //obtain an Iterator for Collection
        Iterator itr = c.iterator();
        
         
        
        //iterate through LinkedHashMap values iterator        
        while(itr.hasNext())        
            System.out.println(itr.next());
    }
}

Saturday, 14 May 2011

Performance of Map interface implementations

Hashtable

An instance of Hashtable has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. Note that the hash table is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. The initial capacity and load factor parameters are merely hints to the implementation. The exact details as to when and whether the rehash method is invoked are implementation-dependent.

HashMap

This implementation provides constant-time [ Big O Notation is O(1) ] performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.
Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

TreeMap

The TreeMap implementation provides guaranteed log(n) [ Big O Notation is O(log N) ] time cost for the containsKey, get, put and remove operations.

LinkedHashMap

A linked hash map has two parameters that affect its performance: initial capacity and load factor. They are defined precisely as for HashMap. Note, however, that the penalty for choosing an excessively high value for initial capacity is less severe for this class than for HashMap, as iteration times for this class are unaffected by capacity.

Thursday, 31 March 2011

HashMap vs HashTable

At the outset, both are similar in functionality (purpose) but for very small difference. Both store data in key/value pairs. There is nothing with HashMap that cannot be done with Hashtable.

Following table gives the differences.

Hashtable	HashMap
Introduced with JDK 1.0 version, the starting version of Java	Introduced with JDK 1.2
Part of legacy classes (the data structures of JDK 1.0 are known as legacy classes)	Part of Collections framework
Methods of Hashtable are synchronized by default	Methods are not synchronized by default
Does not permit null values (trying to add, throws NullPointerException)	Permits null values (either key or value)
Enumeration interface is used for iterating keys	Iterator is used for iterating the keys
Enumerator is not fail-fast	Iterator is fail-fast
Safe for multithreaded applications	Better for non-threaded applications
Low performance due to synchronization	High performance
Hashtable is serialized	HashMap is not serialized

Hashtable can make use of HashMap methods to get the features of collections framework. As of JDK 1.2, the Hashtable implements Map. Following is the class signature of Hashtable.
public class java.util.Hashtable extends java.util.Dictionary implements java.util.Map, java.lang.Cloneable, java.io.Serializable
What is synchronized?

The concept of synchronization comes in multithreading. In multithreaded applications, there is every chance that multiple threads accessing the same source of data at the same time. This results in data corruption and data inconsistency. Synchronization avoids this by allowing only one thread to access the resource of data at a time.
That is, when one thread is modifying the data of Hashtable, the other thread cannot modify. The other thread should wait until the first one completes its job. This is done by acquiring the lock on Hashtable by the first thread. The lock will not be released until the first completes its job.
What is fail-fast?
The concept of fail-fast comes with iterators. When an Iterator object is created and iteration is going on, the HashMap elements cannot be modified (like addition or deletion of elements cannot be done). This is explained programmatically in ConcurrentModificationException.
Even though the HashMap methods are not synchronized, we can obtain a synchronized HashMap optionally as follows.

Map m = Collections.synchronizeMap(HashMap);

Programs on Hashtable and HashMap are available at Hashtable Examples and HashMap Examples.
About Hashing and Hashcode
Comparing two strings letter by letter in a for loop is a time taking process. To make faster, the JVM converts each string into an integer number called hashcode. Different strings with different sequence of characters have different hashcodes. Comparison with integer numbers gives maximum performance. Usage of hashcode numbers for comparison, searching of duplicate elements and identification is faster.
Hashing is process of converting a string or object into a 32-bit integer number. Two objects are said to be equal if their hashcodes are same. hashCode() is used in combination of equals() method. When compared, hashing is done automatically by the JVM. Hashing, in data structures, is done implicitly in the basic operations with add(), contains(), remove() and size() etc. Hashing is more useful to compare the sets of large content.

Pages