Java Rumblings: Collections

Showing posts with label Collections. Show all posts

Wednesday, 25 January 2012

Hashset Example in java

http://way2java.com/collections/hashset-operations/

Stack example with generics

http://way2java.com/collections/java-generics-stack/

Sunday, 22 January 2012

Queue example in java

Before going into the details of this program, it is advised to go through Queue Fundamentals where queue basics were discussed.
Following are some methods that do the similar operations.

boolean offer(Object obj): Adds the element obj to the queue. If the addition is successful, the method returns true else false.
Object poll(): Returns the head (first) element and also deletes it. That is, we cannot get it again. If no element exists (when queue is empty), the method returns null.
Object remove(): It also returns and deletes the head element like poll(), but with a small difference. This method throws NoSuchElementException if the queue is empty.
Object peek(): Returns the head element but it does not delete it. That is, we can get it again. Returns null when the queue is empty.
Object element(): It works similar to peek() but with a small difference (returns but does not delete the element). It throws NoSuchElementException when the queue is empty.

offer() method is equivalent to add() method of Collection interface.
http://way2java.com/collections/queue-programming/

interface Queue was introduced with JDK 1.5 and is part of collections framework as Queue extends Collection interface. Queue stores elements in such way it returns the first one added. Queue follows the style of FIFO (first in, first out, remember stack follows LIFO, last in, first out). The first element in the queue is known as head and the last one is known as tail. New elements are added to the tail.
Following is the signature
public interface Queue extends Collection As queue is a subclass of Collection, it can make use of all the methods of Collection and also the extra defined in its own such as inserting, retrieving and checking the elements.
Queue vs List
The major variation is that list allows retrieving the elements from any position. But, queue follows FIFO. LinkedList implements both interfaces List and Queue.
Programming Tip: Queue allows null elements. But do not add null as the return type of poll() method is also null when the queue is empty. The return null value may confuse (or conflict) with the actual null value you have added.
Realtime examples of Queue

Job Seekers Queue: Imagine a number of participants, aspiring job, appearing for interview. They stand in a queue. The first one stood (head) in the queue comes out first and attends the interview. The participant who comes late joins at the last of the queue (tail). While queue moves on a number check points may be there to check their bonafides and marks lists etc. Care must be taken one participant cannot be checked by many at a time or one checking person cannot check a number of participants at a time. For this a synchronized queue (java.util.concurrent.BlockingQueue) exactly suits.
Cinema Tickets Queue: A person who stands first in the booking counter leaves first.

Saturday, 21 January 2012

Stack example in java

http://way2java.com/collections/java-stack/2/

Java Stack

Stack, in any language, stores elements in LIFO (Last-in First-out) order. As the name indicates, in LIFO, the last added element is retrieved first. It can be imagined like a stack of plates put on the top of the other. The plate added at the end is drawn first. In stack, elements cannot be added in the middle or removed. There can be only one point of entry and one point of exit.
Following is the class signature
public class Stack extends Vector Stack is derived from Vector and again Vector is derived from List. List belongs to collections framework (JDK 1.2) and Stack and Vector belong to legacy classes (JDK 1.0). The Java designers extended List to Stack and Vector to give the features and advantages of collections to legacy classes. It is a good designing principle. Now stack and vector can use all the methods of Collection and List interfaces.
The java.util.Stack comes with five operations represented by five methods.

Object push(Object obj): An element obj is added the to the stack with push() method. Elements are added implicitly on the top of another. The method returns the element added.
Object pop(): With this method, the elements can be retrieved from the Stack. The last one added (which lies on the top) is returned first. With pop(), the element comes out of the stack permanently. That is, it cannot be retrieved again.
int search(Object obj): With this method, an element, obj, index number can be obtained. The counting starts from the top. The topmost element index number is 1. If the stack is empty, the method returns EmptyStackException.
peek(): This method returns the element existing on the top. But element is not popped (removed) out. With peek method, the element existence also can be known.
boolean empty(): Returns true if no elements exist in the stack.

pop vs peek

For a beginner, these two methods are very confusing. Let us discuss again. The pop() method returns the element existing on the top of the stack. The element returned is deleted from the stack permanently. That is, the programmer cannot get it again.
The peek() method also returns the element but the element is not removed from the stack. That is, calling peek() method five times, returns the same element 5 times. peek() method can be used to know which element is lying on the top of the stack or which element is returned when pop() is called. It can be used in regular programming like this.

if(st.peek().equals("SNRao"))
       {
          System.out.println(st.pop());
       }

Limitations of Stack Usage
There are two limitations for stack usage.

The programmer stores data in a DS with an intention to get the elements whenever he would like in the code. But with stack, the popped element is completely removed and cannot be obtained again. For this reason stack is not used much in general production software like health and banking etc, but used by system and tool developers.
Another drawback is, the elements cannot be added or retrieved from the middle.

The Stack class comes with only one constructor.

Stack(): This default constructor creates an empty Stack object.

Thursday, 19 January 2012

TreeMap Program - Special functions

public static void specialFunctions(String args[])
{
 TreeMap<String, Integer> tm1 = new TreeMap<String, Integer>();  
 tm1.put("banana", 20);
 tm1.put("apple", 50);
 tm1.put("melon", 40);
 tm1.put("guava", 20);
 tm1.put("cherry", 30);

 System.out.println("\nElements of tm1: " + tm1);
 Set mySet = tm1.keySet();
 System.out.println("Keys of tm1: " + mySet);

 Collection myCol = tm1.values();
 System.out.println("Values of tm1: " + myCol);

 // USING ITERATOR
 System.out.print("\nPrinting keys with Iterator: ");
 Iterator it1 = mySet.iterator();
 while(it1.hasNext())
 {
  System.out.print(it1.next() + ", ");
 } 
 // TREEMAP SORT BY VALUE
 TreeSet yourSet = new TreeSet();
 yourSet.addAll(myCol);
 System.out.println("\nTreeMap sort by value: " + yourSet);

 // COMPARING TWO TREE MAPS
 TreeMap<String, Integer> tm2 = new TreeMap<String, Integer>();  
 tm2.putAll(tm1);
 System.out.println("\nElements of tm2: " + tm2);
 if(tm1.equals(tm2))
 {
  System.out.println("tm1 and tm2 contain same elements\n");
 } 

 System.out.println("\t\tEXTRACTING TREEMAP ELEMENTS\n");

 SortedMap m1 = tm1.subMap("banana", "guava");
 System.out.println("subMap(\"banana\", \"guava\"): " + m1);

 SortedMap m2 = tm1.headMap("guava");
 System.out.println("headMap(\"guava\"): " + m2);

 SortedMap m3 = tm1.tailMap("guava");
 System.out.println("tailMap(\"guava\"): " + m3);
}

TreeMap Progam - General Functions

public static void generalFunctionsDemo(String args[])
{
 TreeMap tm1 = new TreeMap();
 System.out.println("tm1 isEmpty() before adding elements: " + tm1.isEmpty());
 tm1.put("one", 1);
 tm1.put("apple", "red");
 tm1.put("two", 2);
 tm1.put("three", 3);
 tm1.put("four", 4);

 System.out.println("tm1 isEmpty() after adding elements: " + tm1.isEmpty());
 System.out.println("Key/value pairs of tm1:" +  tm1);

 System.out.println("\napple key exists: " + tm1.containsKey("apple"));
 System.out.println("two key exists: " + tm1.containsKey("two"));
 System.out.println("red value exists: " + tm1.containsValue("red"));
 System.out.println("2 value exists: " + tm1.containsValue(2));

 System.out.println("\nValue of three: " + tm1.get("three"));
 System.out.println("Value of four: " + tm1.get("four"));

 System.out.println("\nNo. of elements before removal: " + tm1.size());
 tm1.remove("one");
 System.out.println("No. of elements after removal: " + tm1.size());
 tm1.clear();
 System.out.println("No. of elements after clear(): " + tm1.size());
}

Iterate over hashmap

public static void iterateOverHashmapGeneric(){


 HashMap<String, Integer> hm1 = new HashMap<String, Integer>();
 hm1.put("E", 69);
 hm1.put("A", 65);
 hm1.put("G", 71);
 hm1.put("C", 67);

 Set<String> mySet = hm1.keySet();
 System.out.print("foreach printing: ");
 for(String str : mySet)
 {
  System.out.print(str + ":" + hm1.get(str) + ", ");
 }

 HashMap<String, Integer> hm2 = new HashMap<String, Integer>();
 hm2.putAll(hm1);
 if(hm1.equals(hm2))
 {
  System.out.println("\n\nhm1 and hm2 contain the same elements");
 }

 HashMap<String, Integer> hm3 = (HashMap) hm1.clone();
 System.out.println("\nElements of hm3: " + hm3);   

}

A generics HashMap object hm1 is created that stores keys as strings and values as integers. With put() method elements are added.

Set<String> mySet = hm1.keySet();
     for(String str : mySet)
     {
       System.out.print(str + ":" + hm1.get(str) + ", ");
     }

With HashMap, Iterator cannot be used as Iterator stores elements of single entities but HashMap stores pairs. The new JDK 1.5 enhanced for loop (foreach) cannot be used straight away for the same reason. It is used indirectly. The keySet() returns all the keys of HashMap hm1. As keys are single entities, to iterate them, foreach loop is used. The str refers a key and using the key str, the value associated is retrieved with get() method.

HashMap<String, Integer> hm2 = new HashMap<String, Integer>();
     hm2.putAll(hm1);
     if(hm1.equals(hm2))
     {
       System.out.println("\n\nhm1 and hm2 contain the same elements");
     }

Another generics object hm2 is created and all the elements of hm1 are added at a time with putAll() method. equals() method is used to check both hm1 and hm2 contains the same elements are not. As they have same, the method returns true.

HashMap<String, Integer> hm3 = (HashMap) hm1.clone();
     System.out.println("\nElements of hm3: " + hm3);

Any Collection object including HashMap can be cloned. As the cloned object, hm3 and the hm1 occupy different locations and have their own set of values, they maintain perfect encapsulation.

Iterator vs ListIterator

http://way2java.com/collections/iterator-vs-listiterator/

ConcurrentModificationException

http://way2java.com/exceptions/concurrentmodificationexception/

Enumeration Interface

Enumeration interface is from java.util package introduced with JDK 1.0 version, the starting version of Java. It is used by many data structures (especially with legacy classes) to print their values. Following program illustrates.

import java.util.*;
public class EnumerationDemo
{
    public static void main(String args[])
    {
        Vector vect1 = new Vector();
        vect1.addElement("Raju");  
        vect1.addElement("Reddy");
        vect1.addElement("Rao");
        vect1.addElement("Ratnakar");

        Enumeration e = vect1.elements();

        while(e.hasMoreElements())
        {
            Object obj1 = e.nextElement();
            System.out.println(obj1);
        }
    }
}

In the above code, the elements() method of Vector returns an object of Enumeration. Following is the elements() method signature as defined in Vector class.

public Enumeration elements();

Observe, the elements() method returns an object of Enumeration. The above program gives a warning as program is designed for generics. But the program executes fine.

How Enumeration works?

The Enumeration interface comes with two methods.

public abstract boolean hasMoreElements();
public abstract Object nextElement();

The Enumeration object e contains all the elements of the Vector vect1. The Enumeration object e maintains a cursor over the elements. By default, the cursor is placed on the first element. The hasMoreElements() returns true if the cursor points to an element else false. If there exists an element (known by the true value), the control enters the while loop. In the while loop, the nextElement() returns the value (pointed by the cursor) and then pushes to the next element. Again the loop goes back and checks whether an element exists or not (known by the return value of hasMoreElements()) and then enters the loop. The nextElement() does its job. Like this, complete elements of the e object are returned. When cursor points to no element, the hasMoreElements() method returns false and the loop terminates.

The nextElement() method returns an object of Object class. If required, it can be casted to the respective element and used in the program. The following modified while loop, adds "great man" if the name is Rao.

We can make an analogy for nextElement() method. It is like read() method of FileInputStream. The read() method reads and returns the byte where the file pointer exists and then pushes the file pointer to the next byte.

Sorting keys and values in Hashmap

Example shows how sorting operations on keys and values are done.

public static void sortingHashMapDemo(){
 HashMap hm1 = new HashMap();
 hm1.put("one", 1);
 hm1.put("thousand", 1000);
 hm1.put("ten", 10);
 hm1.put("hundred", 100);
 // SORTING KEYS
 Set mySet = hm1.keySet();
 System.out.println("\nhm1 keys: " + mySet);
 TreeSet ts1 = new TreeSet(mySet);
 System.out.println("hm1 sorted keys: " + ts1);

 // SORTING VALUES
 Collection myCol = hm1.values();
 System.out.println("\nhm1 values: " + myCol);   
 TreeSet ts2 = new TreeSet(myCol);
 System.out.println("hm1 sorted values: " + ts2);

 // GET KEY FROM VALUE
 for(Object obj1 : mySet)
 {
  if(hm1.get(obj1).equals(10))
  {
   System.out.println("10 value key is " + obj1);   
  }
 }
}

Wednesday, 18 January 2012

Collection methods

Map

`obj =`	`m.put(key, val)`	Creates mapping from `key` to `val`. It returns the previous value (or null) associated with the key.
	`m.putAll(map2)`	Adds all key-value entries from another map, map2.
Removing key-value pairs from a map
	`m.clear()`	Removes all elements from a Map
`obj =`	`m.remove(key)`	Deletes mapping from `key` to anything. Returns previous value (or null) associated with the key.
Retrieving information from the map
`b =`	`m.containsKey(key)`	Returns true if `m` contains a key `key`
`b =`	`m.containsValue(val)`	Returns true if `m` contains `val` as one of the values
`obj =`	`m.get(key)`	Returns value corresponding to `key`, or `null` if there is no mapping. If `null` has been stored as a value, use `containsKey` to see if there is a mapping.
`b =`	`m.isEmpty()`	Returns `true` if `m` contains no mappings.
`i =`	`m.size()`	Returns number of mappings in `m`.
Retrieving all keys, values, or key-value pairs (necessary for iteration)
`set =`	`m.entrySet()`	Returns set of Map.Entry values for all mappings.
`set =`	`m.keySet()`	Returns Set of keys.
`col =`	`m.values()`	Returns a Collection view of the values in `m`.

TreeMap vs HashMap

After knowing Hashtable vs HashMap, now let us see the comparison of HashMap with TreeMap. Basically both are derived from Map interface and meant to store key/value pairs. But TreeMap inherits one more interface SortedMap and for this reason it attains the property of returning the elements in sorting order by default (irrespective of the addition of elements in any order).

By nature, all the classes derived from Map interface allow null values either as keys are values. Null keys are accepted only once because Map classes do not allow duplicate keys.
Similarities:

The methods of both are not synchronized. Both are not thread-safe. But we can obtain a synchronized version as follows.

HashMap hm1 = new HashMap();
Map m1 = Collections.synchronizedMap(hm1);

Now the methods of m1 are synchronized but still the methods of hm1 are not synchronized. So the programmer still has got a choice of using synchronized or unsynchronized map. This is the height of intelligence of Java designers in developing collection framework classes. It is a much matured framework.
Both do not support duplicate keys. If added it overrides the earlier (but not error or exception).
Iterators of both are fail-fast.
Allows null values for both keys and values (null key is allowed only once because no duplicate keys).
HashMap is the direct implementation of Map whereas TreeMap comes with an intermittent SortedMap (see the above hierarchy).

Differences:

HashMap returns unordered values. TreeMap returns the elements in ascending order (known as natural order) of keys by default (the affect of deriving from SortedMap). TreeMap order can be customized using Comparator and Comparable interfaces.
The performance of HapMap is higher than TreeMap because TreeMap requires minimum execution of two method calls to create a tree structure and to print the elements in natural order. The performance of HashMap can still be enhanced by the programmer by choosing optimum initial capacity and load factor, at the time of HashMap object creation itself, as per his requirements.

What is synchronized?
The concept of synchronization comes in multithreading. In multithreaded applications, there is every chance that multiple threads accessing the same source of data at the same time. This results in data corruption and data inconsistency. Synchronization avoids this by allowing only one thread to access the resource of data at a time.
That is, when one thread is modifying the data of Hashtable, the other thread cannot modify. The other thread should wait until the first one completes its job. This is done by acquiring the lock on Hashtable by the first thread. The lock will not be released until the first completes its job.
What is fail-fast?
The concept of fail-fast comes with iterators. When an Iterator object is created and iteration is going on, the HashMap elements cannot be modified (like addition or deletion of elements cannot be done). This is explained programmatically in ConcurrentModificationException.
Programs on HashMap and TreeMap are available at HashMap Tutorial and TreeMap Tutorial.
About Hashing and Hashcode
Comparing two strings letter by letter in a for loop is a time taking process. To make faster, the JVM converts each string into an integer number called hashcode. Different strings with different sequence of characters have different hashcodes. Comparison with integer numbers gives maximum performance. Usage of hashcode numbers for comparison, searching of duplicate elements and identification is faster.
Hashing is process of converting a string or object into a 32-bit integer number. Two objects are said to be equal if their hashcodes are same. hashCode() is used in combination of equals() method. When compared, hashing is done automatically by the JVM. Hashing, in data structures, is done implicitly in the basic operations with add(), contains(), remove() and size() etc. Hashing is more useful to compare the sets of large content.

Monday, 3 October 2011

Making collections final

Once we write final

ArrayList list = new ArrayList();

We can add, delete objects from this list, but I can not

list = new ArrayList() or list = list1.

So declaring the list final means that you cannot reassign the list variable to another object.

There are two situations in which this can be useful.

You want to make sure that no-one reassigns your list variable once it has received its value. This can reduce complexity and helps in understanding the semantics of your class/method. In this case you are usually better off by using good naming conventions and reducing method length (the class/method is already too complex to be easily understood).
When using inner classes you need to declare variables as final in an enclosing scope so that you can access them in the inner class. This way, Java can copy your final variable into the inner class object (it will never change its value) and the inner class object does not need to worry what happens to the outer class object while the inner class object is alive and needs to access the value of that variable.

Thursday, 7 July 2011

Deque in java

The java.util.Deque interface is a subtype of the java.util.Queue interface. It represents a queue where you can insert and remove elements from both ends of the queue. Thus, "Deque" is short for "Double Ended Queue" and is pronounced "deck", like a deck of cards.

Deque Implementations

Being a Queue subtype all methods in the Queue and Collection interfaces are also available in the Deque interface.

Since Deque is an interface you need to instantiate a concrete implementation of the interface in order to use it. You can choose between the following Deque implementations in the Java Collections API:

java.util.ArrayDeque (Click on the link to read about it.)
java.util.LinkedList

LinkedList is a pretty standard Deque / Queue implementation.

Internal Implementation
ArrayDeque stores its elements internally in an array. If the number of elements exceeds the space in the array, a new array is allocated, and all elements moved over. In other words, the ArrayDeque grows as needed, even if it stores its elements in an array.

Possible Usage
Because the double-ended queue supports addition or removal of elements from either end of the data structure, it can be used as a queue (first-in-first-out/FIFO) or as a stack (last-in-first-out/LIFO). J2SE 5 introduced a Queue interface and the Deque interface extends this interface. Note that a Vector-based Stack class has been present since JDK 1.0, but the Javadoc documentation for this class states that Deque should be used instead because it provides better LIFO operation support. Implementations of Deque are typically expected to perform better than Stack as well.

The Deque interface is extended by the concurrency-supporting interface BlockingDeque and is implemented by classes LinkedBlockingDeque (introduced in Java SE 6), LinkedList (available since JDK 1.2, but now implements Deque), and ArrayDeque (introduced in Java SE 6). For this blog entry, I will demonstrate using one Deque instance as a queue and one Deque instance as a stack using ArrayDeque for both. ArrayDeque is not thread-safe and does not allow for elements of the data structure to be null (a recommended but not required condition of Deque implementations and uses).

Operations on Deque
Adding to deque
To add elements to the tail of a Deque you call its add() method. You can also use the addFirst() and addLast() methods, which add elements to the head and tail of the deque.

Deque dequeA = new LinkedList();

dequeA.add     ("element 1"); //add element at tail
dequeA.addFirst("element 2"); //add element at head
dequeA.addLast ("element 3"); //add element at tail

Getting or Peeking the element
You can peek at the element at the head of the queue without taking the element out of the queue. This is done via the element() method. You can also use the getFirst and getLast() methods, which return the first and last element in the Deque. Here is how that looks:

Object firstElement = dequeA.element();
Object firstElement = dequeA.getFirst();
Object lastElement  = dequeA.getLast();

Removing the element
To remove elements from a deque, you call the remove(), removeFirst() and removeLast methods. Here are a few examples:

Object firstElement = dequeA.remove();
Object firstElement = dequeA.removeFirst();
Object lastElement  = dequeA.removeLast();

Iterating over the elements
One method is normal foreach style:

//access via new for-loop
for(Object object : dequeA) {
    String element = (String) object;
}

Another method is via iterator

//access via Iterator
Iterator iterator = dequeA.iterator();
while(iterator.hasNext(){
  String element = (String) iterator.next();
}

Friday, 24 June 2011

Converting a Collection to an array

If you have been using Collections in Java 5 you probably have stumbled upon a problem of converting a given Collection<T> into an array of type T. In the Collection interface, there is a method called toArray() which returns an array of type Object[]. But then, if since Java 5 Collections are using generics, shouldn’t it be possible to get an array T[] of the generic type we used in the declaration of our Collection<T>? Well, there is a method T[] toArray(T[] a). But what is that strange “T[] a” doing there? Why is there no simple method T[] toArray() without any strange parameters? It seams like it shouldn’t be a big problem to implement such a method… but actually it is not possible to do so in Java 5, and we will show here why.
Let’s start from trying to implement that method by ourselves. As an example, we will extend the ArrayList class and implement there a method T[] toArrayT(). The method simply uses the standard Object[] toArray() method and casts the result to T[].

import java.util.*;

/**
 * Implementation of the List interface with "T[] toArrayT()" method.
 * In a normal situation you should never extend ArrayList like this !
 * We are doing it here only for the sake of simplicity.
 * @param <T> Type of stored data
 */
class HasToArrayT<T> extends ArrayList<T> {
    public T[] toArrayT() {
        return (T[]) toArray();
    }
}

public class Main {
    public static void main(String[] args) {
        // We create an instance of our class and add an Integer
        HasToArrayT<Integer> list = new HasToArrayT<Integer>();
        list.add(new Integer(4));

        // We have no problems using the Object[] toArray method
        Object[] array = list.toArray();
        System.out.println(array[0]);

        // Here we try to use our new method... but fail on runtime
        Integer[] arrayT = list.toArrayT(); // Exception is thrown :
        // "Ljava.lang.Object; cannot be cast to [Ljava.lang.Integer"
        System.out.println(arrayT[0]);
    }
}

If you compile it and run, it will throw an exception when trying to use our new method. But why? Didn’t we do everything right? Objects stored in the array ARE Integers and toArray method returns an array of those objects, so what is the problem?

Java compiler gives us a hint to understand what is wrong. During compilation, you should see a warning at line 12, saying something like “Type safety: Unchecked cast from Object[] to T[]“. Let’s follow this trace and check if the array that comes out of our toArrayT method is really of type Integer[]. Put this code somewhere in main() method before the Integer[] arrayT = list.toArrayT() line :

System.out.println("toArrayT() returns Integer[] : "
    + (list.toArrayT() instanceof Integer[]));

When you run the program, you should get “false”, which means that toArrayT() DID NOT return Integer[]. Why? Well, many of you probably know the answer – it is Type Erasure. Basically, during compilation the compiler removes every information about the generic types. Effectively, when the program runs, all the appearances of T in our HasToArrayT class are nothing more than simple Object’s. It means, that the cast in the function toArrayT does not cast to Integer[] any more, even if we are using Integer in the declaration of the generic type. Actually, there is also a second problem here, coming from the way Java handles casting of arrays. This time however we would like to concentrate just on the first one – usage of generics.
So, if the information that T is an Integer is no longer there at runtime, is there any way to really return an array of type Integer[] ? Yes, there is, and it is exactly what the method T[] toArray(T[] a) in the Collections framework is doing. But the price you have to pay for it is the additional argument a, whose instance you have to create first by yourself. How do they use it there? Let’s look at how it is really implemented in, for example, ArrayList :

public <T> T[] toArray(T[] a) {
    if (a.length < size)
        // Make a new array of a's runtime type, but my contents:
        return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}

As it is explained in the method’s Javadoc, if the specified array a is big enough it just copies elements into this array. If not, it creates a new array. What is important for us, is that when creating a new array it uses Arrays.copyOf method, specifying there a.getClass() as the type of array to create. So that’s where it needs that additional argument – to know what type of array to create. But then you might say, would it not be easier to fit there T.class instead of a.getClass()? No. Because of type erasure, T is no longer the type we specified. Moreover, we would get a compile time error if we wrote T.class! Some of you may be thinking about creating a new array the simple way – new T[], but it is also not possible because of the same reason. We have already discussed how to get Generic Arrays in Java here.
To put it straight, in order to return a new array of T, the type that we really specified, we MUST give something of that type to the function toArray. There is no other way the function could now what type of array it should create, because the generic type T supplied when creating that Collection is erased every time during compilation.
One other way to implement such a method is by using the Class<T[]> type as an input parameter. This way, instead of having to create a new instance a of array T[] and then using it in the function toArray(a), we could use the .class field – toArrayT(TYPE[].class) :

public T[] toArrayT(Class<T[]> c) {
        return (T[]) Arrays.copyOf(toArray(), size(), c);
 }

It still takes an argument, but it has two advantages over the T[] toArray(T[] a) function. First, it does not require you to create an instance of an array just to pass it to the function. Second, it is simpler. Of course, it has still the disadvantage of having to pass some argument.

After all, there is no perfect way to perform such a conversion. The best solution would be not to use arrays at all. Code with arrays is difficult to maintain and invites people to make mistakes that often come out not earlier than on runtime. Using collections is generally much preferred and if you use them all the time, you will not be forced to deal with the toArray method.

Thursday, 23 June 2011

Type Safety: why would you need a Collections.checkedSet in JDK1.4?

In one of the first posts on this blog I have discussed the problem of type checking (or lack of it) in some of Java collections even when using generics. Today I want to show you a similar issue that can break the type safety in your code. It’s a problem that can cause a really nasty bugs in your application – the ones that manifest themselves many thousand lines of code after the faulty method introduce it.
Imagine that you work with generics, but you use a third party code. What if some part of that code was written in Java 1.4 or earlier, therefore not using generics. Let’s go one step further: what if (for some reason) this code does not respect the typing of your collections? You would expect to get an error, right? Well, you’ll get one… but do you know when? Check the following code:

public static void main(String[] args) {
    // create a type-safe set of Integers and add some values to it
    Set<Integer> setOfInts = new HashSet<Integer>();
    setOfInts.add(new Integer(7));
    setOfInts.add(new Integer(13));

    // pass your set to a non-generic code
    Set uncheckedSet = setOfInts;
    // non-generic code does not respect your typing
    // and adds a String to your set of integers
    uncheckedSet.add(new String("Illegal"));

    // back in your code - do some set manipulations:
    System.out.println(setOfInts);
    setOfInts.remove(new Integer(7));

    // at the end iterate trough your set
    for (Integer integer : setOfInts) {
        System.out.println(integer);
    }
}

The code above is a simplified version of the described scenario. In lines 8 – 11 the ‘third party’ code is executed and adds a String into a set of Integers. So back to the question: in a given example when will error occur? One might assume in line 11 as the bad element is inserted… wrong! Maybe when we access the collection and print all of its elements?… wrong! We can even do some manipulations on it (line 15) and still be fine! The error will show his ugly head finally in the loop in line 18 when we will enter it for the second time. This will happen so late because in line 18 for the first time we access the inserted String and cast it to Integer – at this point ClassCastException occurs.
In the example above the distance between the faulty insertion and the place where exception is thrown is only seven lines, but you can imagine its being 7000. Your code can be working fine for 7000 hours and after that throw an exception that will basically give you no hint what’s wrong. If you do not use third party code you should not feel safe because of that – I am sure that if your project has more than 10000 lines there is a part of it so old and dusty that it was written without the usage of generics…
What can be done? Well, we are in luck as there is a quite simple way of protecting yourself from that: in java.util.Collections you will find a way to wrap your Set, Map or List into an forwarding class that will check in runtime the typing of the inserted objects. In the case of our code snippet we only need to add one line:

public static void main(String[] args) {
    // create a type-safe set of Integers and add some values to it
    Set<Integer> setOfInts = new HashSe<Integer>();
    setOfInts = Collections.checkedSet(setOfInts, Integer.class);
    setOfInts.add(new Integer(7));
    setOfInts.add(new Integer(13));

    // pass your set to a non-generic code
    Set uncheckedSet = setOfInts;
    // non-generic code does not respect your typing
    // and adds a String to your set of integers
    uncheckedSet.add(new String("Illegal"));

    // back in your code - do some set manipulations:
    System.out.println(setOfInts);
    setOfInts.remove(new Integer(7));

    // at the end iterate trough your set
    for (Integer integer : setOfInts) {
        System.out.println(integer);
    }
}

As you see the only difference between this and previous snippet is additional line (#4) in which we wrap our set of integers into a CheckedSet. See that due to Generic type erasure in Java the Collections.checked* methods require a class object besides a collection to wrap. Now when we run the new code the error will manifest itself just at the moment where an error is commited: in line 12 when we try to insert a string. Exactly what we needed!

To summarize: if in your project you use either third party code or legacy code that you suspect of not using generics consider wrapping your collections with Collection.checked* methods!

Note :
Users with jdk 6 or higher will get ClassCastException in any case, if they add some wrong datatype.

Type safety in Java Set and Map in JDK 1.4

Probably many of you still remember the lack of type checking in Java 1.4 Collections and how much hassle it was to deal with casting the collection elements, not to mention how many errors this introduced to the code. Since introduction of generics in Java 1.5 this have really improved and one might think that nowadays the language itself protects the programmer from the silliest of typing mistakes. Generics themselves brought with them a set of new complications, but it seems reasonable to think that in the basic code situations when using Java’s Sets or Maps and when there is no kung-fu complications like wildcards or casting we are safe and secure. Are we really?
Recently I have encountered a bug in a production code when dealing with a simple (really simple) usage of a Map. The embarassing part of this issue was that it took lots of effort to debug it and the cause of it was… well… trivial. The code below shows a sketch of how the code looked like before the bug was introduced. In real life the class was a part of a really old and rarely edited legacy code, obviously it was much larger than the one below:

import java.util.HashMap;
import java.util.Map;

public class EmployeeDataLookup {
    // A map storing relation between employee ID and name.
    private Map<Integer, String> employeeIdToName;

    public EmployeeDataLookup() {
        // Create a new EmployeeDataLookup and initialize
        // it with employee names and IDs.
        employeeIdToName = new HashMap<Integer, String>();
        addEmployee(301, "John Doe");
        addEmployee(302, "Mary Poppins");
        addEmployee(303, "Andy Stevens");
    }

    public void addEmployee(int employeeId, String employeeName) {
        employeeIdToName.put(employeeId, employeeName);
    }

    // This class is very complicated and has many other methods...

    // Lookup method for finding employee name for given employee ID.
    public String findEmployeeName(int employeeId) {
        return employeeIdToName.get(employeeId);
    }

    public static void main(String[] args) {
        // Create a EmployeeDataLookup instance
        EmployeeDataLookup employeeLookup = new EmployeeDataLookup();
        // Find the name of an employee with ID = 301
        String employeeName = employeeLookup.findEmployeeName(301);
        System.out.print("Employee 301 : " + employeeName);
    }
}

So what went wrong? Well, at some point the project functionality requirements have changed (surprising, right?) and the key of the map storing data inside of the class had to be changed. In terms of the example above you might say that a company has merged with one of the vendors, so the system storing the employee data had to be made compatible with the ID system of the vendor. Because the vendor company used 13 digit IDs to identify its employees the change to the code in Java terms meant that instead of using Integers we had to change to Longs. Simple right? Its even easier if you use an IDE like Eclipse – just change/refactor the Map key type in line 7 from Integer to Long, let the editor show you all the errors, fix them and that is it! In five minutes all is done:

import java.util.HashMap;
import java.util.Map;

public class EmployeeDataLookup2 {
    // A map storing relation between employee ID and name.
    private Map<Long, String> employeeIdToName;

    public EmployeeDataLookup2() {
        // Create a new EmployeeDataLookup and initialize
        // it with employee names and IDs.
        employeeIdToName = new HashMap<Long, String>();
        addEmployee(301, "John Doe");
        addEmployee(302, "Mary Poppins");
        addEmployee(303, "Andy Stevens");
    }

    public void addEmployee(long employeeId, String employeeName) {
        employeeIdToName.put(employeeId, employeeName);
    }

    // This class is very complicated and has many other methods...

    // Lookup method for finding employee name for given employee ID.
    public String findEmployeeName(int employeeId) {
        return employeeIdToName.get(employeeId);
    }

    public static void main(String[] args) {
        // Create a EmployeeDataLookup instance
        EmployeeDataLookup employeeLookup = new EmployeeDataLookup();
        // Find the name of an employee with ID = 301
        String employeeName = employeeLookup.findEmployeeName(301);
        System.out.print("Employee 301 : " + employeeName);
    }
}

The change above introduced the bug. Can you see it? When you run the main method now you will find out that the name of the employee with ID=301 is ‘null’! Ha!

If you do not see the trouble (or are to lazy to try) I’ll give you a hint: check out the function findEmployeeName . If you have seen it, try to imagine that this class in real life had 1000+ lines of code unrelated to the changed map. Since there are no compile errors it was like looking for a needle in a haystack… so what’s the bug exactly?

The problem is that after introducing generics, probably due to backward compatibility, some of the methods in Collection and Map interface have not been changed and still do not perform type checking. One of them is Map.get(Object) which have caused the bug in the code above. After the ’simple change’ we have still been using Integer keys to access the map that contained Longs, so even though we had a record of employee 301 in our system the get method returned null. And because of the lack of type checking there were no warnings… Busted.

So what’s the lesson from this? Be cautious about type checking even in the most basic situations. Remember that access methods in Sets, Lists and Maps can behave and bite you in the behind. The biggest pain is that the bugs introduced this way are usually hard to spot by a human, so sometimes a dozen of engineers staring at the code can have trouble finding it. There is a way to deal with them though – most of them can be easily found if you are using a static code analysis tool (eg: FindBugs).

Note:
If you are using JDK 6 or higher, its no worries for you. Because in any case you would get the correct result.

Java Rumblings

Pages