Showing posts with label iterator. Show all posts
Showing posts with label iterator. Show all posts

Wednesday, 18 May 2011

Iterating over ArrayList in java

To get successive elements from an ArrayList, there are 4 ways:

Use either a for loop with an integer index to get all the elements from an ArrayList, or go over all elements in a ArrayList using an Iterator (forward) or ListIterator (forward / backward).
  1. foreach loop. This is fast and works for all kinds of lists, but is not entirely flexible (only sequential forward with no deletions, additions, or multiple references). This should be your first choice in programming. Works efficiently with both ArrayList and LinkedList.
    ArrayList<String> a = new ArrayList<String>();
    . . .
    for (String s : a) {
    System.out.println(s);
    }

     

  2. for loop with index. This is fast, but should not be used with a LinkedList. It does allow orders other than sequentially forward by one.

    for (int i = 0; i < a.size(); i++) {
    System.out.println(a.get(i));
    }

     

  3. Iterator<E> - Allows simple forward traversal. Can be used for the largest number of other kinds of data structures. This example uses an Iterator to print all elements (Strings) in an ArrayList. It uses hasNext(), which returns true if there are more elements, and next(), which returns the next element. Works with both ArrayList and LinkedList.

    for (Iterator<String> iter = a.iterator(); iter.hasNext();) 
    {
    System.out.println(iter.next());
    }


  4. ListIterator<E> - Allows traversal of the ArrayList, but it is more general than a simple Iterator, allowing inserts and deletes (although both are very slow for an ArrayList). It also allows bidirectional traversal. Works efficiently with both ArrayList and LinkedList.

Saturday, 14 May 2011

Difference between Enumeration and Iterator interface or Enumeration vs Iterator

Enumeration and Iterator are the interface available in java.util package. The functionality of Enumeration interface is duplicated by the Iterator interface. New implementations should consider using Iterator in preference to Enumeration. Iterators differ from enumerations in following ways:

Enumeration Iterator
Enumeration contains 2 methods namely hasMoreElements() & nextElement(). Iterator contains three methods namely hasNext(), next(),remove().
No such operation supported. It acts as read-only interface, as delete not supported. Iterator adds an optional remove operation, and has shorter method names. Using remove() we can delete the objects.
Enumeration interface is used by legacy classes. Vector.elements() & Hashtable.elements() method returns Enumeration. Iterator is returned by all Java Collections Framework classes. java.util.Collection.iterator() method returns an instance of Iterator.

Difference between iterator and index access

Index based access allow access of the element directly on the basis of index. The cursor of the datastructure can directly goto the ‘n’ location and get the element. It doesnot traverse through n-1 elements. So this is like random access, as it is in case of arrays.

In Iterator based access, the cursor has to traverse through each element to get the desired element.So to reach the ‘n’th element it need to traverse through n-1 elements. So this is the case of linked list, where we have to go through each element to insert something in list.

Insertion,updation or deletion will be faster for iterator based access if the operations are performed on elements present in between the datastructure.

Insertion,updation or deletion will be faster for index based access if the operations are performed on elements present at last of the datastructure.

Traversal or search in index based datastructure is faster.

ArrayList is index access and LinkedList is iterator access.

Sunday, 17 April 2011

Are iterators always fail-safe?

The iterators of the Collection implementations of the Java runtime throw a ConcurrentModificationException[1] when they detect that another thread has modified the Collection while a thread is iterating over it. Such iterators are generally called fail-fast iterators.

Looking at the implementation one can see how this has been made possible. The Collection subclass maintains an integer modCount that is incremented on every operation that structurally modifies the collection (like add, remove, clear). The fail-fast iterators also contain an integer field expectedModCount which is initialized to the modCount while the iterator is created. Later on, during every iteration, the iterator verifies if the expectedModCount is same as the modCount of the collection it is iterating over. A mismatch means that the collection has been modified during the life cycle of the iterator and a ConcurrentModificationException is thrown. The same happens during the collection serialization process. The modification count prior to writing to the stream should be same as the count after writing to the stream.

But is this behavior guaranteed? Are these iterators always fail-fast? I used to think so, until I saw the declaration of the modCount field in the Collection implementations.

The modCount field is not marked as volatile. This would mean - while a thread makes structural changes to the collection and increments the modCount field, the other thread that performs the iteration or serialization might not see the incremented value of the modCount field. It might end up doing a comparison with the stale modCount that is visible to it. This might cause the fail-fast iterators to behave in a different way and failing at a later point. The field is non-volatile in Apache Harmony too.

A bug[2] in Sun Java explains this. It says the modCount was intentionally made non-volatile. Making it volatile would have added contention at every collection modification operation because of the memory fencing that’s required.

Also the Javadoc clearly says “fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification”.

This behavior would be the same even on the Synchronized wrapper's of these collections.

References
[1] http://java.sun.com/j2se/1.5.0/docs/api/java/util/ConcurrentModificationException.html
[2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6625725

Iterators returned by set implementation and concurrent modification

Except EnumSet, the iterators returned by the implemented class's (HashSet/TreeSet) iterator method are fail-fast i.e., if the set is modified at any time after the iterator is created, in any way except through the iterator's own remove method, the Iterator throws a ConcurrentModificationException. But again it is not guaranteed to throw "ConcurrentModificationException" by iterator all the time but will be thrown on best-effort basis. So dont rely on this exception for the correctness of the program.

Wednesday, 2 February 2011

What is iterator?

The point of using an iterator is to allow the the client to iterate over the collection of objects, without exposing implementation details. This gives you the benefit to change what collection the Iterator iterates over and how each element is served, without making any change the client's code.


Making custom collections iterable

Let's say we have our own custom Vector-based collection, GameCollection, that stores Games and we want to make this collection iterable (over its stored Games) by making it implement the Iterator inteface:


  1. public class GameCollection implements Iterable<Game> {  
  2.  private Vector<Game> games;  
  3.    
  4.  public GameCollection() {  
  5.   games = new Vector<Game>();  
  6.  }  
  7.    
  8.  public void add(Game game) {  
  9.   games.add(game);  
  10.  }  
  11.   
  12.  @Override  
  13.  public Iterator<Game> iterator() {  
  14.   return games.iterator();  
  15.  }  
  16. }  

The client can then use the above collection as follows:


  1. GameCollection gc = new GameCollection();  
  2. Iterator<game> gameIterator = gc.iterator();  
  3.   
  4. while (gameIterator.hasNext()) {  
  5.  Game g = gameIterator.next();  
  6.  System.out.println(g.getName());  
  7. }  

Or better yet, make use of Java's for-each statement:


  1. GameCollection gc = new GameCollection();  
  2. for (Game g : gc) {  
  3.  System.out.println(g.getName());  
  4. }  

As you can see from the above code, the client doesn't know that we are using a Vector to store our Games in the GameCollection. Infact, we can later change the GameCollection to store the Games in a LinkedList or an Array, and yet the client wouldn't need to change his iteration code (code above) to suit for our changes.

Multiple iterators for a custom collection

What if we now want our GameCollection to offer two iterable solutions? Say we want our GameCollection to iterate over both Games and also GameConsoles?

The first thing that comes to mind is to try and make the GameCollection implement two Iterator interfaces using different generic arguments:


  1. public class GameCollection implements Iterable<Game>, Iterable<GameConsole>  

but unfortunately, the above doesn't compile. Therefore, as a workaround to such an issue, we can make use of Java's ability to allow for inner classes:


  1. public class GameCollection  {  
  2.  private Vector<Game> games;  
  3.  private Vector<GameConsole> consoles;  
  4.    
  5.  private class Games implements Iterable<Game> {  
  6.   @Override  
  7.   public Iterator<Game> iterator() {  
  8.    return games.iterator();  
  9.   }  
  10.  }  
  11.    
  12.  private class Consoles implements Iterable<GameConsole> {  
  13.   @Override  
  14.   public Iterator<GameConsole> iterator() {  
  15.    return consoles.iterator();  
  16.   }  
  17.     
  18.  }  
  19.    
  20.  public GameCollection() {  
  21.   games = new Vector<Game>();  
  22.   consoles = new Vector<GameConsole>();  
  23.  }  
  24.    
  25.  public void add(Game game) {  
  26.   games.add(game);  
  27.  }  
  28.    
  29.  public void add(GameConsole console) {  
  30.   consoles.add(console);  
  31.  }  
  32.   
  33.  public Games games() {  
  34.   return new Games();  
  35.  }  
  36.    
  37.  public Consoles consoles() {  
  38.   return new Consoles();  
  39.  }  
  40. }  

With the above two inner classes and the public methods games() and consoles(), the client can iterate over the collections like such:


  1. GameCollection gc = new GameCollection();  
  2.   
  3. //Add games and consoles with gc.add()  
  4.   
  5. for (Game g : gc.games()) {  
  6.  System.out.println(g.getName());  
  7. }  
  8.   
  9. for (GameConsole g : gc.consoles()) {  
  10.  System.out.println(g.getName());  
  11. }  

Creating custom iterators

Up till this point, we have only used iterators that are in built with Java's existing collections; but what if we want our own custom iterator? We can use Java's java.util.Iterator interface to build our own iterator.

In the following example, I have created a Circular Iterator:
  1. public class CircularGamesIterator implements Iterator<Game> {  
  2.   
  3.  private Vector<Game> list;  
  4.  private int currentPosition;  
  5.    
  6.  public CircularGamesIterator(Vector<Game> games) {  
  7.   list = games;  
  8.   currentPosition = 0;  
  9.  }  
  10.    
  11.  @Override  
  12.  public boolean hasNext() {  
  13.   return currentPosition < list.size();  
  14.  }  
  15.   
  16.  @Override  
  17.  public Game next() {  
  18.   Game el = list.elementAt(currentPosition);  
  19.   currentPosition = (currentPosition + 1) % list.size();   
  20.   return el;  
  21.  }  
  22.   
  23.  @Override  
  24.  public void remove() { }  
  25. }  


The iterator() method of the GameCollection class can then be modified to return an instance of the CircularGamesIterator:


  1. public class GameCollection implements Iterable<Game> {  
  2.  private Vector<Game> games;  
  3.    
  4.  public GameCollection() {  
  5.   games = new Vector<Game>();  
  6.  }  
  7.    
  8.  public void add(Game game) {  
  9.   games.add(game);  
  10.  }  
  11.   
  12.  @Override  
  13.  public Iterator<Game> iterator() {  
  14.   return new CircularGamesIterator(games);  
  15.  }  
  16. }  

An Iterator should not be Iterable!

I have seen some examples on the internet where people make Iterators iterable by making their Iterators implement the Iterator interface and returning this in the iterator() method:
I have removed the implementation in the methods for the following code for brevity.
  1. public class CustomIterator implements Iterator<Game>, Iterable<Game> {  
  2.   
  3.  public CustomIterator(Vector<Game> games) {  
  4.   list = games;  
  5.  }  
  6.   
  7.  @Override  
  8.  public Iterator<Game> iterator() {  
  9.   return this;  
  10.  }  
  11.   
  12.  @Override  
  13.  public boolean hasNext() {  
  14.    //Implementation  
  15.  }  
  16.   
  17.  @Override  
  18.  public Game next() {  
  19.    //Implementation  
  20.  }  
  21.   
  22.  @Override  
  23.  public void remove() { }  
  24. }  


The problem with the above code can be clearly illustrated with such a method:


  1. public static void iterateTwice(Iterable<Game> games) {  
  2.  for (Game g : games) {  
  3.   System.out.println(g.getName());  
  4.  }  
  5.    
  6.  for (Game g : games) {  
  7.   System.out.println(g.getName());  
  8.  }  
  9.  System.out.println();  
  10. }  

If you directly pass in your iterable iterator to the above method, the Games will only be printed once.

To illustrate this, let's assume that we will use the custom iterator mentioned above, CustomIterator, that implements both Iterator<Game> and Iterable<Game>. Our GameCollection implements Iterable<Game>, and its iterator() method returns new CustomIterator(games);. Now, let's also say that GameCollection also provides a method games() that also returns new CustomIterator(games);.

Here's an example that illustrates this:

  1. GameCollection gc = new GameCollection();  
  2.   
  3. gc.add(new Game("Broken Sword"));  
  4. gc.add(new Game("Grim Fandango"));  
  5. gc.add(new Game("Manic Mansion"));  
  6.   
  7. System.out.println("Printing the iterable collection, GameCollection");  
  8. iterateTwice(gc);  
  9.   
  10. System.out.println("Printing the iterable iterator, CustomIterator");  
  11. CustomIterator gamesIterator = gc.games();  
  12. iterateTwice(gamesIterator);  

The output of the above code is as follows:
Printing the iterable collection, GameCollection
Broken Sword
Grim Fandango
Manic Mansion
Broken Sword
Grim Fandango
Manic Mansion

Printing the iterable iterator, CustomIterator
Broken Sword
Grim Fandango
Manic Mansion

Notice how when passing the iterable CustomIterator iterator to the method, the Games were only printed once, whereas when we passed in the GameCollection that implements Iterable, the Games were printed twice.

This is because when CustomIterator was passed in, the second for loop terminated immediately.

Tuesday, 21 September 2010

Iterators

The List and Set collections provide iterators, which are objects that allow going over all the elements of a collection in sequence. The java.util.Iterator interface provides for one-way traversal and java.util.ListIterator provides two-way traversal. Iterator is a replacement for the older Enumeration class which was used before collections were added to Java.

Creating an Iterator

Iterators are created by calling the iterator() or listIterator() method of a List, Set, or other data collection with iterators.

public void showCollection ( Collection c )
{
    Iterator i = c.iterator();
   while ( i.hasNext() ) {  //returns true if elements are still there
Object o = i.next();         //returns the next element
// Process element, casting to
// appropriate type if necessary
}

Methods in Iterator Interface

Iterator defines three methods, one of which is optional.
ResultMethod Description
b = it.hasNext() true if there are more elements for the iterator.
obj = it.next() Returns the next object. If a generic list is being accessed, the iterator will return something of the list's type. Pre-generic Java iterators always returned type Object, so a downcast was usually required.
void it.remove() Removes the most recent element that was returned by next. Not all collections support delete. An UnsupportedOperationException will be thrown if the collection does not support remove().

Difference from Enumeration Interface

An Iterator is similar to the Enumeration interface, Iterators differ from enumerations in two ways:
1.Iterators allow the caller to remove elements from the underlying collection during the iteration with well-defined semantics.
2. Method names have been improved.  


The hasNext method is identical in function to Enumeration.hasMoreElements, and the next method is identical in function to Enumeration.nextElement. The remove method removes from the underlying Collection the last element that was returned by next. The remove method may be called only once per call to next, and throws an exception if this condition is violated. Note that Iterator.remove is the only safe way to modify a collection during iteration; the behavior is unspecified if the underlying collection is modified in any other way while the iteration is in progress.

The following snippet shows you how to use an Iterator to filter a Collection, that is, to traverse the collection, removing every element that does not satisfy some condition:
static void filter(Collection c) {
for (Iterator i = c.iterator(); i.hasNext(); )
if (!cond(i.next()))
i.remove();
}
Two things should be kept in mind when looking at this simple piece of code:
  • The code is polymorphic: it works for any Collection that supports element removal, regardless of implementation. That's how easy it is to write a polymorphic algorithm under the collections framework!
  • It would have been impossible to write this using Enumeration instead of Iterator, because there's no safe way to remove an element from a collection while traversing it with an Enumeration.

 

Normal syntax

Let there be collection c.

Iterator it = c.iterator();

Example with Java 5 generics

ArrayList<String> alist = new ArrayList<String>();
// . . . Add Strings to alist

for (Iterator<String> it = alist.iterator(); it.hasNext(); ) {
String s = it.next(); // No downcasting required.
System.out.println(s);
}

Example as above but with enhanced Java 5 for loop

for (String s : alist) {
System.out.println(s);
}

Example pre Java 5, with explicit iterator and downcasting

An iterator might be used as follows, wi.
ArrayList alist = new ArrayList(); // This holds type Object.
// . . . Add Strings to alist

for (Iterator it = alist.iterator(); it.hasNext(); ) {
String s = (String)it.next(); // Downcasting is required pre Java 5.
System.out.println(s);
}
 
  • The order in which elements are visited is defined by the underlying implementation class.
  • Should the Collection change while we are iterating through it, the methods hasNext() and next() will throw the RuntimeException –ConcurrentModificationException. This does not need to be caught, but may be useful. When next() goes beyond the end of the Collection, perhaps because the hasNext() method was not used correctly, the exception NoSuchElementException is thrown. This is again a RuntimeException and so there is no need for a try…catch block.