A multimap is like a map, except it can map each key to multiple values. The Collections Framework doesn't include an interface for multimaps, because they aren't used all that commonly. It's a fairly simple matter to use a
There is a standard trick for finding permutation groups: for each word in the dictionary, alphabetize the letters in the word (that is, reorder the word's letters into alphabetical order) and put an entry into a multimap, mapping the alphabetized word to the original word. For example, the word "bad" causes an entry mapping "abd" into "bad" to be put into the multimap. A moment's reflection will show that all of the words to which any given key maps form a permutation group. It's a simple matter to iterate over the keys in the multimap, printing out each permutation group that meets the size constraint.
The following program is a straightforward implementation of this technique. The only tricky part is the
Map
whose values are List
objects as a multimap. This technique is demonstrated in the next code example, which reads a dictionary containing one word per line (all lowercase) and prints out all of the permutation groups in the dictionary that meet a size criterion. A permutation group is a bunch of words all of which contain exactly the same letters, but in a different order. The program takes two arguments on the command line: the name of the dictionary file, and the minimum size of permutation group to print out. Permutation groups containing fewer words than the specified minimum are not printed. There is a standard trick for finding permutation groups: for each word in the dictionary, alphabetize the letters in the word (that is, reorder the word's letters into alphabetical order) and put an entry into a multimap, mapping the alphabetized word to the original word. For example, the word "bad" causes an entry mapping "abd" into "bad" to be put into the multimap. A moment's reflection will show that all of the words to which any given key maps form a permutation group. It's a simple matter to iterate over the keys in the multimap, printing out each permutation group that meets the size constraint.
The following program is a straightforward implementation of this technique. The only tricky part is the
alphabetize
method, which returns a string containing the same characters as its argument, in alphabetical order. This routine (which has nothing to do with the Collections Framework) implements a slick bucket sort. It assumes that the word being alphabetized consists entirely of lowercase alphabetic characters. Running the program on an 80,000 word dictionary takes about 16 seconds on an aging UltraSparc 1. With a minimum permutation group size of eight, it produces the following output:import java.util.*;
import java.io.*;
public class Perm {
public static void main(String[] args) {
int minGroupSize = Integer.parseInt(args[1]);
// Read words from file and put into simulated multimap
Map m = new HashMap();
try {
BufferedReader in =
new BufferedReader(new FileReader(args[0]));
String word;
while((word = in.readLine()) != null) {
String alpha = alphabetize(word);
List l = (List) m.get(alpha);
if (l==null)
m.put(alpha, l=new ArrayList());
l.add(word);
}
} catch(IOException e) {
System.err.println(e);
System.exit(1);
}
// Print all permutation groups above size threshold
for (Iterator i = m.values().iterator(); i.hasNext(); ) {
List l = (List) i.next();
if (l.size() >= minGroupSize)
System.out.println(l.size() + ": " + l);
}
}
private static String alphabetize(String s) {
int count[] = new int[256];
int len = s.length();
for (int i=0; i<len; i++)
count[s.charAt(i)]++;
StringBuffer result = new StringBuffer(len);
for (char c='a'; c<='z'; c++)
for (int i=0; i<count[c]; i++)
result.append(c);
return result.toString();
}
}
Many of these words seem a bit bogus, but that's not the program's fault; they're in the dictionary file.% java Perm dictionary.txt 8
9: [estrin, inerts, insert, inters, niters, nitres, sinter,
triens, trines]
8: [carets, cartes, caster, caters, crates, reacts, recast, traces]
9: [capers, crapes, escarp, pacers, parsec, recaps, scrape, secpar,
spacer]
8: [ates, east, eats, etas, sate, seat, seta, teas]
12: [apers, apres, asper, pares, parse, pears, prase, presa, rapes,
reaps, spare, spear]
9: [anestri, antsier, nastier, ratines, retains, retinas, retsina,
stainer, stearin]
10: [least, setal, slate, stale, steal, stela, taels, tales, teals,
tesla]
8: [arles, earls, lares, laser, lears, rales, reals, seral]
8: [lapse, leaps, pales, peals, pleas, salep, sepal, spale]
8: [aspers, parses, passer, prases, repass, spares, sparse, spears]
8: [earings, erasing, gainers, reagins, regains, reginas, searing,
seringa]
11: [alerts, alters, artels, estral, laster, ratels, salter, slater,
staler, stelar, talers]
9: [palest, palets, pastel, petals, plates, pleats, septal, staple,
tepals]
8: [enters, nester, renest, rentes, resent, tenser, ternes, treens]
8: [peris, piers, pries, prise, ripes, speir, spier, spire]
No comments:
Post a Comment