

We wrote a program to compare the performance of Array_set with the current implementation of Set(C++) as described in the Set tutorial, elsewhere in this volume. The latter uses a relatively complicated digital trie data structure. Actually, we used the Set_of_p in the comparison because, of the three unordered collection classes defined in Set(C++), Set_of_p has the best performance characteristics.
Here are the first few lines of our program:
#include <Stopwatch.h> see Stopwatch(C++) #include <Array_alg.h> Stopwatch w; typedef double* DOUBLEP;
#ifdef SETH #include <Set.h> typedef Set_of_p<double> SET; #else #include <Array_set.h> typedef Array_set<DOUBLEP> SET; #endif SET p,q;
The program uses a Stopwatch(C++) for timing. It uses the Array Algorithm shuffle(), described in shuffle(C++), to obtain good averages (see below).
The data used for the comparison consisted of N unique pseudorandom pointers:
main(int,char* argv[]){ int N = atoi(argv[1]); DOUBLEP* data = new DOUBLEP[N]; for(int i = 0;i<N;i++){ do{ data[i] = (DOUBLEP)rand(); see rand(S) }while(p.contains(data[i])); p.insert(data[i]); } ... }
Note that uniqueness is guaranteed by testing each pointer for containment before inserting it into p. For our first test, we repeatedly emptied p into q, then q into p, and so on. We timed the repetitions using the Stopwatch and took an average over the number of repetitions. To get a good average, we used shuffle() to apply a random permutation to the data at the end of each pass:
int NREPS = 10000/N; w.start();for(i=0;i<NREPS;i++){ for(int ii=0;ii<N;ii++){ assert(p.remove(data[ii])); assert(q.insert(data[ii])); }
for(ii=0;ii<N;ii++){ assert(q.remove(data[ii])); assert(p.insert(data[ii])); } shuffle(data,data+N); } w.stop(); cout << "insert/remove time = " << w.user()/NREPS << "\ n";
This test showed that, for insertion and removal, Array_set outperforms Set_of_p for values of N up to about 8. In terms of order estimates, the O(1) complexity of the singleelement Set_of_p operations quickly overtakes the O(N) complexity of the corresponding Array_set operations.
Next, we timed the contains operation for a variety of elements known to be either members (data) or nonmembers (data2):
DOUBLEP* data2 = new DOUBLEP[N];for(i=0;i<N;i++){ do{ data2[i] = (DOUBLEP)rand(); }while(p.contains(data2[i])); } w.reset(); w.start();
for(n=0;n<NREPS;n++){ for (i = 0; i < HOWMANY; i++){ assert(p.contains(arr1[i])); } for(i = 0; i < HOWMANY; i++){ assert(!p.contains(arr2[i])); } }
w.stop(); cerr << "contains (yes/no) time = " << w.user()/NREPS << "\ n";
This test showed that Array_set outperforms Set_of_p for values of N up to 9.
The story changes dramatically when we compare operations that are O(N) in both implementations. Consider iteration over all elements. Array_set iteration is little more than a forloop over an array; in terms of order estimates, the constant multiplier for iteration is very close to 1. Set_of_p iteration is a walk over a complicated Nary tree structure; in terms of order estimates, the constant multiplier is relatively large. We would therefore expect Array_set iteration to outperform Set_of_p iteration even for small values of N, and to become progressively better as N grows larger. We would expect the same for the algebraic (union, difference, and soon) and relational operations (equality, subset, superset), which are also O(N) in both implementations. Actual measurements bear this out (times are in milliseconds):
N = 100 Array_set Set_of_pequality 0.5 11.2 union 2.5 14.5 difference 3.3 28.2
N = 1000 Array_set Set_of_p
equality 5.0 71.7 union 11.6 89.0 difference 20.0 161.7
In summary, for applications emphasizing O(1) operations (insertion, removal, membership testing), Set_of_p overtakes Array_set at around 10 elements, and becomes progressively better as the number of elements increases. For applications emphasizing O(N) operations (iteration, algebraic, relational operations), Array_set is faster than Set_of_p for all values of N and becomes progressively better as the size increases.
A similar comparison of the current implementation of Map(C++) against a Block(C++) and Array_alg(C++) implementation of the same specification gave even more dramatic results: insertion breaks even at 155 elements, membership at 80 elements, and Array_set iteration is always faster.
If your application runs on a machine with limited memory, executable size is probably just as important to you as execution speed. To give an idea of relative sizes of programs using Array_set and the current Set implementation, we compiled and linked the following two programs:
CC o main1 l++ #include <Set.h> main(){ } CC o main2 l++ #include <Array_set.h> main(){ }
On a Sun 3, the stripped executable sizes of main1 and main2 were 82K bytes and 41K bytes, respectively.