No More Array Errors (Part II) - Array_alg(C++)

No More Array Errors (Part II) - Array_alg(C++)

In Part I of this tutorial, we showed how to eliminate array reallocation errors by using Block(C++) as a replacement for arrays. In the second part, we will show how to eliminate another major class of array errors: the algorithmic ones.

Even though arrays are the simplest data structure of all, we still make mistakes when implementing array algorithms. Even after we get the algorithms right, efficiency often remains a problem. For example, when was the last time you implemented an array search algorithm? Was it a linear search algorithm? Or was the array sorted so that you could use a binary search algorithm? If the former, did you get the all the boundary and termination conditions right the first time? If the latter, are you sure that there wasn't a faster algorithm? And finally, was your solution reusable? Or would repeating the exercise for an array with a different element type have required you to copy and edit the source code?

Considerations like these have persuaded many C++ programmers to abandon arrays in favor of library components like Map(C++) and Set(C++) One reason that programmers make fewer algorithmic errors when working with Maps and Sets is that the storage and retrieval algorithms are built in. Moreover, Map and Set have pretty good performance characteristics. Maps have logarithmic time and linear space complexity. That is, Map lookups run in O(lgN) time, where N is the number of elements in the Map, and a Map with N keys takes up O(N) words of storage. Sets have constant time and linear space complexity.

Finally, Maps and Sets are parameterized, supporting genuine code reuse.

There are times, however, when types like Set and Map are overkill. If space is tight, why pay the space overhead of a complicated data structure? Arrays have no overhead. Or, when there are just a few elements, it will be faster to search an array than a complicated data structure, no matter what algorithms are used (how few is ``few'' may be determined by analysis or measurement). Remember that order estimates provide a reliable standard of comparison only for very large values of N. For small values of N, a quadratic algorithm may actually outperform an O(1) algorithm. If your application never exceeds the value of N where the two algorithms break even, it would make sense to use the quadratic algorithm. The section "Performance", presents some actual measurements that bear this point out.

Finally, there are many algorithms that Set and Map simply don't provide, and which would be difficult or inefficient for you to implement on top of these types. For example, Set does not provide any mechanism for partitioning elements into equivalence classes based on some criterion.

This tutorial describes Array_alg(C++) (Array Algorithms), a component implementing nearly 100 algorithms on arrays. These include familiar algorithms for searching, sorting, copying, and removing, and some not-so-familiar ones for partitioning and shuffling. We believe that these algorithms are the most efficient ones known; if you know of a better one, we'd like to hear about it! The algorithms are documented in 33 manual entries, plus an introduction. The documentation includes complexity information from which the cost of using each algorithm can be computed.

Array_alg is an early ancestor of the C++ Standard Template Library.

Next topic: Using Block(C++) with Array Algorithms

© 2005 The SCO Group, Inc. All rights reserved.
SCO OpenServer Release 6.0.0 -- 02 June 2005