A Class for Efficient Key Lookup in C++ - Symbol(C++)

An example

In this section, we will show an example of a program which can be speeded up by using Symbols as keys instead of Strings. The technique for knowing if a program can be speeded up by using Symbols is: try it and see. The speedup that can be obtained by using Symbols as keys is sensitive to the following parameters (among others): (1) The underlying data structure used to implement the tables; (2) The size of the tables; and (3) The total number of times each probe key is looked up in the tables. In general, the more frequently a given probe key is looked up, the bigger the win will be using Symbols, since once a Symbol is constructed the remaining operations are cheap. Our example will perform the following prototypical actions:

  1. Insert some keys into a table.

  2. Construct a number of probe keys, where some fraction of the probe keys are present in the table.

  3. Look up each probe key in the table some number of times.

For our table, we will use a Map (see Map(C++) data structure:

       #include <Map.h>
       #include <String.h>
       Map<String,int> table;

The information associated with each key is unimportant for this example; here we have associated an integer. Here is the set of keys we will insert into our table:

       char* tablekeys[] = { "now", "is", "the", "time",
           "for", "all", "good", "people", "to", "come",
           "aid", "of", "their", "party", 0 };

The following function inserts the above keys into the table:

       void build_table() {
           for (char** p = tablekeys; *p != 0; ++p)
               table[*p] = 0;

To implement step 2 above, we will use the following data structure:

       struct Probe {
           char* s;
           String* k;
       } probe[] = {
           "now", 0, "come", 0, "foo", 0, "bar", 0,
           "all", 0, "symbol", 0, "x", 0, "good", 0,
           "quick", 0, "brown", 0, "fox", 0, "aid", 0,
           "zzz", 0, "probes", 0, "table", 0, "map", 0,
           "their", 0, "x1", 0, "x2", 0, "x3", 0, 0, 0

Each of the elements in the above array contains a pointer to the probe string, and a placeholder to store a pointer to the key we will construct from the string. The following function constructs the probe keys:

       void make_probes() {
           for (Probe* p = probe; p->s != 0; ++p)
               p->k = new String(p->s);

The following function implements step 3 above:

       void lookup() {
           int lookups_per_probe = ...;
           while (lookups_per_probe-- > 0)
               for (Probe* p = probe; p->s != 0; ++p)

The inner loop looks up each probe in the table; the outer loop repeats this the desired number of times. (The reason we nest the while-loop outside the for-loop, rather than the other way around, is to avoid the anomalous behavior that would result from looking up the same key several times in immediate succession.) Here is the sequence of function calls we wish to time:


The time to execute the above is as follows:

lookups per probe time (ms)
1 9.33
2 13.2
5 24.7
10 43.6
50 190
100 380

If we change every occurrence of ``String'' in the above to ``Symbol'', the corresponding times are:

lookups per probe time (ms)
1 10.6
2 12.5
5 15.4
10 21.3
50 72.5
100 129

Notice that when the number of lookups per probe is one, Symbols are actually slightly slower than Strings. This is because the Symbol constructor is slightly slower than the String constructor. When the number of lookups is one, the extra time spent in key construction is not paid off in the time saved by efficient lookup. The remainder of the entries correspond to speedups of 5%, 38%, 50%, 62%, and 66%, respectively.

Recall that the speedup to be gained by using Symbols is sensitive to the parameters mentioned above. Your mileage may vary.

Next topic: Summary comparison
Previous topic: Removing the baggage

© 2005 The SCO Group, Inc. All rights reserved.
SCO OpenServer Release 6.0.0 -- 02 June 2005