entrelacs

Open Addressing and related techniques

The Arrow Space is implemented ag a giant Content-Indexed Hash table featuring:

Content-Indexed Hash Table

In an content-indexed hash-table, there is no key or identifier: the whole value is used as a key. This is the obvious solution to store arrows which are mathematical constructs mapped one-to-one onto the storage.

Solving Collisions with Open Adressing

When assimilating an arrow, one computes an hash code h of its definition.

h gives the default location where an arrow is supposed to be in the storage space.

However, this default location might have already been taken to store another arrow. This is called a hash conflict or a collision.

In such a case, one makes use of the Open Addressing technique. It consists in putting conflicting arrows among neighbor slots.

This set of related slots is called an open block herein. They are not necessary nearby as there may be a big offset between two consecutive slots.

Solving Data Clutering with Double Hashing

Using the same constant offset for probing slots may cause unrelated open blocks to merge as their probing sequences resonate to each other.

It may lead to very long and slow probing sequences.

However the offset used to probe locations is allowed to change depending on the probed data. So it may typically be computed by another hash function. This technique is called Double Hashing.

Slot chaining with Coalesced Hashing

Some data don’t fit in a single slot. One extends the Open Addressing and Double Hashing techniques to ventilate data segments accross the whole storage.

One also uses the Coalesced Hashing technique, which consists in linking together the slots used to store related segments.

Advanced Tricks

Hop-o’-My-Thumb’s “Seven-League” boots

The Coalesced Hashing link is not saved as a full slot address. It is rather stored in the form of a short offset multiplier which allows to jump over unwanted data.

Hop-o’-My-Thumb’s white pebbles

Coalesced Hashing is not used to link slots of the same probing sequence. To avoid open blocks breaking on content removal, one introduces a pebble counter which counts how many probing pathes go through a given slot at a given time.

And more …

This presentation introduces the Hop-o’-My-Thumb tricks in a pleasant way.

A few other tricks related with Open Addressing have been employed to fit even more information in the same storage. See the prototype page for more information.