Any good books on hashing using c/c++, even java

Any good books on hashing using c/c++, even java

Matthew Leverton

Supreme Loser

January 1999

verthex said:

Buckets? Whats that.

You have to group all collided elements together in some way. Most people will use a linear list for collisions under the assumption that collisions are rare.

And collisions are rare ... except when somebody purposely designs keys to collide.

--
RTFM | Follow Me on Google+ | I know 10 people

Thomas Fjellstrom

Member #476

June 2000

verthex said:

Buckets? Whats that.

GOOGLE?????!!!! jk.

A hash table is typically an array with N buckets (positions/slots/entries) in it, often a power of two. Each bucket contains room for one or more items. The hash value happens to be the index into that array. Why is it called a bucket? I'm not sure.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

verthex

Member #11,340

September 2009

Ok I guess the question should finally be what type of tree structure should I use and what type of hashing function is the fastest for it? Thomas pointed out a well balanced one, but is that something that's a result of a specific type of tree or a b-tree with some special properties would hold as well. Are all trees b-trees btw?

BAF

Member #2,981

December 2002

A bucket is typically a watertight, vertical cylinder or truncated cone, with an open top and a flat bottom, usually attached to a semicircular carrying handle called the bail. In common usage, the terms "bucket" and "pail" are often used interchangeably. As a shipping container, a pail is a package with a sealed top or lid.

There are many types of buckets;

a water bucket is used to carry water
Household and garden uses are often for carrying liquids and granular products
Elaborate ceremonial or ritual buckets in bronze, ivory or other materials are fond in several ancient or medieval cultures and are known by the Latin for bucket, situla.
Large scoops or buckets are attached to loader and telehandler for agricultural and earthmoving purposes.
A lunch box is often called a lunch pail
Buckets can be reused as seats, tool caddies, hydroponic gardens, chamberpots, "street" drums, livestock feeders.

BAF.zone | SantaHack!

Thomas Fjellstrom

Member #476

June 2000

verthex said:

Most often a hash table uses linked lists in each "bucket". A tree can be used, but it increases the complexity quite a bit, and is only useful for a hash table you expect to get very large, with a ton of collisions.

And no, not all trees are binary trees. There are many types of trees. Ask Baf, I'm sure he call tell you all about Trees, like he can buckets.

verthex

Member #11,340

September 2009

Thomas Fjellstrom said:

Most often a hash table uses linked lists in each "bucket".

Most of the data structure books I have ( actually all) have a chapter on hashing and not once do they mention buckets, but after searching google I found some examples.

Quote:

Ask Baf, I'm sure he call tell you all about Trees, like he can buckets.

Yes, his coding style must consist of typing english.

Peter Wang

Member #23

April 2000

Thomas Fjellstrom said:

only useful for a hash table you expect to get very large, with a ton of collisions.

If you're expecting a ton of collisions, you're doing it wrong.

Evert

Member #794

November 2000

Chess programs use hash tables to store search results, with the hash key generated from the board position. Typically, the table is stored as a plain array of 2^N elements and the index is calculated from the lowest N bits of the hash key.
The number of positions searched is very much larger than the number of elements in the table, so you need multiple buckets per entry (in practice, one looks at the next k entries, say 4 (you really would like to stay within the same cache line)) and some replacement strategy to get rid of older entries.

Not sure if that's at all helpful for what you want to do...

verthex

Member #11,340

September 2009

Evert said:

Not sure if that's at all helpful for what you want to do...

Anything is helpful although I have no use for chess algorithms. If there are books on this subject please mention those, maybe someone else will run across this topic or maybe I'll need it one day.

Thomas Fjellstrom

Member #476

June 2000

Peter Wang said:

If you're expecting a ton of collisions, you're doing it wrong.

I suppose. But I can be lazy and don't want to implement a dynamically resizing hash table, and still want tens/hundreds of millions of entries. The resize operation has a rather large amount of overhead on large tables, you have to iterate over every single item and rehash it into the new table. kinda sucky.

verthex

Member #11,340

September 2009

So does anyone know if boost unordered_map is tree based or bucket based?

And is there one example of an stl container that uses buckets?

Bob

Free Market Evangelist

September 2000

Boost's unordered_map uses buckets. See here.

std::unordered_map<> (new in C++11) also uses buckets (it's nearly identical to the boost implementation). Before C++11, no other standard STL container uses buckets.

--
- Bob
[ -- All my signature links are 404 -- ]

verthex

Member #11,340

September 2009

Bob said:

Boost's unordered_map uses buckets.

I'm familiar with iterating through a b-tree but I'm wondering what keyword would I google to find examples of iterating through buckets?

Bob

Free Market Evangelist

September 2000

You don't iterate through buckets, you iterate through elements. boost::unordered_map<> supports forward iterations like any other STL container.

No specified support for backwards iterations though (it may or may not work on your implementation).

--
- Bob
[ -- All my signature links are 404 -- ]

james_lohr

Member #1,947

February 2002

verthex said:

but implementing the code is a little hard to do

Not really. Java uses a ridiculously simple hash function for hashing strings and gets passable results, considering that almost every Java application will be using it somewhere.

What specifically are you trying to hash that you have having trouble with?

verthex

Member #11,340

September 2009

James Lohr said:

What specifically are you trying to hash that you have having trouble with?

I want to make a clone of a boost unordered_map. I'm just doing it to learn the fundamentals of hash tables but I really liked that one so I picked that to learn from.

I did find chapter 23 in Professional C++ full of useful information.