Allegro.cc - Online Community

Allegro.cc Forums » Off-Topic Ordeals » Any good books on hashing using c/c++, even java

Credits go to BAF, Dizzy Egg, Edgar Reynaldo, Steve Terry, and Thomas Fjellstrom for helping out!
This thread is locked; no one can reply to it. rss feed Print
 1   2 
Any good books on hashing using c/c++, even java
Matthew Leverton
Supreme Loser
January 1999
avatar

verthex said:

Buckets? Whats that.

You have to group all collided elements together in some way. Most people will use a linear list for collisions under the assumption that collisions are rare.

And collisions are rare ... except when somebody purposely designs keys to collide.

Thomas Fjellstrom
Member #476
June 2000
avatar

verthex said:

Buckets? Whats that.

GOOGLE?????!!!! jk.

A hash table is typically an array with N buckets (positions/slots/entries) in it, often a power of two. Each bucket contains room for one or more items. The hash value happens to be the index into that array. Why is it called a bucket? I'm not sure.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

verthex
Member #11,340
September 2009
avatar

Ok I guess the question should finally be what type of tree structure should I use and what type of hashing function is the fastest for it? Thomas pointed out a well balanced one, but is that something that's a result of a specific type of tree or a b-tree with some special properties would hold as well. Are all trees b-trees btw?

BAF
Member #2,981
December 2002
avatar

A bucket is typically a watertight, vertical cylinder or truncated cone, with an open top and a flat bottom, usually attached to a semicircular carrying handle called the bail. In common usage, the terms "bucket" and "pail" are often used interchangeably. As a shipping container, a pail is a package with a sealed top or lid.

There are many types of buckets;

  • a water bucket is used to carry water

  • Household and garden uses are often for carrying liquids and granular products

  • Elaborate ceremonial or ritual buckets in bronze, ivory or other materials are fond in several ancient or medieval cultures and are known by the Latin for bucket, situla.

  • Large scoops or buckets are attached to loader and telehandler for agricultural and earthmoving purposes.

  • A lunch box is often called a lunch pail

  • Buckets can be reused as seats, tool caddies, hydroponic gardens, chamberpots, "street" drums, livestock feeders.

Thomas Fjellstrom
Member #476
June 2000
avatar

verthex said:

Ok I guess the question should finally be what type of tree structure should I use and what type of hashing function is the fastest for it? Thomas pointed out a well balanced one, but is that something that's a result of a specific type of tree or a b-tree with some special properties would hold as well. Are all trees b-trees btw?

Most often a hash table uses linked lists in each "bucket". A tree can be used, but it increases the complexity quite a bit, and is only useful for a hash table you expect to get very large, with a ton of collisions.

And no, not all trees are binary trees. There are many types of trees. Ask Baf, I'm sure he call tell you all about Trees, like he can buckets.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

verthex
Member #11,340
September 2009
avatar

Most often a hash table uses linked lists in each "bucket".

Most of the data structure books I have ( actually all) have a chapter on hashing and not once do they mention buckets, but after searching google I found some examples.

Quote:

Ask Baf, I'm sure he call tell you all about Trees, like he can buckets.

Yes, his coding style must consist of typing english.

Peter Wang
Member #23
April 2000

only useful for a hash table you expect to get very large, with a ton of collisions.

If you're expecting a ton of collisions, you're doing it wrong. ;)

Evert
Member #794
November 2000
avatar

Chess programs use hash tables to store search results, with the hash key generated from the board position. Typically, the table is stored as a plain array of 2^N elements and the index is calculated from the lowest N bits of the hash key.
The number of positions searched is very much larger than the number of elements in the table, so you need multiple buckets per entry (in practice, one looks at the next k entries, say 4 (you really would like to stay within the same cache line)) and some replacement strategy to get rid of older entries.

Not sure if that's at all helpful for what you want to do...

verthex
Member #11,340
September 2009
avatar

Evert said:

Not sure if that's at all helpful for what you want to do...

Anything is helpful although I have no use for chess algorithms. If there are books on this subject please mention those, maybe someone else will run across this topic or maybe I'll need it one day.

Thomas Fjellstrom
Member #476
June 2000
avatar

If you're expecting a ton of collisions, you're doing it wrong.

I suppose. But I can be lazy and don't want to implement a dynamically resizing hash table, and still want tens/hundreds of millions of entries. The resize operation has a rather large amount of overhead on large tables, you have to iterate over every single item and rehash it into the new table. kinda sucky.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

verthex
Member #11,340
September 2009
avatar

So does anyone know if boost unordered_map is tree based or bucket based?

And is there one example of an stl container that uses buckets?

Bob
Free Market Evangelist
September 2000
avatar

Boost's unordered_map uses buckets. See here.

std::unordered_map<> (new in C++11) also uses buckets (it's nearly identical to the boost implementation). Before C++11, no other standard STL container uses buckets.

--
- Bob
[ -- All my signature links are 404 -- ]

verthex
Member #11,340
September 2009
avatar

Bob said:

Boost's unordered_map uses buckets.

I'm familiar with iterating through a b-tree but I'm wondering what keyword would I google to find examples of iterating through buckets?

Bob
Free Market Evangelist
September 2000
avatar

You don't iterate through buckets, you iterate through elements. boost::unordered_map<> supports forward iterations like any other STL container.

No specified support for backwards iterations though (it may or may not work on your implementation).

--
- Bob
[ -- All my signature links are 404 -- ]

james_lohr
Member #1,947
February 2002

verthex said:

but implementing the code is a little hard to do

Not really. Java uses a ridiculously simple hash function for hashing strings and gets passable results, considering that almost every Java application will be using it somewhere.

What specifically are you trying to hash that you have having trouble with?

verthex
Member #11,340
September 2009
avatar

What specifically are you trying to hash that you have having trouble with?

I want to make a clone of a boost unordered_map. I'm just doing it to learn the fundamentals of hash tables but I really liked that one so I picked that to learn from.

I did find chapter 23 in Professional C++ full of useful information.

 1   2 


Go to: