Hash table size?

Soldato
Joined
15 Feb 2009
Posts
4,304
Location
Bristol
For my coursework at uni I have to read from a text file and put the words into both a linked list (can do this fine) and hash table, so that I can compare the efficiency of each storing method.

I've read and understand how the hash table works, but confused as to how big to make the table?

I understand that the bigger the table size, the less collisions but how big relative to the volume of input?

The text file I have to use has over 28,000 words, but making the table with 28,000+ indexes seems ridiculous? Or is having the same number of indexes as input normal?

I've tried googling but didn't get any answers.

All help is appreciated! :)
 
There is no specific answer, the only thing to be wary of is that when a hash table gets too full (I.e. a large amount of collisions are occuring). Then a larger table is created. This is called rehashing and is generally a very expensive operation, which is something to avoid. When a rehash will occur depends on the load factor. ~ this is usually defaulted to 0.75 (in Java at least)

I was always taught as a generally rule of thumb

No. of buckets = 1.5 * no. of elements expected

And make the number of buckets a prime number as so to minimize collisions and create a more even distribution throughout the hash space.

This is especially true if storing strings as they have predictable usage patterns
 
Back
Top Bottom