Huffman Encoding / Trees

Soldato
Joined
27 Jun 2006
Posts
6,331
Hello folks, going over a few past papers for revision here and was just wondering if anyone would know the answer to this question:

2hdqn10.png


a) is easy enough to understand, you follow the paths using up the numbers in the bit pattern. But for the second one I just can't seem to begin piecing it together in my head.

I could create another tree with all those letters (a.w.o.d.c.u.t.e.r) in it but I assume they want the smallest string of numbers possible?

Appreciate any help. Thanks. :)
 
You'll likely want to represent "wood" as one digit (since it's used 3 times in the setence), and probably "cut" too.

Edit: by "one digit", I mean one "letter".
 
But they're only looking 'a woodcutter'. edit - Actually I'm not so sure now.

And surely this is networking? :p
 
Last edited:
You create another tree based on the number of letters used. In the example, o occurs the most at 7 times so would be the 00 tree, t is next with 4 and made the 01 tree, d is also 4 and would be 10, e is next with 3 and would be 110 and so on. I would put vowels before other letters that occur the same number of times.
Then simply generate the bit pattern for "a woodcutter".
 
The easiest-to-visualise method i think is to count the frequencies of each value (character) and sort into increasing frequency :

f : 1
i : 1
n : 1
a : 2
c : 2
r : 2
s : 2
u : 2
e : 3
w : 3
d : 4
t : 4
o : 7

Take the top two value:frequency pairs (lowest frequencies) and place
on a binary tree next to each other, and create a parent node with a frequency
which is the sum of the two child frequencies :
Code:
    *:2
   /   \
f:1     i:1
Remove the two values which we have used, and insert the new node into the
tree, at the first sorted postition :

n : 1
* : 2
a : 2
c : 2
r : 2
s : 2
u : 2
e : 3
w : 3
d : 4
t : 4
o : 7

( f:1 and i:1 are gone, *:2 is added )

Repeat for all values in the frequency list, creating seperate trees
and joining the tree where necessary :
Code:
    *:3 
   /   \
n:1     *:2
       /   \
    f:1     i:1
a : 2
c : 2
r : 2
s : 2
u : 2
* : 3
e : 3
w : 3
d : 4
t : 4
o : 7
Code:
    *:4             *:3 
   /   \           /   \
a:2     c:2     n:1     *:2 
                       /   \
                    f:1     i:1
r : 2
s : 2
u : 2
* : 3
e : 3
w : 3
* : 4
d : 4
t : 4
o : 7
Code:
    *:4             *:4             *:3 
   /   \           /   \           /   \
a:2     c:2     r:2     s:2     n:1     *:2 
                                       /   \
                                    f:1     i:1
u : 2
* : 3
e : 3
w : 3
* : 4
* : 4
d : 4
t : 4
o : 7
Code:
    *:4             *:4             *:5    
   /   \           /   \           /   \   
a:2     c:2     r:2     s:2     u:2     *:3 
                                       /   \
                                    n:1     *:2 
                                           /   \
                                        f:1     i:1
e : 3
w : 3
* : 4
* : 4
d : 4
t : 4
* : 5
o : 7
Code:
    *:4             *:4             *:5               *:6 
   /   \           /   \           /   \             /   \
a:2     c:2     r:2     s:2     u:2     *:3       e:3     w:3
                                       /   \
                                    n:1     *:2 
                                           /   \
                                        f:1     i:1
* : 4
* : 4
d : 4
t : 4
* : 5
* : 6
o : 7
Code:
            *:8
       ____/   \____
    *:4             *:4             *:5               *:6 
   /   \           /   \           /   \             /   \
a:2     c:2     r:2     s:2     u:2     *:3       e:3     w:3
                                       /   \
                                    n:1     *:2 
                                           /   \
                                        f:1     i:1
d : 4
t : 4
* : 5
* : 6
o : 7
* : 8
Code:
       *:15    
      /    \ 
            *:8                 *:8
       ____/   \____           /   \
    *:4             *:4     d:4     t:4      *:5              *:6 
   /   \           /   \                    /   \            /   \
a:2     c:2     r:2     s:2             u:2     *:3       e:3     w:3
                                               /   \
                                            n:1     *:2 
                                                   /   \
                                                f:1     i:1
* : 5
* : 6
o : 7
* : 8
* : 8
Code:
            *:8                 *:8                  *:11
       ____/   \____           /   \           _____/    \_____ 
    *:4             *:4     d:4     t:4     *:5                *:6 
   /   \           /   \                   /   \              /   \
a:2     c:2     r:2     s:2             u:2     *:3        e:3     w:3
                                               /   \
                                            n:1     *:2 
                                                   /   \
                                                f:1     i:1
o : 7
* : 8
* : 8
* : 11
Code:
       *:15
      /    \
   o:7      *:8                 *:8                  *:11
       ____/   \____           /   \           _____/    \_____ 
    *:4             *:4     d:4     t:4     *:5                *:6 
   /   \           /   \                   /   \              /   \
a:2     c:2     r:2     s:2             u:2     *:3        e:3     w:3
                                               /   \
                                            n:1     *:2 
                                                   /   \
                                                f:1     i:1
* : 8
* : 11
* : 15
<join up the rest>
Code:
                        *:34
           ____________/    \_____________
       *:15                               *:19
      /    \                       ______/    \______
   o:7      *:8                 *:8                  *:11
       ____/   \____           /   \           _____/    \_____ 
    *:4             *:4     d:4     t:4     *:5                *:6 
   /   \           /   \                   /   \              /   \
a:2     c:2     r:2     s:2             u:2     *:3        e:3     w:3
                                               /   \
                                            n:1     *:2 
                                                   /   \
                                                f:1     i:1
(Omitting the obvious 0=left, 1=right for each branch)

Now you can encode the message :

a 0100
w 1111
o 00
o 00
d 100
c 0101
u 1100
t 101
t 101
e 1110
r 0110

edit: I'm assuming they want the number of bits to encode "a woodcutter" using the tree generated for the entire message, rather than using the tree for just that text (which would be shorter)

Into a bitstream :

0100111100001000101110010110111100110 = 37 bits
 
Last edited:
Did the test today and despite my best efforts to revise it, this didn't come up!

I'd just like to thank everyone for their help and particularly matja for going to the effort of explaining all of that.
 
Back
Top Bottom