The easiest-to-visualise method i think is to count the frequencies of each value (character) and sort into increasing frequency :
f : 1
i : 1
n : 1
a : 2
c : 2
r : 2
s : 2
u : 2
e : 3
w : 3
d : 4
t : 4
o : 7
Take the top two value:frequency pairs (lowest frequencies) and place
on a binary tree next to each other, and create a parent node with a frequency
which is the sum of the two child frequencies :
Remove the two values which we have used, and insert the new node into the
tree, at the first sorted postition :
n : 1
* : 2
a : 2
c : 2
r : 2
s : 2
u : 2
e : 3
w : 3
d : 4
t : 4
o : 7
( f:1 and i:1 are gone, *:2 is added )
Repeat for all values in the frequency list, creating seperate trees
and joining the tree where necessary :
Code:
*:3
/ \
n:1 *:2
/ \
f:1 i:1
a : 2
c : 2
r : 2
s : 2
u : 2
* : 3
e : 3
w : 3
d : 4
t : 4
o : 7
Code:
*:4 *:3
/ \ / \
a:2 c:2 n:1 *:2
/ \
f:1 i:1
r : 2
s : 2
u : 2
* : 3
e : 3
w : 3
* : 4
d : 4
t : 4
o : 7
Code:
*:4 *:4 *:3
/ \ / \ / \
a:2 c:2 r:2 s:2 n:1 *:2
/ \
f:1 i:1
u : 2
* : 3
e : 3
w : 3
* : 4
* : 4
d : 4
t : 4
o : 7
Code:
*:4 *:4 *:5
/ \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3
/ \
n:1 *:2
/ \
f:1 i:1
e : 3
w : 3
* : 4
* : 4
d : 4
t : 4
* : 5
o : 7
Code:
*:4 *:4 *:5 *:6
/ \ / \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3 e:3 w:3
/ \
n:1 *:2
/ \
f:1 i:1
* : 4
* : 4
d : 4
t : 4
* : 5
* : 6
o : 7
Code:
*:8
____/ \____
*:4 *:4 *:5 *:6
/ \ / \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3 e:3 w:3
/ \
n:1 *:2
/ \
f:1 i:1
d : 4
t : 4
* : 5
* : 6
o : 7
* : 8
Code:
*:15
/ \
*:8 *:8
____/ \____ / \
*:4 *:4 d:4 t:4 *:5 *:6
/ \ / \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3 e:3 w:3
/ \
n:1 *:2
/ \
f:1 i:1
* : 5
* : 6
o : 7
* : 8
* : 8
Code:
*:8 *:8 *:11
____/ \____ / \ _____/ \_____
*:4 *:4 d:4 t:4 *:5 *:6
/ \ / \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3 e:3 w:3
/ \
n:1 *:2
/ \
f:1 i:1
o : 7
* : 8
* : 8
* : 11
Code:
*:15
/ \
o:7 *:8 *:8 *:11
____/ \____ / \ _____/ \_____
*:4 *:4 d:4 t:4 *:5 *:6
/ \ / \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3 e:3 w:3
/ \
n:1 *:2
/ \
f:1 i:1
* : 8
* : 11
* : 15
<join up the rest>
Code:
*:34
____________/ \_____________
*:15 *:19
/ \ ______/ \______
o:7 *:8 *:8 *:11
____/ \____ / \ _____/ \_____
*:4 *:4 d:4 t:4 *:5 *:6
/ \ / \ / \ / \
a:2 c:2 r:2 s:2 u:2 *:3 e:3 w:3
/ \
n:1 *:2
/ \
f:1 i:1
(Omitting the obvious 0=left, 1=right for each branch)
Now you can encode the message :
a 0100
w 1111
o 00
o 00
d 100
c 0101
u 1100
t 101
t 101
e 1110
r 0110
edit: I'm assuming they want the number of bits to encode "a woodcutter" using the tree generated for the entire message, rather than using the tree for just that text (which would be shorter)
Into a bitstream :
0100111100001000101110010110111100110 = 37 bits