huffman notes
This commit is contained in:
parent
ebc0272dcf
commit
9b21944cf6
35
370/notes/huffman.md
Normal file
35
370/notes/huffman.md
Normal file
@ -0,0 +1,35 @@
|
||||
# Huffman codes
|
||||
|
||||
Covering: Fixed length encoding & Variable length encoding
|
||||
|
||||
# Fixed length encoding
|
||||
|
||||
consider ascii or unicode, where each symbol is either 8-bit or 16-bits in width.
|
||||
|
||||
# Huffman Trees
|
||||
|
||||
We create a tree of character frequencies where each node basically has a character and that character's frequency.
|
||||
|
||||
```
|
||||
struct Node {
|
||||
uint8_t c;
|
||||
size_t frequency;
|
||||
...
|
||||
};
|
||||
```
|
||||
|
||||
Rules of thumb:
|
||||
* The more frequent characters are close to the root
|
||||
* Less frequent characters are found far from the root
|
||||
|
||||
We'll end up with a list of nodes which we can throw into a maxheap to build our huffman tree.
|
||||
|
||||
General decoding process goes like this:
|
||||
|
||||
1. Get frequencies of all symbols
|
||||
2. Put those frequenncies of symbols into a structure like above
|
||||
3. Build max heap of the node set
|
||||
Keep in mind however that our root should be agnostic so that we can start bit strings with 0|1
|
||||
4. When we reach a leaf we drop that char into the result
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user