huffman notes

2019-03-28 14:51:54 -07:00
parent ebc0272dcf
commit 9b21944cf6
1 changed files with 35 additions and 0 deletions
--- a/370/notes/huffman.md
+++ b/370/notes/huffman.md
@@ -0,0 +1,35 @@
 # Huffman codes
 Covering: Fixed length encoding & Variable length encoding
 # Fixed length encoding
 consider ascii or unicode, where each symbol is either 8-bit or 16-bits in width.
 # Huffman Trees
 We create a tree of character frequencies where each node basically has a character and that character's frequency.
 ```
 struct Node {
 	uint8_t c;
 	size_t frequency;
 	...
 };
 ```
 Rules of thumb:
 	* The more frequent characters are close to the root
 	* Less frequent characters are found far from the root
 We'll end up with a list of nodes which we can throw into a maxheap to build our huffman tree.
 General decoding process goes like this:
 	1. Get frequencies of all symbols
 	2. Put those frequenncies of symbols into a structure like above
 	3. Build max heap of the node set
 		Keep in mind however that our root should be agnostic so that we can start bit strings with 0|1
 	4. When we reach a leaf we drop that char into the result