Research Article
A Fast Algorithm for Calculating Minimum Redundancy Prefix Codes with Unsorted Alphabet
@INPROCEEDINGS{10.4108/icst.chinacom.2014.256270, author={Yupeng Tai and Haibin Wang}, title={A Fast Algorithm for Calculating Minimum Redundancy Prefix Codes with Unsorted Alphabet}, proceedings={9th International Conference on Communications and Networking in China}, publisher={IEEE}, proceedings_a={CHINACOM}, year={2015}, month={1}, keywords={huffman coding lossless data compression minimum redundancy code prefix-free code}, doi={10.4108/icst.chinacom.2014.256270} }
- Yupeng Tai
Haibin Wang
Year: 2015
A Fast Algorithm for Calculating Minimum Redundancy Prefix Codes with Unsorted Alphabet
CHINACOM
IEEE
DOI: 10.4108/icst.chinacom.2014.256270
Abstract
Minimum redundancy coding (also known as Huffman code) is one of the most well-known algorithm of data compression. Many efforts have been made to improve the efficiency of it. Most of them are based on the assumption that the input alphabet has been already sorted. In this paper, we propose an algorithm of calculating the minimum-redundancy codes directly with unsorted alphabet. It consumes only O(nlog(n/k)) time in the worst cases, where n is the alphabet size and k is the longest codeword length. It is fast because only a part of the symbols requires to be sorted before the final minimum redundancy code is generated. The theoretical analysis and numerical simulation results show that this algorithm achieves a substantial improvement upon the best previous O(nlogn) algorithms for this problem.