9th International Conference on Communications and Networking in China

Research Article

A Fast Algorithm for Calculating Minimum Redundancy Prefix Codes with Unsorted Alphabet

  • @INPROCEEDINGS{10.4108/icst.chinacom.2014.256270,
        author={Yupeng Tai and Haibin Wang},
        title={A Fast Algorithm for Calculating Minimum Redundancy Prefix Codes with Unsorted Alphabet},
        proceedings={9th International Conference on Communications and Networking in China},
        publisher={IEEE},
        proceedings_a={CHINACOM},
        year={2015},
        month={1},
        keywords={huffman coding lossless data compression minimum redundancy code prefix-free code},
        doi={10.4108/icst.chinacom.2014.256270}
    }
    
  • Yupeng Tai
    Haibin Wang
    Year: 2015
    A Fast Algorithm for Calculating Minimum Redundancy Prefix Codes with Unsorted Alphabet
    CHINACOM
    IEEE
    DOI: 10.4108/icst.chinacom.2014.256270
Yupeng Tai1,*, Haibin Wang1
  • 1: Institute of Acoustics, Chinese Academy of Sciences
*Contact email: typ@mail.ioa.ac.cn

Abstract

Minimum redundancy coding (also known as Huffman code) is one of the most well-known algorithm of data compression. Many efforts have been made to improve the efficiency of it. Most of them are based on the assumption that the input alphabet has been already sorted. In this paper, we propose an algorithm of calculating the minimum-redundancy codes directly with unsorted alphabet. It consumes only O(nlog(n/k)) time in the worst cases, where n is the alphabet size and k is the longest codeword length. It is fast because only a part of the symbols requires to be sorted before the final minimum redundancy code is generated. The theoretical analysis and numerical simulation results show that this algorithm achieves a substantial improvement upon the best previous O(nlogn) algorithms for this problem.