Dictionary-based compression schemes are the most commonly used data compression schemes since they appeared in the foundational paper of Ziv and Lempel in 1977, and generally referred to as LZ77. Their work is the base of Zip, gZip, 7-Zip and many other compression software utilities. Some of these compression schemes use variants of the greedy approach to parse the text into dictionary phrases; others have left the greedy approach to improve the compression ratio. Recently, two bit-optimal parsing algorithms have been presented filling the gap between theory and best practice. We present a survey on the parsing problem for dictionary-based text compression, identifying noticeable results of both a theoretical and practical nature, which have appeared in the last three decades. We follow the historical steps of the Zip scheme showing how the original optimal parsing problem of finding a parse formed by the minimum number of phrases has been replaced by the bit-optimal parsing problem where the goal is to minimise the length in bits of the encoded text.

Langiu, A. (2013). On parsing optimality for dictionary-based text compression— the Zip case. JOURNAL OF DISCRETE ALGORITHMS, 20, 65-70 [10.1016/j.jda.2013.04.001].

On parsing optimality for dictionary-based text compression— the Zip case

LANGIU, Alessio
2013-01-01

Abstract

Dictionary-based compression schemes are the most commonly used data compression schemes since they appeared in the foundational paper of Ziv and Lempel in 1977, and generally referred to as LZ77. Their work is the base of Zip, gZip, 7-Zip and many other compression software utilities. Some of these compression schemes use variants of the greedy approach to parse the text into dictionary phrases; others have left the greedy approach to improve the compression ratio. Recently, two bit-optimal parsing algorithms have been presented filling the gap between theory and best practice. We present a survey on the parsing problem for dictionary-based text compression, identifying noticeable results of both a theoretical and practical nature, which have appeared in the last three decades. We follow the historical steps of the Zip scheme showing how the original optimal parsing problem of finding a parse formed by the minimum number of phrases has been replaced by the bit-optimal parsing problem where the goal is to minimise the length in bits of the encoded text.
2013
Settore INF/01 - Informatica
Langiu, A. (2013). On parsing optimality for dictionary-based text compression— the Zip case. JOURNAL OF DISCRETE ALGORITHMS, 20, 65-70 [10.1016/j.jda.2013.04.001].
File in questo prodotto:
File Dimensione Formato  
The_Zip_case.pdf

Solo gestori archvio

Descrizione: Articolo principale
Dimensione 263.54 kB
Formato Adobe PDF
263.54 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/86143
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? ND
social impact