Dictionary-based compression algorithms include a parsing strategy to transform the input text into a sequence of dictionary phrases. Given a text, such process usually is not unique and, for compression purposes, it makes sense to find one of the possible parsing that minimise the final compression ratio. This is the parsing problem. In more than 30 years of history of dictionary-based text compression only few optimal parsing algorithms were presented. Most of the practical dictionary-based compression solutions need or prefer to factorise the input data into a sequence of dictionary-phrases and symbols. Those two output categories are usually encoded via two different encoders producing a compressed output that is a mixture of two compressors. This book contains a review of many dictionary-based compression schemes, their theoretical basis, a focus on the parsing problem and related problems, a recent theoretical model for such compression schemes, and an optimal solution called Dictionary-Symbolwise Flexible Parsing that covers almost all the classic dictionary-based compression schemes and the more general Dictionary-Symbolwise variant, where letters and dictionary references are compressed via different variable-length encoders.

Langiu, a. (2013). Optimal Parsing for Dictionary-Based Compression. berlin : Lap Lambert Academic Publishing.

Optimal Parsing for Dictionary-Based Compression

LANGIU, Alessio
2013-01-01

Abstract

Dictionary-based compression algorithms include a parsing strategy to transform the input text into a sequence of dictionary phrases. Given a text, such process usually is not unique and, for compression purposes, it makes sense to find one of the possible parsing that minimise the final compression ratio. This is the parsing problem. In more than 30 years of history of dictionary-based text compression only few optimal parsing algorithms were presented. Most of the practical dictionary-based compression solutions need or prefer to factorise the input data into a sequence of dictionary-phrases and symbols. Those two output categories are usually encoded via two different encoders producing a compressed output that is a mixture of two compressors. This book contains a review of many dictionary-based compression schemes, their theoretical basis, a focus on the parsing problem and related problems, a recent theoretical model for such compression schemes, and an optimal solution called Dictionary-Symbolwise Flexible Parsing that covers almost all the classic dictionary-based compression schemes and the more general Dictionary-Symbolwise variant, where letters and dictionary references are compressed via different variable-length encoders.
2013
Settore INF/01 - Informatica
978-3-659-51042-7
Langiu, a. (2013). Optimal Parsing for Dictionary-Based Compression. berlin : Lap Lambert Academic Publishing.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/86125
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact