Suppose an oracle knows a string S that is unknown to us and that we want to determine. The oracle can answer queries of the form “Is s a substring of S?”. In 1995, Skiena and Sundaram showed that, in the worst case, any algorithm needs to ask the oracle σn/4−O(n) queries in order to be able to reconstruct the hidden string, where σ is the size of the alphabet of S and n its length, and gave an algorithm that spends (σ−1)n+O(σn) queries to reconstruct S. The main contribution of our paper is to improve the above upper-bound in the context where the string is compressible. We first present a universal algorithm that, given a (computable) compressor that compresses the string to τ bits, performs q=O(τ) substring queries; this algorithm, however, runs in exponential time. For this reason, the second part of the paper focuses on more time-efficient algorithms whose number of queries is bounded by specific compressibility measures. We first show that any string of length n over an integer alphabet of size σ with rle runs can be reconstructed with [Formula presented]> substring queries in linear time and space. We then present an algorithm that spends q∈O(σglog⁡n) substring queries and runs in O(n(log⁡n+log⁡σ)+q) time using linear space, where g is the size of a smallest straight-line program generating the string.

Fici G., Prezza N., Venturini R. (2021). Adaptive learning of compressible strings. THEORETICAL COMPUTER SCIENCE, 896, 46-52 [10.1016/j.tcs.2021.10.003].

Adaptive learning of compressible strings

Fici G.
;
Prezza N.;
2021-01-01

Abstract

Suppose an oracle knows a string S that is unknown to us and that we want to determine. The oracle can answer queries of the form “Is s a substring of S?”. In 1995, Skiena and Sundaram showed that, in the worst case, any algorithm needs to ask the oracle σn/4−O(n) queries in order to be able to reconstruct the hidden string, where σ is the size of the alphabet of S and n its length, and gave an algorithm that spends (σ−1)n+O(σn) queries to reconstruct S. The main contribution of our paper is to improve the above upper-bound in the context where the string is compressible. We first present a universal algorithm that, given a (computable) compressor that compresses the string to τ bits, performs q=O(τ) substring queries; this algorithm, however, runs in exponential time. For this reason, the second part of the paper focuses on more time-efficient algorithms whose number of queries is bounded by specific compressibility measures. We first show that any string of length n over an integer alphabet of size σ with rle runs can be reconstructed with [Formula presented]> substring queries in linear time and space. We then present an algorithm that spends q∈O(σglog⁡n) substring queries and runs in O(n(log⁡n+log⁡σ)+q) time using linear space, where g is the size of a smallest straight-line program generating the string.
2021
Fici G., Prezza N., Venturini R. (2021). Adaptive learning of compressible strings. THEORETICAL COMPUTER SCIENCE, 896, 46-52 [10.1016/j.tcs.2021.10.003].
File in questo prodotto:
File Dimensione Formato  
Adaptive learning of compressible strings.pdf

accesso aperto

Tipologia: Versione Editoriale
Dimensione 316.29 kB
Formato Adobe PDF
316.29 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/525562
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact