xkcd-repeats: A new Taxonomy of Repeats defined by their Context Diversity

Published by NAVER LABS Europe at 9 November 2017

Matthias Gallé, Matias Tealdi

Journal of Discrete Algorithms, 48, pp. 1-16

@article{galle2018xkcd,
  title={xkcd-repeats: A new taxonomy of repeats defined by their context diversity},
  author={Gall{\'e}, Matthias and Tealdi, Mat{\'\i}as},
  journal={Journal of Discrete Algorithms},
  volume={48},
  pages={1--16},
  year={2018},
  publisher={Elsevier}
}

Careers home

The context in which a substring appears is an important notion to identify – for example – its semantic meaning. However, existing definitions from stringology fail to model the context explicitly. We introduce here xkcd-repeats, a new family of repeats characterized by the number of different symbols at the left and right of their occurrences. These repeats include as special extreme cases the well-known classes of maximal and super-maximal repeats. We give sufficient and necessary condition to bound their number linearly in the size of the original string, and show an optimal algorithm that computes them in linear time – given a suffix array –, independent of the size of the alphabet, as well as two other algorithms that are faster in practice. We extend this in two ways: first we show how to generalize the notion of context. Instead of reducing it to the single symbol before and after an occurrence, we propose the notion of unbounded context, while keeping linear representation and computing time. Secondly, we provide a general framework which allows to compute these (and other) repeats incrementally; opening the door to streaming application.

NAVER FRANCE Gender Equality 2024

NAVER FRANCE Gender Equality 2023

VISION

Perception to help robots understand and interact with the environment.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

Action

All

Publications

Blog

News

Careers

People

xkcd-repeats: A new Taxonomy of Repeats defined by their Context Diversity

All

Publications

Blog

News

Careers

People

Cookie settings