Suffix Tree

The suffix tree data structure was one of the first linear-time solutions for the longest common substring problem. It was first described by E.M. McCreight in 1976. A suffix tree for an n-character string S is a Patricia trie containing all n suffixes of S. With it, a large text can be searched, and common substrings can be extracted, very quickly. Variants of the LZW compression schemes use it (LZSS). Suffix trees are useful for string matching applications, such as those that arise when working with DNA sequences. Each edge in a suffix tree contains the following information: an edge label, in the form of a substring of the source string, represented by the start and end positions of the substring; a list of child nodes, often in the form of a linked list, a pointer to the next sibling node, and a suffix link, pointing to the node for the immediate suffix of the string represented by the current node. Suffix links are a key feature for linear-time construction of the tree, since they allow changes to propagate to all suffixes quickly. The large amount of information at each node makes the suffix tree very memory-intensive, consuming some twenty times the memory size of the source text in common implementations. The Suffix array reduces this requirement to a factor of four, and efforts have continued to find smaller indexing structures.

References

  • E.M. McCreight. (1976). A space-economical suffix tree construction algorithm. Journal of the ACM 23 262-272.
  • E. Ukkonen. (1995). On-line construction of suffix trees. Algorithmica 14(3):249-260. PDF

External links

 

<< PreviousWord BrowserNext >>
society of exploration geophysicists
convex combination
latitudinarian
chuckle brothers
stu iii
cuthred of wessex
homo sacer
cate edwards
list of geoscience organizations
james campbell
aethelheard
mariella frostrup
cuthred
paul breisach
nir davidovich
thomas andrews (metallurgist)
dracula x: rondo of blood
9th millennium
robert main
corrie
air south
wortley top forge
edgar roni figaro
aon center (los angeles)
sopwith triplane
mod python
10th millennium
acarology
idina menzel
truro parish
list of turkish field marshals
jarndyce and jarndyce
nfl network
wmi (x window manager)
gros islet
saic
dish antenna
opponent processes
frederick campion steward
sunpass
takeshi nagata
parti marxiste lniniste du qubec
james campbell (disambiguation)
static (radio)