An object to find exact subsequences within a sequence.

Namespace:  BioSharp.Core.Bio.Search
Assembly:  BioSharp.Core (in BioSharp.Core.dll) Version: 0.1.3191.26120 (0.1.0.0)

Syntax

C#
public sealed class KnuthMorrisPrattSearch

Remarks

Reference: KNUTH D.E., MORRIS (Jr) J.H., PRATT V.R., 1977, Fast pattern matching in strings, SIAM Journal on Computing 6(1):323-350.

When the object is constructed the findMatches() method would be called. This will return an int[] giving the offsets (ie the location of the first symbol of each match in the text). The getKMPNextTable() returns the table of border lengths used in the algorithm. This method is protected as it is unlikely it will be needed except for debugging.

The algorithm finds exact matches therefore ambiguity symbols will match only themselves. The class cannot perform regular expressions. The class operates on all alphabets thus if searching for a DNA pattern you should compile both the pattern and its reverse complement.

WARNING the behaviour of a pattern containing gaps is undefined. It's not recommended that you try it.

Original BioJava version by Mark Schreiber. Port to C# by Doug Swisher.

Inheritance Hierarchy

System..::.Object
  BioSharp.Core.Bio.Search..::.KnuthMorrisPrattSearch

See Also