Longest Common Substring
Algorithm
Given 2 strings abcdxyz
and xyzabcd
lets say, we want to find the longest common substring between the 2. There are 2 here: abcd
and xyz
. So we're looking for the former.
The most efficient algorithm given 2 strings n
and m
has characteristics:
Space:
Complexity:
And proceeds like so: We keep an in memory matrix where rows, indexed by i2,
correspond to letters in s2
:xyzabcd
and columns, indexed by i1
correspond to letters in s1
: abcdxyz
. We begin with i1 = 0
, i2 = 0
and iterate through both strings in a double for loop style. If s2[i2] == s1[i1]
we updated the matrix: m[i2][i1] = m[i2-1][i1-1] + 1
. The resulting m
looks like so:
a
b
c
d
x
y
z
x
0
0
0
0
1
0
0
y
0
0
0
0
0
2
0
z
0
0
0
0
0
0
3
a
1
0
0
0
0
0
0
b
0
2
0
0
0
0
0
c
0
0
3
0
0
0
0
d
0
0
0
4
0
0
0