See all coding puzzles and patterns here.

Problem Definition

Given 2 strings, find the longest common substrings for the 2 strings. The substrings must be contiguous.


Examples

Some test cases to verify our solution.

assert longest_common_substring("abcdefg", "zzzzzzdefzzz") == "def"
assert longest_common_substring("abcdefg", "gfedcba") == "a" # any other single character is also acceptable

Solution

This is a dynamic programming problem similar to Longest Common Subsequence. However, unlike the subsequence variant, the characters in this solution must be contiguous. dp[i][j] will represent the longest common substring for the strings s1[:i] and s2[:j] that ends at the end of both strings.

def longest_common_substring(s1, s2):
    l1 = len(s1) + 1
    l2 = len(s2) + 1
    dp = [[0 for _ in range(l2)] for _ in range(l1)]
    max_len = 0
    result = ""
    for i1 in range(1, l1):
        for i2 in range(1, l2):
            if s1[i1 - 1] != s2[i2 - 1]:
                dp[i1][i2] = 0
            else:
                dp[i1][i2] = dp[i1-1][i2-1] + 1
                if dp[i1][i2] > max_len:
                    max_len = dp[i1][i2]
                    result = s1[i1-max_len:i1]
    return result

Complexity

Let $n = $ len(s1), $m = $ len(s3).

Runtime complexity should be $O(n \cdot m)$.

Auxiliary space complexity should be $O(n \cdot m)$.