Find Anagrams in String

See all coding puzzles and patterns here.

Difficulty: Medium
Patterns:
- Sliding Window Technique

Problem Definition

Given a string of characters and a target word, return all starting positions in the string of anagrams of the target word.

Examples

Some test cases to verify our solution.

assert anagrams(string="abdcdabdbab", word="bad") == [0, 4, 5, 7] # because of these anagrams of "bad" in the string:
    # abd cdabdbab
    # abdc dab dbab
    # abdcd abd bab
    # abdcdab dba b
assert anagrams(string="abdcdabdbabd", word="bad") == [0, 4, 5, 7, 9] # because of these anagrams of "bad" in the string:
    # abd cdabdbabd
    # abdc dab dbabd
    # abdcd abd babd
    # abdcdab dba bd
    # abdcdabdb abd

Solution

We'll use a sliding window approach using a window of the same length as the target word.

As we move the window through the string, we'll keep a count of all characters inside the window. On each iteration, we will decrement the count for the character that just left the window and increment the count for the character that just entered the window.

Then we will compare the character counts of the window and the target word to see if the window is an anagram of the target word. If it is an anagram, then the character counts should be equal, in which case, we add the start position of the window to the set of results.

def is_char_count_equal(word_count, curr_count):
    for char in word_count:
        if char not in curr_count:
            return False
        if curr_count[char] != word_count[char]:
            return False
    return True

def initialize_char_count(target_word):
    char_count = {}
    for char in target_word:
        if char not in char_count:
            char_count[char] = 0
        char_count[char] += 1
    return char_count

def anagrams(string, word):
    if len(string) < len(word):
        return []
    l = 0
    curr_count = initialize_char_count(string[:len(word)])
    word_count = initialize_char_count(word)
    res = []
    r = len(word)
    if is_char_count_equal(word_count, curr_count):
        res.append(l)
    while r < len(string):
        curr_count[string[l]] -= 1
        l += 1
        if string[r] not in curr_count:
            curr_count[string[r]] = 0
        curr_count[string[r]] += 1
        if is_char_count_equal(word_count, curr_count):
            res.append(l)
        r += 1
    return res

Complexity

Let $n = $ len(string), $m = $ len(word).

Runtime complexity should be $\Theta(n)$.

Auxiliary space complexity should be $\Theta(m)$ because we are storing an auxiliary dictionary of size $m$.

By removing empty entries from curr_count, we can reduce the space complexity to $m$ with an increase in mean and mode runtime but not $O$ runtime.

Coding Puzzle: Find Anagrams in String

Problem Definition

Examples

Solution

Complexity