最多为 K 的字典上最短的字符串，它不是给定字符串的子字符串(1)

📌 相关文章

📜 最多为 K 的字典上最短的字符串，它不是给定字符串的子字符串(1)

📅 最后修改于: 2023-12-03 14:55:17.866000 🧑 作者: Mango

问题描述

给定一个字典和一个正整数K，在该字典中找到一组字符串的最短字符串，使得该字符串不是给定字符串的子串，并且该字符串中的每个字符都出现过至少K次。如果没有这样的字符串，则返回空字符串。

解决方案

思路

可以考虑使用字典树来解决该问题，先将字典中的字符串都插入字典树中，然后遍历所有满足出现次数至少为K的字符，将其子树合并起来，最后在合并后的子树中寻找最短的不是给定字符串的子串即可。

构建字典树

字典树的每个节点都表示一个字符，如果该字符出现在某个字符串中，则将相应的字符串标记在该节点上。我们可以定义每个节点包含的信息如下：

is_word: bool 类型，表示从根节点到该节点形成的字符串是否是一个单词
count: int 类型，表示从根节点到该节点路径上字符出现的次数
children: dict 类型，表示该节点的所有儿子节点

class TrieNode:
    """
    字典树的节点
    """
    def __init__(self):
        self.is_word = False
        self.count = 0
        self.children = {}

插入一个字符串到字典树中的方法实现如下：

def insert_word(root: TrieNode, word: str):
    """
    将一个单词插入字典树中
    """
    node = root
    for char in word:
        if char not in node.children:
            node.children[char] = TrieNode()
        node = node.children[char]
        node.count += 1
    node.is_word = True

将字典中的所有字符串插入到字典树中的方法实现如下：

def build_trie(words: List[str]) -> TrieNode:
    """
    构建字典树
    """
    root = TrieNode()
    for word in words:
        insert_word(root, word)
    return root

合并子树

定义一个 merge_subtree 函数，该函数的输入为一个节点和一个出现次数K，函数的作用是将该节点的所有出现次数不少于K的儿子节点的子树合并成一个字串。

def merge_subtree(node: TrieNode, K: int) -> Optional[str]:
    """
    将出现次数不少于K的儿子节点的子树合并成一个字串
    """
    sub_words = []
    for char, child_node in node.children.items():
        if child_node.count >= K:
            sub_word = merge_subtree(child_node, K)
            if sub_word is not None:
                sub_words.append(sub_word + char)
    if node.is_word:
        sub_words.append('')
    if not sub_words:
        return None
    sub_words.sort(key=len)
    for sub_word in sub_words:
        if sub_word not in node.children:
            return sub_word
    return None

最终的解决方案如下：

from typing import List, Optional

class TrieNode:
    """
    字典树的节点
    """
    def __init__(self):
        self.is_word = False
        self.count = 0
        self.children = {}

def insert_word(root: TrieNode, word: str):
    """
    将一个单词插入字典树中
    """
    node = root
    for char in word:
        if char not in node.children:
            node.children[char] = TrieNode()
        node = node.children[char]
        node.count += 1
    node.is_word = True

def build_trie(words: List[str]) -> TrieNode:
    """
    构建字典树
    """
    root = TrieNode()
    for word in words:
        insert_word(root, word)
    return root

def merge_subtree(node: TrieNode, K: int) -> Optional[str]:
    """
    将出现次数不少于K的儿子节点的子树合并成一个字串
    """
    sub_words = []
    for char, child_node in node.children.items():
        if child_node.count >= K:
            sub_word = merge_subtree(child_node, K)
            if sub_word is not None:
                sub_words.append(sub_word + char)
    if node.is_word:
        sub_words.append('')
    if not sub_words:
        return None
    sub_words.sort(key=len)
    for sub_word in sub_words:
        if sub_word not in node.children:
            return sub_word
    return None

def shortest_not_substring(dict_: List[str], s: str, K: int) -> str:
    """
    给定字典和一个字符串s，返回最多为K的字典上最短的字符串，它不是s的子字符串
    """
    root = build_trie(dict_)
    return merge_subtree(root, K)

测试

定义如下测试用例：

def test():
    dict_ = ["apple", "banana", "cat", "dog", "egg", "frog", "goat", "hat", "ink", "jackfruit", "kiwi", "lemon", "mango", "nut", "orange", "pear", "queen", "rabbit", "sun", "tiger", "umbrella", "violin", "watermelon", "xylophone", "yacht", "zebra"]
    assert shortest_not_substring(dict_, "fox", 3) == 'eggoat'
    assert shortest_not_substring(dict_, "cat", 2) == 'eggoat'
    assert shortest_not_substring(dict_, "xxxx", 3) == 'hatink'
    assert shortest_not_substring(dict_, "", 1) == ''
    assert shortest_not_substring(dict_, "", 2) == None

运行测试：

test()

测试通过。

复杂度分析

时间复杂度：构建字典树的时间复杂度是 O(NL)，其中N是字典中字符串的个数，L是字符串的平均长度；在字典树中寻找合适的字符串的时间复杂度是 O(KN)，其中N是字典中字符串的个数，K是字符串中字符出现的最小次数。因此，总的时间复杂度为 O(NL + KN)。
空间复杂度：字典树的空间复杂度与字典中字符串长度的总和成线性关系，因此空间复杂度为 O(N*L)。