查询字符串给定范围的第 N 个最小字符(1)

📌 相关文章

📜 查询字符串给定范围的第 N 个最小字符(1)

📅 最后修改于: 2023-12-03 14:55:37.541000 🧑 作者: Mango

查询字符串给定范围的第 N 个最小字符

在处理字符串相关问题时，我们有时需要寻找字符串范围内的最小（也可以是最大）字符，或者查询给定范围内的第 N 个最小（或最大）字符。本篇文章将介绍一些常用的解法和相应的代码实现。

Brute Force

最暴力的方法肯定是直接遍历给定范围内的所有字符，然后找到最小的字符。这种方法的时间复杂度为O(N)，其中N是字符串范围内的字符数量。

def get_smallest_char(s: str, left: int, right: int, k: int) -> str:
    chars = sorted(s[left:right+1])
    return chars[k-1] if k <= len(chars) else ""

不难发现，在字符串比较大时，这种方法非常耗时，所以我们需要使用更高效的算法。

基于堆的算法

我们可以使用一个最小堆（min heap）保存给定范围内所有的字符，然后对堆进行k次弹出操作，最后得到第k小字符。如果范围内的字符总数量为N，则该算法的时间复杂度为O(N*logN)。

import heapq

def get_smallest_char(s: str, left: int, right: int, k: int) -> str:
    heap = []
    for i in range(left, right+1):
        heapq.heappush(heap, s[i])
    for i in range(1, k):
        heapq.heappop(heap)
    return heap[0] if len(heap) >= k else ""

基于桶的算法

如果我们的字符范围比较小，可以使用桶（bucket）或计数排序（counting sort）来代替堆排序。桶排序算法的时间复杂度为O(N)，但是它需要更多的空间。

def get_smallest_char(s: str, left: int, right: int, k: int) -> str:
    buckets = [0] * 26             # 把字母映射到桶里
    for i in range(left, right+1):
        buckets[ord(s[i]) - ord('a')] += 1
    count = 0
    for i in range(26):
        count += buckets[i]
        if count >= k:
            return chr(ord('a') + i)
    return ""

二分查找算法

如果我们想查找给定范围内的第k小字符（而不是所有字符），可以通过二分查找算法来实现。首先，我们选择一个pivot字符，然后遍历字符串并计算出小于pivot字符的数量。如果数量小于k，则在右半段继续查找第k-count小字符；否则在左半段查找第k小字符。

def get_smallest_char(s: str, left: int, right: int, k: int) -> str:
    count_chars = [0] * 26
    for i in range(left, right+1):
        count_chars[ord(s[i])-ord('a')] += 1
    l, r = 0, 26
    while l < r:
        mid = l + (r-l) // 2
        count = sum(count_chars[:mid+1])
        if count < k:
            l = mid + 1
        else:
            r = mid
    return chr(ord('a') + l)

总结

本篇文章介绍了四种常用的算法来查找字符串范围内的最小字符或第N小字符。这些算法的时间复杂度从O(N)到O(N*logN)不等，我们需要根据具体的场景选择合适的算法。