📌  相关文章
📜  给定范围内的最大出现次数

📅  最后修改于: 2021-04-17 12:19:33             🧑  作者: Mango

以非降序给出n个整数数组。查找给定范围内最频繁出现的值的数量。

例子:

Input : arr[] = {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
        Query 1: start = 0, end = 9
        Query 2: start = 4, end = 9
Output : 4
         3
Explanation:  
Query 1: '2' occurred the most number of times
with a frequency of 4 within given range.
Query 2: '7' occurred the most number of times
with a frequency of 3 within given range.

段树可用于有效解决此问题。
有关分段树的实现,请参见此处
这个问题背后的关键思想是给定的数组以非降序排列,这意味着当数组按排序顺序排列时,所有出现的数字都连续放置在数组中。
可以构造一个分段树,其中每个节点将存储其各自范围[i,j]的最大计数。为此,我们将构建频率阵列,并在该阵列上调用RMQ(范围最大查询)。例如

arr[] =  {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
freq_arr[] = {2, 2, 4, 4, 4, 4, 1, 3, 3, 3}
where, freq_arr[i] = frequency(arr[i])

现在有两种情况需要考虑,
情况1:给定范围的索引i和j处的数字值相同,即arr [i] = arr [j]。
解决这种情况非常容易。由于arr [i] = arr [j],因此这些索引之间的所有数字都是相同的(因为数组是非递减的)。因此,这种情况的答案就是简单地计算i和j(包括两者)之间的所有数字,即(j – i + 1)
例如

arr[] =  {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
if the given query range is [3, 5], answer would 
be (5 - 3 + 1) = 3, as 2 occurs 3 times within 
given range

情况2:给定范围的索引i和j处的数字值不同,即arr [i]!= arr [j]
如果arr [i]!= arr [j],则存在一个索引k,其中arr [i] = arr [k]和arr [i]!= arr [k +1]。这可能是部分重叠的情况,其中某些特定数字出现在给定范围的最左侧,而某些出现在范围开始之前。在这里,简单地调用RMQ将导致错误的答案。
例如

arr[] =  {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
freq_arr[] = {2, 2, 4, 4, 4, 4, 1, 3, 3, 3}
if the given query is [4, 9], calling RMQ on 
freq_arr[] will give us 4 as answer which 
is incorrect as some occurrences of 2 are 
lying outside the range. Correct answer 
is 3.

在给定范围的最右边部分可能会发生类似情况,其中某些特定数字出现在该范围内,而某些出现在该范围结束之后。
因此,对于这种情况,在给定范围内,我们必须计算直到索引i的最左边相同的数字,以及从索引j到范围末尾的最右边的相同数字。然后在索引i和j之间调用RMQ(范围最大查询),并取这三个值中的最大值。
例如

arr[] =  {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
freq_arr[] = {2, 2, 4, 4, 4, 4, 1, 3, 3, 3}
if the given query is [4, 7], counting leftmost
same numbers i.e 2 which occurs 2 times inside 
the range and rightmost same numbers i.e. 3 
which occur only 1 time and RMQ on [6, 6] is 
1. Hence maximum would be 2.

下面是上述方法的实现

// C++ Program to find the occurrence
// of the most frequent number within
// a given range
#include 
using namespace std;
  
// A utility function to get the middle index
// from corner indexes.
int getMid(int s, int e) { return s + (e - s) / 2; }
  
/*  A recursive function to get the maximum value in
    a given range  of array indexes. The following
    are parameters for this function.
   
    st    --> Pointer to segment tree
    index --> Index of current node in the segment 
              tree. Initially 0 is passed as root is
              always at index 0
    ss & se  --> Starting and ending indexes of the 
                 segment represented by current node,
                  i.e., st[index]
    qs & qe  --> Starting and ending indexes of query 
                 range */
int RMQUtil(int* st, int ss, int se, int qs, int qe, 
                                          int index)
{
    // If segment of this node is a part of given range,
    //  then return the min of the segment
    if (qs <= ss && qe >= se)
        return st[index];
  
    // If segment of this node is outside the
    // given range
    if (se < qs || ss > qe)
        return 0;
  
    // If a part of this segment overlaps 
    // with the given range
    int mid = getMid(ss, se);
    return max(RMQUtil(st, ss, mid, qs, qe, 2 * index + 1),
               RMQUtil(st, mid + 1, se, qs, qe, 2 * index + 2));
}
  
// Return minimum of elements in range from
// index qs (query start) to
// qe (query end).  It mainly uses RMQUtil()
int RMQ(int* st, int n, int qs, int qe)
{
    // Check for erroneous input values
    if (qs < 0 || qe > n - 1 || qs > qe) {
        printf("Invalid Input");
        return -1;
    }
  
    return RMQUtil(st, 0, n - 1, qs, qe, 0);
}
  
// A recursive function that constructs Segment Tree
// for array[ss..se]. si is index of current node in
// segment tree st
int constructSTUtil(int arr[], int ss, int se, int* st, 
                                               int si)
{
    // If there is one element in array, store it in
    //  current node of segment tree and return
    if (ss == se) {
        st[si] = arr[ss];
        return arr[ss];
    }
  
    // If there are more than one elements, then 
    // recur for left and right subtrees and store 
    // the minimum of two values in this node
    int mid = getMid(ss, se);
    st[si] = max(constructSTUtil(arr, ss, mid, st, si * 2 + 1),
                 constructSTUtil(arr, mid + 1, se, st, si * 2 + 2));
    return st[si];
}
  
/* Function to construct segment tree from given 
   array. This function allocates memory for segment
   tree and calls constructSTUtil() to fill the 
   allocated memory */
int* constructST(int arr[], int n)
{
    // Allocate memory for segment tree
  
    // Height of segment tree
    int x = (int)(ceil(log2(n)));
  
    // Maximum size of segment tree
    int max_size = 2 * (int)pow(2, x) - 1;
  
    int* st = new int[max_size];
  
    // Fill the allocated memory st
    constructSTUtil(arr, 0, n - 1, st, 0);
  
    // Return the constructed segment tree
    return st;
}
  
int maximumOccurrence(int arr[], int n, int qs, int qe)
{
    // Declaring a frequency array
    int freq_arr[n + 1];
  
    // Counting frequencies of all array elements.
    unordered_map cnt;
    for (int i = 0; i < n; i++)
        cnt[arr[i]]++; 
  
    // Creating frequency array by replacing the 
    // number in array to the number of times it 
    // has appeared in the array
    for (int i = 0; i < n; i++)
        freq_arr[i] = cnt[arr[i]];
  
    // Build segment tree from this frequency array
    int* st = constructST(freq_arr, n);
  
    int maxOcc; // to store the answer
  
    // Case 1: numbers are same at the starting 
    // and ending index of the query
    if (arr[qs] == arr[qe])
        maxOcc = (qe - qs + 1);
  
    // Case 2: numbers are different
    else {
        int leftmost_same = 0, righmost_same = 0;
  
        // Partial Overlap Case of a number with some
        // occurrences lying inside the leftmost
        //  part of the range and some just before the
        // range starts
        while (qs > 0 && qs <= qe && arr[qs] == arr[qs - 1]) {
            qs++;
            leftmost_same++;
        }
  
        // Partial Overlap Case of a number with some 
        // occurrences lying inside the rightmost part of 
        // the range and some just after the range ends
        while (qe >= qs && qe < n - 1 && arr[qe] == arr[qe + 1]) {
            qe--;
            righmost_same++;
        }
        // Taking maximum of all three
        maxOcc = max({leftmost_same, righmost_same, 
                                RMQ(st, n, qs, qe)});
    }
    return maxOcc;
}
  
// Driver Code
int main()
{
    int arr[] = { -5, -5, 2, 2, 2, 2, 3, 7, 7, 7 };
    int n = sizeof(arr) / sizeof(arr[0]);
  
    int qs = 0; // Starting index of query range
    int qe = 9; // Ending index of query range
  
    // Print occurrence of most frequent number 
    // within given range
    cout << "Maximum Occurrence in range is = " 
         << maximumOccurrence(arr, n, qs, qe) << endl;
  
    qs = 4; // Starting index of query range
    qe = 9; // Ending index of query range
  
    // Print occurrence of most frequent number
    // within given range
    cout << "Maximum Occurrence in range is = " 
         << maximumOccurrence(arr, n, qs, qe) << endl;
  
    return 0;
}
输出:
Maximum Occurrence in range is = 4
Maximum Occurrence in range is = 3

进一步的优化:对于部分重叠的情况,我们必须运行一个循环以计算两边相同数字的数量。为了避免该循环并在O(1)中执行此操作,我们可以将每个数字的首次出现的索引存储在给定数组中,因此,通过进行一些预计算,我们可以在O(1)中找到所需的计数。

时间复杂度:
树构建的时间复杂度为O(n)。查询的时间复杂度为O(Log n)。