以非降序给出n个整数数组。查找给定范围内最频繁出现的值的数量。
例子:
Input : arr[] = {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
Query 1: start = 0, end = 9
Query 2: start = 4, end = 9
Output : 4
3
Explanation:
Query 1: '2' occurred the most number of times
with a frequency of 4 within given range.
Query 2: '7' occurred the most number of times
with a frequency of 3 within given range.
段树可用于有效解决此问题。
有关分段树的实现,请参见此处
这个问题背后的关键思想是给定的数组以非降序排列,这意味着当数组按排序顺序排列时,所有出现的数字都连续放置在数组中。
可以构造一个分段树,其中每个节点将存储其各自范围[i,j]的最大计数。为此,我们将构建频率阵列,并在该阵列上调用RMQ(范围最大查询)。例如
arr[] = {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
freq_arr[] = {2, 2, 4, 4, 4, 4, 1, 3, 3, 3}
where, freq_arr[i] = frequency(arr[i])
现在有两种情况需要考虑,
情况1:给定范围的索引i和j处的数字值相同,即arr [i] = arr [j]。
解决这种情况非常容易。由于arr [i] = arr [j],因此这些索引之间的所有数字都是相同的(因为数组是非递减的)。因此,这种情况的答案就是简单地计算i和j(包括两者)之间的所有数字,即(j – i + 1)
例如
arr[] = {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
if the given query range is [3, 5], answer would
be (5 - 3 + 1) = 3, as 2 occurs 3 times within
given range
情况2:给定范围的索引i和j处的数字值不同,即arr [i]!= arr [j]
如果arr [i]!= arr [j],则存在一个索引k,其中arr [i] = arr [k]和arr [i]!= arr [k +1]。这可能是部分重叠的情况,其中某些特定数字出现在给定范围的最左侧,而某些出现在范围开始之前。在这里,简单地调用RMQ将导致错误的答案。
例如
arr[] = {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
freq_arr[] = {2, 2, 4, 4, 4, 4, 1, 3, 3, 3}
if the given query is [4, 9], calling RMQ on
freq_arr[] will give us 4 as answer which
is incorrect as some occurrences of 2 are
lying outside the range. Correct answer
is 3.
在给定范围的最右边部分可能会发生类似情况,其中某些特定数字出现在该范围内,而某些出现在该范围结束之后。
因此,对于这种情况,在给定范围内,我们必须计算直到索引i的最左边相同的数字,以及从索引j到范围末尾的最右边的相同数字。然后在索引i和j之间调用RMQ(范围最大查询),并取这三个值中的最大值。
例如
arr[] = {-5, -5, 2, 2, 2, 2, 3, 7, 7, 7}
freq_arr[] = {2, 2, 4, 4, 4, 4, 1, 3, 3, 3}
if the given query is [4, 7], counting leftmost
same numbers i.e 2 which occurs 2 times inside
the range and rightmost same numbers i.e. 3
which occur only 1 time and RMQ on [6, 6] is
1. Hence maximum would be 2.
下面是上述方法的实现
// C++ Program to find the occurrence
// of the most frequent number within
// a given range
#include
using namespace std;
// A utility function to get the middle index
// from corner indexes.
int getMid(int s, int e) { return s + (e - s) / 2; }
/* A recursive function to get the maximum value in
a given range of array indexes. The following
are parameters for this function.
st --> Pointer to segment tree
index --> Index of current node in the segment
tree. Initially 0 is passed as root is
always at index 0
ss & se --> Starting and ending indexes of the
segment represented by current node,
i.e., st[index]
qs & qe --> Starting and ending indexes of query
range */
int RMQUtil(int* st, int ss, int se, int qs, int qe,
int index)
{
// If segment of this node is a part of given range,
// then return the min of the segment
if (qs <= ss && qe >= se)
return st[index];
// If segment of this node is outside the
// given range
if (se < qs || ss > qe)
return 0;
// If a part of this segment overlaps
// with the given range
int mid = getMid(ss, se);
return max(RMQUtil(st, ss, mid, qs, qe, 2 * index + 1),
RMQUtil(st, mid + 1, se, qs, qe, 2 * index + 2));
}
// Return minimum of elements in range from
// index qs (query start) to
// qe (query end). It mainly uses RMQUtil()
int RMQ(int* st, int n, int qs, int qe)
{
// Check for erroneous input values
if (qs < 0 || qe > n - 1 || qs > qe) {
printf("Invalid Input");
return -1;
}
return RMQUtil(st, 0, n - 1, qs, qe, 0);
}
// A recursive function that constructs Segment Tree
// for array[ss..se]. si is index of current node in
// segment tree st
int constructSTUtil(int arr[], int ss, int se, int* st,
int si)
{
// If there is one element in array, store it in
// current node of segment tree and return
if (ss == se) {
st[si] = arr[ss];
return arr[ss];
}
// If there are more than one elements, then
// recur for left and right subtrees and store
// the minimum of two values in this node
int mid = getMid(ss, se);
st[si] = max(constructSTUtil(arr, ss, mid, st, si * 2 + 1),
constructSTUtil(arr, mid + 1, se, st, si * 2 + 2));
return st[si];
}
/* Function to construct segment tree from given
array. This function allocates memory for segment
tree and calls constructSTUtil() to fill the
allocated memory */
int* constructST(int arr[], int n)
{
// Allocate memory for segment tree
// Height of segment tree
int x = (int)(ceil(log2(n)));
// Maximum size of segment tree
int max_size = 2 * (int)pow(2, x) - 1;
int* st = new int[max_size];
// Fill the allocated memory st
constructSTUtil(arr, 0, n - 1, st, 0);
// Return the constructed segment tree
return st;
}
int maximumOccurrence(int arr[], int n, int qs, int qe)
{
// Declaring a frequency array
int freq_arr[n + 1];
// Counting frequencies of all array elements.
unordered_map cnt;
for (int i = 0; i < n; i++)
cnt[arr[i]]++;
// Creating frequency array by replacing the
// number in array to the number of times it
// has appeared in the array
for (int i = 0; i < n; i++)
freq_arr[i] = cnt[arr[i]];
// Build segment tree from this frequency array
int* st = constructST(freq_arr, n);
int maxOcc; // to store the answer
// Case 1: numbers are same at the starting
// and ending index of the query
if (arr[qs] == arr[qe])
maxOcc = (qe - qs + 1);
// Case 2: numbers are different
else {
int leftmost_same = 0, righmost_same = 0;
// Partial Overlap Case of a number with some
// occurrences lying inside the leftmost
// part of the range and some just before the
// range starts
while (qs > 0 && qs <= qe && arr[qs] == arr[qs - 1]) {
qs++;
leftmost_same++;
}
// Partial Overlap Case of a number with some
// occurrences lying inside the rightmost part of
// the range and some just after the range ends
while (qe >= qs && qe < n - 1 && arr[qe] == arr[qe + 1]) {
qe--;
righmost_same++;
}
// Taking maximum of all three
maxOcc = max({leftmost_same, righmost_same,
RMQ(st, n, qs, qe)});
}
return maxOcc;
}
// Driver Code
int main()
{
int arr[] = { -5, -5, 2, 2, 2, 2, 3, 7, 7, 7 };
int n = sizeof(arr) / sizeof(arr[0]);
int qs = 0; // Starting index of query range
int qe = 9; // Ending index of query range
// Print occurrence of most frequent number
// within given range
cout << "Maximum Occurrence in range is = "
<< maximumOccurrence(arr, n, qs, qe) << endl;
qs = 4; // Starting index of query range
qe = 9; // Ending index of query range
// Print occurrence of most frequent number
// within given range
cout << "Maximum Occurrence in range is = "
<< maximumOccurrence(arr, n, qs, qe) << endl;
return 0;
}
Maximum Occurrence in range is = 4
Maximum Occurrence in range is = 3
进一步的优化:对于部分重叠的情况,我们必须运行一个循环以计算两边相同数字的数量。为了避免该循环并在O(1)中执行此操作,我们可以将每个数字的首次出现的索引存储在给定数组中,因此,通过进行一些预计算,我们可以在O(1)中找到所需的计数。
时间复杂度:
树构建的时间复杂度为O(n)。查询的时间复杂度为O(Log n)。