背景 :
哈希表(或字典)必须支持三种基本操作:
- 查找(键):如果键在表上,则返回真,否则返回假
- 插入(键):如果尚未存在,则将项目“键”添加到表中
- 删除(键):从表中删除“键”
即使我们有一个大表来存储密钥,也很可能发生冲突。使用生日悖论的结果:只有 23 个人,两个人出生日期相同的概率是 50%!解决散列冲突的一般策略有 3 种:
- 封闭寻址或链接:将冲突元素存储在辅助数据结构中,如链表或二叉搜索树。
- 开放寻址:允许元素从目标桶溢出并进入其他空间。
尽管上述解决方案提供的预期查找成本为 O(1),但在开放寻址(使用线性探测)中查找的预期最坏情况成本是 Ω(log n) 和 Θ(log n / log log n) 在简单链接(资料来源:斯坦福讲座笔记)。为了缩小预期时间和最坏情况预期时间的差距,使用了两个想法:
- 多项选择散列:为每个元素在散列表中的位置提供多项选择
- 重定位散列:允许散列表中的元素在放置后移动
布谷鸟哈希:
Cuckoo 哈希将多项选择和重定位的思想结合在一起,并保证 O(1) 最坏情况查找时间!
- 多项选择:我们给一个关键的两个选择 h1(key) 和 h2(key) 用于居住。
- 重定位:可能会发生 h1(key) 和 h2(key) 被占用的情况。这是通过模仿杜鹃鸟来解决的:它在孵化时将其他蛋或幼鸟推出巢穴。类似地,将新密钥插入布谷鸟哈希表可能会将旧密钥推送到不同的位置。这给我们留下了重新放置旧密钥的问题。
- 如果旧钥匙的备用位置空着,则没有问题。
- 否则,旧密钥将取代另一个密钥。这一直持续到程序找到一个空位,或进入一个循环。在循环的情况下,选择新的散列函数并“重新散列”整个数据结构。在 Cuckoo 成功之前,可能需要多次重新哈希。
插入预计 O(1)(摊销)的概率很高,即使考虑重新散列的可能性,只要键的数量保持在哈希表容量的一半以下,即负载因子低于 50%。
删除是 O(1) 最坏情况,因为它只需要检查哈希表中的两个位置。插图
输入:
{20, 50, 53, 75, 100, 67, 105, 3, 36, 39}
哈希函数:
h1(key) = key%11
h2(key) = (key/11)%11
让我们首先在由 h1(20) 确定的第一个表中的可能位置插入20 :
下一个: 50
下一个: 53 。 h1(53) = 9。但是 20 已经在 9 处了。我们将 53 放在表 1 中,将 20 放在表 2 中 h2(20) 处
下一个: 75 。 h1(75) = 9。但是 53 已经在 9 处了。我们将 75 放在表 1 中,将 53 放在表 2 中 h2(53) 处
下一个: 100 。 h1(100) = 1。
下一个: 67 。 h1(67) = 1。但是 100 已经在 1 处了。我们将 67 放在表 1 中,将 100 放在表 2 中
下一个: 105 。 h1(105) = 6。但是 50 已经在 6 处了。我们将 105 放在表 1 中,将 50 放在表 2 中 h2(50) = 4。现在 53 已经被置换了。 h1(53) = 9. 75 位移:h2(75) = 6。
下一个: 3 。 h1(3) = 3。
下一个: 36 。 h1(36) = 3. h2(3) = 0。
下一个: 39 。 h1(39) = 6. h2(105) = 9. h1(100) = 1. h2(67) = 6. h1(75) = 9. h2(53) = 4. h1(50) = 6. h2 (39) = 3。
在这里,新键 39 稍后在递归调用位置 105 时被替换,它被替换。
执行:
下面是 Cuckoo 哈希的实现
C++
// C++ program to demonstrate working of Cuckoo
// hashing.
#include
// upper bound on number of elements in our set
#define MAXN 11
// choices for position
#define ver 2
// Auxiliary space bounded by a small multiple
// of MAXN, minimizing wastage
int hashtable[ver][MAXN];
// Array to store possible positions for a key
int pos[ver];
/* function to fill hash table with dummy value
* dummy value: INT_MIN
* number of hashtables: ver */
void initTable()
{
for (int j=0; j
Java
// Java program to demonstrate working of
// Cuckoo hashing.
import java.util.*;
class GFG
{
// upper bound on number of elements in our set
static int MAXN = 11;
// choices for position
static int ver = 2;
// Auxiliary space bounded by a small multiple
// of MAXN, minimizing wastage
static int [][]hashtable = new int[ver][MAXN];
// Array to store possible positions for a key
static int []pos = new int[ver];
/* function to fill hash table with dummy value
* dummy value: INT_MIN
* number of hashtables: ver */
static void initTable()
{
for (int j = 0; j < MAXN; j++)
for (int i = 0; i < ver; i++)
hashtable[i][j] = Integer.MIN_VALUE;
}
/* return hashed value for a key
* function: ID of hash function according to which
key has to hashed
* key: item to be hashed */
static int hash(int function, int key)
{
switch (function)
{
case 1: return key % MAXN;
case 2: return (key / MAXN) % MAXN;
}
return Integer.MIN_VALUE;
}
/* function to place a key in one of its possible positions
* tableID: table in which key has to be placed, also equal
to function according to which key must be hashed
* cnt: number of times function has already been called
in order to place the first input key
* n: maximum number of times function can be recursively
called before stopping and declaring presence of cycle */
static void place(int key, int tableID, int cnt, int n)
{
/* if function has been recursively called max number
of times, stop and declare cycle. Rehash. */
if (cnt == n)
{
System.out.printf("%d unpositioned\n", key);
System.out.printf("Cycle present. REHASH.\n");
return;
}
/* calculate and store possible positions for the key.
* check if key already present at any of the positions.
If YES, return. */
for (int i = 0; i < ver; i++)
{
pos[i] = hash(i + 1, key);
if (hashtable[i][pos[i]] == key)
return;
}
/* check if another key is already present at the
position for the new key in the table
* If YES: place the new key in its position
* and place the older key in an alternate position
for it in the next table */
if (hashtable[tableID][pos[tableID]] != Integer.MIN_VALUE)
{
int dis = hashtable[tableID][pos[tableID]];
hashtable[tableID][pos[tableID]] = key;
place(dis, (tableID + 1) % ver, cnt + 1, n);
}
else // else: place the new key in its position
hashtable[tableID][pos[tableID]] = key;
}
/* function to print hash table contents */
static void printTable()
{
System.out.printf("Final hash tables:\n");
for (int i = 0; i < ver; i++, System.out.printf("\n"))
for (int j = 0; j < MAXN; j++)
if(hashtable[i][j] == Integer.MIN_VALUE)
System.out.printf("- ");
else
System.out.printf("%d ", hashtable[i][j]);
System.out.printf("\n");
}
/* function for Cuckoo-hashing keys
* keys[]: input array of keys
* n: size of input array */
static void cuckoo(int keys[], int n)
{
// initialize hash tables to a dummy value
// (INT-MIN) indicating empty position
initTable();
// start with placing every key at its position in
// the first hash table according to first hash
// function
for (int i = 0, cnt = 0; i < n; i++, cnt = 0)
place(keys[i], 0, cnt, n);
// print the final hash tables
printTable();
}
// Driver Code
public static void main(String[] args)
{
/* following array doesn't have any cycles and
hence all keys will be inserted without any
rehashing */
int keys_1[] = {20, 50, 53, 75, 100,
67, 105, 3, 36, 39};
int n = keys_1.length;
cuckoo(keys_1, n);
/* following array has a cycle and hence we will
have to rehash to position every key */
int keys_2[] = {20, 50, 53, 75, 100,
67, 105, 3, 36, 39, 6};
int m = keys_2.length;
cuckoo(keys_2, m);
}
}
// This code is contributed by Princi Singh
C#
// C# program to demonstrate working of
// Cuckoo hashing.
using System;
class GFG
{
// upper bound on number of
// elements in our set
static int MAXN = 11;
// choices for position
static int ver = 2;
// Auxiliary space bounded by a small
// multiple of MAXN, minimizing wastage
static int [,]hashtable = new int[ver, MAXN];
// Array to store
// possible positions for a key
static int []pos = new int[ver];
/* function to fill hash table
with dummy value
* dummy value: INT_MIN
* number of hashtables: ver */
static void initTable()
{
for (int j = 0; j < MAXN; j++)
for (int i = 0; i < ver; i++)
hashtable[i, j] = int.MinValue;
}
/* return hashed value for a key
* function: ID of hash function
according to which key has to hashed
* key: item to be hashed */
static int hash(int function, int key)
{
switch (function)
{
case 1: return key % MAXN;
case 2: return (key / MAXN) % MAXN;
}
return int.MinValue;
}
/* function to place a key in one of
its possible positions
* tableID: table in which key
has to be placed, also equal to function
according to which key must be hashed
* cnt: number of times function has already
been called in order to place the first input key
* n: maximum number of times function
can be recursively called before stopping and
declaring presence of cycle */
static void place(int key, int tableID,
int cnt, int n)
{
/* if function has been recursively
called max number of times,
stop and declare cycle. Rehash. */
if (cnt == n)
{
Console.Write("{0} unpositioned\n", key);
Console.Write("Cycle present. REHASH.\n");
return;
}
/* calculate and store possible positions
* for the key. Check if key already present
at any of the positions. If YES, return. */
for (int i = 0; i < ver; i++)
{
pos[i] = hash(i + 1, key);
if (hashtable[i, pos[i]] == key)
return;
}
/* check if another key is already present
at the position for the new key in the table
* If YES: place the new key in its position
* and place the older key in an alternate position
for it in the next table */
if (hashtable[tableID,
pos[tableID]] != int.MinValue)
{
int dis = hashtable[tableID, pos[tableID]];
hashtable[tableID, pos[tableID]] = key;
place(dis, (tableID + 1) % ver, cnt + 1, n);
}
else // else: place the new key in its position
hashtable[tableID, pos[tableID]] = key;
}
/* function to print hash table contents */
static void printTable()
{
Console.Write("Final hash tables:\n");
for (int i = 0; i < ver;
i++, Console.Write("\n"))
for (int j = 0; j < MAXN; j++)
if(hashtable[i, j] == int.MinValue)
Console.Write("- ");
else
Console.Write("{0} ",
hashtable[i, j]);
Console.Write("\n");
}
/* function for Cuckoo-hashing keys
* keys[]: input array of keys
* n: size of input array */
static void cuckoo(int []keys, int n)
{
// initialize hash tables to a
// dummy value (INT-MIN)
// indicating empty position
initTable();
// start with placing every key
// at its position in the first
// hash table according to first
// hash function
for (int i = 0, cnt = 0;
i < n; i++, cnt = 0)
place(keys[i], 0, cnt, n);
// print the final hash tables
printTable();
}
// Driver Code
public static void Main(String[] args)
{
/* following array doesn't have
any cycles and hence all keys
will be inserted without any rehashing */
int []keys_1 = {20, 50, 53, 75, 100,
67, 105, 3, 36, 39};
int n = keys_1.Length;
cuckoo(keys_1, n);
/* following array has a cycle and
hence we will have to rehash to
position every key */
int []keys_2 = {20, 50, 53, 75, 100,
67, 105, 3, 36, 39, 6};
int m = keys_2.Length;
cuckoo(keys_2, m);
}
}
// This code is contributed by PrinciRaj1992
Javascript
输出:
Final hash tables:
- 100 - 36 - - 50 - - 75 -
3 20 - 39 53 - 67 - - 105 -
105 unpositioned
Cycle present. REHASH.
Final hash tables:
- 67 - 3 - - 39 - - 53 -
6 20 - 36 50 - 75 - - 100 -
使用超过 2 个替代散列函数的布谷鸟散列的泛化有望有效地利用散列表的大部分容量,同时牺牲一些查找和插入速度。示例:如果我们使用 3 个哈希函数,加载 91% 并且仍然在预期范围内运行是安全的。
如果您希望与专家一起参加现场课程,请参阅DSA 现场工作专业课程和学生竞争性编程现场课程。