Katz 中心性(中心性度量)
在图论中,节点的 Katz 中心性是衡量网络中心性的指标。它由 Leo Katz 于 1953 年引入,用于衡量一个演员(或节点)在社交网络中的相对影响程度。与仅考虑一对参与者之间的最短路径(测地线)的典型中心性度量不同,Katz 中心性度量通过考虑一对参与者之间的步行总数来衡量影响。
它类似于 Google 的 PageRank 和特征向量中心性。
测量 Katz 中心性
Katz 中心性通过测量直接邻居(一级节点)以及网络中通过这些直接邻居连接到所考虑节点的所有其他节点的数量来计算网络中节点的相对影响。然而,与远邻建立的连接会受到衰减因子的影响 .一对节点之间的每条路径或连接都分配了一个权重,由下式确定和节点之间的距离为 .
例如,在右图中,假设正在测量 John 的中心性,并且 .分配给连接 John 与其直接邻居 Jane 和 Bob 的每个链接的权重将是 .由于 Jose 通过 Bob 间接连接到 John,分配给该连接(由两个链接组成)的权重将为 .同样,通过 Aziz 和 Jane 分配给 Agneta 和 John 之间的连接的权重将是并且通过 Diego、Jose 和 Bob 分配给 Agneta 和 John 之间的连接的权重将是 .
数学公式
令 A 为所考虑网络的邻接矩阵。元素的 A 是变量,如果节点 i 连接到节点 j,则取值为 1,否则取值为 0。 A 的幂表示两个节点之间通过中介存在(或不存在)链接。例如,在矩阵 , 如果元素 ,它表示节点 2 和节点 12 通过节点 2 的一些一级和二级邻居连接。如果表示节点 i 的 Katz 中心性,然后在数学上:
请注意,上面的定义使用了元素在 location 邻接矩阵被提升到权力 (IE ) 反映了总人数节点之间的度数连接和 .衰减因子的值必须选择它使其小于邻接矩阵 A 的最大特征值的绝对值的倒数。在这种情况下,可以使用以下表达式来计算 Katz 中心性:
这里是单位矩阵, 是一个大小为 n 的单位向量(n 是节点数),由 1 组成。 表示 A 的转置矩阵和 ( 表示项的矩阵求逆 ( )。
以下是计算图及其各个节点的 Katz 中心性的代码。
def katz_centrality(G, alpha=0.1, beta=1.0,
max_iter=1000, tol=1.0e-6,
nstart=None, normalized=True,
weight = 'weight'):
"""Compute the Katz centrality for the nodes
of the graph G.
Katz centrality computes the centrality for a node
based on the centrality of its neighbors. It is a
generalization of the eigenvector centrality. The
Katz centrality for node `i` is
.. math::
x_i = \alpha \sum_{j} A_{ij} x_j + \beta,
where `A` is the adjacency matrix of the graph G
with eigenvalues `\lambda`.
The parameter `\beta` controls the initial centrality and
.. math::
\alpha < \frac{1}{\lambda_{max}}.
Katz centrality computes the relative influence of
a node within a network by measuring the number of
the immediate neighbors (first degree nodes) and
also all other nodes in the network that connect
to the node under consideration through these
immediate neighbors.
Extra weight can be provided to immediate neighbors
through the parameter :math:`\beta`. Connections
made with distant neighbors are, however, penalized
by an attenuation factor `\alpha` which should be
strictly less than the inverse largest eigenvalue
of the adjacency matrix in order for the Katz
centrality to be computed correctly.
Parameters
----------
G : graph
A NetworkX graph
alpha : float
Attenuation factor
beta : scalar or dictionary, optional (default=1.0)
Weight attributed to the immediate neighborhood.
If not a scalar, the dictionary must have an value
for every node.
max_iter : integer, optional (default=1000)
Maximum number of iterations in power method.
tol : float, optional (default=1.0e-6)
Error tolerance used to check convergence in
power method iteration.
nstart : dictionary, optional
Starting value of Katz iteration for each node.
normalized : bool, optional (default=True)
If True normalize the resulting values.
weight : None or string, optional
If None, all edge weights are considered equal.
Otherwise holds the name of the edge attribute
used as weight.
Returns
-------
nodes : dictionary
Dictionary of nodes with Katz centrality as
the value.
Raises
------
NetworkXError
If the parameter `beta` is not a scalar but
lacks a value for at least one node
Notes
-----
This algorithm it uses the power method to find
the eigenvector corresponding to the largest
eigenvalue of the adjacency matrix of G.
The constant alpha should be strictly less than
the inverse of largest eigenvalue of the adjacency
matrix for the algorithm to converge.
The iteration will stop after max_iter iterations
or an error tolerance ofnumber_of_nodes(G)*tol
has been reached.
When `\alpha = 1/\lambda_{max}` and `\beta=0`,
Katz centrality is the same as eigenvector centrality.
For directed graphs this finds "left" eigenvectors
which corresponds to the in-edges in the graph.
For out-edges Katz centrality first reverse the
graph with G.reverse().
"""
from math import sqrt
if len(G) == 0:
return {}
nnodes = G.number_of_nodes()
if nstart is None:
# choose starting vector with entries of 0
x = dict([(n,0) for n in G])
else:
x = nstart
try:
b = dict.fromkeys(G,float(beta))
except (TypeError,ValueError,AttributeError):
b = beta
if set(beta) != set(G):
raise nx.NetworkXError('beta dictionary '
'must have a value for every node')
# make up to max_iter iterations
for i in range(max_iter):
xlast = x
x = dict.fromkeys(xlast, 0)
# do the multiplication y^T = Alpha * x^T A - Beta
for n in x:
for nbr in G[n]:
x[nbr] += xlast[n] * G[n][nbr].get(weight, 1)
for n in x:
x[n] = alpha*x[n] + b[n]
# check convergence
err = sum([abs(x[n]-xlast[n]) for n in x])
if err < nnodes*tol:
if normalized:
# normalize vector
try:
s = 1.0/sqrt(sum(v**2 for v in x.values()))
# this should never be zero?
except ZeroDivisionError:
s = 1.0
else:
s = 1
for n in x:
x[n] *= s
return x
raise nx.NetworkXError('Power iteration failed to converge in '
'%d iterations.' % max_iter)
上面的函数是使用 networkx 库调用的,一旦安装了该库,您最终可以使用它,并且必须用Python编写以下代码来实现节点的 katz 中心性。
>>> import networkx as nx
>>> import math
>>> G = nx.path_graph(4)
>>> phi = (1+math.sqrt(5))/2.0 # largest eigenvalue of adj matrix
>>> centrality = nx.katz_centrality(G,1/phi-0.01)
>>> for n,c in sorted(centrality.items()):
... print("%d %0.2f"%(n,c))
上述代码的输出是:
0 0.37
1 0.60
2 0.60
3 0.37
上面的结果是一个字典,描述了每个节点的 katz 中心性值。以上是我关于中心性度量的系列文章的延伸。继续联网!!!
参考
http://networkx.readthedocs.io/en/networkx-1.10/index.html
https://en.wikipedia.org/wiki/Katz_centrality