Kullback-Leibler 散度
是当我们有一个与概率分布
q
近似的真实概率分布
p
时的标准误差度量
. 它的高效计算在许多任务中都是必不可少的,例如在近似计算中或作为学习概率时的误差度量。在高维概率中,作为与贝叶斯网络相关的概率,直接计算可能是不可行的。本文考虑了有效计算两个概率分布的 Kullback-Leibler 散度的情况,每个分布都来自不同的贝叶斯网络,这些网络可能具有不同的结构。该论文基于辅助删除算法来计算必要的边缘分布,但使用具有潜力的操作缓存,以便在需要时重用过去的计算。这些算法使用来自
bnlearn
存储库的贝叶斯网络进行测试。
Python 中的
计算机代码
以
pgmpy
为基础
提供
,这是一个用于处理概率图形模型的库。
Kullback–Leibler divergence
is the standard measure of error when we have a true probability distribution
p
which is approximate with probability distribution
q
. Its efficient computation is essential in many tasks, as in approximate computation or as a measure of error when learning a probability. In high dimensional probabilities, as the ones associated with Bayesian networks, a direct computation can be unfeasible. This paper considers the case of efficiently computing the Kullback–Leibler divergence of two probability distributions, each one of them coming from a different Bayesian network, which might have different structures. The paper is based on an auxiliary deletion algorithm to compute the necessary marginal distributions, but using a cache of operations with potentials in order to reuse past computations whenever they are necessary. The algorithms are tested with Bayesian networks from the
bnlearn
repository. Computer code in
Python
is provided taking as basis
pgmpy
, a library for working with probabilistic graphical models.