NTNU
Bayesian Networks in Reliability: A primer Helge Langseth
[email protected]
Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
MMR 2004 – p.1/13
NTNU
Outline Basics of Bayesian networks-framework Definition Representation Calculation scheme Apparent shortcomings: Sparser representations Knowledge acquisition Continuous variables Some cute features Causal models Model estimation Reliability applications MMR 2004 – p.2/13
NTNU
A simple example: “Car start” B: Battery
F : Fuel level
H: Head-lights
G: Fuel gauge
E: Engine turns
C: Car starts
P (B, F, H, G, E, C)
MMR 2004 – p.3/13
NTNU
A simple example: “Car start” B: Battery
F : Fuel level
H: Head-lights
G: Fuel gauge
E: Engine turns pa(E) = {B} C: Car starts
P (B, F, H, G, E, C)
MMR 2004 – p.3/13
NTNU
A simple example: “Car start” B: Battery
F : Fuel level
H: Head-lights
G: Fuel gauge
E: Engine turns
C: Car starts
pa(E) = {B} nd(E) = {H, G, F, B}
P (B, F, H, G, E, C)
MMR 2004 – p.3/13
NTNU
A simple example: “Car start” B: Battery
F : Fuel level
H: Head-lights
G: Fuel gauge
E: Engine turns
C: Car starts
pa(E) = {B} nd(E) = {H, G, F, B} X⊥ ⊥nd(X) \ pa(X) | pa(X) E⊥ ⊥{H, G, F } | B Other d-sep. rules: Pearl(88)
P (B, F, H, G, E, C)
MMR 2004 – p.3/13
NTNU
A simple example: “Car start” B: Battery
F : Fuel level
H: Head-lights E: Engine turns
E B =empty B = empty yes .01 .97 no .99 .03
G: Fuel gauge P (E | pa (E))
C: Car starts
pa(E) = {B} nd(E) = {H, G, F, B} X⊥ ⊥nd(X) \ pa(X) | pa(X) E⊥ ⊥{H, G, F } | B Other d-sep. rules: Pearl(88)
P (B, F, H, G, E, C) = P (B)P (F )P (H | B)P (G | F ) · P (E | B)P (C | E, F )
Markov properties ⇔ Factorization property
MMR 2004 – p.3/13
NTNU
They crop up everywhere M
M latent, multinomial
Y p(y | M = m) Mixture models
MMR 2004 – p.4/13
NTNU
They crop up everywhere X latent, N (0, I) X1
X2
Linear regression
Y1
Y2
Y3
Y | x ∼ N (µ + Lx, Ψ), Ψ diagonal Factor analyzers
MMR 2004 – p.4/13
NTNU
They crop up everywhere X latent, N (0, I)
M latent, multinomial
X1
M
X2
Y2
Y3
Linear regression given M = m Y1
Y | {x, M = m} ∼ N (µm + Lm x, Ψm ), Ψm diagonal Mixture of Factor analyzers
MMR 2004 – p.4/13
NTNU
Calculation scheme X1
X2
X3
X4
X5
MMR 2004 – p.5/13
NTNU
Calculation scheme X1
X2
X1
X2
X3
X4
X3
X4
X5
X5
MMR 2004 – p.5/13
NTNU
Calculation scheme X1
X2
X1
X2
X1
X2
X3
X4
X3
X4
X3
X4
X5
X5
X5
MMR 2004 – p.5/13
NTNU
Calculation scheme X1
X2
X1
X2
X1
X2
X1
X2
X3
X4
X3
X4
X3
X4
X3
X4
X5
X5
X5
X5
MMR 2004 – p.5/13
NTNU
Calculation scheme X1
X2
X1
X2
X1
X2
X1
X2
X3
X4
X3
X4
X3
X4
X3
X4
X5 φ1 (x3 , x4 , x5 )
X3 ,X4 ,X5
X5 ψ1 (x3 , x4 )
X3 , X4
X5 φ2 (x1 , x3 , x4 )
X1 ,X3 ,X4
X5 ψ2 (x1 , x4 )
X1 ,X4
φ3 (x1 , x2 , x4 )
X1 ,X2 ,X4
‘Divide-and-Conquer’ strategy: We can look at 3 variables at a time instead of 5. Important, as the cost is exponential in # variables. MMR 2004 – p.5/13
NTNU
Cheaper representations Consider a binary node with m binary parents. The CPT requires 2m parameters. This must be bad, right?
Y
Z1
...
Zm
MMR 2004 – p.6/13
NTNU
Cheaper representations Consider a binary node with m binary parents. The CPT requires 2m parameters. This must be bad, right? Wrong!
Y
Z1
...
Zm
All parameters are required if we do not make additional assumptions! But: p(y | z1 , . . . , zm ) Functional relations zi = 0 zi = 1 (if-then-else, AND-gates, . . . ). p1 Sparser representations zj = 1 zj = 0 than CPTs, e.g., probability trees. p2 ... MMR 2004 – p.6/13
NTNU
Continuous variables Not all families of distributions can be handled by the calculation scheme. Works for: Multinomial variables Multivariate Gaussian distributions Conditional Gaussian distributions What can we do when these models are unrealistic?
MMR 2004 – p.7/13
NTNU
Continuous variables Not all families of distributions can be handled by the calculation scheme. Works for: Multinomial variables Multivariate Gaussian distributions Conditional Gaussian distributions What can we do when these models are unrealistic? Discretization: Difficult tradeoff between precision and model complexity Mixtures of Truncated exponentials: A new family of distributions that can cope with the calculation scheme. Can approximate any distribution arbitrarily well MMR 2004 – p.7/13
NTNU
KA: When p(xi | pa (xi)) is not available All elements of the set {p(xi | pa (xi ))}ni=1 are required to fully specify a BN. Experts sometimes prefer to give {p(xi )}ni=1 and correlations (e.g., in the form of cross-product ratios) instead. Alternative frameworks: Vines, Chain graphs, . . . Iterative proportional fitting procedure: 1. p0 (xi , xj ) initialized to obtain correct correlation. 2. for k = 1, 2, . . .: j pk−1 (xi ,xj ) (i) pk (xi , xj ) ← pk−1 (xi , xj ) · p(xi ) (ii) pk (xi , xj ) ←
pk (xi , xj )
·
i
pk (xi ,xj ) p(xj )
MMR 2004 – p.8/13
NTNU
KA: When p(xi | pa (xi)) is not available All elements of the set {p(xi | pa (xi ))}ni=1 are required to fully specify a BN. Experts sometimes prefer to give {p(xi )}ni=1 and correlations (e.g., in the form of cross-product ratios) instead. Alternative frameworks: Vines, Chain graphs, . . . Iterative proportional fitting procedure: In BNs: Work with the cliques! Iterate over cliques l: pk−1 (xl ) pk (x) ← pk−1 (x) p(xl ) Gives minimum info model It also works for inconsistent input
MMR 2004 – p.8/13
NTNU
Calculating causal (!) effects
Cause
Effect Can we estimate causal strength?
The key is that p(x | observe(Y = y)) = p(x | do(Y ← y)) not holds in general! MMR 2004 – p.9/13
NTNU
Calculating causal (!) effects Management
TTF
Planned PM Can we estimate causal strength? NO! Destroyed by confounder
The key is that p(x | observe(Y = y)) = p(x | do(Y ← y)) not holds in general! MMR 2004 – p.9/13
NTNU
Calculating causal (!) effects Management
Planned PM
Actual PM
TTF
Can we estimate causal strength? Yes! Intermediate (observable) effect saves the day!
The key is that p(x | observe(Y = y)) = p(x | do(Y ← y)) not holds in general! MMR 2004 – p.9/13
NTNU
Estimating models from data Estimating parameters: No missing values: Counting Missing values: EM-algorithm Estimating structure: Only discrete (or discretized) variables: Constrain-based (hypothesis testing) Fully Bayesian approach Continuous/Mixed variables: Purely Gaussian and conditional Gaussian models: “Simple” General distributions: Difficult
MMR 2004 – p.10/13
NTNU
Troubleshooting Find a “useful” repair strategy; i.e., a sequence of steps with a low expected cost of repair Example: The BATS system (developed by HP). Can be employed in many domains, initially intended for troubleshooting printers Bobbio et al.’s FTA ⇒ BN algorithm gives troubleshooter systems new expressive power: Refined system models Common cause failures NOT - events Modeling user interaction Non-perfect repair actions Questions MMR 2004 – p.11/13
NTNU
Safety-critical software Safety-assessment requires:
Safety-critical software is special:
Disparate sources of information; several types of evidence, many which are not quantitative in nature.
No-fault-criterion: Bugs immediately removed
The relation between evidence and safety assessment is not always direct or quantifiable
Test in a traditional way is not sufficient (PIE algorithm; 50% of tested locations hide their faults).
Qsystem
We need a framework to combine these disparate sources of information in a transparent way
Complexity
Fault tol.
Reliability
Testing
Experience
Consequences
System safety MMR 2004 – p.12/13
NTNU
Safety-critical software Safety-critical software is special: No-fault-criterion: Bugs immediately removed Test in a traditional way is not sufficient (PIE algorithm; 50% of tested locations hide their faults).
Safety-assessment requires: Disparate sources of information; several types of evidence, many which are not quantitative in nature. The relation between evidence and safety assessment is not always direct or quantifiable We need a framework to combine these disparate sources of information in a transparent way
Each node in the top-level model is refined using a “sub-net” The safety standard for safety critical software in aviation (RTCA/DO-17B) was implemented in this way See Gran (2002) for details
MMR 2004 – p.12/13
NTNU
References Bobbio, A., L. Portinale, M. Minichino, and E. Ciancamerla (2001). Improving the analysis of dependable systems by mapping fault trees into Bayesian networks. Reliability Engineering and System Safety 71(3), 249–260. Gran, B. A. (2002). The use of Bayesian Belief Networks for combining disparate sources of information in the safety assessment of software based systems. Ph. D. thesis, Department of Mathematical Sciences, Norwegian University of Science and Technology. Doktor Ingeniør avhandling 2002:35. Jensen, F. V. (2001). Bayesian Networks and Decision Graphs. New York: Springer-Verlag. Langseth, H. and F. V. Jensen (2003). Decision theoretic troubleshooting in coherent systems. Reliability Engineering and System Safety 80(1), 49–61. Lauritzen, S. L. (1995). The EM-algorithm for graphical association models with missing data. Computational Statistics and Data Analysis 19, 191–201. Moral, S., R. Rumí, and A. Salmerón (2001). Mixtures of truncated exponentials in hybrid Bayesian networks. In Sixth European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Volume 2143 of Lecture Notes in Artificial Intelligence, pp. 145–167. Springer-Verlag. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA.: Morgan Kaufmann Publishers. Pearl, J. (2000). Causality – Models, Reasoning, and Inference. Cambridge, UK: Cambridge University Press. MMR 2004 – p.13/13