Bayesian Networks
Bayesian Networks are probabilistic graphical models that represent variables and their conditional dependencies using a directed acyclic graph (DAG). They are fundamental to reasoning under uncertainty in AI.
Probability and Uncertainty in AI
Real-world AI systems must reason under uncertainty. Not all facts are known with certainty - symptoms may or may not indicate a disease, sensor readings may be noisy, and future events are inherently probabilistic.
Bayes Theorem is the mathematical foundation:
``` P(H | E) = P(E | H) * P(H) / P(E) ```
Where: - P(H | E) - Posterior: probability of hypothesis H given evidence E - P(E | H) - Likelihood: probability of evidence given hypothesis - P(H) - Prior: initial probability of hypothesis - P(E) - Marginal likelihood (normalizing constant)
What is a Bayesian Network?
A Bayesian Network (BN) is a Directed Acyclic Graph (DAG) where:
- Nodes represent random variables (e.g., Disease, Symptom, Test Result) - Edges represent conditional dependencies (A -> B means A influences B) - Conditional Probability Tables (CPTs) quantify the strength of each relationship
Bayesian Networks compactly represent the joint probability distribution over all variables, enabling efficient probabilistic inference.
Building a Simple Bayesian Network
Consider a classic medical diagnosis example:
- Rain -> Sprinkler (rain affects whether sprinkler is on) - Rain -> Wet Grass - Sprinkler -> Wet Grass
This is the famous "Wet Grass" Bayesian Network.
# Bayesian Network: Wet Grass Problem
# Using manual probability tables (no external library needed)
# Prior probabilities
P_rain = 0.2 # P(Rain = True)
P_no_rain = 0.8 # P(Rain = False)
# Conditional: P(Sprinkler | Rain)
P_sprinkler_given_rain = {True: 0.01, False: 0.40}
# Conditional: P(WetGrass | Rain, Sprinkler)
P_wet_given = {
(True, True): 0.99,
(True, False): 0.80,
(False, True): 0.90,
(False, False): 0.001,
}
def compute_joint(rain: bool, sprinkler: bool, wet: bool) -> float:
"""Compute joint probability P(Rain, Sprinkler, WetGrass)."""
p_r = P_rain if rain else P_no_rain
p_s_given_r = P_sprinkler_given_rain[rain] if sprinkler else (1 - P_sprinkler_given_rain[rain])
p_w_given_rs = P_wet_given[(rain, sprinkler)] if wet else (1 - P_wet_given[(rain, sprinkler)])
return p_r * p_s_given_r * p_w_given_rs
# Query: P(Rain | WetGrass = True)
# Sum over all combinations where WetGrass = True
p_wet = sum(
compute_joint(r, s, True)
for r in [True, False]
for s in [True, False]
)
p_rain_and_wet = sum(
compute_joint(True, s, True)
for s in [True, False]
)
p_rain_given_wet = p_rain_and_wet / p_wet
print(f"P(Rain | WetGrass=True) = {p_rain_given_wet:.4f}")
# Output: P(Rain | WetGrass=True) ~= 0.3577Inference in Bayesian Networks
Exact Inference: - Variable Elimination - Systematically sum out irrelevant variables. - Belief Propagation - Message-passing algorithm for tree-structured networks. - Complexity: NP-hard in general, polynomial for polytrees.
Approximate Inference: - Monte Carlo sampling (MCMC) - Variational inference - Loopy belief propagation
Approximate methods are used for large, complex networks where exact inference is intractable.
Applications of Bayesian Networks
Bayesian Networks are used across many domains:
- Medical Diagnosis - Inferring diseases from symptoms and test results.
- Spam Filtering - Naive Bayes classifier (a simple BN) for email classification.
- Fault Diagnosis - Identifying root causes in complex systems.
- Natural Language Processing - Probabilistic parsing and disambiguation.
- Bioinformatics - Gene regulatory network modeling.
- Risk Assessment - Financial and insurance risk modeling.
Naive Bayes Classifier
The Naive Bayes classifier is a special case of a Bayesian Network that assumes all features are conditionally independent given the class. Despite this "naive" assumption, it works surprisingly well in practice.
from collections import defaultdict
import math
class NaiveBayesClassifier:
def __init__(self):
self.class_probs = {}
self.feature_probs = defaultdict(lambda: defaultdict(dict))
def train(self, X: list, y: list):
n = len(y)
classes = set(y)
# Compute class priors P(class)
for c in classes:
self.class_probs[c] = y.count(c) / n
# Compute P(feature=value | class)
for c in classes:
class_docs = [X[i] for i in range(n) if y[i] == c]
for feature_idx in range(len(X[0])):
values = [doc[feature_idx] for doc in class_docs]
unique_vals = set(values)
for val in unique_vals:
self.feature_probs[feature_idx][c][val] = values.count(val) / len(values)
def predict(self, x: list) -> str:
best_class, best_score = None, float('-inf')
for c, prior in self.class_probs.items():
score = math.log(prior)
for i, val in enumerate(x):
p = self.feature_probs[i][c].get(val, 1e-6) # Laplace smoothing
score += math.log(p)
if score > best_score:
best_score, best_class = score, c
return best_class
# Example: Spam detection
X_train = [
["free", "money", "click"],
["free", "offer", "win"],
["meeting", "tomorrow", "office"],
["project", "deadline", "office"],
]
y_train = ["spam", "spam", "ham", "ham"]
clf = NaiveBayesClassifier()
clf.train(X_train, y_train)
print(clf.predict(["free", "win", "click"])) # spam
print(clf.predict(["meeting", "office", "project"])) # hamKey Takeaways
- Bayesian Networks represent probabilistic relationships between variables as a DAG.
- Each node has a Conditional Probability Table (CPT) quantifying its dependencies.
- Bayes' Theorem enables updating beliefs when new evidence is observed.
- Inference can be exact (variable elimination) or approximate (MCMC, variational).
- Naive Bayes is a simple but effective BN used for text classification and spam filtering.
Contact Us
Have a question or feedback? Fill out the form below or reach us directly at support@nvaitraining.com