Logistic Regression
Despite its name, Logistic Regression is a classification algorithm. It uses the sigmoid function to predict probabilities and is the go-to baseline for binary and multi-class classification problems.
What is Logistic Regression?
Logistic Regression is a supervised learning algorithm used for classification tasks. Unlike linear regression which predicts continuous values, logistic regression predicts the probability that an input belongs to a particular class.
For binary classification (two classes), the output is a probability between 0 and 1. If the probability exceeds a threshold (typically 0.5), the input is classified as class 1; otherwise class 0.
Common applications include: - Email spam detection - Disease diagnosis (positive/negative) - Credit risk assessment - Customer churn prediction
The Sigmoid Function
The key to logistic regression is the sigmoid (logistic) function, which maps any real number to a value between 0 and 1:
sigma(z) = 1 / (1 + e^(-z))
Where z = w1*x1 + w2*x2 + ... + b (the linear combination of inputs).
Properties of the sigmoid: - Output is always between 0 and 1 (interpretable as probability) - sigma(0) = 0.5 (the decision boundary) - Smooth and differentiable (required for gradient descent)
Binary Cross-Entropy Loss
Logistic regression uses Binary Cross-Entropy (also called Log Loss) as its cost function:
Loss = -(1/n) * sum[y*log(y_pred) + (1-y)*log(1-y_pred)]
This function: - Heavily penalizes confident wrong predictions (e.g., predicting 0.99 when true label is 0) - Is convex, guaranteeing a global minimum - Reduces to MSE for linear regression in the limit
Multi-Class Classification
For problems with more than two classes, logistic regression extends in two ways:
- One-vs-Rest (OvR): Train one binary classifier per class. Each classifier predicts "is this class X or not?"
- Softmax Regression (Multinomial): Extends sigmoid to multiple classes. Outputs a probability distribution over all classes that sums to 1.
- Scikit-learn handles multi-class automatically with the multi_class parameter.
- Softmax is the foundation of the output layer in neural network classifiers.
Implementing Logistic Regression in Python
Complete example with binary and multi-class classification:
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer, load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
# ---- Binary Classification: Breast Cancer ----
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Feature scaling is important for logistic regression
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
model = LogisticRegression(max_iter=1000, C=1.0) # C = 1/lambda (regularization)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Binary Classification (Breast Cancer)")
print(f"Accuracy: {accuracy_score(y_test, y_pred):.4f}")
print(classification_report(y_test, y_pred, target_names=data.target_names))
# ---- Multi-Class: Iris ----
iris = load_iris()
X2, y2 = iris.data, iris.target
X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y2, test_size=0.2, random_state=42)
multi_model = LogisticRegression(multi_class='multinomial', solver='lbfgs', max_iter=200)
multi_model.fit(X2_train, y2_train)
print(f"\nIris Multi-Class Accuracy: {multi_model.score(X2_test, y2_test):.4f}")
# ---- Predict probabilities ----
probs = model.predict_proba(X_test[:3])
print(f"\nProbabilities for first 3 test samples:\n{probs}")Decision Boundary and Threshold Tuning
The default decision threshold is 0.5, but this can be adjusted based on the problem:
High precision needed (e.g., spam filter - avoid false positives): Raise threshold to 0.7+ High recall needed (e.g., cancer screening - avoid false negatives): Lower threshold to 0.3
The ROC curve and AUC score help evaluate model performance across all thresholds. An AUC of 1.0 is perfect; 0.5 is random guessing.
Key Takeaways
- Logistic regression is a classification algorithm that outputs probabilities via the sigmoid function.
- It is trained by minimizing Binary Cross-Entropy loss using gradient descent.
- Feature scaling (StandardScaler) significantly improves convergence and performance.
- The decision threshold (default 0.5) can be tuned to balance precision and recall.
- Despite its simplicity, logistic regression is a strong baseline and highly interpretable.
Contact Us
Have a question or feedback? Fill out the form below or reach us directly at support@nvaitraining.com