Machine LearningAdvanced22 min readUpdated March 2025

Autoencoders

Autoencoders are neural networks trained to reconstruct their input through a compressed bottleneck representation. They learn efficient data encodings without labels and are used for dimensionality reduction, anomaly detection, and generative modeling.

What is an Autoencoder?

An autoencoder is a type of neural network that learns to compress data into a lower-dimensional representation and then reconstruct it back to the original form.

The architecture has two parts: - Encoder - Compresses input X into a latent representation Z (the bottleneck) - Decoder - Reconstructs the input from Z back to X'

The network is trained to minimize reconstruction loss (difference between X and X'). Since the bottleneck forces compression, the encoder must learn the most important features of the data.

Autoencoder Architecture

A typical autoencoder architecture:

Input Layer - Same size as the original data (e.g., 784 for 28x28 MNIST images).
Encoder Layers - Progressively smaller layers that compress the representation.
Bottleneck (Latent Space) - The compressed representation Z. Its size determines compression ratio.
Decoder Layers - Mirror of encoder, progressively expanding back to original size.
Output Layer - Same size as input. Uses sigmoid activation for normalized inputs.
The bottleneck dimension is the key hyperparameter - smaller = more compression but higher reconstruction error.

Types of Autoencoders

Several variants extend the basic autoencoder:

Denoising Autoencoder - Input is corrupted with noise; model learns to reconstruct the clean version. Produces more robust representations.
Sparse Autoencoder - Adds L1 penalty on activations, forcing most neurons to be inactive. Learns sparse, interpretable features.
Variational Autoencoder (VAE) - Encodes to a probability distribution (mean and variance) rather than a fixed point. Enables generation of new samples.
Convolutional Autoencoder - Uses Conv layers instead of Dense. Ideal for image data.
LSTM Autoencoder - Uses recurrent layers for sequence data like time series.

Implementing an Autoencoder in Python

Complete autoencoder for MNIST digit compression using PyTorch:

python

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import numpy as np

# ---- Define Autoencoder ----
class Autoencoder(nn.Module):
    def __init__(self, input_dim=784, latent_dim=32):
        super().__init__()
        # Encoder
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 256),
            nn.ReLU(),
            nn.Linear(256, 64),
            nn.ReLU(),
            nn.Linear(64, latent_dim),
        )
        # Decoder
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 64),
            nn.ReLU(),
            nn.Linear(64, 256),
            nn.ReLU(),
            nn.Linear(256, input_dim),
            nn.Sigmoid(),  # Output in [0,1] for normalized images
        )

    def forward(self, x):
        z = self.encoder(x)
        return self.decoder(z)

    def encode(self, x):
        return self.encoder(x)

# ---- Load MNIST ----
transform = transforms.Compose([transforms.ToTensor(), transforms.Lambda(lambda x: x.view(-1))])
train_data = datasets.MNIST('./data', train=True, download=True, transform=transform)
loader = DataLoader(train_data, batch_size=256, shuffle=True)

# ---- Train ----
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Autoencoder(input_dim=784, latent_dim=32).to(device)
optimizer = optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

for epoch in range(10):
    total_loss = 0
    for batch, _ in loader:
        batch = batch.to(device)
        output = model(batch)
        loss = criterion(output, batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print(f"Epoch {epoch+1}/10 | Loss: {total_loss/len(loader):.6f}")

print(f"\nCompression: 784 -> 32 dimensions ({784/32:.0f}x reduction)")

Anomaly Detection with Autoencoders

One of the most powerful applications of autoencoders is anomaly detection:

1. Train the autoencoder on normal data only 2. For new data, compute the reconstruction error 3. High reconstruction error = the model has never seen this pattern = anomaly

This works because the autoencoder learns to efficiently reconstruct normal patterns. Anomalies, being unlike anything in training, are reconstructed poorly.

Applications include fraud detection, network intrusion detection, and manufacturing defect detection.

Variational Autoencoders (VAE)

A Variational Autoencoder extends the basic autoencoder by encoding inputs to a probability distribution (mean mu and variance sigma) rather than a fixed point:

- The encoder outputs mu and log(sigma^2) for each latent dimension - A sample z is drawn from N(mu, sigma^2) during training (reparameterization trick) - The loss combines reconstruction loss + KL divergence (regularizes the latent space)

The continuous, structured latent space of VAEs enables generation of new samples by sampling from the prior distribution N(0, 1).

Key Takeaways

Autoencoders compress data through a bottleneck and reconstruct it, learning efficient representations without labels.
The bottleneck dimension controls the compression ratio and the richness of learned features.
Denoising autoencoders learn more robust features by training on corrupted inputs.
Reconstruction error is a powerful anomaly score - high error means the input is unlike training data.
Variational Autoencoders add probabilistic encoding, enabling generation of new realistic samples.

Contact Us

Have a question or feedback? Fill out the form below or reach us directly at support@nvaitraining.com