Experiment: Information Bottleneck in Deep Neural Networks

Information Bottleneck in Deep Neural Networks

Category: Machine Learning

Summary: Tracking how hidden-layer information about inputs and labels changes during training to test the neural-network compression hypothesis.

The information-bottleneck view of deep learning proposes that useful hidden representations should retain label-relevant structure while compressing away irrelevant details of the input. This experiment asks whether that compression stage is visible in a concrete multilayer network trained on a controlled classification problem.

The script trains a five-layer NumPy MLP and periodically estimates the information-plane coordinates I(X;T) and I(T;Y) for each hidden layer using binned mutual-information approximations. It then examines whether layers first gain information and later compress it while preserving or increasing label relevance.

That question matters because the compression claim has been influential but contentious. The experiment therefore records full information-plane trajectories rather than only final accuracy, making the issue about representational dynamics instead of benchmark performance alone.

Method: NumPy backpropagation training of a five-layer MLP with periodic mutual-information estimation for each hidden layer using binned information-plane diagnostics.

What is measured: Training and test accuracy, information-plane trajectories I(X;T) and I(T;Y) by layer, detected compression events, compression ratios, and whether the Tishby-style compression hypothesis is supported.