Experiment: Information Bottleneck in Deep Networks

Information Bottleneck in Deep Networks

Category: Machine Learning

Summary: Testing whether the classic information-bottleneck picture of fitting followed by compression remains visible in much deeper feedforward networks.

The information-bottleneck view of learning proposes that neural networks first absorb task-relevant structure and later compress away input details that do not matter for prediction. This experiment extends that question to networks with up to seven hidden layers, asking whether the same two-phase picture still appears once the architecture is substantially deeper than in the original small demonstrations.

The script trains deep multilayer perceptrons on a noisy binary Gaussian-cluster task and tracks proxy measures for mutual information between layer activations, inputs, and labels over training time. It focuses on whether task-relevant information rises early while input-related information peaks and then declines, and on how those turning points depend on depth.

That makes the project a mechanistic study of internal representation dynamics rather than a pure accuracy benchmark. The value is in testing whether a widely discussed explanatory story for shallow networks survives in deeper models where optimization and representation geometry can differ substantially.

Method: Repeated deep-MLP training on Gaussian-cluster classification with layerwise information proxies tracked across epochs and depth.

What is measured: Input-information and label-information proxies, peak compression epoch, layerwise compression trends, training and test accuracy, and depth dependence of the two-phase pattern.