In the paper Representation compression and generalization in deep neural networks (Shwartz-Ziv et al., 2019), the following conjecture is given.

Conjecture 1. (Informal Version)

With probability over the training data drawn from the same distribution as a random variable pair , for the generalization error , there is a bound obeying the following form:

where is the full model obtained by training and is the output of an intermediate -layer encoder of the model, i.e., representation obtained after passing through the first layers.