Oliver Eberle, BCCN Berlin / TU Berlin

Disentangling high- and low-level contributions in visual saliency prediction models

Models of visual saliency aim to explain experimentally observed data, increase our understanding of attentional mechanisms and reveal what features our visual system is tuned to detect. Unfortunately, simple and interpretable models using a set of low-level representations such as color or orientation features have demonstrated limited overall performance. In contrast, image features making use of high-level characteristics such as rich object or face detectors in combination with a pointwise readout network have boosted performance in a trade-off for interpretability of inner operations. Exchanging such high-level for low-level intensity-contrast features has recently shown to outperform all previous pre-deep saliency models. This presents a framework to disentangle high- and low-level contributions to saliency. Extending on this work, a new low-level model using a set of learnable Gabor orientation filters, is proposed here. Computational experiments show that such low-level intensity-contrast and/or orientation can explain up to 45% of image-based saliency. Furthermore, analyses on resulting saliency maps give intuitive understanding of how salient representations are identified. In addtion, the classic and influential low-level saliency model by Itti & Koch is revisited to investigate causes for its limited performance on natural stimuli. For this, its feature extraction is used and combined with the same readout architecture as in all models considered here. Results indicate that local feature contrast is most predictive whereas conspicuity maps and map normalization have decreased performance.

Additional Information

MSc thesis defence as part of the International Master Program Computational Neuroscience.

Organized by

Klaus Obermayer / Robert Martin

Go back