Visual saliency detection, aimed at the simulation of human visual system (HVS), has drawn wide attention in recent decades. Reconstruction based saliency detection models are established methods for saliency detection, which predict unexpected regions via linear combination or auto-encoder network. However, these models are ineffective in dealing with images due to the loss of spatial information caused by the conversion from images to vectors. In this paper, a novel approach is proposed to solve this problem. The core is a deep reconstruction model, i.e., convolutional neural network for reconstruction stacked with auto-encoder (CNNR for short). On the one hand, the use of CNN is able to directly take two-dimensional data as input rather than having to convert the matrix to a series of vectors as in conventional reconstruction based saliency detection methods. On the other hand, the training process of CNN is augmented with the initialization obtained by an unsupervised learning process of convolutional auto-encoder (CAE). By this way, our CNNR model can be trained on limited labeled data, with the weights of the CNN being meaningfully initialized by CAE instead of random initialization. Performance evaluations are conducted through comprehensive experiments on four benchmark datasets and the comparisons with eight state-of-the-art saliency detection models show that our proposed deep reconstruction model outperforms most of the eight state-of-the-art saliency detection models.
- convolutional neural network
- deep learning
- saliency detection
Lin, X., Tang, Y., Tianfield, H., Qian, F., & Zhong, W. (2019). A novel approach to reconstruction based saliency detection via convolutional neural network stacked with auto-encoder. Neurocomputing, 349, 145-155. https://doi.org/10.1016/j.neucom.2019.01.041