A novel approach to reconstruction based saliency detection via convolutional neural network stacked with auto-encoder

Xinchen Lin, Yang Tang, Huaglory Tianfield, Feng Qian, Weimin Zhong

Research output: Contribution to journalArticle

Abstract

Visual saliency detection, toward the simulation of human visual system (HVS), has drawn much attention in recent decades. Reconstruction based saliency detection models are established for saliency detection, which predict unexpected regions via linear combination or auto-encoder network. However, these models are ineffective in dealing with images due to the loss of spatial information caused by the conversion from images to vectors. In this paper, a novel approach is proposed to solve this problem. The core is a deep reconstruction model, i.e., convolutional neural network for reconstruction stacked with auto-encoder (CNNR). On the one hand, the use of CNN is able to directly take two-dimensional data as input instead of having to convert the matrix to a series of vectors as in conventional reconstruction based saliency detection methods. On the other hand, the training process of CNN is augmented with the initialization obtained by an unsupervised learning process of convolutional auto-encoder (CAE). By this way, our CNNR model can be trained on limited labeled data, with the weights of the CNN being meaningfully initialized by CAE instead of random initialization. Performance evaluations are conducted through comprehensive experiments on four benchmark datasets and the comparisons with eight state-of-the-art saliency detection models show that our proposed deep reconstruction model outperforms most of the eight state-of-the-art saliency detection models.
Original languageEnglish
Number of pages31
JournalNeurocomputing
DOIs
Publication statusPublished - 23 Jan 2019

Fingerprint

Benchmarking
Learning
Neural networks
Weights and Measures
Unsupervised learning
Datasets

Keywords

  • convolutional neural network
  • auto-encoder
  • deep learning
  • reconstruction
  • saliency detection

Cite this

@article{87db3c8e589847dbb59a2998c67e50e1,
title = "A novel approach to reconstruction based saliency detection via convolutional neural network stacked with auto-encoder",
abstract = "Visual saliency detection, toward the simulation of human visual system (HVS), has drawn much attention in recent decades. Reconstruction based saliency detection models are established for saliency detection, which predict unexpected regions via linear combination or auto-encoder network. However, these models are ineffective in dealing with images due to the loss of spatial information caused by the conversion from images to vectors. In this paper, a novel approach is proposed to solve this problem. The core is a deep reconstruction model, i.e., convolutional neural network for reconstruction stacked with auto-encoder (CNNR). On the one hand, the use of CNN is able to directly take two-dimensional data as input instead of having to convert the matrix to a series of vectors as in conventional reconstruction based saliency detection methods. On the other hand, the training process of CNN is augmented with the initialization obtained by an unsupervised learning process of convolutional auto-encoder (CAE). By this way, our CNNR model can be trained on limited labeled data, with the weights of the CNN being meaningfully initialized by CAE instead of random initialization. Performance evaluations are conducted through comprehensive experiments on four benchmark datasets and the comparisons with eight state-of-the-art saliency detection models show that our proposed deep reconstruction model outperforms most of the eight state-of-the-art saliency detection models.",
keywords = "convolutional neural network, auto-encoder, deep learning, reconstruction, saliency detection",
author = "Xinchen Lin and Yang Tang and Huaglory Tianfield and Feng Qian and Weimin Zhong",
note = "Acceptance from webpage File marked as 'submitted version' - query to author. ET 15/2/19 AAM provided, 12m embargo.",
year = "2019",
month = "1",
day = "23",
doi = "10.1016/j.neucom.2019.01.041",
language = "English",

}

A novel approach to reconstruction based saliency detection via convolutional neural network stacked with auto-encoder. / Lin, Xinchen; Tang, Yang; Tianfield, Huaglory; Qian, Feng; Zhong, Weimin.

In: Neurocomputing, 23.01.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A novel approach to reconstruction based saliency detection via convolutional neural network stacked with auto-encoder

AU - Lin, Xinchen

AU - Tang, Yang

AU - Tianfield, Huaglory

AU - Qian, Feng

AU - Zhong, Weimin

N1 - Acceptance from webpage File marked as 'submitted version' - query to author. ET 15/2/19 AAM provided, 12m embargo.

PY - 2019/1/23

Y1 - 2019/1/23

N2 - Visual saliency detection, toward the simulation of human visual system (HVS), has drawn much attention in recent decades. Reconstruction based saliency detection models are established for saliency detection, which predict unexpected regions via linear combination or auto-encoder network. However, these models are ineffective in dealing with images due to the loss of spatial information caused by the conversion from images to vectors. In this paper, a novel approach is proposed to solve this problem. The core is a deep reconstruction model, i.e., convolutional neural network for reconstruction stacked with auto-encoder (CNNR). On the one hand, the use of CNN is able to directly take two-dimensional data as input instead of having to convert the matrix to a series of vectors as in conventional reconstruction based saliency detection methods. On the other hand, the training process of CNN is augmented with the initialization obtained by an unsupervised learning process of convolutional auto-encoder (CAE). By this way, our CNNR model can be trained on limited labeled data, with the weights of the CNN being meaningfully initialized by CAE instead of random initialization. Performance evaluations are conducted through comprehensive experiments on four benchmark datasets and the comparisons with eight state-of-the-art saliency detection models show that our proposed deep reconstruction model outperforms most of the eight state-of-the-art saliency detection models.

AB - Visual saliency detection, toward the simulation of human visual system (HVS), has drawn much attention in recent decades. Reconstruction based saliency detection models are established for saliency detection, which predict unexpected regions via linear combination or auto-encoder network. However, these models are ineffective in dealing with images due to the loss of spatial information caused by the conversion from images to vectors. In this paper, a novel approach is proposed to solve this problem. The core is a deep reconstruction model, i.e., convolutional neural network for reconstruction stacked with auto-encoder (CNNR). On the one hand, the use of CNN is able to directly take two-dimensional data as input instead of having to convert the matrix to a series of vectors as in conventional reconstruction based saliency detection methods. On the other hand, the training process of CNN is augmented with the initialization obtained by an unsupervised learning process of convolutional auto-encoder (CAE). By this way, our CNNR model can be trained on limited labeled data, with the weights of the CNN being meaningfully initialized by CAE instead of random initialization. Performance evaluations are conducted through comprehensive experiments on four benchmark datasets and the comparisons with eight state-of-the-art saliency detection models show that our proposed deep reconstruction model outperforms most of the eight state-of-the-art saliency detection models.

KW - convolutional neural network

KW - auto-encoder

KW - deep learning

KW - reconstruction

KW - saliency detection

U2 - 10.1016/j.neucom.2019.01.041

DO - 10.1016/j.neucom.2019.01.041

M3 - Article

ER -