Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers

CC BY

Saved in:
Bibliographic Details
Main Authors: Katharina, Hoedt, Verena, Praher, Arthur, Flexer
Format: Book
Language:English
Published: Springer 2023
Subjects:
Online Access:https://link.springer.com/article/10.1007/s00521-022-07918-7
https://dlib.phenikaa-uni.edu.vn/handle/PNK/8325
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:localhost:PNK-8325
record_format dspace
spelling oai:localhost:PNK-83252023-04-26T03:57:39Z Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers Katharina, Hoedt Verena, Praher Arthur, Flexer black-box nature deep audio and image classifiers CC BY Given the rise of deep learning and its inherent black-box nature, the desire to interpret these systems and explain their behaviour became increasingly more prominent. The main idea of so-called explainers is to identify which features of particular samples have the most influence on a classifier’s prediction, and present them as explanations. Evaluating explainers, however, is difficult, due to reasons such as a lack of ground truth. In this work, we construct adversarial examples to check the plausibility of explanations, perturbing input deliberately to change a classifier’s prediction. This allows us to investigate whether explainers are able to detect these perturbed regions as the parts of an input that strongly influence a particular classification. Our results from the audio and image domain suggest that the investigated explainers often fail to identify the input regions most relevant for a prediction; hence, it remains questionable whether explanations are useful or potentially misleading. 2023-04-26T03:57:39Z 2023-04-26T03:57:39Z 2022 Book https://link.springer.com/article/10.1007/s00521-022-07918-7 https://dlib.phenikaa-uni.edu.vn/handle/PNK/8325 en application/pdf Springer
institution Digital Phenikaa
collection Digital Phenikaa
language English
topic black-box nature
deep audio and image classifiers
spellingShingle black-box nature
deep audio and image classifiers
Katharina, Hoedt
Verena, Praher
Arthur, Flexer
Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
description CC BY
format Book
author Katharina, Hoedt
Verena, Praher
Arthur, Flexer
author_facet Katharina, Hoedt
Verena, Praher
Arthur, Flexer
author_sort Katharina, Hoedt
title Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
title_short Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
title_full Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
title_fullStr Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
title_full_unstemmed Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
title_sort constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
publisher Springer
publishDate 2023
url https://link.springer.com/article/10.1007/s00521-022-07918-7
https://dlib.phenikaa-uni.edu.vn/handle/PNK/8325
_version_ 1764268034335178752
score 8.881002