Design and evaluation of GAN-based models for adversarial training robustness in deep learning

Date

2023-04-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Adversarial attacks show one of the generalization issues of current deep learning models on special distribution shifted data. The adversarial samples generated by the attack algorithm can introduce malicious behavior to any deep learning system that affects the consistency of the deep learning model. This thesis presents the design and evaluation of multiple possible component architectures of a GAN that can provide a new direction for training a robust convolution classifier. Each component is related to a different aspect of the GAN that impacts the generalization and the robustness outcomes. The best formulation can achieve around 45% accuracy under 8/255 L∞ PGD attack and 60% accuracy under 128/255 L2 PGD attack that outperforms L2 PGD adversarial training. The other contributions include the research on gradient masking, robustness transferability across the constraints and the generalization limitations.

Description

Keywords

Adversarial attacks, Adversarial samples, Adversarial robustness, Adversarial training, Generative adversarial networks

Citation