Revisiting ensemble adversarial attack

摘要

Deep neural networks have shown vulnerability to adversarial attacks. Adversarial examples generated with an ensemble of source models can effectively attack unseen target models, posing a security threat to practical applications. In this paper, we investigate the manner of ensemble adversarial attacks from the viewpoint of network gradients with respect to inputs. We observe that most ensemble adversarial attacks simply average gradients of the source models, ignoring their different contributions in the ensemble. To remedy this problem, we propose two novel ensemble strategies, the Magnitude-Agnostic Bagging Ensemble (MABE) strategy and Gradient-Grouped Bagging And Stacking Ensemble (G2BASE) strategy. The former builds on a bagging ensemble and leverages a gradient normalization module to rebalance the ensemble weights. The latter divides diverse models into different groups according to the gradient magnitudes and combines an intragroup bagging ensemble with an intergroup stacking ensemble. Experimental results show that the proposed methods enhance the success rate in white-box attacks and further boost the transferability in black-box attacks.

出版物
Signal Processing: Image Communication
何子文
何子文
在读博士、联合指导

主要从事人工智能安全、对抗样本等方面的研究。

王伟
王伟
副研究员、硕导

主要从事多媒体内容安全、人工智能安全、多模态内容分析与理解等方面的研究工作。

董晶
董晶
研究员、硕导

主要从事多媒体内容安全、人工智能安全、多模态内容分析与理解等方面的研究工作。详情访问:http://cripac.ia.ac.cn/people/jdong

谭铁牛
谭铁牛
研究员,博导

主要从事图像处理、计算机视觉和模式识别等相关领域的研究工作,目前的研究主要集中在生物特征识别、图像视频理解和信息内容安全等三个方向。