Exploring Adversarial Fake Images on Face Manifold

不同方法生成的对抗图像

摘要

Images synthesized by powerful generative adversarial network (GAN) based methods have drawn moral and privacy concerns. Although image forensic models have reached great performance in detecting fake images from real ones, these models can be easily fooled with a simple adversarial attack. But, the noise adding adversarial samples are also arousing suspicion. In this paper, instead of adding adversarial noise, we optimally search adversarial points on face manifold to generate anti-forensic fake face images. We iteratively do a gradient-descent with each small step in the latent space of a generative model, e.g. Style-GAN, to find an adversarial latent vector, which is similar to norm-based adversarial attack but in latent space. Then, the generated fake images driven by the adversarial latent vectors with the help of GANs can defeat main-stream forensic models. For examples, they make the accuracy of deepfake detection models based on Xception or EfficientNet drop from over 90% to nearly 0%, meanwhile maintaining high visual quality. In addition, we find manipulating style vector $z$ or noise vectors $n$ at different levels have impacts on attack success rate. The generated adversarial images mainly have facial texture or face attributes changing.

出版物
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, Oral)

We proposed a novel method to generate adversarial anti-forensic images which can bypass deep forensic models.

Figure1
Figure 1: Pipeline of our method

  • The overall pipeline of our method. We perform gradient descent on the latent vector and noise inputs of Style-GAN, respectively or together, maximizing loss function of the target forensic model(s).

Figure2
Figure 2: Generated images from different methods

  • Figure 2 shows adversarial images generated by different methods. Upper left is the original Style-GAN-generated image. Upper right is the image generated by our method. Lower left and Lower right are adversarial images generated by FGSM[8] and PGD[21] Linf norm-based attack respectively under the same perturbation level. Although all these images can bypass the target forensic model, images generated by our method are more invisible to human eyes.

Table1
Table 1: Acc of image forensic models before and after attack

  • Table 1 shows accuracy different models perform on our method and other adversarial attack method, PGD L2, PGD Linf and FGSM attack. our method has the same ability to bypass the forensic detectors as norm-based adversarial attack, PGD Linf and has better adversarial strength than FGSM and PGD L2 attack. PGD L2 attack shows poor performance on both models because of the limited perturbation scale.

Table2
Table 2: Image quality metrics

  • Table 2 shows metrics measuring distortion between adversarial images and reference images. proposed method has similar performance with FGSM in MSE, PSNR and SSIM, while surpassing the rest methods in LPIPS and user study by a large margin.

Citation

@InProceedings{Li_2021_CVPR,
    author    = {Li, Dongze and Wang, Wei and Fan, Hongxing and Dong, Jing},
    title     = {Exploring Adversarial Fake Images on Face Manifold},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {5789-5798}
}
李东泽
李东泽
在读硕士

自动化所2020级硕士研究生。

王伟
王伟
副研究员、硕导

主要从事多媒体内容安全、人工智能安全、多模态内容分析与理解等方面的研究工作。

樊红兴
樊红兴
硕士

主要研究视频取证,图像视频处理,非接触式生理信号测量等内容

董晶
董晶
研究员、硕导

主要从事多媒体内容安全、人工智能安全、多模态内容分析与理解等方面的研究工作。详情访问:http://cripac.ia.ac.cn/people/jdong