A Unified Framework for High Fidelity Face Swap and Expression Reenactment

摘要

Face manipulation techniques improve fast with the development of powerful image generation models. Two particular face manipulation methods, namely face swap and expression reenactment attract much attention for their flexibility and ease to generate high quality synthesis results. Recently, these two subjects are actively studied. However, most existing methods treat the two tasks separately, ignoring their underlying similarity. In this paper, we propose to tackle the two problems within a unified framework that achieves high quality synthesis results. The enabling component for our unified framework is the clean disentanglement of 3D pose, shape, and expression factors and then recombining them for different tasks accordingly. We then use the same set of 2D representations for face swap and expression reenactment tasks that are input to a common image translation model to directly generate the final synthetic images. Once trained, the proposed model can accomplish both face swap and expression reenactment tasks for previously unseen subjects. Comprehensive experiments and comparisons show that the proposed method achieves high fidelity results in multiple aspects, and it is especially good at faithfully preserving source facial shape in the face swap task, and accurately transferring facial movements in the expression reenactment task.

出版物
IEEE Transactions on Circuits and Systems for Video Technology
彭勃
彭勃
副研究员
樊红兴
樊红兴
硕士

主要研究视频取证,图像视频处理,非接触式生理信号测量等内容

王伟
王伟
副研究员、硕导

主要从事多媒体内容安全、人工智能安全、多模态内容分析与理解等方面的研究工作。

董晶
董晶
研究员、硕导

主要从事多媒体内容安全、人工智能安全、多模态内容分析与理解等方面的研究工作。详情访问:http://cripac.ia.ac.cn/people/jdong