Face manipulation techniques improve fast with the development of powerful image generation models. Two particular face manipulation methods, namely face swap and expression reenactment attract much attention for their flexibility and ease to generate high quality synthesis results. Recently, these two subjects are actively studied. However, most existing methods treat the two tasks separately, ignoring their underlying similarity. In this paper, we propose to tackle the two problems within a unified framework that achieves high quality synthesis results. The enabling component for our unified framework is the clean disentanglement of 3D pose, shape, and expression factors and then recombining them for different tasks accordingly. We then use the same set of 2D representations for face swap and expression reenactment tasks that are input to a common image translation model to directly generate the final synthetic images. Once trained, the proposed model can accomplish both face swap and expression reenactment tasks for previously unseen subjects. Comprehensive experiments and comparisons show that the proposed method achieves high fidelity results in multiple aspects, and it is especially good at faithfully preserving source facial shape in the face swap task, and accurately transferring facial movements in the expression reenactment task.