EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers [TOP LAB](arxiv.org)
|paper|arXiv
Current generative video models excel at producing novel content from text and image prompts, but leave a critical gap in editing existing pre-recorded videos, where minor alterations to the spoken sc...