DiT-Head: High Resolution Talkin Head Synthesis using Diffusion Transformers

Mir, A.; Alonso, E.; Mondragon, E.

DiT-Head: High Resolution Talkin Head Synthesis using Diffusion Transformers

Mir, A., Alonso, E. ORCID: 0000-0002-3306-695X & Mondragon, E. ORCID: 0000-0003-4180-1261 (2024). DiT-Head: High Resolution Talkin Head Synthesis using Diffusion Transformers. In: Proceedings of the 16th International Conference on Agents and Artificial Intelligence. 16th International Conference on Agents and Artificial Intelligence (ICAART 2024), 24-26 Feb 2024, Rome, Italy. doi: 10.5220/0012312200003636

Abstract

We propose a novel talking head synthesis pipeline called ”DiT-Head,” which is based on diffusion transformers and uses audio as a condition to drive the denoising process of a diffusion model. Our method is scalable and can generalise to multiple identities while producing high-quality results. We train and evaluate our proposed approach and compare against existing methods of talking head synthesis. We show that our model can compete with these methods in terms of visual quality and lip-sync accuracy. Our results highlight the potential of our proposed approach to be used for a wide range of applications including virtual assistants, entertainment, and education. For a video demonstration of results and our user study, please refer to our supplementary material.

Publication Type:	Conference or Workshop Item (Paper)
Additional Information:	This paper will be presented at 16th International Conference on Agents and Artificial Intelligence (ICAART 2024).
Publisher Keywords:	Talking Head Synthesis, Diffusion Transformers
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments:	School of Science & Technology > Department of Computer Science

Preview

Text - Accepted Version
Download (1MB) | Preview

Export

Downloads

Downloads per month over past year

View more statistics

Metadata

Altmetric

View Altmetric information about this item.

CORE (COnnecting REpositories)

Actions (login required)

Admin Login

Creators:	Mir, A. Alonso, E. ORCID: 0000-0002-3306-695X Mondragon, E. ORCID: 0000-0003-4180-1261
Event Title:	16th International Conference on Agents and Artificial Intelligence (ICAART 2024)
Event Type:	Conference
Event Location:	Rome, Italy
Event Dates:	24-26 Feb 2024
Status:	Published
Refereed:	Yes
Journal or Publication Title:	Proceedings of the 16th International Conference on Agents and Artificial Intelligence
URI:	https://openaccess.city.ac.uk/id/eprint/31834
Date available in CRO:	08 Dec 2023 10:36
Date deposited:	7 December 2023
Dates:	Date Event 7 December 2023 Accepted 26 February 2024 Published