Text-to-talking face: emotionally rich facial animation synthesis

Maini, Sarthak; Shukla, Jainendra (Advisor)

dc.contributor.author	Maini, Sarthak
dc.contributor.author	Shukla, Jainendra (Advisor)
dc.date.accessioned	2024-05-07T13:22:27Z
dc.date.available	2024-05-07T13:22:27Z
dc.date.issued	2023-11-29
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/1399
dc.description.abstract	Given the facial image of an individual along with audio, Talking face generation aims to synthesize portrait videos of the individual that are conditioned by the given audio. Existing methods focus on generating talking face videos conditioned on audio and portrait images or driving video. Moreover, existing methods that have attempted to synthesize talking face videos struggle to produce realistic head movements and facial expressions that align with the audio content. In order to tackle these issues, we introduce a novel framework for the synthesis of expressive talking faces solely based on textual input where the facial characteristics or name of the subject is passed as input along with the audio content to be spoken. Our work makes use of Facial Action Units (FAUs) to explicitly model the facial characteristics of the subject along with other implicit parameters responsible for talking face synthesis. The result is an expressive talking face which explicitly models lip synchronization with audio, head motion and facial expressions resulting in a photo-realistic emotional talking face.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Talking Face Generation	en_US
dc.subject	Facial Action Units	en_US
dc.subject	Text-to-Face Synthesis	en_US
dc.subject	Emotionally Expressive Avatars	en_US
dc.title	Text-to-talking face: emotionally rich facial animation synthesis	en_US
dc.type	Other	en_US