Single-image-to-full-body-animated-video model. Realistic gesture and body motion conditioned on speech audio.
Open source ↗