Phenaki is an AI model to generate videos that can be multiple minutes long straight from text. You can also generate video from a still image and a prompt. The proposed video encoder-decoder outperforms all per-frame baselines currently used in the literature in terms of spatio-temporal quality and number of tokens per video. To generate video tokens from text, they are using bidirectional masked transformer conditioned on pre-computed text tokens. The generated video tokens are subsequently de-tokenized to create the actual video.
Discover similar tools to enhance your workflow
Scribble your content ideas with AI suggestions to improve your writing performance. Start by out...
Steve AI is an AI video maker for social media and content marketers for creating Live and Animat...
Turn text into movies and games.