Video-InfinityComparisonAblationMulti-PromptsGalleryAcknowledgement
Video-Infinity1
Distributed Long Video Generation

Capable of generating videos with 2,300 frames in 5 minutes. 100 times faster than the prior methods *

COMPARISON

Maximum Frames

Video-Infinity
Streaming T2V
OpenSora V1.1
Free Noise
2,300
1,200 *
128
120

Time Costing (120 frames)

Video-Infinity
Free Noise
OpenSora V1.1
Streaming T2V
20s
187s
217s
1,604s
(Click to play)
Video-Infinity *
Free Noise *
Open Sora V1.1
Streaming T2V
GPU1
GPU2

ABLATION

Ablating Clip Parallelism and Dual-scope Attention
to assess their effects.

Clip Parallelism

 Attention 
 Conv & GroupNorm 

Dual-scope Attention

 Global-scope 
 Local-scope 

MULTI-PROMPTS

ACKNOWLEDGEMENT

[1] Our method generates videos with 2,300 frames in 5 minutes, achieving a frame rate of 7.6 fps. Our sampling steps setting is 30.

[2] ... which is approximately 100 times faster than previous methods. The comparison is made with the time taken by the 'Streaming T2V' method under the settings used for generating extreamly-long videos with 1024 frames.

[3] The maximum frame count for 'Streaming T2V' is noted as 1200, because the longest frame sequence mentioned in the original text is 1200, and currently, there are no videos generated by it that exceed 1200 frames.

[4] Our comparison experiments were conducted on 8 x Nvidia Ada 6000 GPUs.

[5] The methods 'Free Noise' and 'Video-Infinity' are based on the 'VideoCrafter2' model.