视频生成【文章汇总】SVD, Sora, Latte, VideoCrafter12, DiT...

🕗 发布于 2024-07-25 10:22 音视频视频生成 Sora SVD DiT

视频生成【文章汇总】SVD, Sora, Latte, VideoCrafter12, DiT...

- 数据集
- 指标
【arXiv 2024】MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions
【CVPR 2024】VBench : Comprehensive Benchmark Suite for Video Generative Models
【arxiv 2024】xxx
【arxiv 2024】xxx
【arxiv 2024】xxx
【arxiv 2024】xxx
【arxiv 2024】xxx
【arxiv 2024】xxx

数据集

指标

【arXiv 2024】MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions

Authors: Xuan Ju, Yiming Gao, Zhaoyang Zhang, Ziyang Yuan, Xintao Wang, Ailing Zeng, Yu Xiong, Qiang Xu, Ying Shan

Abstract Sora's high-motion intensity and long consistent videos have significantly impacted the field of video generation, attracting unprecedented attention. However, existing publicly available datasets are inadequate for generating Sora-like videos, as they mainly contain short videos with low motion intensity and brief captions. To address these issues, we propose MiraData, a high-quality video dataset that surpasses previous ones in video duration, caption detail, motion strength, and visual quality. We curate MiraData from diverse, manually selected sources and meticulously process the data to obtain semantically consistent clips. GPT-4V is employed to annotate structured captions, providing detailed descriptions from four different perspectives along with a summarized dense caption. To better assess temporal consistency and motion intensity in video generation, we introduce MiraBench, which enhances existing benchmarks by adding 3D consistency and tracking-based motion strength metrics. MiraBench includes 150 evaluation prompts and 17 metrics covering temporal consistency, motion strength, 3D consistency, visual quality, text-video alignment, and distribution similarity. To demonstrate the utility and effectiveness of MiraData, we conduct experiments using our DiT-based video generation model, MiraDiT. The experimental results on MiraBench demonstrate the superiority of MiraData, especially in motion strength.

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【CVPR 2024】VBench : Comprehensive Benchmark Suite for Video Generative Models

Authors: Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu

Abstract Video generation has witnessed significant advancements, yet evaluating these models remains a challenge. A comprehensive evaluation benchmark for video generation is indispensable for two reasons: 1) Existing metrics do not fully align with human perceptions; 2) An ideal evaluation system should provide insights to inform future developments of video generation. To this end, we present VBench, a comprehensive benchmark suite that dissects "video generation quality" into specific, hierarchical, and disentangled dimensions, each with tailored prompts and evaluation methods. VBench has three appealing properties: 1) Comprehensive Dimensions: VBench comprises 16 dimensions in video generation (e.g., subject identity inconsistency, motion smoothness, temporal flickering, and spatial relationship, etc). The evaluation metrics with fine-grained levels reveal individual models' strengths and weaknesses. 2) Human Alignment: We also provide a dataset of human preference annotations to validate our benchmarks' alignment with human perception, for each evaluation dimension respectively. 3) Valuable Insights: We look into current models' ability across various evaluation dimensions, and various content types. We also investigate the gaps between video and image generation models. We will open-source VBench, including all prompts, evaluation methods, generated videos, and human preference annotations, and also include more video generation models in VBench to drive forward the field of video generation.

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【arxiv 2024】xxx

Authors:

Abstract

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【arxiv 2024】xxx

Authors:

Abstract

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【arxiv 2024】xxx

Authors:

Abstract

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【arxiv 2024】xxx

Authors:

Abstract

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【arxiv 2024】xxx

Authors:

Abstract

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

【arxiv 2024】xxx

Authors:

Abstract

【Paper】 > 【Github_Code】 > 【Project】 > 【中文解读,待续】

原文地址：https://blog.csdn.net/qq_45934285/article/details/140653460

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：Music Tag Editor Pro for Mac：强大的音频标签管理工具
下一篇：PyTorch基于注意力的目标检测模型DETR

MYSQL常用基本操作总结
SQL查询中各个关键字的执行先后顺序： from > on> join > where > group by > with > having >select
阅读更多2024-09-19
JAVA并发编程系列之Semaphore信号量剖析
候选人，心中万马奔腾！！！吐了一口82年老血，当场砸电脑回家！原因是：腾讯T2面试，现场限时3分钟+限最多20行代码，模拟地铁口安检进站。其中安检入口10个，当前排队人数是100个，每个人安检进站耗时
阅读更多2024-09-19
24年蓝桥杯及攻防世界赛题-MISC-2
24年蓝桥杯及攻防世界赛题-MISC-2
阅读更多2024-09-19
干货-并发编程提高——重谈 RUNNABLE-上篇（十四）
直接看它的 Javadoc 中的说明：一个在 JVM 中执行的线程处于这一状态中。（A threadexecuting而传统的进（线）程状态一般划分如下：注：这里的进程指早期的单线程进程，这里所谓进程
阅读更多2024-09-19
phpstudy 建站使用 php8版本打开 phpMyAdmin后台出现网页提示致命错误：（phpMyAdmin这是版本问题导致的）
将网站根目录phpMyAdmin4.8.5里面的文件换成官网下载的5.2.1版本即可。重启网站，打开phpMyAdmin后台即可（若打不开更改 mysql密码即可）解决方法：官网下载phpmyadm
阅读更多2024-09-19
零工市场小程序：保障灵活就业
截止2024年高校毕业生达到1179万，在今年的经济情况下，就业市场就面临着比较大的压力，许多毕业生面临一时之间难以找到合适的工作的问题，那么求职者就会需要一份临时的工作来得到报酬，面对传统的找零工方
阅读更多2024-09-19
Linux中权限和指令
mv指令是move的缩写，用来，经常用来备份文件或目录。
阅读更多2024-09-19
Redis 底层揭秘：事务与 Lua 脚本的工作原理
定义Lua 是一种轻量级的脚本语言，它可以在 Redis 中被执行，用于实现复杂的逻辑操作。优势与事务相比，Lua 脚本具有更高的性能和更好的灵活性。Lua 脚本可以在 Redis 服务器端一次性执行
阅读更多2024-09-19
Vue3使用shapefile读取矢量数据，以数组形式返回坐标点
【代码】Vue3使用shapefile读取矢量数据，以数组形式返回坐标点。
阅读更多2024-09-19
WEB 编程：使用富文本编辑器 Quill 配合 WebBroker 后端
评估了好几个，最后选择这个开源的。把前端代码，存储为一个单独的文本文件，方便随便哪个页面需要的时候可以使用。相当于封装为一个独立的对象，方便代码重用。
阅读更多2024-09-19

视频生成【文章汇总】SVD, Sora, Latte, VideoCrafter12, DiT...

视频生成【文章汇总】SVD, Sora, Latte, VideoCrafter12, DiT...

数据集

指标

【arXiv 2024】MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions

【CVPR 2024】VBench : Comprehensive Benchmark Suite for Video Generative Models

【arxiv 2024】xxx

【arxiv 2024】xxx

【arxiv 2024】xxx

【arxiv 2024】xxx

【arxiv 2024】xxx

【arxiv 2024】xxx

相关文章