arxiv Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

名称
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
首页
https://yiyibooks.cn/arxiv/2303.14369v1/index.html
原始地址
https://arxiv.org/pdf/2303.14369
描述
基于对比学习的视频语言表示方法,例如剪辑,已经取得了出色的性能,可以在预定的视频文本对上进行语义互动 ...