Sign in

AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding

By Xing Zhang and others
Temporal Video Grounding (TVG) aims to localize a moment from an untrimmed video given the language description. Since the annotation of TVG is labor-intensive, TVG under limited supervision has accepted attention in recent years. The great success of vision-language pre-training guides TVG to follow the traditional "pre-training + fine-tuning" paradigm,... Show more
June 11, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Click on play to start listening