Sign in

A Two-Stage Progressive Pre-training using Multi-Modal Contrastive Masked Autoencoders

By Muhammad Abdullah Jamal and Omid Mohareri
In this paper, we propose a new progressive pre-training method for image understanding tasks which leverages RGB-D datasets. The method utilizes Multi-Modal Contrastive Masked Autoencoder and Denoising techniques. Our proposed approach consists of two stages. In the first stage, we pre-train the model using contrastive learning to learn cross-modal representations.... Show more
September 16, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
A Two-Stage Progressive Pre-training using Multi-Modal Contrastive Masked Autoencoders
Click on play to start listening