VL-JEPA: Joint Embedding Predictive Architecture for Vision-Language

(arxiv.org)

2 points | by andsoitis 12 hours ago ago

No comments yet.