Meta reposted this
Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more here: https://lnkd.in/giU_-6_M A few highlights of DINOv3: 1️⃣SSL enables 1.7B-image, 7B-param training without labels, supporting annotation-scarce scenarios including satellite imagery 2️⃣Produces excellent high-resolution features and state-of-the art performance on dense prediction tasks 3️⃣Versatile application across vision tasks and domains, all with a frozen backbone (no fine-tuning required) 4️⃣ Includes distilled smaller models (ViT-B, ViT-L) and ConvNeXt variants for deployment flexibility To help foster innovation and collaboration in the computer vision community, we’re releasing DINOv3 under a commercial license with a full suite of pre-trained models, adapters, training and evaluation code, and (much!) more. Find them here: https://lnkd.in/gEptEtVR
Démo impressionnante !
DINOv3 is revolutionary—high-resolution vision features are now faster, smarter, and versatile!
Ness, indeed.
Niicce, I was waiting for it!! Thanks for Having ConvNeXt variant this time, and evaluating satellite imagery!
Congrats! 🎉
Excited for this 🔥
WOW!!
Douglas Sammur Lizandro Kenedy Railson Lima
It's more than a technical upgrade. With 7B parameters and 1.7B training images, DINOv3's pivot to visual in-context prompting marks a conceptual shift combining scale and simplicity to advance intuitive, open-set segmentation and multimodal alignment. Excited to see real-world business cases emerge, especially in healthcare and medical imaging, where semantic segmentation and resource-constrained diagnostics could benefit immensely.