Cvpr 24 Realnet

CVPR'24 RealNet

This is the official video demonstration for the

[CVPR 2026]

Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement.

CVPR 2026 Paper Pre

Adapting In-context Generation for Enhanced Composed Image Retrieval.

CVPR 2026 paper: RAP

This is the presentation of parer: RAP: Fast Feedforward Rendering-Free Attribute-Guided Primitive Importance Score Prediction ...

[CVPR 2026] Explicit Recovery Behavior for Diffusion Policies (REACH)

Are diffusion policies in robot learning too brittle for the real world? In this video, we introduce REACH (Recovery through ...

CVPR 2026

CVPR '26 | R2VLM

CVPR26 Poster: Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress.

[CVPR 2026] Fine-Grained Token Grounding as a Robust Detector of LVLM Hallucinations

CVPR

CVPR 2026 AGENTSAFE

AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions.

[CVPR 2026] VIMCAN

VIMCAN: Visual-Inertial 3D Human Pose Estimation with Hybrid Mamba-Cross-Attention Network.

CVPR 2026 Poster Presentation

In this video, we introduce a novel video object detection framework called D2FANet. D2FANet is the first framework to jointly ...

[CVPR 2026] MUST

MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality.

[CVPR 2026] RealVLG-R1

[

[CVPR 2026] Spatial-Frequency Aligned Diffusion Features for Cross-Sparsity Correspondence

[CVPR 2026] Virtual Full-stack Scanning of Brain MRI via Imputing Any Quantised Code

[

[CVPR 2026] 44354_MMCP-GEN_YouTube video

[

【CVPR 2026】REL-SF4PASS

REL-SF4PASS: Panoramic Semantic Segmentation with REL Depth Representation and Spherical Fusion.

[CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

a 5-min short video introducing our published work at