Return to Article Details Multi-Modal Robotic World Modeling via Physically Consistent Video Generation and Cross-View Representation Alignment Download Download PDF