↓跳过正文

Arxiv

LDA-1B: Scaling Latent Dynamics Action Model via Universal Embodied Data Ingestion ↗ ↖

Arxiv Github RSS 在投

Recent robot foundation models largely rely on large-scale behavior cloning, which imitates expert …

NavSpace: How Navigation Agents Follow Spatial Intelligence Instructions ↗ ↖

Hao Dong1,2,‡

Instruction-following navigation is a key step toward embodied intelligence.

Neural Force Field: Few shot learning of generalized physical reasoning ↗ ↖

Ruihong Shen

Arxiv Github ICLR

We present NFF, a modeling framework built on NODE that learns interpretable force field …

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use ↗ ↖

Lingjun Chen

Michael Qizhe Shieh

Arxiv Github ICLR 2026

The MCP standardizes how LLMs interact with external systems, forming the foundation for general …

Learning Physics-Grounded 4D Dynamics with Neural Gaussian Force Fields ↗ ↖

Ruihong Shen

Arxiv Github ICLR

Predicting physical dynamics from visual data remains a fundamental challenge in AI, as it requires …

Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models ↗ ↖

In this paper, we introduce \emph{Luminark｝, a training-free and probabilistically-certified …

CorrectNav: Self-Correction Flywheel Empowers Vision-Language-Action Navigation Model ↗ ↖

Arxiv AAAI CCF A

Existing vision-and-language navigation models often deviate from the correct trajectory when …

SimLauncher: Launching Sample-Efficient Real-world Robotic Reinforcement Learning via Simulation Pre-training ↗ ↖

Arxiv IROS 2025

Autonomous learning of dexterous, long-horizon robotic skills has been a longstanding pursuit of …

Playing with Transformer at 30+ FPS via Next-Frame Diffusion ↗ ↖

Arxiv NeurIPS 2025

In this work, we present Next-Frame Diffusion (NFD), an autoregressive diffusion transformer that …

Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation ↗ ↖

Arxiv CVPR 2025

OmniPhysGS: 3D Constitutive Gaussians for General Physics-based Dynamics Generation ↗ ↖

Arxiv ICLR 2025

ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning ↗ ↖

Wangchunshu Zhou

Zhuosheng Zhang

Arxiv ICLR 2025

We present ChemAgent, a novel framework designed to improve the performance of LLMs through a …

Autonomous Character-Scene Interaction Synthesis from Text Instruction

Siyuan Huang†

Arxiv SIGGRAPH Asia 2024

DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes ↗ ↖

Jialiang Zhang*

Scaling up dynamic human-scene interaction modeling ↗ ↖

Siyuan Huang†

Arxiv Github CVPR2024