↓Skip to main content

Arxiv

Playing with Transformer at 30+ FPS via Next-Frame Diffusion ↗ ↖

Arxiv NeurIPS 2025

In this work, we present Next-Frame Diffusion (NFD), an autoregressive diffusion transformer that incorporates block-wise causal attention, enabling iterative sampling and efficient inference via parallel token generation within each frame.

Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation ↗ ↖

Arxiv CVPR 2025

OmniPhysGS: 3D Constitutive Gaussians for General Physics-based Dynamics Generation ↗ ↖

Arxiv ICLR 2025

ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning ↗ ↖

Wangchunshu Zhou

Zhuosheng Zhang

Arxiv ICLR 2025

We present ChemAgent, a novel framework designed to improve the performance of LLMs through a dynamic, self-updating library.

Autonomous Character-Scene Interaction Synthesis from Text Instruction

3 December 2024

Siyuan Huang†

Arxiv SIGGRAPH Asia 2024

DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes ↗ ↖

6 November 2024

Jialiang Zhang*

Scaling up dynamic human-scene interaction modeling ↗ ↖

Siyuan Huang†

Arxiv Github CVPR2024