Skip to main content

Arxiv

Playing with Transformer at 30+ FPS via Next-Frame Diffusion
Xinle Cheng
Tianyu He†
Jiayi Xu
Junliang Guo
Di He
Jiang Bian
Arxiv NeurIPS 2025
In this work, we present Next-Frame Diffusion (NFD), an autoregressive diffusion transformer that incorporates block-wise causal attention, enabling iterative sampling and efficient inference via parallel token generation within each frame.
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
Yiming Qin
Zhu Xu
Yang Liu†
Arxiv CVPR 2025
OmniPhysGS: 3D Constitutive Gaussians for General Physics-based Dynamics Generation
Yuchen Lin
Chenguo Lin†
Jianjin Xu
Yadong Mu‡
Arxiv ICLR 2025
ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning
Xiangru Tang*
Tianyu Hu*
Muyang Ye*
Yanjun Shao*
Xunjian Yin
Siru Ouyang
Wangchunshu Zhou
Pan Lu
Zhuosheng Zhang
Yilun Zhao
Arman Cohan
Mark Gerstein
Arxiv ICLR 2025
We present ChemAgent, a novel framework designed to improve the performance of LLMs through a dynamic, self-updating library.
Autonomous Character-Scene Interaction Synthesis from Text Instruction
Nan Jiang
Zimo He
Zi Wang
Hongjie Li
Yixin Chen
Siyuan Huang†
Yixin Zhu†
Arxiv SIGGRAPH Asia 2024
DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes
Jialiang Zhang*
Haoran Liu*
Danshi Li*
Xinqiang Yu*
Haoran Geng
Yufei Ding
Jiayi Chen
He Wang†
Arxiv CoRL2024
Scaling up dynamic human-scene interaction modeling
Nan Jiang
Zhiyuan Zhang
Hongjie Li
Xiaoxuan Ma
Zan Wang
Yixin Chen
Tengyu Liu
Yixin Zhu†
Siyuan Huang†
Arxiv Github CVPR2024