Ready-to-React: Online Reaction Policy for
Two-Character Interaction Generation

ICLR 2025


Zhi Cen1 Huaijin Pi2 Sida Peng1 Qing Shuai1 Yujun Shen3 Hujun Bao1 Xiaowei Zhou1 Ruizhen Hu4*

1Zhejiang University 2The University of Hong Kong 3Ant Group 4Shenzhen University
*Corresponding Author

Abstract


This paper addresses the task of generating two-character online interactions. Previously, two main settings existed for two-character interaction generation: (1) generating one’s motions based on the counterpart’s complete motion sequence, and (2) jointly generating two-character motions based on specific conditions. We argue that these settings fail to model the process of real-life two-character interactions, where humans will react to their counterparts in real time and act as independent individuals. In contrast, we propose an online reaction policy, called Ready-to-React, to generate the next character pose based on past observed motions. Each character has its own reaction policy as its “brain”, enabling them to interact like real humans in a streaming manner. Our policy is implemented by incorporating a diffusion head into an auto-regressive model, which can dynamically respond to the counterpart’s motions while effectively mitigating the error accumulation throughout the generation process. We conduct comprehensive experiments using the challenging boxing task. Experimental results demonstrate that our method outperforms existing baselines and can generate extended motion sequences. Additionally, we show that our approach can be controlled by sparse signals, making it well-suited for VR and other online interactive environments.

Method Overview


Overview of our online reaction policy. Given a boxing scene at the leftmost figure, where the blue agent is thinking about its next move. The reaction policy follows these steps: first, based on the observations, the history encoder encodes the current state and observations; then, the next latent predictor predicts the upcoming motion latent; and finally, an online motion decoder decodes this motion latent into the actual next pose. The same reaction policy can be applied to the pink agent. Through a streaming process for both agents, our reaction policy enables the continuous generation of two-character motion sequences without length limit.

Reactive Motion Generation


Here we present our method in the context of generating reactive motions. The opponent’s motion is provided as ground truth.

Two-character Motion Generation


Our method enables the simultaneous generation of motion for both agents. Starting with the first four frames, each agent’s subsequent motion is generated by leveraging the interaction between their own and their opponent’s past motions.

Long-term Two-character Motion Generation


Given the first four frames of the two-character motion sequence, our method can successfully generate 1800 frames of the two-character motion.

Application: Generating Reactive Motion with Sparse Signals


Introducing sparse control is essential for making our method practical in VR online interactive environments. Our approach successfully generates realistic motion while effectively adhering to the sparse signals. The blue agent is controlled by a combination of sparse signals and our reaction policy.

Supplementary Video




Citation


@inproceedings{cen2025ready_to_react,
  title={Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation},
  author={Cen, Zhi and Pi, Huaijin and Peng, Sida and Shuai, Qing and Shen, Yujun and Bao, Hujun and Zhou, Xiaowei and Hu, Ruizhen},
  booktitle={ICLR},
  year={2025}
}