Preprint

APEX: Adaptive Policy Execution for Precise Manipulation

Mengfei Zhao1,* Chenxi Jiang1,* Tuo An1 Jindou Jia1,† Jianfei Yang1,†
1MARS Lab, Nanyang Technological University
*Equal contribution Corresponding authors
Overview of APEX as an online correction layer between a learned policy and a low-level controller.

APEX bridges the test-time execution gap between learned policies and black-box low-level controllers.

Video Overview

Abstract

Modern imitation learning methods, including visuomotor and Vision-Language-Action (VLA) policies, typically output high-level action references that are executed by low-level controllers. However, the absence of higher-order reference signals, together with the policy's lack of awareness of the underlying low-level control dynamics during training, inevitably induces an execution gap. As a result, realized actions deviate systematically from policy-commanded ones, with a critical impact on precision-sensitive manipulation. Prior work either modifies the policy architecture or the low-level controller, both requiring intrusive changes to the pretrained policy or packaged controller. This raises a natural question: when the policy and controller are both treated as inaccessible black boxes, can we bridge the execution gap? We propose Adaptive Policy EXecution (APEX), a plug-and-play framework inserted between the policy and the controller that reconstructs a dynamically feasible reference from policy outputs and adapts at test-time according to low-level state feedback, with a provable convergence guarantee. Extensive empirical studies show that APEX reduces controller-induced tracking error by 41.2% on demonstration replay and improves manipulation success by 4.8--25.8 percentage points across four visuomotor and VLA policy classes.

Test-Time Execution Gap

Learned policies output high-level action references, but the robot executes them through a low-level controller. At test time, missing higher-order references and unknown controller dynamics can make the realized motion deviate from the commanded motion, which is especially harmful for precision-sensitive manipulation.

Method

APEX has two components. First, an adaptive passive filter turns policy outputs into a smooth, dynamically feasible reference with velocity- and acceleration-like signals.

Second, feedback-driven adaptive terms refine this reference during execution to compensate for unknown dynamics and controller parameters.

1

Black-Box Interface

No policy retraining or controller modification.

2

Reference Reconstruction

Recover smooth reference signals from policy outputs.

3

Online Adaptation

Update correction terms from real-time feedback.

Experiments

We evaluate APEX by isolating controller-induced replay error, deploying frozen learned policies, and testing across policy classes and platforms.

Expert replay benchmark showing APEX reduces tracking error and improves success rates across simulated tasks.
Expert-replay benchmark: APEX reduces tracking error and improves success under controller mismatch.
ACT policy deployment on PegInsertion showing tracking-error reduction and success-rate gains across controller settings.
ACT deployment on PegInsertion across different controller gains.

Real-Robot Deployment

On a physical UR5, APEX improves execution under the robot's native position servo. The largest gain appears on PegInsertion, where the flow matching policy improves from 20% to 35%. Rollouts are shown at 4x speed.

UR5 real-world deployment sequences and tracking-error comparison for Push, Pick, and Insert tasks.

DP + APEX Rollouts

Raw DP vs DP + APEX

PushCube

PickCube

PegInsert

Adaptive Update

The adaptive update outperforms the best fixed correction and increases correction strength when tracking degrades.

Adaptive-update ablation comparing convergence behavior and success rates.

VLA Generalization

With π0.5 on LIBERO Spatial, APEX reduces end-effector tracking error and improves mean success from 72.1% to 97.9%.

LIBERO Spatial results for pi0.5 showing success-rate and tracking-error gains.

BibTeX

@article{zhao2026apex,
  title   = {APEX: Adaptive Policy Execution for Precise Manipulation},
  author  = {Zhao, Mengfei and Jiang, Chenxi and An, Tuo and Jia, Jindou and Yang, Jianfei},
  journal = {arXiv preprint},
  year    = {2026}
}