
Reinforcement learning provides high generalization and robustness
- Efficient training with massive simulation data
- Bidirectional training framework connects multiple skills
- High-quality real-world data alignment improves long-horizon task success rate

Powerful scene deployment capability
- End-to-end technical architecture: After receiving instructions, the upper-level Vision Language Model (VLM) analyzes the chaotically placed items on the table, plans the operation sequence, and the lower-level manipulation model breaks down the subtasks for each item and executes them sequentially.
- High generalization ability: With only 20 real-world data points, the model achieved a successful grasping rate of over 99%.
- Strong adaptability to flexible objects: Simulates various operation scenarios for flexible objects in a simulation environment, combined with fine-tuning and optimization using real-world data. Even when interrupted or disturbed, it can adaptively adjust its strategy and resume the packing action.