The Second Wave of Real VLA: Psi R1 Achieves Generalized Intelligence at the Brain Level!

Psi R1 marks the release of the first reinforcement learning–driven Vision-Language-Action (VLA) model, successfully demonstrating dexterous, long-horizon task execution in an open-ended Mahjong environment.
· Millimeter-Level Dexterous Manipulation:
PsiBot’s dexterous hand achieves precise coordination across vision, language, and action modalities—executing human-like movements such as flipping, grasping, and organizing tiles with millimeter-level accuracy.
· Human-Robot Interaction and Cognitive Decision-Making:
The robot must understand Mahjong rules, interpret evolving game states, and dynamically generate strategies for discarding, melds (Pung/Kong), and reactive play—completing a full pipeline from rule interpretation to real-time reasoning and execution.
· Reinforcement Learning–Enhanced Long-Horizon Execution:
Validated in the Mahjong scenario, Psi R1 maintains a 30-minute+ Chain of Action Thought (CoAT)—breaking traditional limits of robotic task duration and reasoning continuity in complex environments.
· Tri-Layered Multimodal Interaction:
Psi R1 supports human–robot, robot–robot, and robot–environment interaction. Robots not only share information but also physically collaborate, such as passing tiles to each other, enabling multi-agent embodied teamwork in real-world scenarios. With this, Psi R1 delivers not just true VLA capability, but a generalized cognitive system—laying the groundwork for real-world, long-horizon embodied intelligence.
In our demonstration video, the robot is able to play a full game of Chinese Standard Mahjong lasting over 30 minutes—from start to finish. It not only performs high-precision actions such as drawing, discarding, and placing tiles, but also understands human intent to execute complex interactions like Pung and Kong.
Even in the face of human interference or environmental disruption, the robot is able to comprehend context and resume the game autonomously, showcasing robust, human-level perception, reasoning, and dexterous execution in a truly open and dynamic task setting.
This series of technologies is not limited to Mahjong—it can be generalized to a wide range of dexterous manipulation tasks, laying a solid foundation for real-world deployment in practical scenarios such as grasping and sealing takeaway bags, restocking in retail environments, and more.

Solving the Last-Mile Dexterity Challenge in Logistics with a Truly Dexterous Hand

The Psi R1 VLA model, independently developed by PsiBot, has demonstrated outstanding closed-loop control capabilities in logistics scenarios such as bag lifting and ring threading.
The system accurately recognizes various types of delivery bags in diverse orientations and executes multi-angle ring insertion and lifting with precision. Every movement is smooth, coordinated, and robust, showcasing Psi R1’s ability to handle unstructured, real-world logistics tasks with the dexterity and adaptability required for commercial deployment.
Even when facing challenging conditions such as shifting ring positions or sagging, deformable bags, Psi R1 leverages its VLA-based reasoning capabilities to dynamically adjust the posture of its dexterous hand in real time, ensuring high task success rates.
With its exceptional generalization and adaptive control performance, Psi R1 not only handles complex manipulation tasks, but also excels in interacting with deformable objects—a critical requirement for real-world logistics and takeaway delivery.
By offering robust, intelligent solutions for the “last 100 meters” of logistics, Psi R1 is bringing robotic service into everyday life, pushing embodied AI from the lab into practical, human-facing applications.

A Precision Intelligence Solution for Retail Restocking

In retail restocking scenarios, tasks often involve placing goods onto shelf compartments or hook-based displays. To ensure stable and reliable placement, the system must overcome several key challenges.
Supported by the Psi R1 model, PsiBot’s dexterous hand with tactile sensing is capable of:
· Identifying product type and planning the corresponding restocking target position
· Executing precise, adaptive placement based on the item’s form factor and handling requirements
· Achieving millimeter-level end-effector alignment with the target position—especially crucial when placing products onto narrow hooks
The robotic arm and dexterous hand feature extremely high repeatability and positioning accuracy. During operation, the system precisely adjusts for factors such as:
· Center of mass distribution
· Geometric characteristics of the object
· Gripping force allocation
Throughout the task, it continuously applies visual and force feedback correction to compensate for even the slightest deviations, ensuring stable, precise, and reliable manipulation in real-world scenarios.
With its high-degree-of-freedom dexterous hand and multi-sensor fusion capabilities, PsiBot has significantly improved not only the efficiency of restocking operations, but also the reliability and adaptability of the system. Even when faced with diverse product forms and constantly changing environments, it consistently executes restocking tasks with precision and stability.

Reinforcement Learning in Simulation and Teleoperation

PsiBot integrates simulation environments with teleoperation technology to effectively support training for long-horizon dexterous manipulation within its VLA system, demonstrating exceptional performance in complex tasks such as Mahjong.
By deeply combining imitation learning with reinforcement learning, the system not only greatly enhances the efficiency of simulated data collection and utilization, but also enables rapid accumulation of successful experience data, building a powerful data flywheel.
This Sim-to-Real technical pipeline allows robots to quickly acquire high-difficulty manipulation skills, significantly reducing training costs and risks in the physical world. As a result, complex dexterous operations move from theory into practice—paving a new path for the evolution of robotic intelligence.
Full-Body Systems
PsiBot offers a highly economical configuration combining a single-arm manipulator with a mobile base, designed to meet real-world deployment demands at minimal cost.
PsiBot V1 integrates a wheeled mobile base with a humanoid upper body, paired with PsiBot’s proprietary five-finger tactile dexterous hand—an optimal blend of efficient mobility and precise manipulation.
Featuring 32 degrees of freedom, PsiBot V1 is designed for enterprise-level deployment and can be rapidly scaled across service, logistics, and manufacturing industries. It represents one of the most practical and deployment-ready robotic solutions available.
 
Teleoperation
PsiBot’s latest teleoperation device is an isomorphic dexterous hand exoskeleton with a 1:1 joint mapping design. By simply wearing the glove, the operator can naturally and intuitively control every joint of the robotic hand.
This innovative design dramatically lowers the barrier to human-robot interaction, offering a fluid and responsive control experience, with precise, real-time motion output—enabling both expert and non-expert users to command dexterous robots with ease.
In real-world applications, PsiBot’s dexterous hand is capable of executing complex grasping tasks—such as gripping the handle ring of a delivery bag and lifting it smoothly and securely, while preventing slippage or tearing throughout the motion. The system adapts to various bag sizes, supporting multi-angle ring threading and dynamic grip posture adjustment, demonstrating human-level flexible manipulation.
High-DOF Exoskeleton | 16 DOF Precision Control with RL-Compatible Simulation
The dexterous hand is paired with a 16-degree-of-freedom exoskeleton system, enabling precise synchronization with the user’s intricate hand movements. This high-precision control allows the system to go beyond basic gestures and perform fine-grained, high-complexity operations.
Additionally, a high-fidelity simulation model supports reinforcement learning (RL) training in virtual environments, accelerating the development and refinement of manipulation strategies through self-supervised learning and continuous performance optimization.
3D Tactile Perception & Force Feedback | Human-Level Teleoperation (Teleoperation)
The dexterous hand is equipped with high-resolution 3D tactile sensors at the fingertips, alongside real-time force feedback integrated into the exoskeleton system.
These innovations allow the operator to perceive object shape, texture, and resistance during teleoperation, enabling true “perceptive control”.
For example, in testing, the dexterous hand demonstrated the ability to stably grasp and transport delicate objects such as puffed corn snacks—extremely light, soft, and fragile—reliably and gently, showcasing human-like precision and tactile sensitivity in robotic manipulation.
Full-Hand Tactile and Force Feedback | Material & Geometry Perception via Blind Grasping (Teleoperation)
Even in the absence of visual input, PsiBot’s dexterous hand—equipped with full-hand tactile sensing and real-time force feedback—can autonomously perceive object geometry and surface material properties.
In testing, the system successfully performed blind grasps on objects with varying geometries—such as rectangular boxes and cylindrical baseballs—and dynamically adjusted its grasping strategy based on tactile signals from the hand.
This enables natural, stable, and reliable manipulation, closely mimicking human tactile perception and reflex-based control.

Precision Force Control | Safe Grasping of Fragile Objects (Force Control, Non-Teleoperated)

The dexterous hand is equipped with a high-precision force control mode, allowing it to adjust its grip strength precisely based on the fragility of the target object. In coordination with the central control system (the “brain”), the hand can dynamically modulate grip force as needed with fine granularity.
For example, when grasping extremely delicate items such as tofu, the system autonomously modulates the gripping force to the minimum necessary pressure, ensuring the object remains intact throughout the grasp and transport process—achieving stable manipulation without damage.
Slip-Locking Mechanism | Adaptive Resistance to External Disturbance (Force Control, Non-Teleoperated)
During grasping and transport, if the dexterous hand detects signs of object slippage—such as a bottle beginning to slide—the system automatically increases grip force, activating a slip-locking mechanism to rapidly stabilize the object and prevent it from falling.
This feature ensures high-reliability performance even in dynamic or unstable environments, significantly enhancing the safety and stability of robotic manipulation during real-world operation.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top