estimate_pose_multi_view
Skill class for ai.intrinsic.estimate_pose_multi_view skill.
The estimate_pose_multi_view skill can be used to estimate the pose
of objects in the scene given at least 2 cameras
(and a maximum of 4 cameras). We recommend using 3
cameras in a triangular configuration for the best estimation performance.
The skill uses the input images obtained from the cameras to
estimate the 6 DOF pose of every instance of the
desired object in the scene. The CAD model of the
object to be estimated is defined at training time, which
is why it is important that the object matches the
CAD model provided. The skill succeeds if the number of detections is
equal or greater than the min_num_instances specified. The skill returns
the pose, score and ID for each detected object instance.
This skill does not update the pose of the detected object
instances in the belief world. Instead, skills such as update_world
or sync_product_objects can use the output of this skill to
update the belief world. The skill can use any ML multi-view pose estimator
or One Shot pose estimator as long as the pose estimator is created on the
same camera configuration.
Prerequisites
Required assets:
- Cameras : The skill can be run with a minimum of 2 cameras and a maximum of 4. If less than 4 cameras are available, the remaining slots can be filled with an already passed camera. The cameras should be placed at similar distances to the objects to be detected. The distances and angles of all cameras w.r.t the objects in the region of interest should fall within the pose range specified at training time.
- Pose estimator : the pose estimator for the object to be detected should be created in advance following the ML Multi‐view pose estimator training process or the One Shot pose estimator creation process.
The cameras must be extrinsically calibrated w.r.t each other. This means that, if the solution is running in real world, the relative position and orientation of the cameras should match (up to the calibration accuracy) the real world position. Follow the camera-to-camera calibration process for more information on how to calibrate multiple cameras.
Usage Example
- Set the
Camerasthat need to be used for pose estimation. The skill provides up to 4 possible slots. However, if less than 4 cameras are available, the remaining slots can be filled with an already passed in camera. - Set the
Pose estimatorto be used for the object that needs to be detected. ThePose estimatorshould be of type ML Multi‐view pose estimator or One Shot pose estimator.
(Optional parameters):
- Set the
Min num instancesparameter to define the minimum number of instances that the pose estimator should detect in the scene. The skill will fail if the not enough parts are detected in the scene. Default: 0. - Set
Update object poseto update the pose of an object in the belief world with the first returned pose of the pose estimator. - Set
Visibility score paramsto additionally calculate the visibility score of each detected part in the scene. Visibility score defines how visible the part is and it can be used for grasp planning to tell the robot which part to pick up first. - Set
Region of interestto limit the search space for the pose estimator.Only objects within the region of interest will be considered for pose estimation. This also reduces the runtime for one shot pose estimation.
When the skill is run, the skill dialog will display two images:
- A raw image from the first camera
- An annotated image where every detected object instance is highlighted.
Update the digital twin (belief world)
To update the digital twin (belief world) with all the pose estimates returned by this skill, you can use the sync_product_objects skill or update_world skill as described for the [estimate_pose skill documentation]((https://flowstate.intrinsic.ai/docs/skill_guides/perception/estimate_pose/#update-the-digital-twin-belief-world).
Parameters
pose_estimator
Id of the pose estimator.
min_num_instances
Minimal number of instances that must be detected. Skill fails if not enough instances can be found. Leaving this value unset will default to 0.
update_object_pose
Update pose for a single object. If multiple instances of the same object are in the scene, the pose with the highest detection score will be updated.
visibility_score_params
Optional parameter to compute visibility score for pose estimates.
log_debug_data
Optional parameter to log images and metadata for debugging purposes.
capture_data
If specified, the skill uses the capture_result_locations within to
retrieve images captured earlier.
If not specified, the skill captures new images directly from the cameras.
inference_timeout_sec
Maximum allowed time in seconds to run the pose estimation inference. This includes the pose estimator creation and the actual pose estimation.
region_of_interest
If set, the pose estimator will restrict the search space to the region of interest.
Returns
estimates
Pose estimates returned by the pose estimator.
root_ts_target
Poses of the returned pose estimates. Note that this is redundant to the
poses returned in 'estimates' but allows easier use in combination with
other skills like sync_product_objects.
Error Code
The skill does not have error codes yet