Annotation data¶
Currently, SceneFun3D provides three categories of annotated data:
- Functional interactive element annotations
- Language task descriptions
- Motion annotations
In the sections below, we describe the provided data for each category. Each annotation is accompanied with a unique identifier of the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
.
Functional interactive elements¶
The format of the functional interactive element annotations can be seen below.
{
"visit_id": the identifier of the scene,
"annotations": [
{
"annot_id": unique id of the annotation,
"indices": the mask indices of the original laser scan point cloud ({visit_id}_laser_scan.ply) that comprise the functional interactive element instance,
"label": affordance label
},
...
]
}
Currently, the SceneFun3D dataset contains interactions with the following affordance labels:
Label | Description |
---|---|
rotate | functionalities that are adjusted by a rotary switch knob, e.g., thermostat |
key_press | surfaces that consist of keys that can be pressed, e.g., remote control, keyboard |
tip_push | functionalities that can be triggered by the tip of the finger, e.g., light switch |
hook_pull | surfaces that can be pulled by hooking up fingers, e.g., fridge handle |
pinch_pull | surfaces that can be pulled through a pinch movement, e.g., drawer knob |
hook_turn | surfaces that can be turned by hooking up fingers, e.g., door handle |
foot_push | surfaces that can be pushed by foot, e.g., foot pedal of a trash can |
plug_in | surfaces that comprise electrical power sources |
unplug | removing a plug from a socket |
In addition to these affordance categories, we have annotated functionalities whose geometry or the parent object’s geometry is not well-captured in the laser scans (e.g., reflective or transparent surfaces) under the label exclude
. These cases are excluded during the evaluation process.
Language task descriptions¶
The format of the natural language task descriptions can be seen below.
{
"visit_id": the identifier of the scene,
"descriptions": [
{
"desc_id": unique id of the description,
"annot_id": [
list of the associated annotation id's in the *annotations.json* file
],
"description": language instruction of the task
},
...
]
}
We highlight that, in some cases, more than one instance of functional interactive elements may correspond to a single language task description.
Motion annotations¶
The format of the motion annotations can be seen below.
{
"visit_id": the identifier of the scene,
"motions": [
{
"motion_id": unique id of the description,
"annot_id": the associated annotation id in the *annotations.json* file,
"motion_type": motion type (rotational or translational),
"motion_dir": motion direction (three element array),
"motion_origin_idx": point index of the original laser scan point cloud ({visit_id}_laser_scan.ply) which comprises the motion axis origin ,
"motion_viz_orient": motion visualization orientation (optional)`
},
...
]
}