costar_plan

Integrating learning and task planning for robots with Keras, including simulation, real robot, and multiple dataset support.

View the Project on GitHub jhu-lcsr/costar_plan

Task Learning

The goal of our hierarchical task learning is to be able to use symbolic high-level planning grounded by learned neural net models of the world.

Our training data is sequences of task execution created with the costar_bullet simulation tool, as per the figure below.

Training Data

The tool lets us generate large amounts of task performances, with randomized high level (and eventually low level) actions.

ABOUT THIS DOCUMENT: This mostly consists of notes by Chris Paxton for how to get these methods to work. Some of these notes will be good; some may be out of date.

Task Definition

The task is defined as a set of high- and low-level actions at various levels, given by a task plan such as that shown below. For now, we will consider the simple version of the “blocks” task.

Here, the robot can grab one of several blocks. Grabbing a block is divided between aligning, approaching, and closing the gripper. We wish to define a task architecture like this one:

First stage of blocks task

The code below defines this simplest version of the task, creating both the action generators themselves (called “options”), and associated conditions. There are three differnt things we need to create:

The key parts of the Option are the constructor and the samplePolicy(world) function, which takes a current world and returns a policy functor and a condition.

The block of code below, taken from CTP’s Blocks task, defines the choice between different objects. Take a look at the basic simulation options for examples of options and the samplePolicy() function.

AlignOption = lambda goal: GoalDirectedMotionOption(
    self.world,
    goal,
    pose=((0.05, 0, 0.05), self.grasp_q),
    pose_tolerance=(0.03, 0.025),
    joint_velocity_tolerance=0.05,)
align_args = {
    "constructor": AlignOption,
    "args": ["block"],
    "remap": {"block": "goal"},
}
GraspOption = lambda goal: GoalDirectedMotionOption(
    self.world,
    goal,
    pose=((0.0, 0, 0.0), self.grasp_q),
    pose_tolerance=(0.03, 0.025),
    joint_velocity_tolerance=0.05,)
grasp_args = {
    "constructor": GraspOption,
    "args": ["block"],
    "remap": {"block": "goal"},
}
LiftOption = lambda: GeneralMotionOption(
    pose=(self.over_final_stack_pos, self.grasp_q),
    pose_tolerance=(0.05, 0.025),
    joint_velocity_tolerance=0.05,)
lift_args = {
    "constructor": LiftOption,
    "args": []
}
PlaceOption = lambda: GeneralMotionOption(
    pose=(self.final_stack_pos, self.grasp_q),
    pose_tolerance=(0.05, 0.025),
    joint_velocity_tolerance=0.05,)
place_args = {
    "constructor": PlaceOption,
    "args": []
}
close_gripper_args = {
    "constructor": CloseGripperOption,
    "args": []
}
open_gripper_args = {
    "constructor": OpenGripperOption,
    "args": []
}

When we create a task model, we use the add(name, parent, arg_dict) function. Its three parameters are:

# Create a task model
task = Task()
task.add("align", None, align_args)
task.add("grasp", "align", grasp_args)
task.add("close_gripper", "grasp", close_gripper_args)
task.add("lift", "close_gripper", lift_args)
task.add("place", "lift", place_args)
task.add("open_gripper", "place", open_gripper_args)
task.add("done", "open_gripper", lift_args)

We automatically call the task.compile() function to create the task model.

Creating a Data Set

We use three different data sets: navigation task, pybullet stacking task, and JIGSAWS suturing dataset.

You can create a small data set in the normal way:

rosrun costar_bullet start --robot ur5 --task stack1 --agent task \
  --features multi --save -i 10 --data_file small.npz --success_only

It might be helpful to make a larger training data set than this one, which is possible by changing the number of iterations as seen below.

rosrun costar_bullet start --robot ur5 --task stack1 --agent task \
  --features multi --save -i 100 --data_file bigger.npz --success_only

Some notes:

Current Best Practice

Create some data for predictor learning, 2017-09-12:

rosrun costar_bullet start --robot ur5 --task stack1 --agent task -i 5000 \
  --features multi  --verbose --seed 0 --success_only  --cpu --save \
  --data_file stack.npz

Creating the suturing data set involves the use of the “convert_jigsaws” tool:

./costar_plan/costar_models/scripts/convert_jigsaws.py Suturing \
    --out_dir costar_plan/suturing_data_jpg --drop_chance 0

where “Suturing” is the name of the directory copied from the downloaded JIGSAWS files.

Learning

Models

Current models of note:

Older models include:

The current preferred model is Predictor. You should use that one when possible.

Predictor Model

The predictor model learns to generate a bunch of possible futures.

Useful notes:

Pretraining the Image Encoder

Pretraining has been a critical part of many of the examples of previous work we have looked at when designing the predictor networks. To help leverage these models, we have a separate training pipeline that (1) learns encoders and decoders to generate possible goals, and (2) then uses this learned representation for the predictor network.

Updated command:

rosrun costar_models ctp_model_tool --data_file rpy.npz \
  --model pretrain_image_encoder \
  -e 500 --features multi \
  --batch_size 24  \
  --optimizer adam \
  --lr 0.001 \
  --upsampling conv_transpose \
  --steps_per_epoch 500 \
  --dropout_rate 0.2 --skip_connections 1

Notes:

Add Dropout

It’s also possible to add dropout to the “decoder” network.

To do so:

rosrun costar_models ctp_model_tool --data_file rpy.npz \
  --model pretrain_image_encoder \
  -e 1000 --features multi \
  --batch_size 48  \
  --optimizer adam \
  --lr 0.001 \
  --upsampling conv_transpose \
  --steps_per_epoch 300 \
  --dropout_rate 0.1 \
  --skip_connections 1 \
  --hypothesis_dropout 1 --decoder_dropout_rate 0.1

As of Nov. 14, 2017, I would not recommend doing this. Recommended practice is to use dropout on the encoder only.

Goal Sampler Model

This is a stripped down version of the predictor model that predicts the distribution over goals first, then predicts images separately. It is supposed to reduce some of the issues we see with learning.

rosrun costar_models ctp_model_tool --data_file rpy.npz --model goal_sampler \
  -e 1000 \
  --features multi \
  --batch_size 64  \
  --optimizer adam \
  --lr 0.001 \
  --upsampling conv_transpose \
  --steps_per_epoch 300 \
  --cpu

Hierarchical Model

This version is old and will at some point no longer be supported. The hierarchical model learns an encoding for predicting both the next set of features and an associated action.

We can now use the standard CoSTAR bullet tool to train a model:

rosrun costar_bullet start --robot ur5 --task blocks --agent null --features multi \
  -i 1000 --model hierarchical --data_file small.npz --load --si 5 --lr 0.001

Replace the options as necessary. Interesting things to change:

It takes several thousand iterations to get this right. An example at about 500 iterations:

Example intermediate results

You should be able to see the model learning something useful fairly early on, at least with a small data set. It just won’t be perfect.

Testing

To run the basic model, no planning, use the same ff agent as you would for anything else:

rosrun costar_bullet start --robot ur5 --task blocks --agent ff --features multi \
  -i 1 --model hierarchical --load_model  --gui