Visibility Planning Using the Baxter robot

For more details about this project, click here for the project report. For the poster, click here.


The principal idea behind this project was to have a robotic manipulator reach a target while avoiding obstacles in the configuration space of the manipulator using visual servoing techniques. This project deals with one of the fundamental problems of visibility planning, i.e. the robot needs to find a spot for its hand camera which will give it a very good view of the target object and in turn will help in the localization of the target object. After the object has been localized, a regular planner can be used to plan for a trajectory to the target object. The problem of visibility planning has been tackled in this project using various AI optimization approaches.

Fig 1. The Baxter robot with its right arm task space.


Work in the field of visibility planning has been done before at various universities. Kuffner and Vahrenkamp have tackled this problem by distributing the task of motion planning into two parts – motion planning to get to a position near the target object which makes sure that the target object is in the field of view of the camera and from then on, use visual servoing to grasp the object. Tsai and Allen have done significant work in a model based and task driven vision system that automatically plans vision sensor parameters so that task requirements are satisfied. In one of their earlier works, they figured out the important features in an object that should be attainable like if the object is in focus, magnification and other parameters. Cheng and Tsai have worked on this problem by defining various parameters and simulating the entire model in a graphics engine.

Object Reconstruction

Reconstruction is the part that deals with the reconstruction of the object features from the initial partial view of the object through the hand camera. Initially, the user is presented with a video feed from the hand camera. The user then selects the feature vectors that are visible (corner points). Using the geometric model of the target object that are known and the feature vectors as input to a generic PnP solver, the transformation matrix between the object and the camera frame is found. Using this transformation matrix, the approximate position of the object is found and is drawn on to a visualizer.

Simulated Annealing

SA is a generic metaheuristic method for the global optimization problem of locating a good approximation to the global optimum of a given function in a large search space. It is based on the metal annealing method which involves heating and controlled cooling of a material to increase the size of its crystals and reduce the defects.

Initially, a random state is chosen. the energy for this state is calculated. The temperature is decreased according to the temperature schedule. The next state is sampled. The energy of this state is calculated. If the energy is less than the current energy, then this state is accepted. If the energy is more, then the probability function which takes the temperature and the energies of the current and the next states gives out a probability of acceptance of the new state. If it is over a certain threshold, it is accepted or else it is rejected. This goes on till the temperature reaches 0. The best energy state is stored and at the end, this state is chosen.

Part A: Neighborhood Sampling

The sampling space is the task space of the hand camera of the robot. Essentially the task space consists of all the positions and orientations attainable by the hand camera irrespective of collisions with itself or the environment. Random position samples are generated. Sampling orientations is a difficult task to achieve and because in this problem, it is known that the camera should always be pointing in the direction of the robot, and hence the desired orientation is calculated based on the vector joining the camera centre to the centroid of the reconstructed object.

Fig 2. Configurations being sampled. Obstacle – Red box, Reconstructed object – Green dots.

Part B: Temperature schedule and probability function

The temperature schedule used here is an exponentially decreasing function.

T = e^{-Ai}

The coefficient A determines the curviness of the schedule. The value of A was chosen using the following formula where T_0 is the initial temperature, T_N is the lowest attainable temperature and N is the number of steps that is desired to get to the minimum temperature T_N.

    \[A = \frac{1}{n}\log{\frac{T_0}{T_N}}\]

The probability function is the one used by Kirkpatrick et al which has been widely accepted over the years. The probability is 1 when E_{new} is lesser than E_{prev}, 0 when E_{new} = E_{prev} and when E_{new} is greater than E_{prev}, it is –

    \[ P = e^{\frac{-(E_{new} - E_{prev})}{T}} \]

Part C: Energy function

A function which encompassed all the essential attributes of a good configuration and a quantitative representation of the visible object features was devised. Manipulability and the distance of the object from the sampled camera position were used along with collision check parameters using ray tracing.

    \[ M = \text{min}(Eigenvalues(JJ^T)) \]

J is the Jacobian of the arm at a configuration and the manipulability score M of that configuration is the minimum eigenvalue of the product of the Jacobian matrix and its transpose.


The results obtained were quite promising. This method was compared with two naive approaches – random sampling and random neighbourhood sampling. The criterion for success was chosen as the visibility of six feature points on the object after moving to the “supposedly” better viewing position as determined by the method used. Random sampling’s biggest disadvantage was that it could not sample valid IK configurations. Random neighbourhood sampling also didn’t give anything better. Random neighbourhood sampling gave more valid IK poses because the samples were being taken in the neighbourhood of a correct pose sample. It still did not perform well and timed out usually with really bad scores. Simulated annealing performed successfully on 17 occasions with an average restart rate of 4.96. The time taken by simulated annealing on average was about 4 minutes.

Fig 3. Optimal configuration for a better visibility generated using SA.


Work done at the ARC (Autonomous Robotic Collaboration) Lab under the supervision of Professor Dmitry Berenson.

Leave a Reply

Your email address will not be published. Required fields are marked *