Training Data Overview

Our training dataset is made up of 16 robotic nephrectomy procedures recorded using da Vinci Xi systems in porcine labs. The original video data was recorded at 60 Hz and to reduce labelling cost we subsample this to 2 Hz. Sequences with little or no motion are manually removed to leave 149 frames per procedure. Video frames are 1280x1024 and we provide the left and right eye camera image as well as the stereo camera calibration parameters. Labels are only provided for the left image. 

In each frame we hand label several man-made and anatomical objects. 

  • da Vinci robotic surgical instruments
    • Individual articulating parts (shaft, wrist, jaws)
  • Suturing Needles
  • Suturing thread
  • Clips/clamps
  • Kidney parenchyma
    • Fascia covered and uncovered
  • Small bowel
  • Background tissue
  • Each class will have a distinct numerical label in a ground truth image. A supplied json file will contain the class name to numerical label mapping.

Test Data Overview

We will release 2x500 frame sequences as a test sequence. They will be from similar procedures as the training data with no instrument types or objects that are unseen in the training data. 

Evaluation Criteria

The challenge will be ranked on the mean intersection over union (IoU) metric. This means we compute the IoU for each class which is present in a frame and then average over these scores for a per-frame score.