The assessment of surgical skills for Minimally Invasive Surgery (MIS) has traditionally been conducted with visual observation and objective scoring. This paper presents a practical framework for the detection of instrument/tissue interaction from MIS video sequences by incorporating multiple visual cues. The proposed technique investigates the characteristics of four major events involved in MIS procedures including idle, retraction, cauterisation and suturing. Constant instrument tracking is maintained and multiple visual cues related to shape, deformation, changes in light reflection and other low level images featured are combined in a Bayesian framework to achieve an overall frame-by-frame classification accuracy of 77% and episode classification accuracy of 85%.
Benny P. L. Lo, Ara Darzi, Guang-Zhong Yang