Information about the location and size of the targets that users interact with in real world settings can enable new innovations in human performance assessment and software usability analysis. Accessibility APIs provide some information about the size and location of targets. However this information is incomplete because it does not support all targets found in modern interfaces and the reported sizes can be inaccurate. These accessibility APIs access the size and location of targets through low-level hooks to the operating system or an application. We have developed an alternative solution for target identification that leverages visual affordances in the interface, and the visual cues produced as users interact with targets. We have used our novel target identification technique in a hybrid solution that combines machine learning, computer vision, and accessibility API data to find the size and location of targets users select with 89% accuracy. Our hybrid approach is superior to...
Amy Hurst, Scott E. Hudson, Jennifer Mankoff