An unmanned aerial vehicle-based assessment method for quantifying computer vision models
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Computer vision is a growing field in computer science. Since the advancement of Machine learning, Computer vision solutions have been trending. As a result of the growing number of solutions and performance increases in Machine learning, machine learning solutions are now being utilized in the field of robotics. A problem propagates when the evaluation methods that were used previously are used for robotic vision solutions. The accuracy metric although valuable from a data driven perspective lends no benefit to the use in robotics. The accuracy calculated by the performance of the Convolutional neural network on the evaluation dataset is only a relevant metric to the evaluation dataset. The accuracy metric does not define the distance at which the accuracy of the Convolutional Neural Network (CNN) begins to decrease below the required threshold. The accuracy metric does not depict the strengths and weaknesses of the CNN in terms of orientation of the object. The accuracy metric does not show the accuracy of the CNN given a specific orientation and distance. Orientation and distance are factors when considering a computer vision solution for the use in robotics. A popular example is Tesla. Tesla incorporates a multitude of systems in order to produce their self-driving capabilities. One of the systems used is camera feed that utilizes Machine learning to depict the context of the image. Tesla needs their system to perform in a multitude of distance and orientation of objects [9]. Simply using a single accuracy metric will not be enough to define the limitations of the system. What this thesis proposes is an evaluative method capable of defining the spatial limitations of a CNN for 3D objects. This approach utilizes Unmanned aerial vehicle (UAV) mobile sensors in order to generate the desired distances and orientation from the object being evaluated. Multiple flight sequences are conducted to provide information that is able to define the exact point in which the accuracy starts to decrease and the orientations that are the most weak. This approach was tested using a two class CNN that depicted if a Ford Ranger was in the image or if it was not. The experimental results using an Unmanned Aerial Vehicle (UAV) was able to depict the CNN's dependencies such as: the distance from the object, the altitude, the orientation of the object and the impact these dependencies have on accuracy. An UAV was used due to their innate capability as mobile sensors capable of producing any perspective and distance required.