Abstract:
Convolutional Neural Networks have proved to be a powerful tool to solve a wide range of Computer Vision tasks, especially where is difficult to implement a solution in a purelyalgorithmic way. In the industry, the availability of powerful deep models to address classification, detection, and image segmentation now offers new possibilities for automating not only the production, but also the quality assessment of the final products. Unfortunately, industrial applications have to face some limitations, especially when dealing with the so called ”Embedded Vision” solutions where such models have to be transferred and used directly on-camera. Indeed, limited memory and computational capability pose important challenges on the architecture of the model to use. During the last years a large amount of research aimed to face such problems, proposing various compression algorithms to reduce the number of parameters of neural networks. The purpose of this thesis is to explore existing literature and to provide some general guidelines that can be beneficial during the deployment of industrial applications. Among all possible problems and challenges industrial settings present, two very common issues concern the complexity of the models and the scarcity of data. This work addresses some of the most exploited techniques to tackle such problems, as well as contributes by proposing two novel methods: a novel data augmentation technique to compensate heavily unbalanced classes, and a filter pruning algorithm that greatly improves the inference time and reduces the memory footprint of a model. The algorithms have been implemented in real-case scenarios, and they have been compared to other methods in the existing literature.