Drones with mounted cameras provide significant advantages when compared to fixed cameras for object detection and visual tracking scenarios. Given their recent adoption in the wild and late advances in computer vision models, many aerial datasets have been introduced.
In this talk, we’ll explore recent advances in object detection, comparing the challenges of natural images with those recorded by drones. Given the successes achieved by pretraining image classifiers on large datasets, and transferring the learned representations, a set of object detectors fine-tuned on publicly available aerial datasets will be presented and explained. We’ll highlight existing libraries that mitigate the cost of training large models from scratch, by including pretrained model weights and model variants found in the literature. Both Convolutional Neural Networks and the newly developed Transformers applied to vision will be covered and compared, outlining the main features of each architecture. The presentation will be accompanied by code snippets for aiding understanding and delivering practical examples.
This is aimed at a general audience familiar with Python. Knowledge of Computer Vision is a plus but not a requirement as we’ll introduce the necessary concepts. We’ll ground the presented model architectures and libraries on the task of object detection applied to aerial datasets and demonstrate that state-of-the-art methods are within everyone’s reach.
Priority access to all content
Community Discord
Exclusive promotions and giveaways