Birdsong Phrase Classification With Siamese Neural Networks

Recognizing birdsong patterns with machine learning

1 July 2020


  • Deep Learning
  • Bioacoustics

As part of my masters thesis I developed a “Shazam” for birdsong based on siamese neural networks, a few-shot machine learning technique capable of recognizing Cassin’s Vireo song elements. Below you will find the recorded live stream of my thesis defense.

Abstract: The process of learning good features to discriminate among numerous and different bird phrases is computationally expensive. Moreover, it might be impossible to achieve acceptable performance in cases where training data is scarce and classes are unbalanced. To address this issue, we propose a few-shot learning task in which an algorithm must make predictions given only a few instances of each class. We compared the performance of different Siamese Neural Networks at metric learning over the set of Cassini’s Vireo syllables. Then, the network features were reused for the few-shot classification task. With this approach we overcame the limitations of data scarcity and class imbalance while achieving state-of-the-art performance.

Santiago Rentería
Computer scientist and audio engineer working at the intersection of machine listening and acoustic ecology.
Interests: sound art, media archaeology and process philosophy.