NIX Solutions: Google Learned to Recognize Sung and Whistled Songs

Google Assistant has learned to recognize not only recorded songs, but also sung or whistled songs. After processing the recording with machine learning algorithms, the application displays the most likely songs and indicates the probability of a match, writes N+1. The feature is available in the Google app or widget for Android and iOS.

For many years, programs have been developed for smartphones that can recognize songs playing nearby. The implementation of the algorithms varies from program to program, but in general, their principle of operation is to analyze the peaks in the spectrogram of the audio recording, convert this data into an acoustic fingerprint and compare it with fingerprints from the database.

Since the data is compared to the baseline in a highly compressed form, which essentially reflects the main melody of the song, potentially the same method can be applied to sung melodies. In practice, this is a difficult task, because the data may be incomplete (the person has forgotten a part of the melody) and distorted, and instead of several instruments, only one “instrument” is used for humming.

A few years ago, a similar function was implemented by the developers of the SoundHound application, and now it also appeared in Google search. For this you need to ask the voice assistant what song is currently playing. Developers trained new neural network models on recordings of people whistling and humming various songs with or without words. As a result, the song recognition service learned to compare such records with regular songs from its database.

NIX Solutions notes that at the time of launch, the function is available in the latest versions of Google applications on Android and iOS.  On Android it works with more than 20 languages, and on iOS so far it is only available in English, but the company promises to expand the list of languages. After the algorithms have listened to the song, the application does not display a specific song, but three to choose from, indicating the probability of matching for each of them, and, in some cases, a button with additional results.