The main goal of the AudioHelix (a.k.a. RAABSPM: Rapid and Accurate Audio Browsing by Structural Pattern Modelling) project is to develop methods for sound comparison and searching as well as several end-user applications that exploit these methods to retrieve sound or music samples based on specified properties of their content. The methods will be implemented in software that can quickly and accurately search a large database of audio samples and retrieve samples that are similar to a given query. It will also search for selected other properties of the sample content. The audio samples may be any kind of music, songs, movie sound tracks, radio recordings, speeches or other types of sounds. The database may contain millions of sound samples. The patterns we use are relevant to musical structure modelling and allow searching by taking into account the spectral similarity as well as the succession of timbres.

Audionamix has developed technology that allows the separation of many different parts of a monaural sound signal. The signal is initially separated into packets containing data from a certain interval in time. Each of these packets are decomposed into sound elements that relate to the sound sources present in the audio signal. These sound elements are in some ways analogous to DNA.

Sencel Bioinformatics AS has for many years been in the forefront of the development of rapid and sensitive software for searching huge public databases of DNA and protein sequences. Sencel has employed a range of parallel computing technologies in order to achieve high speed searches and is currently developing the fastest search tool that uses the gold-standard Smith-Waterman sequence comparison algorithm on common microprocessors.

Sound signals and DNA sequences share common features that make the use of genetic search algorithms very attractive for audio applications. Several common general signal processing techniques may be employed in their analysis. In both cases we are interested in identifying parts of the signals that resemble a part of another signal. In both cases the general sequential order of the signals needs to be conserved. And in both cases, parts of the signals may be missing or a new part might have been inserted in between the conserved parts. All of this is common to both types of data and are different types of local alignment problems.

Outcomes of this project will enable novel audio search tools, sound detection technologies, and also recommendation of audio samples from databases.