Source-Filter Based Clustering for Monaural Blind Source Separation

DAFx 2009

Conference homepage

Abstract

In monaural blind audio source separation scenarios, a signal mixture is usually separated into more signals than active sources. Therefore it is necessary to group the separated signals to the final source estimations. Traditionally grouping methods are supervised and thus need a learning step on appropriate training data. In contrast, we discuss unsupervised clustering of the separated channels by Mel frequency cepstrum coefficients (MFCC). We show that replacing the decorrelation step of the MFCC by the non-negative matrix factorization improves the separation quality significantly. The algorithms have been evaluated on a large test set consisting of melodies played with different instruments, vocals, speech, and noise.

Keywords:

Clustering, Monaural Blind Sound Source Separation, NMF, Audio

Paper : SpGn09a.pdf

Slides : Talk_DAFx09.pdf

Matlab-Code:

An example implementation is available under the GNU General Public License:
download

 

Sound Examples with 2 active Sources

Prand PMFCC PNMF,Div PNMF,Euc Pref
Bass Guitar 4.53 20.30 13.59 13.59 20.30
Bass Keyboard 1.61 14.37 14.37 14.37 14.45
Bass Drums 1.67 1.76 1.76 1.76 3.15
Guitar Keyboard 3.19 2.85 1.47 4.83 5.88
Guitar Drums 1.53 8.34 8.34 18.60 19.14
Keyboard Drums 4.32 8.71 15.87 15.88 15.92

Remarks:

 

Sound Examples with 2 active Sources and dynamic differences

Prand PMFCC PNMF,Div PNMF,Euc Pref
DD 0dB Picollo 3.50 3.63 9.60 9.63 10.06
DD 0dB Horn 3.53 3.89 9.71 9.71 10.15
DD 10dB Picollo 6.01 3.44 17.12 5.69 17.43
DD 10dB Horn -3.96 -6.55 7.17 -4.29 7.42

Remarks:

 

Sound Examples with 3 active Sources: Bass, Harp, and Piccolo

Prand PMFCC PNMF,Div PNMF,Euc PMFCC,Hier PNMF,Div,Hier PNMF,Euc,Hier Pref
mean 2.83 7.98 20.60 20.62 1.92 20.57 20.62 20.95
Bass 4.94 18.75 18.69 18.69 2.66 18.72 18.69 18.85
Harp 1.35 2.52 17.84 17.93 3.13 17.75 17.93 18.68
Piccolo 2.22 2.67 25.26 25.25 -0.01 25.26 25.25 25.32

Remarks:

 

Sound Examples with 3 active Sources: Castanets, Violoncello, and Flute

Prand PMFCC PNMF,Div PNMF,Euc PMFCC,Hier PNMF,Div,Hier PNMF,Euc,Hier Pref
mean 0.30 10.66 3.74 11.04 2.32 10.77 6.09 11.25
Castanets -0.28 14.10 10.89 14.93 5.46 14.31 5.46 14.94
Violoncello 1.74 8.63 0.04 8.85 1.43 8.65 8.64 9.20
Flute -0.57 9.25 0.29 9.35 0.07 9.35 4.16 9.60

Remarks:

 

(C) by Martin Spiertz - 07. September 2009 - spiertz@ient.rwth-aachen.de