Institut für Nachrichtentechnik

Navigation

Source-Filter Based Clustering for Monaural Blind Source Separation

DAFx 2009

Abstract

In monaural blind audio source separation scenarios, a signal mixture is usually separated into more signals than active sources. Therefore it is necessary to group the separated signals to the final source estimations. Traditionally grouping methods are supervised and thus need a learning step on appropriate training data. In contrast, we discuss unsupervised clustering of the separated channels by Mel frequency cepstrum coefficients (MFCC). We show that replacing the decorrelation step of the MFCC by the non-negative matrix factorization improves the separation quality significantly. The algorithms have been evaluated on a large test set consisting of melodies played with different instruments, vocals, speech, and noise.

Keywords:

Clustering, Monaural Blind Sound Source Separation, NMF, Audio

Paper : SpGn09a.pdf

Slides : Talk_DAFx09.pdf

Matlab-Code:

An example implementation is available under the GNU General Public License:
download

Sound Examples with 2 active Sources

	P_rand	P_MFCC	P_NMF,Div	P_NMF,Euc	P_ref
Bass Guitar	4.53	20.30	13.59	13.59	20.30
Bass Keyboard	1.61	14.37	14.37	14.37	14.45
Bass Drums	1.67	1.76	1.76	1.76	3.15
Guitar Keyboard	3.19	2.85	1.47	4.83	5.88
Guitar Drums	1.53	8.34	8.34	18.60	19.14
Keyboard Drums	4.32	8.71	15.87	15.88	15.92

Remarks:

Results are shown in dB.
Mixtures are created with a dynamic difference of 0 dB.
For such mixing scenarios P_NMF,Euc leads generally to good clustering results, as mentioned in the paper.
The mixture Bass Drums could be separated well except the base drum, which is separated and clustered to the bass output. The very low SER could be explained by the high energy of the base drum.

Sound Examples with 2 active Sources and dynamic differences

		P_rand	P_MFCC	P_NMF,Div	P_NMF,Euc	P_ref
DD 0dB	Picollo	3.50	3.63	9.60	9.63	10.06
DD 0dB	Horn	3.53	3.89	9.71	9.71	10.15
DD 10dB	Picollo	6.01	3.44	17.12	5.69	17.43
DD 10dB	Horn	-3.96	-6.55	7.17	-4.29	7.42

Remarks:

DD stands for dynamic difference between the two input signals
Results are shown in dB.
Sound files can be found here.
For a dynamic difference of 0 dB P_NMF,Euc leads to slightly better separation results than P_NMF,Div.
For a dynamic difference of 10 dB P_NMF,Div is significantly better than P_NMF,Euc.

Sound Examples with 3 active Sources: Bass, Harp, and Piccolo

	P_rand	P_MFCC	P_NMF,Div	P_NMF,Euc	P_MFCC,Hier	P_NMF,Div,Hier	P_NMF,Euc,Hier	P_ref
mean	2.83	7.98	20.60	20.62	1.92	20.57	20.62	20.95
Bass	4.94	18.75	18.69	18.69	2.66	18.72	18.69	18.85
Harp	1.35	2.52	17.84	17.93	3.13	17.75	17.93	18.68
Piccolo	2.22	2.67	25.26	25.25	-0.01	25.26	25.25	25.32

Remarks:

Results are shown in dB.
Sound files can be found here.

Sound Examples with 3 active Sources: Castanets, Violoncello, and Flute

	P_rand	P_MFCC	P_NMF,Div	P_NMF,Euc	P_MFCC,Hier	P_NMF,Div,Hier	P_NMF,Euc,Hier	P_ref
mean	0.30	10.66	3.74	11.04	2.32	10.77	6.09	11.25
Castanets	-0.28	14.10	10.89	14.93	5.46	14.31	5.46	14.94
Violoncello	1.74	8.63	0.04	8.85	1.43	8.65	8.64	9.20
Flute	-0.57	9.25	0.29	9.35	0.07	9.35	4.16	9.60

Remarks:

Results are shown in dB.
Sound files can be found here.
For P_NMF,Div the violoncello and the flute could not be separated.
The hierarchical clustering P_NMF,Div,Hier increases the separation quality by first separating an obvious source (the castanets). After that the remaining channels are clustered again into two other sources (violoncello and flute).

(C) by Martin Spiertz - 07. September 2009 - spiertz@ient.rwth-aachen.de

Druckansicht

Institut für Nachrichtentechnik

Navigation

Current page is 1.2.17.8.25.1.3: Clustering for Monaural Blind Source Separation

Source-Filter Based Clustering for Monaural Blind Source Separation

DAFx 2009

Abstract

Keywords:

Paper : SpGn09a.pdf

Slides : Talk_DAFx09.pdf

Matlab-Code:

Sound Examples with 2 active Sources

Sound Examples with 2 active Sources and dynamic differences

Sound Examples with 3 active Sources: Bass, Harp, and Piccolo

Sound Examples with 3 active Sources: Castanets, Violoncello, and Flute