16.1.07

Empirical Mode Decomposition (EMD) and Head Related Transfer Function (HRTF) - Si, es verdad, primer post tematico

Bien, es hora de escribir sobre ciencia

Principlamente son tres los objetivos:
1) Divulgar
2) Que lo que se presente sea criticado
3) Sugerencias del lector

Como no tengo mucho remordimiento, estos post seran mezcla entre español, ingles y los idiomas que sean necesarias, incluyendo obviamente la matematica dentro de esa categoria.

En este post abordaremos rapidamente la tecnica llamada Empirical Mode Decomposition (EMD) porpuesta por Huang et al [1]. Esta tecnica descompe cualquier señal en sus denominados modos intrinsecos, estos son dependientes de la señal y sirven para reconstruir la señal original. El principio es similar a la serie (transformada) de Fourier, la cual utiliza una base de funciones ortogonales (senos y cosenos) para escribir como combinacion lineal cualquier señal que cumpla ciertas caracteristicas, si ese serie de Fourier que sea periodica, si es transformada de fourier se hace periodica a la fuerza y otra importante es que la señal a descomponer tenga un numero finito de discontinuidades para asegurar la convergencia (ver series de fourier y transformada de Fourier).

Cual es el "problema": Transformada de Fourier asume linealidad y estacionariedad de la señal.

El combinar EMD con la transformada de Hilbert da paso a la transformada de Hilbert-Huang la cual no asume absolutamente nada acerca de la naturaleza de la señal y por tanto es ideal para trabajar con señales no lineales y no estacionarias [2], como por ejemplo AUDIO.

Mi objetivo es aplicar EMD al analisis de HRTF. Siendo especifico a cada HRIR, la idea es descomponer las HRIR con dicha técnica en sus modos intrinsecos (IMF) y correlacionar los IMF con data antropomorfica de cada persona, es decir, asociar cada IMF con la influencia del torso, pinna, cabeza, etc...

Vamos al temita en sí:

Brief Introduction

The empirical Mode Decomposition is a relatively new method for analysing nonlinear and non-stationary data. The key part of the method is the `empirical mode decomposition' method with which any complicated data set can be decomposed into a finite and often small number of `intrinsic mode functions' (IMF) that admit well-behaved Hilbert transforms. In this work, the first approach considers the aplication of the Fourier Transform on each Intrinsic Mode Functions in order to analyze the decomposition in the frequency domain and correlated these spectrums with anthropomorphic data.

Given such a signal, the method adaptively decomposes it into a number of modes (IMFs) that are topologically equivalent to amplitude and frequency modulated, sinusoidal signals. In the analytic signal representation, the modes correspond to proper rotations [3]. Thus, the EMD method naturally yields estimates of the significant instantaneous frequencies embedded in the signal, by performing the Hilbert transform on each IMF, in this work the first approach consist in applying the Fourier Transform on each IMFs.

Let x(t) a time dependent signal, the EMD method focuses on the level of local oscillations and decomposes the signal into a finite and often a small number of fundamental oscillatory modes. The bases (IMFs) into which the signal is decomposed are obtained from the signal itself, and they are defined in the time domain. They have the same length as the original signal and preserve the frequency variations with time. The base modes can be made approximately complete and nearly orthogonal with respect to each other. Here, completeness implies that the original signal can be reconstructed without any loss of data by simply adding up the IMFs. Therefore the IMFs can be viewed as linear components of the original or source signal x(t)

In order to achieve this, two conditions need to be satisfied:

(a) The total number of extrema of IMF(t) must be equal to the number of zero crossings (or only differing by at most one)

(b) The mean of the upper envelope IMFu(t) and the lower envelope IMFl(t) be zero.

The IMFs calculation process is called sifting process and is composed by the following steps:

1) Identify all the extrema of x(t)

2) Interpolate between successive maxima and minima (say cubic splines), respectively, to obtain upper and lower envelopes (say, by using cubic splines), IMFl(t) y IMFu(t) respectively.

3) Calculate the local mean m(t) between the envelopes

4) Subtraction of the average from the original to yield d(t)=x(t)-m(t)

5) Repite steps (1-3) until d(t) satisfies the two conditions for being an IMF. Once an IMF is generated, the residual r(t)=x(t)-IMF1(t) is regarded as the original signal, and steps (1-4) are repeated to yield the second IMF, and so on

Huang [2] propose two stopping criteria:

a) The S number: S is defined as the consecutive number of siftings, in which the numbers of zero-crossing and extrema are the same for these S siftings.

b) SD is smaller than a pre-set value, where

Another criteria is proposed by Rilling et al [3] [4]. This criteria is based on two thresholds, teta1 and teta2, and aimed at guaranteeing globally small fluctuations in the mean while taking into account locally large excursions. This amounts to introduce the mode amplitude

And the evaluation function

Thus, the sifting is iterated until

for some preescribed fraction (1-alfa) of the total duration, while

One can tipically set alfa=0.05; teta1=0.05 and teta2=10*teta1 (default values in emd.m at [4])

The procedure is complete when either the residual function becomes monotonic, or when the amplitude of the residue falls below a pre-determined small value so that further sifting would not yield any useful components. These features guarantee the computation of a finite number of IMFs within a finite number of iterations. The outcome of the EMD procedure is the following decomposition of the original signal:

Applying the Fourier Transform (The Fourier Transform is the first stage of this paper. If there are bad results, the second stage considers the Hilbert Transform and consequently to apply the Huang-Hilbert Transform to the HRTF analysis)

Then, the hypotheses are:

1) It is possible to recognize the contributions of the different human organs, such as pinnae, torso, head, etc; to the spectral changes of the HRTF using the empirical mode decomposition method

2) It is possible to label each IMF to morphological data.


Method and Preliminary Results

This work uses the CIPIC HRTF Database [5] and the EMD implementation by Rilling et al [3] [4]

The emd options were setted as:

opt.display=2;
opt.maxiterations=2000;
opt.stop=[0.01,0.1,0.02];

The figure 1 shows the left ear Hrir for 0º azimuth and 0º elevation, the figure 2 for the right ear. (Sampling frequency was 44.1 kHz)

Figure 1 - HRIR (0,0) - Left ear - Subject003 CIPIC HRTF Database

Figure 2 - HRIR (0,0) - Right ear - Subject003 CIPIC HRTF Database

The figures 3 and 4 show the respective HRTFs.

Figure 3 - HRTF (0,0) - Left ear - Subject003 CIPIC HRTF Database

Figure 4 - HRTF (0,0) - Right ear - Subject003 CIPIC HRTF Database

The figure 5 shows the first five IMF of the HRIR (0,0) related to the left ear, the figure 6 show the IMF6 to IMF9 and the residual.

Figure 5 - IMF1 to IMF5 of the HRTF (0,0) - Left ear - Subject003 CIPIC HRTF Database

Figure 6 - IMF6 to IMF9 and the residual of the HRTF (0,0) - Left ear - Subject003 CIPIC HRTF Database

The figure 7 and 8 displays the samen information for HRIR (0,0) related to the rigth ear.

Figure 7 - IMF1 to IMF5 of the HRTF (0,0) -Right ear - Subject003 CIPIC HRTF Database

Figure 8 - IMF6 to IMF9 and the residual of the HRTF (0,0) - Right ear - Subject003 CIPIC HRTF Database

The figure 9 shows the absolute difference between the original signal and the signal constructed by mean of the equation

Figure 9 - Absolute differences between the original signal and the reconstructed signal of the HRIR (0,0) Left ear (above) and HRIR (0,0) Right ear (below)- Subject003 CIPIC HRTF Database

The difference are negligible (order 1e-16). With this result, it is possible to assure that the decomposition process is at least suitable.

The figure 10 shows the HRTF calculated from the original signal (above left = left ear; above right = right ear) and form the reconstructed signal by using the IMF and the residual (below left = left ear; below rigth = rigth ear). Again, the difference are negligible (see figure 1 in the time domain)

Figure 10 - HRTF from the original data (above) and the reconstructed data (below)

The figures 11 and 12, show the fourier transform of the IMF1 to IMF3 for the HRIR (0,0) of the left ear and right ear respectively
Figure 11 - Fourier Transform of the IMF1 (blue), IMF2(green) and IMF3 (red) of the decomposed signal (HRIR (0,0) left ear - Subject 003 - CIPIC HRTF Database)

Figure 12 - Fourier Transform of the IMF1 (blue), IMF2(green) and IMF3 (red) of the decomposed signal (HRIR (0,0) right ear - Subject 003 - CIPIC HRTF Database)

The results indicate that the interpretation of each IMF spetrum to the contribution to the respective HRTF is not direct which can be associated to the nonlinear process involved in the magnitude spectrum calculation process (absolute value and logarithm). It is necessary to carry out more test in order to interpretate each IMF spectrums better.

Finally, the figure 13 and 14 show the Hilbert spectrum of the IMFs (Hilbert-Huang Transform)

Figure 13 - Hilbert-Huang spectrum of the HRIR (0,0) Left ear - Subject 003 - CIPIC HRTF Database)
Figure 14 - Hilbert-Huang spectrum of the HRIR (0,0) Right ear - Subject 003 - CIPIC HRTF Database)

Preliminary Conclussions

1) Es posible descomponer las HRIR en sus modos intinsecos IMF por medio de la tecnica EMD con gran precisión.
2) La intepretacion de los espectros de cada IMF no es inmediata
3) Posiblemente seria adecuado evitar el proceso de calculo de magnitud del espectro (eludir la no linealidad de ese proceso en si) y trabajar con graficas de cada espectro (Fourier o Huang-Hilbert) de cada IMF en las cuales el eje x corresponde a la parte real y el eje y a la parte imaginaria. Salen monos rebonitos, como los que vienen a continuacion:

Los 8 primeros espectros de Hilbert-Huang de los IMF del oido derecho (HRIR(0,0) )


BUENO, ESO ES TO ESO ES TO ESO ES TODO AMIGOS, CUALQUIER SUGERENCIA, CRITICA, COMENTARIO, AYUDA SERA BIENVENID@

Suggestions and comments are always welcome, see you later

By The "serious" FLAYTECH

Se me habia olvidado la bibliografía
[1] Huang, Norden E. et al, The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Royal Society of London Proceedings Series A, 1998.
[2] Huang, Norden E., Beyond the Fourier Transform :Coping with Nonlinear, Nonstationary Time Series, Johns Hopkins University,1998.
[3] G. Rilling, P. Flandrin and P. Gonçalvès, On empirical mode decomposition and its algorithms, IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing NSIP-03, Grado (I), 2003
[4] Flandrin, P, Empirical Mode Decomposition, Matlab codes
[5] V. R. Algazi, R. O. Duda, D. M. Thompson and C. Avendano, "The CIPIC HRTF Database," Proc. 2001 IEEE Workshop on Applications of Signal Processing to Audio and Electroacoustics, pp. 99-102, Mohonk Mountain House, New Paltz, NY, Oct. 21-24, 2001.