In this paper, we propose and evaluate two parallel implementations of Multi-dimensional Ensemble Empirical Mode Decomposition (MEEMD) for multi-core (CPU) and many-core (GPU) architectures. Relative to a sequential C implementation, our double precision GPU implementation, using the CUDA programming model, achieves up to 48.6x speedup on NVIDIA Tesla C2050. Our multi-core CPU implementation, using the OpenMP programming model,