Aiden Davis | Swansea Computer Science Student Conference

Aiden Davis (2009972)

Music Source Separation and Interactive Enhancement

Project Abstract

Recent developments in the deep learning domain have enabled the development of a range of innovative models in the music source separation domain. However, there are still some limitations to source estimates produced by these source separation models with regards to noise, reverb, and perceived quality. In this project we address this with a novel approach to source separation enhancement by building on existing work via a synthesis of state-of-the-art deep neural network models, a more suitable dataset for enhancement, and by providing an interactive process for improving low-quality recordings of music. We use fine-tuned Hybrid Transformer Demucs as our source separation model and train a Mel2Mel DiffWave model on the MUSDB18-HQ dataset for music enhancement. A Google Colaboratory notebook is used for interactive music enhancement. Our evaluation using subjective measurements shows that this approach outperforms the baseline and that the processed audio is preferred over other methods when conducting listening tests. Overall, this project was successful in developing a solution that improves the perceived quality of a state-of-the-art music source separation model.

Keywords: Artifical Intelligence, Deep Neural Networks, Music Enhancement

Conference Details

Session: Poster Session B at Poster Stand 25

Location: Sir Stanley Clarke Auditorium at Wednesday 8th 09:00 – 12:30

Markers: Giedre Sabaliauskaite (Siraj), Manlio Valenti

Course: BSc Computer Science FI, 3rd Year

Future Plans: I’m looking for work