Deep Learning: Facial Dynamic Analysis and Language Fluency Prediction
Project Abstract
This research thesis aims to explore the application of machine learning techniques in evaluating the fluency level of second language speakers, specifically focusing on the Welsh language. The research leverages the High Deformation Facial Dynamics (HDFD) dataset, a unique 4D facial dynamics dataset of spoken Welsh phrases. This dataset provides an opportunity to study the facial dynamics of speakers with varying levels of Welsh proficiency. The primary goal of this research is to find a novel approach to training a Deep Learning model on the HDFD dataset to achieve considerably higher accuracy on the Welsh language fluency prediction task. The advanced goal is to contribute to the broader field of Machine Learning by demonstrating effective strategies for working with small datasets. Essentially, this will prove that fewer samples can still lead to robust and accurate models when combined with innovative training techniques. Due to the limited size of the dataset, the research hopes to explore methods such as data augmentation, transfer learning, and one-shot learning to maximize its utility. The performance of the model will be evaluated through comparison with baseline models, direct measurement of F-score and accuracy metrics, and peer review through academic channels. By achieving these goals, this research hopes to not only improve the performance of the model but also contribute to the preservation of the Welsh language, an important aspect of Welsh heritage. The findings of this research could potentially aid in the Government of Wales’ campaign to increase the number of Welsh speakers to a million by the year 2050.
Keywords: Computer Vision, Deep Learning, Natural Language Processing
Conference Details
Session: Presentation Stream 4 at Presentation Slot 1
Location: GH001 at Tuesday 7th 13:30 – 17:00
Markers: Neil Carter, Lu Zhang
Course: MSc Data Science, Masters PG
Future Plans: I’m looking for an industry placement