Explainable Artificial Intelligence in Decoding Human Emotions Through Vision Transformers




Abstract:
In artificial intelligence and psychological research, understanding how artificial intelligence interprets human emotions through facial expressions is challenging, thus emotion recognition became a crucial task in many computer vision applications, and in the development of emotionally intelligent artificial intelligence systems. Although there have been significant advancements in the field, with many deep learning models achieving high accuracy, there is still a gap in developing models that are both highly accurate and explainable. This is particularly true in aligning with human psychological processes for emotion recognition. Addressing this issue, our research explores the capabilities of vision transformers, specifically focusing on how these models might mimic human attention to key facial features, such as the eyes and mouth, in the context of emotion recognition. The experiments are conducted on the well-known KDEF dataset. In the proposed model, the attention maps are analyzed, aiming to uncover whether artificial intelligence can replicate human-like processing in interpreting emotions. The findings reveal that the model's attention aligns with the psychologically significant facial regions, suggesting a level of humanlike processing. Additionally, the model’s performance is proven by a notable test accuracy of 95%. This research makes a significant contribution to the body of knowledge in both artificial intelligence and psychological domains by demonstrating the potential of vision transformers in accurately interpreting human emotions through facial expressions

CITATION:

IEEE format

B. Gutić, T. Bezdan, H. Farahani, P. Watson, M. Marjanović, “Explainable Artificial Intelligence in Decoding Human Emotions Through Vision Transformers,” in Sinteza 2024 - International Scientific Conference on Information Technology, Computer Science, and Data Science, Belgrade, Singidunum University, Serbia, 2024, pp. 168-174. doi:10.15308/Sinteza-2024-168-174

APA format

Gutić, B., Bezdan, T., Farahani, H., Watson, P., Marjanović, M. (2024). Explainable Artificial Intelligence in Decoding Human Emotions Through Vision Transformers. Paper presented at Sinteza 2024 - International Scientific Conference on Information Technology, Computer Science, and Data Science. doi:10.15308/Sinteza-2024-168-174

BibTeX format
Download

RefWorks Tagged format
Download