Skip to main navigation Skip to search Skip to main content

Multi-Explainable TemporalNet: An Interpretable Multimodal Approach using Temporal Convolutional Network for User-level Depression Detection

  • Anas Zafar
  • , Danyal Aftab
  • , Rizwan Qureshi
  • , Yaofeng Wang
  • , Hong Yan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multimodal depression detection through internet-based data such as social media platforms has been an important problem in the research community, aiming to predict human mental states for ensuring wellbeing of the society. Recently, attention-based networks have gained significant popularity for depression detection. However, existing multimodal methods primarily rely on images and text assuming no correlation between temporal aspects such as relative time of different posts or tweets, which is a crucial factor in deriving depression related behavior patterns. Moreover, they lack model interpretability resulting in limited understanding of how different features are contributing to the model's final prediction. In this paper, we propose MultiExplainable TemporalNet (METN), a Temporal Convolution Network (TCN) based multi-modal transformer network with relative timestamp embeddings. We leverage pretrained foundation models for text and image embeddings and attention maps for model interpretability. We perform extensive experiments and ablation studies to validate the performance of METN for user-level depression detection task. Our model shows state-of-the-art results on various benchmarks, such as 0.945 F1 score on multimodal Twitter dataset, and 0.913 F1 score on multimodal Reddit dataset. We further demonstrate that our model enhances the accuracy of identifying depression in individuals who publicly post messages on social media platforms with enhanced interpretable compatibility. Code and models are available at Github.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
PublisherIEEE Computer Society
Pages2258-2265
Number of pages8
ISBN (Electronic)9798350365474
DOIs
Publication statusPublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle, United States
Duration: 16 Jun 202422 Jun 2024

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
Country/TerritoryUnited States
CitySeattle
Period16/06/2422/06/24

Fingerprint

Dive into the research topics of 'Multi-Explainable TemporalNet: An Interpretable Multimodal Approach using Temporal Convolutional Network for User-level Depression Detection'. Together they form a unique fingerprint.

Cite this