APPLICATION OF HYBRID & NOVEL DEEP LEARNING APPROACHES FOR MULTIMODAL SENTIMENT FUSION IN IMAGES & AUDIO ANALYSIS

Jayaprakash Vattikundala; M. Siva Ganga Prasad

doi:10.70102/afts.2025.1834.605

Original scientific article

Published: December 2025

<< Prev | Next >>

PDF

https://doi.org/10.70102/afts.2025.1834.605

APPLICATION OF HYBRID & NOVEL DEEP LEARNING APPROACHES FOR MULTIMODAL SENTIMENT FUSION IN IMAGES & AUDIO ANALYSIS

Abstract

The paper suggests a hybrid multimodal sentiment analysis (MSA) model that would enhance the accuracy of sentiment prediction through the combination of textual, auditory, and visual information. In most cases, the traditional sentiment analysis models have been challenged because of numerous overlapping features and poor fusion methods when using multimodal data. To overcome these problems, propose a supervised contrastive learning-based methodology that will improve data representation and exploit multimodal feature fusion. The technique includes pre-processing Twitter information by tokenization, stemming, and feature extraction, and classifying it with the help of a Particle Swarm Optimization-Deep Learning Modified Neural Network (PSO-DLBMNN). The experimental findings, assessed based on the measures of accuracy, precision, recall, and F1-score, demonstrate that the suggested model is superior to the traditional approaches to deep learning, such as Bi-LSTM and Bi-GRU. In particular, the PSO-DLBMNN model had an accuracy of 95.48, a precision of 96.57, a recall of 94.87, and an F1-score of 93.45, which is a substantial increase over the baseline models. These results indicate that the model is capable of completing multiple tasks of integrating multimodal data alongside solving the problem of redundancy and data noise. The suggested method gives a fresh outlook on improving sentiment analysis through enhancing multimodal feature fusion. To sum up, the model has the potential to be applied to real-time analysis in social media and human-computer interaction systems, and it provides information about how multimodal data can be used to enhance sentiment prediction and emotional perception.

Keywords:

multimodal sentiment analysis,

feature extraction,

multimodal fusion,

supervised contrastive learning,

multimodal data integration,

twitter sentiment analysis,

particle swarm optimization (PSO).

References

Tul Q, Ali M, Riaz A, Noureen A, Kamranz M, Hayat B, et al. Sentiment Analysis Using Deep Learning Techniques: A Review. International Journal of Advanced Computer Science and Applications. 2017;8(6).

Prasad KR, Karanam SR, Ganesh D, Liyakat KKS, Talasila V, Purushotham P. AI in public-private partnership for IT infrastructure development. The Journal of High Technology Management Research. 2024;35(1):100496.

Rahman F. Scalable Safety-Constrained Learning Pipelines for Distributed Digital-Twin-Based Energy Optimization in Large-Scale Electric Mobility Systems. 2026;1–8.

Balakrishna N, Krishnan MBM, Ganesh D. Hybrid Machine Learning Approaches for Predicting and Diagnosing Major Depressive Disorder. International Journal of Advanced Computer Science and Applications. 2024;15(3).

Turukmane A, Tangudu N, Sreedhar B, Ganesh D, Reddy P, Batta U. An effective routing algorithm for load balancing in unstructured peer-to-peer networks. International Journal of Intelligent Systems and Applications in Engineering. 2023;(7s):87–97.

Ganesh D, Pavan Kumar T. A Survey onadvances in security threats and its counter measures in cognitive radio networks. International Journal of Engineering & Technology. 2018;7(2.8):372.

Davanam G, Pavan Kumar T, Sunil Kumar M. Novel Defense Framework for Cross-layer Attacks in Cognitive Radio Networks. Advances in Intelligent Systems and Computing. Springer Singapore; 2021. p. 23–33.

Qin Z, Zhao P, Zhuang T, Deng F, Ding Y, Chen D. A survey of identity recognition via data fusion and feature learning. Information Fusion. 2023;91:694–712.

Reginald P. Context-Driven Cooperative Intelligent Control for Distributed Cyber-Physical Actuation Platforms Using CTDE Multi-Agent Reinforcement Learning. Recent Advances in Next-Generation Wireless Communication Systems. 2025;43–50.

10.

Tu G, Liang B, Jiang D, Xu R. Sentiment- Emotion- and Context-Guided Knowledge Selection Framework for Emotion Recognition in Conversations. IEEE Transactions on Affective Computing. 2023;14(3):1803–16.

11.

Zou H, Tang X, Xie B, Liu B. Sentiment Classification Using Machine Learning Techniques with Syntax Features. 2015 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE; 2015. p. 175–9.

12.

Yue W, Li L. Sentiment Analysis using Word2vec-CNN-BiLSTM Classification. 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS). IEEE; 2020. p. 1–5.

13.

Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS. Multimodal fusion for multimedia analysis: a survey. Multimedia Systems. 2010;16(6):345–79.

14.

Kumar P. Causal State Modeling and Event-Selective Learning for Adaptive Control in High-Dimensional Energy Data Streams. Journal of Scalable Data Engineering and Intelligent Computing. 2026;34–42.

15.

Mazloom M, Rietveld R, Rudinac S, Worring M, van Dolen W. Multimodal Popularity Prediction of Brand-related Social Media Posts. Proceedings of the 24th ACM international conference on Multimedia. ACM; 2016. p. 197–201.

16.

Poria S, Cambria E, Howard N, Huang GB, Hussain A. Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing. 2016;174:50–9.

17.

Kumar DrMS, Prakash KJ. Internet of Things: IETF protocols, Algorithms and Applications. International Journal of Innovative Technology and Exploring Engineering. 2019;8(11):2853–7.

18.

Rani K, Jayadurga R, Raja V, Kumar M, Swathi R, Kumar P. Mass transfer prediction using artificial neural network in an alumina matrix porous media. European Chemical Bulletin. 2022;(11):113–20.

19.

Godala S, Kumar MS. RETRACTED ARTICLE: A weight optimized deep learning model for cluster based intrusion detection system. Optical and Quantum Electronics. 2023;55(14).

20.

Subbaiah B, Murugesan K, Saravanan P, Marudhamuthu K. An efficient multimodal sentiment analysis in social media using hybrid optimal multi-scale residual attention network. Artificial Intelligence Review. 2024;57(2).

Citation

Copyright

This is an open access article distributed under the Creative Commons Attribution Non-Commercial License (CC BY-NC) License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Article metrics

Google scholar: See link

Issue 34, 2025

THE MODEL OF GREEN ENTREPRENEURSHIP FACTORS ON THE INTERNATIONALIZATION PERFORMANCE OF SMES IN CHINA: A CONCEPTUAL FRAMEWORK HORNED LIZARD-CATBOOST FRAMEWORK FOR CYBERBULLYING PREVENTION IN SOCIAL NETWORKS ON LEVERAGING GENERATIVE ARTIFICIAL INTELLIGENCE (GENAI) FOR BEHAVIOR LEARNING AND PERSONALIZED MARKETING OPTIMIZATION ENHANCING IP COMMERCIALIZATION PERFORMANCE IN SOCIAL SCIENCE ACADEMICS AND THE ROLE OF ENTREPRENEURIAL ORIENTATION, UNIVERSITY SUPPORT, AND SELF-EFFICACY DETERMINANTS OF EMPLOYEE ENGAGEMENT IN ORGANIZED RETAIL: AN ANALYTICAL STUDY See full issue

About us

Editorial policy

APPLICATION OF HYBRID & NOVEL DEEP LEARNING APPROACHES FOR MULTIMODAL SENTIMENT FUSION IN IMAGES & AUDIO ANALYSIS

Abstract

Keywords:

References

Citation

Copyright

Article metrics

Issue 34, 2025

Citations

Disclaimer