×
Home Current Archive Editorial board
Instructions for papers
For Authors Aim & Scope Contact
Original scientific article

A PMI-DRIVEN APPROACH WITH CONVENTIONAL BERT FOR OPTIMIZING TEXT SUMMARIZATION

By
R. Ramesh Orcid logo ,
R. Ramesh

Assistant Professor, Department of Computer Applications, Thanthai Periyar Government Arts & Sciences College , Trichy, Tamil Nadu , India

N. Subalakshmi Orcid logo ,
N. Subalakshmi

Assistant Professor, Department of Computer and Information Science, Annamalai University , Chidambaram, Tamil Nadu , India

S. Selvarani Orcid logo ,
S. Selvarani

Assistant Professor, Department of Computer Science, Alagappa Government Arts College , Karaikudi, Tamil Nadu , India

K. Kavitha Orcid logo ,
K. Kavitha

Assistant Professor (Selection Grade), Department of Electrical & Electronics Engineering, Annamalai University , Annamalai Nagar, Chidambaram, Tamil Nadu , India

M. Jeyakarthic Orcid logo
M. Jeyakarthic

Assistant Professor, Department of Computer and Information Science, Annamalai University , Annamalai Nagar, Chidambaram, Tamil Nadu , India

Abstract

Text summarization plays a crucial role in natural language processing by condensing large volumes of textual information into concise and meaningful summaries. With the rapid growth of digital content, existing summarization approaches often struggle to balance contextual understanding and semantic relevance. This paper presents a PMI-driven BERT-based text summarization framework that integrates Pointwise Mutual Information (PMI) as a statistical pre-processing mechanism with a fine-tuned Conventional BERT model to enhance summary quality. PMI is employed to identify and rank semantically significant terms based on co-occurrence patterns, enabling effective keyword and phrase prioritization before summarization. The ranked textual representation is then processed using a summarization-specific decoder layer added on top of the BERT encoder to generate coherent and context-aware summaries. The proposed framework is evaluated on the CNN/Daily Mail dataset comprising over 300,000 news articles, using ROUGE-1, ROUGE-2, and ROUGE-L metrics for performance assessment. Experimental results demonstrate that the proposed method achieves ROUGE-1, ROUGE-2, and ROUGE-L scores of 46.9, 27.61, and 45.68 respectively, outperforming baseline models such as Seq2Seq, Seq2Sick, and Prefix-Tuning by an average margin of 2–3%. The experiments were conducted using Python with the PyTorch deep learning framework on a CPU-based environment. The results indicate that PMI-based pre-processing significantly improves contextual relevance and semantic consistency in generated summaries. This framework demonstrates robustness and scalability, making it suitable for large-scale document summarization tasks.

References

1.
Belwal R, Gupta A. Automatic text summarization techniques: categorization and contemporary challenges. Information Processing and Management. 2025;(2):103612.
2.
Aswani S, Choudhary K, Shetty S, Nur N. Automatic text summarization of scientific articles using transformers—A brief review. Journal of Autonomous Intelligence. 2024;7(5):1331.
3.
Supriyono, Wibawa AP, Suyono, Kurniawan F. A survey of text summarization: Techniques, evaluation and challenges. Natural Language Processing Journal. 2024;7:100070.
4.
Zhang Y, Jin H, Meng D, Wang J, Tan J. A comprehensive survey on automatic text summarization with exploration of LLM-based methods. Neurocomputing. 2026;663:131928.
5.
Liu W, Sun Y, Yu B, Wang H, Peng Q, Hou M, et al. Automatic Text Summarization Method Based on Improved TextRank Algorithm and K-Means Clustering. Knowledge-Based Systems. 2024;287:111447.
6.
Motghare M, Agarwal M, Agrawal A. NewsSumm: The World’s Largest Human-Annotated Multi-Document News Summarization Dataset for Indian English. Computers. 2025;14(12):508.
7.
Ghanem FA, Padma MC, Abdulwahab HM, Alkhatib R. Deep Learning-Based Short Text Summarization: An Integrated BERT and Transformer Encoder–Decoder Approach. Computation. 2025;13(4):96.
8.
Kaushal A. Charting the growth of text summarization: a bibliometric analysis. Applied Sciences. 2024;(23):11462.
9.
Chen X, Smith J. Advances in biomedical text summarization: methods and evaluation strategies. Journal of Biomedical Informatics. 2024;104302.
10.
Krishna R, Singh K, R. Deep learning approaches for automatic text summarization: a comparative analysis. Scientific Reports. 2025;20224.
11.
Durak H, Egin F, Onan A. Multi-agent deep learning framework for enhanced text summarization. Expert Systems with Applications. 2025;122324.
12.
Luo M, Zhao L. Automatic text summarization techniques: recent advances and open challenges. Information Retrieval Journal. 2024;(4):423–45.
13.
Hu X, Li Y, Zhao J. Semantic feature learning for document-level text summarization. Journal of Intelligent Information Systems. 2025;(2):389–408.
14.
Wang Q, Lee S. Graph neural network architectures for document summarization. IEEE Transactions on Neural Networks and Learning Systems. 2024;(9):11245–57.
15.
Ruiz C, Martinez L. Cross-lingual text summarization using attention-based transformer models. Neural Computing and Applications. 2025;(6):4571–86.
16.
Patel D, Arora S. Hybrid extractive-abstractive text summarization using deep neural networks. Applied Soft Computing. 2024;110467.
17.
Chen L, Kumar A. Reinforcement learning-based abstractive text summarization. Neurocomputing. 2025;126804.
18.
Subalakshmi N, Babubalaji R. Improving the Accuracy of Sarcasm Detection in Text Data Using a Smooth Support Vector Classification Model with Word-Emoji Embedding for News and Indian Indigenous Languages. 2024 International Conference on System, Computation, Automation and Networking (ICSCAN). IEEE; 2024. p. 1–6.
19.
Zhao T, Tang Y. Context-aware text summarization for legal documents. Artificial Intelligence Review. 2025;(3):1751–73.
20.
Lopez G, Silva R. Transformer-based multilingual text summarization. Natural Language Engineering. 2024;(5):843–62.
21.
Park J, Cho M. Scalable neural text summarization in big data environments. Big Data Research. 2025;100392.
22.
Ahmed S, Rahman N. Domain adaptation techniques for neural text summarization. Pattern Recognition Letters. 2024;77–84.
23.
Ozdemir A, Yilmaz T. Integrating semantic role labeling with transformer models for text summarization. Computational Linguistics. 2025;(1):129–54.
24.
Barrios J, Fernandez M. Beyond ROUGE: evaluation metrics for text summarization. Information Processing & Management. 2024;(6):103271.
25.
Nair R, Dev P. Recent trends in text summarization evaluation and benchmarks. Journal of Artificial Intelligence Research. 2025;1159–86.
26.
Jeyakarthic M, Senthilkumar J. Optimal Bidirectional Long Short Term Memory based Sentiment Analysis with Sarcasm Detection and Classification on Twitter Data. 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon). IEEE; 2022. p. 1–6.
27.
Jeyakarthic M, Leoraj A. A novel social media-based adaptable approach for sentiment analysis data. 2023;1–6.
28.
A L, M J. Spotted Hyena Optimization with Deep Learning-Based Automatic Text Document Summarization Model. International Journal of Electrical and Electronics Engineering. 2023;10(5):153–64.

Citation

This is an open access article distributed under the  Creative Commons Attribution Non-Commercial License (CC BY-NC) License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 

Article metrics

Google scholar: See link

The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.