Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical dat...Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical data,automated multi-document biomedical text summarization becomes an effective and robust approach of accessing the increased amount of technical and medical literature in the biomedical sector through the summarization of multiple source documents by retaining the significantly informative data.So,multi-document biomedical text summarization acts as a vital role to alleviate the issue of accessing precise and updated information.This paper presents a Deep Learning based Attention Long Short Term Memory(DLALSTM)Model for Multi-document Biomedical Text Summarization.The proposed DL-ALSTM model initially performs data preprocessing to convert the available medical data into a compatible format for further processing.Then,the DL-ALSTM model gets executed to summarize the contents from the multiple biomedical documents.In order to tune the summarization performance of the DL-ALSTM model,chaotic glowworm swarm optimization(CGSO)algorithm is employed.Extensive experimentation analysis is performed to ensure the betterment of the DL-ALSTM model and the results are investigated using the PubMed dataset.Comprehensive comparative result analysis is carried out to showcase the efficiency of the proposed DL-ALSTM model with the recently presented models.展开更多
The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and ot...The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and other risks.Detecting AI-generated text has thus become essential to safeguard the authenticity and reliability of digital information.This survey reviews recent progress in detection methods,categorizing approaches into passive and active categories based on their reliance on intrinsic textual features or embedded signals.Passive detection is further divided into surface linguistic feature-based and language model-based methods,whereas active detection encompasses watermarking-based and semantic retrieval-based approaches.This taxonomy enables systematic comparison of methodological differences in model dependency,applicability,and robustness.A key challenge for AI-generated text detection is that existing detectors are highly vulnerable to adversarial attacks,particularly paraphrasing,which substantially compromises their effectiveness.Addressing this gap highlights the need for future research on enhancing robustness and cross-domain generalization.By synthesizing current advances and limitations,this survey provides a structured reference for the field and outlines pathways toward more reliable and scalable detection solutions.展开更多
Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame...Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame-work that uses Bidirectional Encoder Representations from Transformers(BERT)for contextual feature extraction and a multiple-window Convolutional Neural Network(CNN)for classification.To identify semantic nuances in email content,BERT embeddings are used,and CNN filters extract discriminative n-gram patterns at various levels of detail,enabling accurate spam identification.The proposed model outperformed Word2Vec-based baselines on a sample of 5728 labelled emails,achieving an accuracy of 98.69%,AUC of 0.9981,F1 Score of 0.9724,and MCC of 0.9639.With a medium kernel size of(6,9)and compact multi-window CNN architectures,it improves performance.Cross-validation illustrates stability and generalization across folds.By balancing high recall with minimal false positives,our method provides a reliable and scalable solution for current spam detection in advanced deep learning.By combining contextual embedding and a neural architecture,this study develops a security analysis method.展开更多
With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard ...With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard expression,which bring serious challenges to traditional classification methods.In order to cope with the above problems,this paper proposes a new ASSC(ALBERT,SVD,Self-Attention and Cross-Entropy)-TextRCNN digital cultural text classification model.Based on the framework of TextRCNN,the Albert pre-training language model is introduced to improve the depth and accuracy of semantic embedding.Combined with the dual attention mechanism,the model’s ability to capture and model potential key information in short texts is strengthened.The Singular Value Decomposition(SVD)was used to replace the traditional Max pooling operation,which effectively reduced the feature loss rate and retained more key semantic information.The cross-entropy loss function was used to optimize the prediction results,making the model more robust in class distribution learning.The experimental results indicate that,in the digital cultural text classification task,as compared to the baseline model,the proposed ASSC-TextRCNN method achieves an 11.85%relative improvement in accuracy and an 11.97%relative increase in the F1 score.Meanwhile,the relative error rate decreases by 53.18%.This achievement not only validates the effectiveness and advanced nature of the proposed approach but also offers a novel technical route and methodological underpinnings for the intelligent analysis and dissemination of digital cultural texts.It holds great significance for promoting the in-depth exploration and value realization of digital culture.展开更多
This paper presents a new approach to determining whether an interested personal name across doeuments refers to the same entity. Firstly,three vectors for each text are formed: the personal name Boolean vectors deno...This paper presents a new approach to determining whether an interested personal name across doeuments refers to the same entity. Firstly,three vectors for each text are formed: the personal name Boolean vectors denoting whether a personal name occurs the text the biographical word Boolean vector representing title, occupation and so forth, and the feature vector with real values. Then, by combining a heuristic strategy based on Boolean vectors with an agglomeratie clustering algorithm based on feature vectors, it seeks to resolve multi-document personal name coreference. Experimental results show that this approach achieves a good performance by testing on "Wang Gang" corpus.展开更多
Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss abo...Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss about multi-document summarization that differs from the single one in which the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Since the number and variety of online medical news make them difficult for experts in the medical field to read all of the medical news, an automatic multi-document summarization can be useful for easy study of information on the web. Hence we propose a new approach based on machine learning meta-learner algorithm called AdaBoost that is used for summarization. We treat a document as a set of sentences, and the learning algorithm must learn to classify as positive or negative examples of sentences based on the score of the sentences. For this learning task, we apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from different medical websites. Then we compare our results with some existing approaches.展开更多
This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level an...This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts.展开更多
We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based met...We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10].展开更多
Text summarization creates subset that represents the most important or relevant information in the original content,which effectively reduce information redundancy.Recently neural network method has achieved good res...Text summarization creates subset that represents the most important or relevant information in the original content,which effectively reduce information redundancy.Recently neural network method has achieved good results in the task of text summarization both in Chinese and English,but the research of text summarization in low-resource languages is still in the exploratory stage,especially in Tibetan.What’s more,there is no large-scale annotated corpus for text summarization.The lack of dataset severely limits the development of low-resource text summarization.In this case,unsupervised learning approaches are more appealing in low-resource languages as they do not require labeled data.In this paper,we propose an unsupervised graph-based Tibetan multi-document summarization method,which divides a large number of Tibetan news documents into topics and extracts the summarization of each topic.Summarization obtained by using traditional graph-based methods have high redundancy and the division of documents topics are not detailed enough.In terms of topic division,we adopt two level clustering methods converting original document into document-level and sentence-level graph,next we take both linguistic and deep representation into account and integrate external corpus into graph to obtain the sentence semantic clustering.Improve the shortcomings of the traditional K-Means clustering method and perform more detailed clustering of documents.Then model sentence clusters into graphs,finally remeasure sentence nodes based on the topic semantic information and the impact of topic features on sentences,higher topic relevance summary is extracted.In order to promote the development of Tibetan text summarization,and to meet the needs of relevant researchers for high-quality Tibetan text summarization datasets,this paper manually constructs a Tibetan summarization dataset and carries out relevant experiments.The experiment results show that our method can effectively improve the quality of summarization and our method is competitive to previous unsupervised methods.展开更多
As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of do...As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of documents. In this paper, we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization, known as Graph & Keywordp. The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents. Therefore, we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary. Meanwhile, because of the differences in the words importance in documents, we propose to use keyword density for the summaries to provide rich content while using a small number of words. The experiment results show that the Graph & Keywordp method outperforms the state of the art systems when tested on the Duc2004 data set. Key words: multi-document, graph algorithm, keyword density, Graph & Keywordp, Due2004展开更多
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summar...Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summary is proposed,which has two stages,the acquisition of acandidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentenceset is obtained by redundancy-based sentence selection approach.At the second stage,optimum se-lection of sentences is proposed to delete sentences in the candidate sentence set according to itscontribution to the whole set until getting the appointed summary length.With a test corpus,theROUGE value of summaries gotten by the proposed approach proves its validity,compared with thetraditional method of sentence selection.The influence of the token chosen in the two-stage sentenceselection approach on the quality of the generated summaries is analyzed.展开更多
A multi-document summarization method based on Latent Semantic Indexing (LSI) is proposed. The method combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decompos...A multi-document summarization method based on Latent Semantic Indexing (LSI) is proposed. The method combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decomposition (SVD) to reduce the dimension of the matrix and extract features, and then the sentence similarity is computed. The sentences are clustered according to similarity of sentences. The centroid sentences are selected from each class. Finally, the selected sentences are ordered to generate the summarization. The evaluation and results are presented, which prove that the proposed methods are efficient.展开更多
This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main c...This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main content of the text, and summaries are created by extracting the highest scored sentences from the original document. The model formalized as a multiobjective integer programming problem. An advantage of this model is that it can cover the main content of source (s) and provide less redundancy in the generated sum- maries. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, have been used the similarity of sentences to the original document and the similarity between sentences. Performance evaluation is conducted by comparing summarization outputs with manual summaries of DUC2004 dataset. Experiments showed that the proposed approach outperforms the related methods.展开更多
基金This work is funded byDeanship of Scientific Research atKingKhalid University under Grant Number(RGP 1/279/42).www.kku.edu.sa.
文摘Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical data,automated multi-document biomedical text summarization becomes an effective and robust approach of accessing the increased amount of technical and medical literature in the biomedical sector through the summarization of multiple source documents by retaining the significantly informative data.So,multi-document biomedical text summarization acts as a vital role to alleviate the issue of accessing precise and updated information.This paper presents a Deep Learning based Attention Long Short Term Memory(DLALSTM)Model for Multi-document Biomedical Text Summarization.The proposed DL-ALSTM model initially performs data preprocessing to convert the available medical data into a compatible format for further processing.Then,the DL-ALSTM model gets executed to summarize the contents from the multiple biomedical documents.In order to tune the summarization performance of the DL-ALSTM model,chaotic glowworm swarm optimization(CGSO)algorithm is employed.Extensive experimentation analysis is performed to ensure the betterment of the DL-ALSTM model and the results are investigated using the PubMed dataset.Comprehensive comparative result analysis is carried out to showcase the efficiency of the proposed DL-ALSTM model with the recently presented models.
基金supported in part by the Science and Technology Innovation Program of Hunan Province under Grant 2025RC3166the National Natural Science Foundation of China under Grant 62572176the National Key R&D Program of China under Grant 2024YFF0618800.
文摘The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and other risks.Detecting AI-generated text has thus become essential to safeguard the authenticity and reliability of digital information.This survey reviews recent progress in detection methods,categorizing approaches into passive and active categories based on their reliance on intrinsic textual features or embedded signals.Passive detection is further divided into surface linguistic feature-based and language model-based methods,whereas active detection encompasses watermarking-based and semantic retrieval-based approaches.This taxonomy enables systematic comparison of methodological differences in model dependency,applicability,and robustness.A key challenge for AI-generated text detection is that existing detectors are highly vulnerable to adversarial attacks,particularly paraphrasing,which substantially compromises their effectiveness.Addressing this gap highlights the need for future research on enhancing robustness and cross-domain generalization.By synthesizing current advances and limitations,this survey provides a structured reference for the field and outlines pathways toward more reliable and scalable detection solutions.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2026R234)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Spam emails remain one of the most persistent threats to digital communication,necessitating effective detection solutions that safeguard both individuals and organisations.We propose a spam email classification frame-work that uses Bidirectional Encoder Representations from Transformers(BERT)for contextual feature extraction and a multiple-window Convolutional Neural Network(CNN)for classification.To identify semantic nuances in email content,BERT embeddings are used,and CNN filters extract discriminative n-gram patterns at various levels of detail,enabling accurate spam identification.The proposed model outperformed Word2Vec-based baselines on a sample of 5728 labelled emails,achieving an accuracy of 98.69%,AUC of 0.9981,F1 Score of 0.9724,and MCC of 0.9639.With a medium kernel size of(6,9)and compact multi-window CNN architectures,it improves performance.Cross-validation illustrates stability and generalization across folds.By balancing high recall with minimal false positives,our method provides a reliable and scalable solution for current spam detection in advanced deep learning.By combining contextual embedding and a neural architecture,this study develops a security analysis method.
基金funded by China National Innovation and Entrepreneurship Project Fund Innovation Training Program(202410451009).
文摘With the rapid development of digital culture,a large number of cultural texts are presented in the form of digital and network.These texts have significant characteristics such as sparsity,real-time and non-standard expression,which bring serious challenges to traditional classification methods.In order to cope with the above problems,this paper proposes a new ASSC(ALBERT,SVD,Self-Attention and Cross-Entropy)-TextRCNN digital cultural text classification model.Based on the framework of TextRCNN,the Albert pre-training language model is introduced to improve the depth and accuracy of semantic embedding.Combined with the dual attention mechanism,the model’s ability to capture and model potential key information in short texts is strengthened.The Singular Value Decomposition(SVD)was used to replace the traditional Max pooling operation,which effectively reduced the feature loss rate and retained more key semantic information.The cross-entropy loss function was used to optimize the prediction results,making the model more robust in class distribution learning.The experimental results indicate that,in the digital cultural text classification task,as compared to the baseline model,the proposed ASSC-TextRCNN method achieves an 11.85%relative improvement in accuracy and an 11.97%relative increase in the F1 score.Meanwhile,the relative error rate decreases by 53.18%.This achievement not only validates the effectiveness and advanced nature of the proposed approach but also offers a novel technical route and methodological underpinnings for the intelligent analysis and dissemination of digital cultural texts.It holds great significance for promoting the in-depth exploration and value realization of digital culture.
文摘This paper presents a new approach to determining whether an interested personal name across doeuments refers to the same entity. Firstly,three vectors for each text are formed: the personal name Boolean vectors denoting whether a personal name occurs the text the biographical word Boolean vector representing title, occupation and so forth, and the feature vector with real values. Then, by combining a heuristic strategy based on Boolean vectors with an agglomeratie clustering algorithm based on feature vectors, it seeks to resolve multi-document personal name coreference. Experimental results show that this approach achieves a good performance by testing on "Wang Gang" corpus.
文摘Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss about multi-document summarization that differs from the single one in which the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Since the number and variety of online medical news make them difficult for experts in the medical field to read all of the medical news, an automatic multi-document summarization can be useful for easy study of information on the web. Hence we propose a new approach based on machine learning meta-learner algorithm called AdaBoost that is used for summarization. We treat a document as a set of sentences, and the learning algorithm must learn to classify as positive or negative examples of sentences based on the score of the sentences. For this learning task, we apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from different medical websites. Then we compare our results with some existing approaches.
文摘This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts.
文摘We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10].
基金This work was supported in part by the National Science Foundation Project of P.R.China 484 under Grant No.52071349partially supported by Young and Middle-aged Talents Project of the State Ethnic Affairs 487 Commission.
文摘Text summarization creates subset that represents the most important or relevant information in the original content,which effectively reduce information redundancy.Recently neural network method has achieved good results in the task of text summarization both in Chinese and English,but the research of text summarization in low-resource languages is still in the exploratory stage,especially in Tibetan.What’s more,there is no large-scale annotated corpus for text summarization.The lack of dataset severely limits the development of low-resource text summarization.In this case,unsupervised learning approaches are more appealing in low-resource languages as they do not require labeled data.In this paper,we propose an unsupervised graph-based Tibetan multi-document summarization method,which divides a large number of Tibetan news documents into topics and extracts the summarization of each topic.Summarization obtained by using traditional graph-based methods have high redundancy and the division of documents topics are not detailed enough.In terms of topic division,we adopt two level clustering methods converting original document into document-level and sentence-level graph,next we take both linguistic and deep representation into account and integrate external corpus into graph to obtain the sentence semantic clustering.Improve the shortcomings of the traditional K-Means clustering method and perform more detailed clustering of documents.Then model sentence clusters into graphs,finally remeasure sentence nodes based on the topic semantic information and the impact of topic features on sentences,higher topic relevance summary is extracted.In order to promote the development of Tibetan text summarization,and to meet the needs of relevant researchers for high-quality Tibetan text summarization datasets,this paper manually constructs a Tibetan summarization dataset and carries out relevant experiments.The experiment results show that our method can effectively improve the quality of summarization and our method is competitive to previous unsupervised methods.
文摘As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of documents. In this paper, we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization, known as Graph & Keywordp. The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents. Therefore, we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary. Meanwhile, because of the differences in the words importance in documents, we propose to use keyword density for the summaries to provide rich content while using a small number of words. The experiment results show that the Graph & Keywordp method outperforms the state of the art systems when tested on the Duc2004 data set. Key words: multi-document, graph algorithm, keyword density, Graph & Keywordp, Due2004
基金the National Natural Science Foundation of China(No.60575041)the High Technology Researchand Development Program of China(No.2006AA01Z150).
文摘Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summary is proposed,which has two stages,the acquisition of acandidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentenceset is obtained by redundancy-based sentence selection approach.At the second stage,optimum se-lection of sentences is proposed to delete sentences in the candidate sentence set according to itscontribution to the whole set until getting the appointed summary length.With a test corpus,theROUGE value of summaries gotten by the proposed approach proves its validity,compared with thetraditional method of sentence selection.The influence of the token chosen in the two-stage sentenceselection approach on the quality of the generated summaries is analyzed.
文摘A multi-document summarization method based on Latent Semantic Indexing (LSI) is proposed. The method combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decomposition (SVD) to reduce the dimension of the matrix and extract features, and then the sentence similarity is computed. The sentences are clustered according to similarity of sentences. The centroid sentences are selected from each class. Finally, the selected sentences are ordered to generate the summarization. The evaluation and results are presented, which prove that the proposed methods are efficient.
文摘This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main content of the text, and summaries are created by extracting the highest scored sentences from the original document. The model formalized as a multiobjective integer programming problem. An advantage of this model is that it can cover the main content of source (s) and provide less redundancy in the generated sum- maries. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, have been used the similarity of sentences to the original document and the similarity between sentences. Performance evaluation is conducted by comparing summarization outputs with manual summaries of DUC2004 dataset. Experiments showed that the proposed approach outperforms the related methods.