期刊文献+
共找到748篇文章
< 1 2 38 >
每页显示 20 50 100
An Incremental Algorithm of Text Clustering Based on Semantic Sequences 被引量:1
1
作者 FENG Zhonghui SHEN Junyi BAO Junpeng 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1340-1344,共5页
This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate clu... This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set. 展开更多
关键词 text clustering semantic sequence ENTROPY
在线阅读 下载PDF
FH Sequences Selected Based on Clustering Analysis 被引量:1
2
作者 Huabin Yang Deyu Wang 《通讯和计算机(中英文版)》 2010年第8期58-61,共4页
关键词 聚类分析算法 跳频序列 基础 空间结构特征 无线电网络 空间映射 跳频通信 碰撞概率
在线阅读 下载PDF
Modeling Bacterial Species: Using Sequence Similarity with Clustering Techniques
3
作者 Miguel-Angel Sicilia Elena García-Barriocanal +2 位作者 Marçal Mora-Cantallops Salvador Sánchez-Alonso Lino González 《Computers, Materials & Continua》 SCIE EI 2021年第8期1661-1672,共12页
Existing studies have challenged the current definition of named bacterial species,especially in the case of highly recombinogenic bacteria.This has led to considering the use of computational procedures to examine po... Existing studies have challenged the current definition of named bacterial species,especially in the case of highly recombinogenic bacteria.This has led to considering the use of computational procedures to examine potential bacterial clusters that are not identified by species naming.This paper describes the use of sequence data obtained from MLST databases as input for a k-means algorithm extended to deal with housekeeping gene sequences as a metric of similarity for the clustering process.An implementation of the k-means algorithm has been developed based on an existing source code implementation,and it has been evaluated against MLST data.Results point out to potential bacterial clusters that are close to more than one different named species and thus may become candidates for alternative classifications accounting for genotypic information.The use of hierarchical clustering with sequence comparison as similarity metric has the potential to find clusters different from named species by using a more informed cluster formation strategy than a conventional nominal variant of the algorithm. 展开更多
关键词 clustering bacterial species K-MEANS sequence alignment
在线阅读 下载PDF
Genetic Diversity among Parents of Hybrid Rice Based on Cluster Analysis of Morphological Traits and Simple Sequence Repeat Markers 被引量:3
4
作者 WANG Sheng-jun Lu Zuo-mei WAN Jian-min 《Rice science》 SCIE 2006年第3期155-160,共6页
The genetic diversity of 41 parental lines popularized in commercial hybrid rice production in China was studied by using cluster analysis of morphological traits and simple sequence repeat (SSR) markers. Forty-one ... The genetic diversity of 41 parental lines popularized in commercial hybrid rice production in China was studied by using cluster analysis of morphological traits and simple sequence repeat (SSR) markers. Forty-one entries were assigned into two clusters (i.e. early or medium-maturing cluster; medium or late-maturing cluster) and further assigned into six sub-clusters based on morphological trait cluster analysis, The early or medium-maturing cluster was composed of 15 maintainer lines, four early-maturing restorer lines and two thermo-sensitive genic male sterile lines, and the medium or late-maturing cluster included 16 restorer lines and 4 medium or late-maturing maintainer lines. Moreover, the SSR cluster analysis classified 41 entries into two groups (i.e, maintainer line group and restorer line group) and seven sub-groups. The maintainer line group consisted of all 19 maintainer lines, two thermo-sensitive genic male sterile lines, while the restorer line group was composed of all 20 restorer lines. The SSR analysis fitted better with the pedigree information. From the views on hybrid rice breeding, the results suggested that SSR analysis might be a better method to study the diversity of parental lines in indica hybrid rice. 展开更多
关键词 parental lines hybrid rice morphological trait simple sequence repeats clustering analysis genetic diversity PEDIGREE
在线阅读 下载PDF
Scaling behavior of nucleotide cluster in DNA sequences
5
作者 CHENG Jun TONG Zi-shuang ZHANG Lin-xi 《Journal of Zhejiang University-Science B(Biomedicine & Biotechnology)》 SCIE CAS CSCD 2007年第5期359-364,共6页
In this paper we study the scaling behavior of nucleotide cluster in 11 chromosomes of Encephalitozoon cuniculi Genome. The statistical distribution of nucleotide clusters for 11 chromosomes is characterized by the sc... In this paper we study the scaling behavior of nucleotide cluster in 11 chromosomes of Encephalitozoon cuniculi Genome. The statistical distribution of nucleotide clusters for 11 chromosomes is characterized by the scaling behavior of P ( S ) ∝ e ?αS, where S represents nucleotide cluster size. The cluster-size distribution P(S1+S2) with the total size of sequential C-G cluster and A-T cluster S1+S2 were also studied. P(S1+S2) follows exponential decay. There does not exist the case of large C-G cluster following large A-T cluster or large A-T cluster following large C-G cluster. We also discuss the relatively random walk length function L(n) and the local compositional complexity of nucleotide sequences based on a new model. These investigations may provide some insight into nucleotide cluster of DNA sequence. 展开更多
关键词 Nucleotide cluster DNA sequences Scaling behavior
在线阅读 下载PDF
Statistical properties of nucleotide clusters in DNA sequences
6
作者 成军 章林溪 《Journal of Zhejiang University-Science B(Biomedicine & Biotechnology)》 SCIE EI CAS CSCD 2005年第5期408-412,共5页
Using the complete genome of Plasmodium falciparum 3D7 which has 14 chromosomes as an example, we have examined the distribution functions for the amount of C or G and A or T consecutively and non-overlapping blocks o... Using the complete genome of Plasmodium falciparum 3D7 which has 14 chromosomes as an example, we have examined the distribution functions for the amount of C or G and A or T consecutively and non-overlapping blocks of m bases in this system. The function P(S) about the number of the consecutive C-G or A-T content cluster conforms to the relation P(S)∝e? ; αs values of the scaling exponent αCG are much larger than αAT; and αAT of 14 chromosomes are hardly changed, whereas αCG of 14 chromosomes have a number of fluctuations. We found maximum value of A-T cluster size is much larger than C-G, which implies the existence of large A-T cluster. Our study of the width function ξ(m) of cluster C-G content showed that follows good power law ξ(m)∝m?γ. The average γ for 14 chromosomes is 0.931. These investigations provide some insight into the nucleotide clusters of DNA sequences, and help us understand other properties of DNA sequences. 展开更多
关键词 DNA sequence Plasmodium falciparum 3D7 Nucleotide clusters Power law
在线阅读 下载PDF
Fuzzy Cluster Analysis of Alzheimer’s Disease-Related Gene Sequences
7
作者 Jing Yang Jiarui Si +1 位作者 Xiaoxuan Gu Ouyan Shi 《Engineering(科研)》 2013年第10期530-533,共4页
The objective of this paper is to analyze the relationship among the interrelated gene sequences of Alzheimer’s disease (AD). Further this paper will provide a study on genetic factor of the occurrence about Alzheime... The objective of this paper is to analyze the relationship among the interrelated gene sequences of Alzheimer’s disease (AD). Further this paper will provide a study on genetic factor of the occurrence about Alzheimer’s disease, so as to provide more information on the prevention of Alzheimer’s disease, the clinical diagnosis and gene therapy for Alzheimer’s disease. The respective alignment of the Alzheimer’s disease interrelated gene sequences with those in The National Center for Biotechnology Information (NCBI) database was studied, and the measurement relationship of these sequences was identified and analyzed by the method of fuzzy cluster. The result of fuzzy cluster analysis indicates that the gene sequences interrelated within one group is consistently having closer relationship within the group other than in another group. 展开更多
关键词 Alzheimer’s Disease Gene mRNA sequence ALIGNMENT Fuzzy cluster
暂未订购
A Novel Method of Deinterleaving Radar Pulse Sequences Based on a Modified DBSCAN Algorithm 被引量:10
8
作者 Abolfazl Dadgarnia Mohammad Taghi Sadeghi 《China Communications》 SCIE CSCD 2023年第2期198-215,共18页
A modified DBSCAN algorithm is presented for deinterleaving of radar pulses in modern EW environments.A main characteristic of the proposed method is that using only time of arrival of pulses,the method can sort the p... A modified DBSCAN algorithm is presented for deinterleaving of radar pulses in modern EW environments.A main characteristic of the proposed method is that using only time of arrival of pulses,the method can sort the pulses efficiently.Other PDW information such as rise time,carrier frequency,pulse width,modulation on pulse,fall time and direction of arrival are not required.To identify the valid PRIs in a set of interleaved pulses,an innovative modification of the DBSCAN algorithm is introduced which is accurate and easy to implement.The proposed method determines valid PRIs more accurately and neglects the spurious ones more efficiently as compared to the classical histogram based algorithms such as SDIF.Furthermore,without specifying any input parameter,the proposed method can deinterleave radar pulses while up to 30%jitter is present in the associated PRI.The accuracy and efficiency of the proposed method are verified by computer simulations and real data results.Experimental simulations are based on different real and operational scenarios where the presence of missing and spurious pulses are also considered.So,the simulation results can be of practical significance. 展开更多
关键词 DEINTERLEAVING radar pulse sequences density based clustering pulse descriptor word
在线阅读 下载PDF
Assessment of genetic diversity by simple sequence repeat markers among forty elite varieties in the germplasm for malting barley breeding 被引量:5
9
作者 Jun-mei WANG Jian-ming YANG Jing-huan ZHU Qiao-jun JIA Yue-zhi TAO 《Journal of Zhejiang University-Science B(Biomedicine & Biotechnology)》 SCIE CAS CSCD 2010年第10期792-800,共9页
The genetic diversity and relationship among 40 elite barley varieties were analyzed based on simple sequence repeat (SSR) genotyping data. The amplified fragments from SSR primers were highly polymorphic in the bad... The genetic diversity and relationship among 40 elite barley varieties were analyzed based on simple sequence repeat (SSR) genotyping data. The amplified fragments from SSR primers were highly polymorphic in the badey accessions investigated. A total of 85 alleles were detected at 35 SSR loci, and allelic variations existed at 29 SSR loci. The allele number per locus ranged from 1 to 5 with an average of 2.4 alleles per locus detected from the 40 badey accessions. A cluster analysis based on the genetic similarity coefficients was conducted and the 40 varieties were classified into two groups. Seven malting barley varieties from China fell into the same subgroup. It was found that the genetic diversity within the Chinese malting barley varieties was narrower than that in other barley germplasm sources, suggesting the importance and feasibility of introducing elite genotypes from different origins for malting barley breeding in China. 展开更多
关键词 Barley (Hordeum vu/gare L.) Genetic similarity Simple sequence repeat (SSR) marker cluster analysis Genetic diversity
原文传递
Mitochondrial targeting sequence of magnetoreceptor MagR:More than just targeting 被引量:2
10
作者 Yanqi Zhang Peng Zhang +10 位作者 Junjun Wang Jing Zhang Tianyang Tong Xiujuan Zhou Yajie Zhou Mengke Wei Chuanlin Feng Jinqian Li Xin Zhang Can Xie Tiantian Cai 《Zoological Research》 SCIE CSCD 2024年第3期468-477,共10页
Iron-sulfur clusters(ISC)are essential cofactors for proteins involved in various biological processes,such as electron transport,biosynthetic reactions,DNA repair,and gene expression regulation.ISC assembly protein I... Iron-sulfur clusters(ISC)are essential cofactors for proteins involved in various biological processes,such as electron transport,biosynthetic reactions,DNA repair,and gene expression regulation.ISC assembly protein IscA1(or MagR)is found within the mitochondria of most eukaryotes.Magnetoreceptor(MagR)is a highly conserved A-type iron and iron-sulfur cluster-binding protein,characterized by two distinct types of iron-sulfur clusters,[2Fe-2S]and[3Fe-4S],each conferring unique magnetic properties.MagR forms a rod-like polymer structure in complex with photoreceptive cryptochrome(Cry)and serves as a putative magnetoreceptor for retrieving geomagnetic information in animal navigation.Although the N-terminal sequences of MagR vary among species,their specific function remains unknown.In the present study,we found that the N-terminal sequences of pigeon MagR,previously thought to serve as a mitochondrial targeting signal(MTS),were not cleaved following mitochondrial entry but instead modulated the efficiency with which iron-sulfur clusters and irons are bound.Moreover,the N-terminal region of MagR was required for the formation of a stable MagR/Cry complex.Thus,the N-terminal sequences in pigeon MagR fulfil more important functional roles than just mitochondrial targeting.These results further extend our understanding of the function of MagR and provide new insights into the origin of magnetoreception from an evolutionary perspective. 展开更多
关键词 Magnetoreceptor(MagR) N-terminal sequence Mitochondrial targeting signal Iron-sulfur cluster
在线阅读 下载PDF
Unsupervised Binary Protocol Clustering Based on Maximum Sequential Patterns 被引量:2
11
作者 Jiaxin Shi Lin Ye +1 位作者 Zhongwei Li Dongyang Zhan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第1期483-498,共16页
With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network securi... With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network security.The blockchain uses the P2P protocol to implement various functions across the network.Furthermore,the P2P protocol format of blockchain may differ from the standard format specification,which leads to sniffing tools such as Wireshark and Fiddler not being able to recognize them.Therefore,the ability to distinguish different types of unknown network protocols is vital for network security.In this paper,we propose an unsupervised clustering algorithm based on maximum frequent sequences for binary protocols,which can distinguish various unknown protocols to provide support for analyzing unknown protocol formats.We mine the maximum frequent sequences of protocolmessage sets in bytes.Andwe calculate the fuzzymembership of the protocolmessage to each maximum frequent sequence,which is based on fuzzy set theory.Then we construct the fuzzy membership vector for each protocol message.Finally,we adopt K-means++to split different types of protocol messages into several clusters and evaluate the performance by calculating homogeneity,integrity,and Fowlkes and Mallows Index(FMI).Besides,the clustering algorithms based onNeedleman–Wunsch and the fixed-length prefix are compared with the algorithm presented in this paper.Compared with these traditional clustering methods,we demonstrate a certain improvement in the clustering performance of our work. 展开更多
关键词 Binary protocol blockchain maximum frequent sequence protocol message clustering protocol reverse engineering
在线阅读 下载PDF
Research on the Application of Time Structure Variation Analysis to the Jiashi-Bachu Earthquake Swarm Sequence 被引量:2
12
作者 Yang Xin Long Haiying +1 位作者 Shangguan Wenming Nie Xiaohong 《Earthquake Research in China》 2008年第3期251-264,共14页
In 1997 - 2003, 27 earthquakes with M≥ 5.0 occurred in the Jiashi-Bachu area of Xinjiang. It was a rare strong earthquake swarm activity. The earthquake swarm has three time segments of activity with different magnit... In 1997 - 2003, 27 earthquakes with M≥ 5.0 occurred in the Jiashi-Bachu area of Xinjiang. It was a rare strong earthquake swarm activity. The earthquake swarm has three time segments of activity with different magnitudes in the years 1997, 1998 and 2003. In different time segments, the seismic activity showed strengthenin-qguiet changes in various degrees before earthquakes with M ≥ 5.0. In order to delimitate effectively the precursory meaning of the clustering (strengthening) quiet change in sequence and to seek the time criterion for impending prediction, the nonlinear characteristics of seismic activity have been used to analyze the time structure characteristics of the earthquake swarm sequence, and further to forecast the development tendency of earthquake sequences in the future. Using the sequence catalogue recorded by the Kashi Station, and taking the earthquakes with Ms≥ 5.0 in the sequence as the starting point and the next earthquake with Ms = 5.0 as the end, statistical analysis has been performed on the time structure relations of the earthquake sequence in different stages. The main results are as follows: (1) Before the major earthquakes with M ≥ 5.0 in the swarm sequence, the time variation coefficient (δ-value) has abnormal demonstrations to different degrees. (2) Within 10 days after δ= 1, occurrence of earthquakes with M ≥ 5.0 in the swarm is very possible. (3) The time variation coefficient has three types of change. (4) The change process before earthquakes with M5.0 is similar to that before earthquakes with M6.0, with little difference in the threshold value. In the earthquake swarm sequence, it is difficult to delimitate accurately the attribute of the current sequences (foreshock or aftershock sequence) and to judge the magnitude of the follow-up earthquake by δ-value. We can only make the judgment that earthquakes with M5.0 are likely to occur in the sequence. (5) The critical clustering characteristics of the sequence are hierarchical. Only corresponding to a certain magnitude can the sequence have the variation state of critical clustering. (6) The coefficient of the time variation has a clear meaning in physics. After the clustering-quiet state of earthquake activity has appeared, it can describe clearly the randomness of the seismogenic system. Furthermore, it can efficiently clarify whether or not the clustering quiescence variation is of some prognostic meaning. In the case that the earthquake frequency attenuation is essentially normal (h 〉 1 ) and there is no remarkable clustering-quiescence state, it is still possible to discover the abnormal change of the sequence from the time variation coefficient. On the contrary, in the later period of swarm activity, after the appearance of many seismic quiescence phenomena, this coefficient did not appear abnormally, even when h 〈 1, suggesting that the δ-value diagnosis is more universal. 展开更多
关键词 Time variation coefficient Earthquake clustering RANDOMNESS Time structure ofearthquake sequence Jiashi earthquake swarm
在线阅读 下载PDF
Similarity Measurement of Web Sessions Based on Sequence Alignment
13
作者 LI Chaofeng LU Yansheng 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期814-818,共5页
The task of clustering Web sessions is to group Web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity. The first and foremost question neede... The task of clustering Web sessions is to group Web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity. The first and foremost question needed to be considered in clustering Web sessions is how to measure the similarity between Web sessions. However, there are many shortcomings in traditional measurements. This paper introduces a new method for measuring similarities between Web pages that takes into account not only the URL but also the viewing time of the visited Web page. Then we give a new method to measure the similarity of Web sessions using sequence alignment and the similarity of Web page access in detail Experiments have proved that our method is valid and efficient. 展开更多
关键词 Web usage mining clustering Web session sequence alignment
在线阅读 下载PDF
Spatio-temporal epidemic type aftershock sequence model for Tangshan aftershock sequence
14
作者 Shaochuan Lue Yong Li 《Earthquake Science》 CSCD 2011年第5期401-408,共8页
Shallow earthquakes usually show obvious spatio-temporal clustering patterns. In this study, several spatio-temporal point process models are applied to investigate the clustering characteristics of the well-known Tan... Shallow earthquakes usually show obvious spatio-temporal clustering patterns. In this study, several spatio-temporal point process models are applied to investigate the clustering characteristics of the well-known Tangshan sequence based on classical empirical laws and a few assumptions. The relative fit of competing models is compared by Akalke Information Criterion. The spatial clustering pattern is well characterized by the model which gives the best fit to the data. A simulated aftershock sequence is generated by thinning algorithm and compared with the real seismicity. 展开更多
关键词 spatio-temporal model Tangshan aftershock sequence Laplace type clustering thinning simulation Akaike information criterion
在线阅读 下载PDF
Fault Identification of Power Grid Based on Wide-Area Differential Current and K-Means Clustering
15
作者 Hao Wu Qunzhan Li 《Energy and Power Engineering》 2017年第4期19-29,共11页
A new method of fault domain identification is proposed based on K-means clustering analysis theories using the wide-area information of power grid. In the method, the node Intelligent Electronic Device (IED) associat... A new method of fault domain identification is proposed based on K-means clustering analysis theories using the wide-area information of power grid. In the method, the node Intelligent Electronic Device (IED) associated domain is defined, and the relationship of positive sequence current fault component for the association domain boundaries is sought, then the conception of positive sequence fault component differential current for node IED association domains is introduced. The information of the positive sequence fault component differential current gathered by node IEDs is selected as the object of K-means clustering. The node IEDs of fault associated domains can be classified into one category, and the node IEDs of non-fault associated domains are classified into another category. With the fault area minimum principle, the group of node IEDs about fault associated domains can be obtained. The overlap of fault associated domains for different nodes is the fault area. A large number of simulations show that the algorithm proposed can identify fault domains with high accuracy and no influence by the operating mode of the system and topological changes. 展开更多
关键词 POSITIVE sequence FAULT Component Differential Current K-Means clustering FAULT Association DOMAIN The NODE IED FAULT DOMAIN Identification
暂未订购
Proteome-Based Clustering Approaches Reveal Phylogenetic Insights into Amphistegina
16
作者 Marleen Stuhr Bernhard Blank-Landeshammer +4 位作者 Achim Meyer Vera Baumeister Jörg Rahnenführer Albert Sickmann Hildegard Westphal 《Journal of Earth Science》 SCIE CAS CSCD 2022年第6期1469-1479,共11页
Foraminifera are highly diverse and have a long evolutionary history.As key bioindicators,their phylogenetic schemes are of great importance for paleogeographic applications,but may be hard to recognize correctly.The ... Foraminifera are highly diverse and have a long evolutionary history.As key bioindicators,their phylogenetic schemes are of great importance for paleogeographic applications,but may be hard to recognize correctly.The phylogenetic relationships within the prominent genus Amphistegina are still uncertain.Molecular studies on Amphistegina have so far only focused on genetic diversity within single species and suggested a cryptic diversity that demands for further investigations.Besides molecular sequencing-based approaches,different mass spectrometry-based proteomics approaches are increasingly used to give insights into the relationship between samples and organisms,especially as these do not require reference databases.To better understand the relationship of amphisteginids and test different proteomics-based approaches we applied de novo peptide sequencing and similarity clustering to several populations of Amphistegina lobifera,A.lessonii and A.gibbosa.We also analyzed the dominant photosymbiont community to study their influence on holobiont proteomes.Our analyses indicate that especially de novo peptide sequencing allows to reconstruct the relationship among foraminiferal holobionts,although the detected separation of A.gibbosa from A.lessonii and A.lobifera may be partly influenced by their different photosymbiont types.The resulting dendrograms reflect the separation in two lineages previously suggested and provide a basis for future studies. 展开更多
关键词 large benthic foraminifera de novo peptide sequencing tandem mass spectra clustering LC-MS/MS runs proteomics symbiont diversity PHYLOGEOGRAPHY Fragilariales
原文传递
Complete Genome Sequence of <i>Bacillus thuringiensis</i>Serovar <i>coreanensis</i>ST7 with Toxicity to Human Cancer Cells
17
作者 Jing Zhang Yiping Liu +3 位作者 Rui Liu Xu Liu Baoli Zhang Jun Zhu 《Advances in Microbiology》 2020年第12期673-680,共8页
<i>Bacillus thuringiensis</i> (Bt) parasporal crystal proteins were well known to be toxic to certain insects and cytocidal activity against various human cancer cells. Bt serovar <i>coreanensis</... <i>Bacillus thuringiensis</i> (Bt) parasporal crystal proteins were well known to be toxic to certain insects and cytocidal activity against various human cancer cells. Bt serovar <i>coreanensis</i> ST7, non-pathogenic to insects and non-hemolytic, has an important parasporin, PS4Aa1 (Cry45Aa1), with potential toxicity to human cancer cells. In this study, we reported the feature of complete genome sequence and the cluster of orthologous groups of proteins function classification of ST7. Meanwhile, the evolutionary of ST7 was also studied. The genome data of ST7 will strongly contribute to a better understanding of the genomic diversity and evolution, and enrich the Bt genome database. 展开更多
关键词 Bacillus thuringiensis Serovar Coreanensis ST7 Gapless Chromosome Plasmid sequences Genomic Feature cluster of Orthologous Groups of Proteins
暂未订购
An Efficient Agglomerative Clustering Algorithm for Web Navigation Pattern Identification
18
作者 A. Anitha 《Circuits and Systems》 2016年第9期2349-2356,共9页
Web log mining is analysis of web log files with web page sequences. Discovering user access patterns from web access are necessary for building adaptive web servers, to improve e-commerce, to carry out cross-marketin... Web log mining is analysis of web log files with web page sequences. Discovering user access patterns from web access are necessary for building adaptive web servers, to improve e-commerce, to carry out cross-marketing, for web personalization, to predict web access sequence etc. In this paper, a new agglomerative clustering technique is proposed to identify users with similar interest, and to determine the motivation for visiting a website. Using this approach, web usage mining is done through different stages namely data cleaning, preprocessing, pattern discovery and pattern analysis. Results are given to explain how this approach produces tight usage clusters than the existing web usage mining techniques. Rather than traditional distance based clustering, the similarity measure is considered during clustering process in order to reduce computational complexity. This paper also deals with the problem of assessing the quality of user session clusters and cluster validity is measured by using statistical test, which measures the distances of clusters distributions to infer their dissimilarity and distinguish level. Using such statistical measures, it is proved that cluster accuracy is improved to the extent of 0.83, over existing k-means clustering with validity measure 0.26, FCM (Fuzzy C Means) clustering with validity measure 0.56. Rough set based clustering with validity measure 0.54 Generation of dense clusters is essential for finding interesting patterns needed for further mining and analysis. 展开更多
关键词 Agglomerative clustering Similarity Measure cluster Validity Clickstream sequence TRANSACTION
在线阅读 下载PDF
Research on the Source Parameters and Correlation Coefficients of Focal Mechanisms for the Yingjiang Earthquake Sequences in 2008
19
作者 Deng Fei Liu Jie 《Earthquake Research in China》 CSCD 2015年第1期68-83,共16页
The source parameters of the Yingjiang earthquake sequences in 2008 are obtained by applying spectral analysis and Brunes source model,based on the digital waveform data recorded by the Yunnan Digital Seismic Network.... The source parameters of the Yingjiang earthquake sequences in 2008 are obtained by applying spectral analysis and Brunes source model,based on the digital waveform data recorded by the Yunnan Digital Seismic Network.The correlation coefficients are calculated using the low-frequency spectral amplitudes of 2 events recorded by a same station,then,events with similar focal mechanism are grouped using the clustering analysis method.Compared to the obtained focal mechanisms,it is found that there are good correlations with the azimuth of P axes in each clustering group,and the larger the correlation coefficient,the closer the azimuths of P axes.We divide the Yingjiang area into 3 regions to analyze the stress level and stress direction by combining the source parameters and the mean focal mechanism of each group.The results show:The change and transformation of the focal mechanism types at different stages can represent the temporal characteristics of the regional stress field.If the earthquake focal mechanism types are concentrated in a time period and switch to the direction of regional stress field,it may be a sign of strong earthquake.There is some relationship between the stress drop and the type of focal mechanism.Those earthquakes with stress fields revealed by focal mechanism types closer to the regional tectonic stress field will have higher stress drop,while those with the focal mechanism-revealed stress fields differing a lot from the regional tectonic stress field will generally have a lower stress drop. 展开更多
关键词 Stress drop Correlation of focal mechanisms clustering group Yingjiangearthquake sequences in 2008
在线阅读 下载PDF
上一页 1 2 38 下一页 到第
使用帮助 返回顶部