期刊文献+
共找到4,746篇文章
< 1 2 238 >
每页显示 20 50 100
Semi-supervised Affinity Propagation Clustering Based on Subtractive Clustering for Large-Scale Data Sets
1
作者 Qi Zhu Huifu Zhang Quanqin Yang 《国际计算机前沿大会会议论文集》 2015年第1期76-77,共2页
In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore,... In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed. 展开更多
关键词 subtractive CLUSTERING INITIAL cluster AFFINITY propagation CLUSTERING SEMI-SUPERVISED CLUSTERING large-scale data setS
在线阅读 下载PDF
Large-scale spatial data visualization method based on augmented reality
2
作者 Xiaoning QIAO Wenming XIE +4 位作者 Xiaodong PENG Guangyun LI Dalin LI Yingyi GUO Jingyi REN 《虚拟现实与智能硬件(中英文)》 EI 2024年第2期132-147,共16页
Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for rese... Background A task assigned to space exploration satellites involves detecting the physical environment within a certain space.However,space detection data are complex and abstract.These data are not conducive for researchers'visual perceptions of the evolution and interaction of events in the space environment.Methods A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time,and the corresponding relationships between data location features and other attribute features were established.A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data.The visualization process is optimized for rendering by merging materials,reducing the number of patches,and performing other operations.Results The results of sampling,feature extraction,and uniform visualization of the detection data of complex types,long duration spans,and uneven spatial distributions were obtained.The real-time visualization of large-scale spatial structures using augmented reality devices,particularly low-performance devices,was also investigated.Conclusions The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space,express the structure and changes in the spatial environment using augmented reality,and assist in intuitively discovering spatial environmental events and evolutionary rules. 展开更多
关键词 large-scale spatial data analysis Visual analysis technology Augmented reality 3D reconstruction Space environment
在线阅读 下载PDF
AOL4PS:A Large-scale Data Set for Personalized Search
3
作者 Qian Guo Wei Chen Huaiyu Wan 《Data Intelligence》 EI 2021年第4期548-567,共20页
Personalized search is a promising way to improve the quality of Websearch,and it has attracted much attention from both academic and industrial communities.Much of the current related research is based on commercial ... Personalized search is a promising way to improve the quality of Websearch,and it has attracted much attention from both academic and industrial communities.Much of the current related research is based on commercial search engine data,which can not be released publicly for such reasons as privacy protection and information security.This leads to a serious lack of accessible public data sets in this field.The few publicly available data sets have not become widely used in academia because of the complexity of the processing process required to study personalized search methods.The lack of data sets together with the difficulties of data processing has brought obstacles to fair comparison and evaluation of personalized search models.In this paper,we constructed a large-scale data set AOL4 PS to evaluate personalized search methods,collected and processed from AOL query logs.We present the complete and detailed data processing and construction process.Specifically,to address the challenges of processing time and storage space demands brought by massive data volumes,we optimized the process of data set construction and proposed an improved BM25 algorithm.Experiments are performed on AOL4 PS with some classic and state-of-the-art personalized search methods,and the experiment results demonstrate that AOL4 PS can measure the effect of personalized search models. 展开更多
关键词 Personalized search Text data processing data set construction
原文传递
Incidence and Survivability of Acute Lymphocytic Leukemia Patients in the United States: Analysis of SEER Data Set from 2000-2019
4
作者 Ishan Ghosh Sudipto Mukherjee 《Journal of Cancer Therapy》 2024年第4期141-163,共23页
The main goal of this research is to assess the impact of race, age at diagnosis, sex, and phenotype on the incidence and survivability of acute lymphocytic leukemia (ALL) among patients in the United States. By takin... The main goal of this research is to assess the impact of race, age at diagnosis, sex, and phenotype on the incidence and survivability of acute lymphocytic leukemia (ALL) among patients in the United States. By taking these factors into account, the study aims to explore how existing cancer registry data can aid in the early detection and effective treatment of ALL in patients. Our hypothesis was that statistically significant correlations exist between race, age at which patients were diagnosed, sex, and phenotype of the ALL patients, and their rate of incidence and survivability data were evaluated using SEER*Stat statistical software from National Cancer Institute. Analysis of the incidence data revealed that a higher prevalence of ALL was among the Caucasian population. The majority of ALL cases (59%) occurred in patients aged between 0 to 19 years at the time of diagnosis, and 56% of the affected individuals were male. The B-cell phenotype was predominantly associated with ALL cases (73%). When analyzing survivability data, it was observed that the 5-year survival rates slightly exceeded the 10-year survival rates for the respective demographics. Survivability rates of African Americans patients were the lowest compared to Caucasian, Asian, Pacific Islanders, Alaskan Native, Native Americans and others. Survivability rates progressively decreased for older patients. Moreover, this study investigated the typical treatment methods applied to ALL patients, mainly comprising chemotherapy, with occasional supplementation of radiation therapy as required. The study demonstrated the considerable efficacy of chemotherapy in enhancing patients’ chances of survival, while those who remained untreated faced a less favorable prognosis from the disease. Although a significant amount of data and information exists, this study can help doctors in the future by diagnosing patients with certain characteristics. It will further assist the health care professionals in screening potential patients and early detection of cases. This could also save the lives of elderly patients who have a higher mortality rate from this disease. 展开更多
关键词 Acute Lymphocytic Leukemia SURVIVABILITY INCIDENCE DEMOGRAPHY SEER data set
暂未订购
Regularized focusing inversion for large-scale gravity data based on GPU parallel computing
5
作者 WANG Haoran DING Yidan +1 位作者 LI Feida LI Jing 《Global Geology》 2019年第3期179-187,共9页
Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes... Processing large-scale 3-D gravity data is an important topic in geophysics field. Many existing inversion methods lack the competence of processing massive data and practical application capacity. This study proposes the application of GPU parallel processing technology to the focusing inversion method, aiming at improving the inversion accuracy while speeding up calculation and reducing the memory consumption, thus obtaining the fast and reliable inversion results for large complex model. In this paper, equivalent storage of geometric trellis is used to calculate the sensitivity matrix, and the inversion is based on GPU parallel computing technology. The parallel computing program that is optimized by reducing data transfer, access restrictions and instruction restrictions as well as latency hiding greatly reduces the memory usage, speeds up the calculation, and makes the fast inversion of large models possible. By comparing and analyzing the computing speed of traditional single thread CPU method and CUDA-based GPU parallel technology, the excellent acceleration performance of GPU parallel computing is verified, which provides ideas for practical application of some theoretical inversion methods restricted by computing speed and computer memory. The model test verifies that the focusing inversion method can overcome the problem of severe skin effect and ambiguity of geological body boundary. Moreover, the increase of the model cells and inversion data can more clearly depict the boundary position of the abnormal body and delineate its specific shape. 展开更多
关键词 large-scale gravity data GPU parallel computing CUDA equivalent geometric TRELLIS FOCUSING INVERSION
在线阅读 下载PDF
Trend Analysis of Large-Scale Twitter Data Based on Witnesses during a Hazardous Event: A Case Study on California Wildfire Evacuation
6
作者 Syed A. Morshed Khandakar Mamun Ahmed +1 位作者 Kamar Amine Kazi Ashraf Moinuddin 《World Journal of Engineering and Technology》 2021年第2期229-239,共11页
Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effectiv... Social media data created a paradigm shift in assessing situational awareness during a natural disaster or emergencies such as wildfire, hurricane, tropical storm etc. Twitter as an emerging data source is an effective and innovative digital platform to observe trend from social media users’ perspective who are direct or indirect witnesses of the calamitous event. This paper aims to collect and analyze twitter data related to the recent wildfire in California to perform a trend analysis by classifying firsthand and credible information from Twitter users. This work investigates tweets on the recent wildfire in California and classifies them based on witnesses into two types: 1) direct witnesses and 2) indirect witnesses. The collected and analyzed information can be useful for law enforcement agencies and humanitarian organizations for communication and verification of the situational awareness during wildfire hazards. Trend analysis is an aggregated approach that includes sentimental analysis and topic modeling performed through domain-expert manual annotation and machine learning. Trend analysis ultimately builds a fine-grained analysis to assess evacuation routes and provide valuable information to the firsthand emergency responders<span style="font-family:Verdana;">.</span> 展开更多
关键词 WILDFIRE EVACUATION TWITTER large-scale data Topic Model Sentimental Analysis Trend Analysis
在线阅读 下载PDF
Fast3VmrMLM:A fast algorithm that integrates genome-wide scanning with machine learning to accelerate gene mining and breeding by design for polygenic traits in large-scale GWAS datasets
7
作者 Jingtian Wang Ying Chen +6 位作者 Guoping Shu Miaomiao Zhao Ao Zheng Xiaoyu Chang Guiqi Li Yibo Wang Yuan-Ming Zhang 《Plant Communications》 2025年第7期42-56,共15页
Genetic dissection and breeding by design for polygenic traits remain substantial challenges.To ad-dress these challenges,it is important to identify as many genes as possible,including key regulatory genes.Here,we de... Genetic dissection and breeding by design for polygenic traits remain substantial challenges.To ad-dress these challenges,it is important to identify as many genes as possible,including key regulatory genes.Here,we developed a genome-wide scanning plus machine learning framework,integrated with advanced computational techniques,to propose a novel algorithm named Fast3VmrMLM.This algo-rithm aims to enhance the identification of abundant and key genes for polygenic traits in the era of big data and artificial intelligence.The algorithm was extended to identify haplotype(Fast3VmrMLM-Hap)and molecular(Fast3VmrMLM-mQTL)variants.In simulation studies,Fast3VmrMLM outperformed existing methods in detecting dominant,small,and rare variants,requiring only 3.30 and 5.43 h(20 threads)to analyze the 18K rice and UK Biobank-scale datasets,respectively.Fast3VmrMLM identified more known(211)and candidate(384)genes for 14 traits in the 18K rice dataset than FarmCPU(100 known genes).Additionally,it identified 26 known and 24 candidate genes for seven yield-related traits in a maize NC II design;Fast3VmrMLM-mQTL identified two known soybean genes near structural variants.We demonstrated that this novel two-step framework outperformed genome-wide scanning alone.In breeding by design,a genetic network constructed via machine learning using all known and candidate genes identified in this study revealed 21 key genes associated with rice yield-related traits.All associated markers yielded high prediction accuracies in rice(0.7443)and maize(0.8492),en-abling the development of superior hybrid combinations.A new breeding-by-design strategy based on the identified key genes was also proposed.This study provides an effective method for gene mining and breeding by design. 展开更多
关键词 Fast3VmrMLM machine learning large-scale data polygenic trait efficient gene mining breeding by design
原文传递
3D Seismic Data Reconstruction based on Weighted Fast Iterative Shrinkage Thresholding algorithm
8
作者 Zhang Hua Qiu Da-Xing +3 位作者 Mo Zi-Fen Hao Ya-Ju Wu Zhao-Qi Dai Meng-Xue 《Applied Geophysics》 2025年第1期22-34,231,232,共15页
Data reconstruction is a crucial step in seismic data preprocessing.To improve reconstruction speed and save memory,the commonly used three-dimensional(3D)seismic data reconstruction method divides the missing data in... Data reconstruction is a crucial step in seismic data preprocessing.To improve reconstruction speed and save memory,the commonly used three-dimensional(3D)seismic data reconstruction method divides the missing data into a series of time slices and independently reconstructs each time slice.However,when this strategy is employed,the potential correlations between two adjacent time slices are ignored,which degrades reconstruction performance.Therefore,this study proposes the use of a two-dimensional curvelet transform and the fast iterative shrinkage thresholding algorithm for data reconstruction.Based on the significant overlapping characteristics between the curvelet coefficient support sets of two adjacent time slices,a weighted operator is constructed in the curvelet domain using the prior support set provided by the previous reconstructed time slice to delineate the main energy distribution range,eff ectively providing prior information for reconstructing adjacent slices.Consequently,the resulting weighted fast iterative shrinkage thresholding algorithm can be used to reconstruct 3D seismic data.The processing of synthetic and field data shows that the proposed method has higher reconstruction accuracy and faster computational speed than the conventional fast iterative shrinkage thresholding algorithm for handling missing 3D seismic data. 展开更多
关键词 data reconstruction fast iterative shrinkage thresholding prior support set weighted operator
在线阅读 下载PDF
Multi-View Picture Fuzzy Clustering:A Novel Method for Partitioning Multi-View Relational Data
9
作者 Pham Huy Thong Hoang Thi Canh +2 位作者 Luong Thi Hong Lan Nguyen Tuan Huy Nguyen Long Giang 《Computers, Materials & Continua》 2025年第6期5461-5485,共25页
Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy cl... Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications. 展开更多
关键词 Multi-view clustering picture fuzzy sets dual anchor graph fuzzy clustering multi-view relational data
在线阅读 下载PDF
Question classification in question answering based on real-world web data sets
10
作者 袁晓洁 于士涛 +1 位作者 师建兴 陈秋双 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期272-275,共4页
To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,t... To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance. 展开更多
关键词 question classification question answering real-world web data sets question and answer web forums re-ranking model
在线阅读 下载PDF
Reconstruction of incomplete satellite SST data sets based on EOF method 被引量:2
11
作者 DING Youzhuan WEI Zhihui +2 位作者 MAO Zhihua WANG Xiaofei PAN Delu 《Acta Oceanologica Sinica》 SCIE CAS CSCD 2009年第2期36-44,共9页
As for the satellite remote sensing data obtained by the visible and infrared bands myers,on, the clouds coverage in the sky over the ocean often results in missing data of inversion products on a large scale, and thi... As for the satellite remote sensing data obtained by the visible and infrared bands myers,on, the clouds coverage in the sky over the ocean often results in missing data of inversion products on a large scale, and thin clouds difficult to be detected would cause the data of the inversion products to be abnormal. Alvera et a1.(2005) proposed a method for the reconstruction of missing data based on an Empirical Orthogonal Functions (EOF) decomposition, but his method couldn't process these images presenting extreme cloud coverage(more than 95%), and required a long time for recon- struction. Besides, the abnormal data in the images had a great effect on the reconstruction result. Therefore, this paper tries to improve the study result. It has reconstructed missing data sets by twice applying EOF decomposition method. Firstly, the abnormity time has been detected by analyzing the temporal modes of EOF decomposition, and the abnormal data have been eliminated. Secondly, the data sets, excluding the abnormal data, are analyzed by using EOF decomposition, and then the temporal modes undergo a filtering process so as to enhance the ability of reconstruct- ing the images which are of no or just a little data, by using EOF. At last, this method has been applied to a large data set, i.e. 43 Sea Surface Temperature (SST) satellite images of the Changjiang River (Yangtze River) estuary and its adjacent areas, and the total reconstruction root mean square error (RMSE) is 0.82℃. And it has been proved that this improved EOF reconstruction method is robust for reconstructing satellite missing data and unreliable data. 展开更多
关键词 EOF SST Changjiang River estuary Missing data sets
在线阅读 下载PDF
Traffic Flow Data Forecasting Based on Interval Type-2 Fuzzy Sets Theory 被引量:5
12
作者 Runmei Li Chaoyang Jiang +1 位作者 Fenghua Zhu Xiaolong Chen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI 2016年第2期141-148,共8页
This paper proposes a long-term forecasting scheme and implementation method based on the interval type-2 fuzzy sets theory for traffic flow data. The type-2 fuzzy sets have advantages in modeling uncertainties becaus... This paper proposes a long-term forecasting scheme and implementation method based on the interval type-2 fuzzy sets theory for traffic flow data. The type-2 fuzzy sets have advantages in modeling uncertainties because their membership functions are fuzzy. The scheme includes traffic flow data preprocessing module, type-2 fuzzification operation module and long-term traffic flow data forecasting output module, in which the Interval Approach acts as the core algorithm. The central limit theorem is adopted to convert point data of mass traffic flow in some time range into interval data of the same time range (also called confidence interval data) which is being used as the input of interval approach. The confidence interval data retain the uncertainty and randomness of traffic flow, meanwhile reduce the influence of noise from the detection data. The proposed scheme gets not only the traffic flow forecasting result but also can show the possible range of traffic flow variation with high precision using upper and lower limit forecasting result. The effectiveness of the proposed scheme is verified using the actual sample application. © 2014 Chinese Association of Automation. 展开更多
关键词 data handling Forecasting Fuzzy sets Membership functions Uncertainty analysis
在线阅读 下载PDF
An Evaluation of the Reliability of Complex Systems Using Shadowed Sets and Fuzzy Lifetime Data 被引量:3
13
作者 Olgierd Hryniewicz 《International Journal of Automation and computing》 EI 2006年第2期145-150,共6页
In this paper, we consider the problem of the evaluation of system reliability using statistical data obtained from reliability tests of its elements, in which the lifetimes of elements are described using an exponent... In this paper, we consider the problem of the evaluation of system reliability using statistical data obtained from reliability tests of its elements, in which the lifetimes of elements are described using an exponential distribution. We assume that this lifetime data may be reported imprecisely and that this lack of precision may be described using fuzzy sets. As the direct application of the fuzzy sets methodology leads in this case to very complicated and time consuming calculations, we propose simple approximations of fuzzy numbers using shadowed sets introduced by Pedrycz (1998). The proposed methodology may be simply extended to the case of general lifetime probability distributions. 展开更多
关键词 Estimation of reliability fuzzy reliability data shadowed sets.
在线阅读 下载PDF
A generalized rough set-based information flling technique for failure analysis of thruster experimental data 被引量:1
14
作者 Han Shan Zhu Qiang +1 位作者 Li Jianxun Chen Lin 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2013年第5期1182-1194,共13页
Interval-valued data and incomplete data are two key problems for failure analysis of thruster experimental data and have been basically solved by the proposed methods in this paper. Firstly, information data acquired... Interval-valued data and incomplete data are two key problems for failure analysis of thruster experimental data and have been basically solved by the proposed methods in this paper. Firstly, information data acquired from the simulation and evaluation system formed as intervalvalued information system (IIS) is classified by the interval similarity relation. Then, as an improvement of the classical rough set, a new kind of generalized information entropy called "H'-information entropy" is suggested for the measurement of uncertainty and the classification ability of IIS. There is an innovative information filling technique using the properties of H'-information entropy to replace missing data by some smaller estimation intervals. Finally, an improved method of failure analysis synthesized by the above achievements is presented to classify the thruster experimental data, complete the information, and extract the failure rules. The feasibility and advantage of this method is testified by an actual application of failure analysis, whose performance is evaluated by the quantification of E-condition entropy. 展开更多
关键词 data acquisition data classification Failure analysis Information filling Rough set
原文传递
A Direct Data-Cluster Analysis Method Based on Neutrosophic Set Implication 被引量:2
15
作者 Sudan Jha Gyanendra Prasad Joshi +2 位作者 Lewis Nkenyereya Dae Wan Kim Florentin Smarandache 《Computers, Materials & Continua》 SCIE EI 2020年第11期1203-1220,共18页
Raw data are classified using clustering techniques in a reasonable manner to create disjoint clusters.A lot of clustering algorithms based on specific parameters have been proposed to access a high volume of datasets... Raw data are classified using clustering techniques in a reasonable manner to create disjoint clusters.A lot of clustering algorithms based on specific parameters have been proposed to access a high volume of datasets.This paper focuses on cluster analysis based on neutrosophic set implication,i.e.,a k-means algorithm with a threshold-based clustering technique.This algorithm addresses the shortcomings of the k-means clustering algorithm by overcoming the limitations of the threshold-based clustering algorithm.To evaluate the validity of the proposed method,several validity measures and validity indices are applied to the Iris dataset(from the University of California,Irvine,Machine Learning Repository)along with k-means and threshold-based clustering algorithms.The proposed method results in more segregated datasets with compacted clusters,thus achieving higher validity indices.The method also eliminates the limitations of threshold-based clustering algorithm and validates measures and respective indices along with k-means and threshold-based clustering algorithms. 展开更多
关键词 data clustering data mining neutrosophic set K-MEANS validity measures cluster-based classification hierarchical clustering
在线阅读 下载PDF
Frequent item sets mining from high-dimensional dataset based on a novel binary particle swarm optimization 被引量:2
16
作者 张中杰 黄健 卫莹 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第7期1700-1708,共9页
A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial partic... A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial particles was designed to ensure the reasonable initial fitness, and then, the dynamically dimensionality cutting of dataset was built to decrease the search space. Based on four high-dimensional datasets, BPSO-HD was compared with Apriori to test its reliability, and was compared with the ordinary BPSO and quantum swarm evolutionary(QSE) to prove its advantages. The experiments show that the results given by BPSO-HD is reliable and better than the results generated by BPSO and QSE. 展开更多
关键词 data mining frequent item sets particle swarm optimization
在线阅读 下载PDF
A Generalized Rough Set Approach to Attribute Generalization in Data Mining 被引量:4
17
作者 李天瑞 徐扬 《Journal of Modern Transportation》 2000年第1期69-75,共7页
This paper presents a generalized method for updating approximations of a concept incrementally, which can be used as an effective tool to deal with dynamic attribute generalization. By combining this method and the L... This paper presents a generalized method for updating approximations of a concept incrementally, which can be used as an effective tool to deal with dynamic attribute generalization. By combining this method and the LERS inductive learning algorithm, it also introduces a generalized quasi incremental algorithm for learning classification rules from data bases. 展开更多
关键词 rough set data mining inductive learning
在线阅读 下载PDF
Influence of image data set noise on classification with a convolutional network 被引量:2
18
作者 Wei Tao Shuai Liguo Zhang Yulu 《Journal of Southeast University(English Edition)》 EI CAS 2019年第1期51-56,共6页
To evaluate the influence of data set noise, the network in network(NIN) model is introduced and the negative effects of different types and proportions of noise on deep convolutional models are studied. Different typ... To evaluate the influence of data set noise, the network in network(NIN) model is introduced and the negative effects of different types and proportions of noise on deep convolutional models are studied. Different types and proportions of data noise are added to two reference data sets, Cifar-10 and Cifar-100. Then, this data containing noise is used to train deep convolutional models and classify the validation data set. The experimental results show that the noise in the data set has obvious adverse effects on deep convolutional network classification models. The adverse effects of random noise are small, but the cross-category noise among categories can significantly reduce the recognition ability of the model. Therefore, a solution is proposed to improve the quality of the data sets that are mixed into a single noise category. The model trained with a data set containing noise is used to evaluate the current training data and reclassify the categories of the anomalies to form a new data set. Repeating the above steps can greatly reduce the noise ratio, so the influence of cross-category noise can be effectively avoided. 展开更多
关键词 image recognition data set noise deep convolutional network filtering of cross-category noise
在线阅读 下载PDF
Are China's Classes Predominantly Centered Around Teacher-Presentation Instruction?——A Large-Scale Data Analysis Based on Classroom Intelligent Analysis Systems
19
作者 Yihe Gao Xiaozhe Yang 《ECNU Review of Education》 2025年第2期349-355,共7页
Systemarchitecture The Intelligent Teaching Team of the Shanghai Institute(Laboratory)of AI Education and the Institute of Curriculum and Instruction of East China Normal University collaborated to develop the High-Qu... Systemarchitecture The Intelligent Teaching Team of the Shanghai Institute(Laboratory)of AI Education and the Institute of Curriculum and Instruction of East China Normal University collaborated to develop the High-Quality Classroom Intelligent Analysis Standard system.This system was measured from the dimensions of Class Eficiency,Equity and Democracy,referred to as CEED system. 展开更多
关键词 A large-scale data analysis Chinese class classroom intelligent analysis systems
原文传递
上一页 1 2 238 下一页 到第
使用帮助 返回顶部