The China’s Conversion Cropland to Forest Program(CCFP)is one of the largest national ecological construction programs,which has effectively improved ecological environment and produced large ecological benefi t.To p...The China’s Conversion Cropland to Forest Program(CCFP)is one of the largest national ecological construction programs,which has effectively improved ecological environment and produced large ecological benefi t.To provide references for further improving ecological benefi t of CCFP,we analyzed the features,differences and relationships of the categorized forest ecological“benefi t value”(B-V)s in 3 kinds of forest restoration ways in different regions in CCFP,using the data of Chinese Forest Ecosystem Research Network(CFERN)from 1999 to 2013 and the methods of the national standards of(LY/T1606-2003),(LY/T1721-2008)and(LY/T1952-2011).The result showed that annual B-Vs of unit area varied in the range of 3.5-10.0 e4 RMBs/hm2·a.Water conservation B-Vs and species conservation B-Vs are the 2 largest constituents,nutrient accumulation B-V was the least in total B-Vs.The B-Vs performed inconsistently among the forest restoration ways and different regions.The rank of average annual total B-Vs of unit area from high to low was“hillside forest conservation”,“returning cropland to forest”,“afforestation on suitable barren hills and wasteland”.Species conservation B-Vs and water conservation B-Vs in southern regions were higher than that of northern and northwestern regions in China.The hot and rainy regions could produce higher species conservation B-Vs.The regression analysis indicated that water conservation B-Vs had signifi cantly positive correlation with the relevant total B-Vs and positive correlation with the relevant atmosphere purification B-Vs whether in regional or in unit area scale.Unit area species conservation B-V was negatively correlated with the relevant nutrient accumulation B-Vs except the way of“afforestation on suitable barren hills and wasteland”.Regional total species conservation B-Vs had signifi cantly negative correlation with its relevant nutrient accumulation B-Vs except“hillside forest conservation”way.We suggest that suitable forest restoration ways must be selective according to the regional specifi c,B-V features and local ecological goals.展开更多
Don't be surprisod if someone tolls you that you can win a world championship by playing online games.Competitive online gaming,or eSports,has debuted as a demonstration sport at the 18th Asian Games held in Jakarta.
Objective:To explore the complex prescription compatibility law of the cold and hot nature of Mahuang Decoction(麻黄汤,MHD) and Maxing Shigan Decoction(麻杏石甘汤,MXSGD),both categorized formulas but with differe...Objective:To explore the complex prescription compatibility law of the cold and hot nature of Mahuang Decoction(麻黄汤,MHD) and Maxing Shigan Decoction(麻杏石甘汤,MXSGD),both categorized formulas but with different hot/cold natures.Methods:Oxygen consumption of mice was determined among three groups:MHD,MXSGD and the control;a cold-hot pad differentiating assay was used to observe the variability of temperature tropism among the groups of mice which was treated with MHD,MXSGD,and their compositions. Meanwhile,the total anti-oxidant capability(T-AOC) activity were detected.Results:After administration of MHD, the mice showed increased oxygen consumption(P0.01).Compared with MHD group,the remaining rate of MXSGD mice on the hot pad was found to be significantly increased with the cold-hot pad differentiating assay (P0.05).There was no significant difference(P0.05) among the remaining rates of MXSGD,MXSGD with high dose Gypsum Fibrosum(MXHGF) group,and MXSGD with low dose Gypsum Fibrosum(MXLGF) group mice.Compared with the MHD group,T-AOC activity of the mice in the Consensus Compositons group was significantly decreased(P=0.0494).Compared with the MXSGD group,T-AOC activity of Gypsum Fibrosum (GF) group was increased significantly(P=0.0013).Conclusions:The differences in cold and hot nature could be represented objectively between MHD with a hot nature and MXSGD with a cold nature.The reason may be the Gypsum Fibrosum which decreased the efficacy of the consensus compositions.However,increasing or decreasing the dose of Gypsum Fibrosum will not change the cold and hot nature of MXSGD.展开更多
Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates.However,the neural mechanisms of coding the similarity of diffe...Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates.However,the neural mechanisms of coding the similarity of different individuals’bodies of the same species to support their categorical representations remain unclear.Using electroencephalography(EEG)and magnetoencephalography(MEG),we investigated the temporal and spatial characteristics of neural processes shared by different individual body silhouettes of the same species by quantifying the repetition suppression of neural responses to human and animal(chimpanzee,dog,and bird)body silhouettes showing different postures.Our EEG results revealed significant repetition suppression of the amplitudes of early frontal/central activity at 180–220 ms(P2)and late occipitoparietal activity at 220–320 ms(P270)in response to animal(but not human)body silhouettes of the same species.Our MEG results further localized the repetition suppression effect related to animal body silhouettes in the left supramarginal gyrus and left frontal cortex at 200–440 ms after stimulus onset.Our findings suggest two neural processes that are involved in spontaneous categorical representations of animal body silhouettes as a cognitive basis of human-animal interactions.展开更多
The sparsity of ground gauges poses a significant challenge for evaluating and merging satellite-based and reanalysis-based precipitation datasets in lake regions.While the standard triple collocation(TC)method offers...The sparsity of ground gauges poses a significant challenge for evaluating and merging satellite-based and reanalysis-based precipitation datasets in lake regions.While the standard triple collocation(TC)method offers a solution without access to ground-based observations,it fails to address rain/no-rain classification and its suitability for assessing and merging lake precipitation has not been explored.This study combines categorical triple collocation(CTC)with standard TC to create an integrated framework(CTC-TC)tailored to evaluate and merge global gridded precipitation products(GPPs).We assess the efficacy of CTC-TC using six GPPs(ERA5-Land,SM2 RAIN-ASCAT,IMERG-Early,IMERG-Late,GSMaPMVK,and PERSIANN-CCS)across the five largest freshwater lakes in China.CTC-TC effectively captures the spatial patterns of metrics for all GPPs,and precisely estimates the correlation coefficient and root mean square error for satellite-based datasets apart from SM2 RAIN-ASCAT,but overestimates the classification accuracy indicator V for all GPPs.Regarding multi-source fusion,CTC-TC leverages the strengths of individual products of triplets,resulting in significant improvements in the critical success index(CSI)by over 11.9%and the modified Kling-Gupta efficiency(KGE')by more than 13.3%.Compared to baseline models,including standard TC,simple model averaging,one outlier removal,and Bayesian model averaging,CTC-TC achieves gains in CSI and KGE'of no less than 24.7%and 3.6%,respectively.In conclusion,the CTC-TC framework offers a thorough evaluation and efficient fusion of GPPs,addressing both categorical and continuous accuracy in data-scarce regions such as lakes.展开更多
Anomaly detection is an important research area in a diverse range of real-world applications.Although many algorithms have been proposed to address anomaly detection for numerical datasets,categorical and mixed datas...Anomaly detection is an important research area in a diverse range of real-world applications.Although many algorithms have been proposed to address anomaly detection for numerical datasets,categorical and mixed datasets remain a significant challenge,primarily because a natural distance metric is lacking.Consequently,the methods proposed in the literature implement entirely different assumptions regarding the definition of cate-gorical anomalies.This paper presents a novel categorical anomaly detection approach,offering two key con-tributions to existing methods.First,a novel surprisal-based anomaly score is introduced,which provides a more accurate assessment of anomalies by considering the full distribution of categorical values.Second,the proposed method considers complex correlations in the data beyond the pairwise interactions of features.This study proposed and tested the novel categorical surprisal anomaly detection algorithm(CSAD)by comparing and evaluating it against six competitors.The experimental results indicate that CSAD produced the best overall performance,achieving the highest average ROC-AUC and PR-AUC values of 0.8 and 0.443,respectively.Furthermore,CSAD's execution time is satisfactory even when processing large,high-dimensional datasets.展开更多
Improving early diagnosis of autism spectrum disorder(ASD)in children increasingly relies on predictive models that are reliable and accessible to non-experts.This study aims to develop such models using Python-based ...Improving early diagnosis of autism spectrum disorder(ASD)in children increasingly relies on predictive models that are reliable and accessible to non-experts.This study aims to develop such models using Python-based tools to improve ASD diagnosis in clinical settings.We performed exploratory data analysis to ensure data quality and identify key patterns in pediatric ASD data.We selected the categorical boosting(CatBoost)algorithm to effectively handle the large number of categorical variables.We used the PyCaret automated machine learning(AutoML)tool to make the models user-friendly for clinicians without extensive machine learning expertise.In addition,we applied Shapley additive explanations(SHAP),an explainable artificial intelligence(XAI)technique,to improve the interpretability of the models.Models developed using CatBoost and other AI algorithms showed high accuracy in diagnosing ASD in children.SHAP provided clear insights into the influence of each variable on diagnostic outcomes,making model decisions transparent and understandable to healthcare professionals.By integrating robust machine learning methods with user-friendly tools such as PyCaret and leveraging XAI techniques such as SHAP,this study contributes to the development of reliable,interpretable,and accessible diagnostic tools for ASD.These advances hold great promise for supporting informed decision-making in clinical settings,ultimately improving early identification and intervention strategies for ASD in the pediatric population.However,the study is limited by the dataset’s demographic imbalance and the lack of external clinical validation,which should be addressed in future research.展开更多
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
The recent developments of cognitive theories may provide a better interpretation for studies of translation rather than a description.The paper tries to put categorization and metaphor into the process of translating...The recent developments of cognitive theories may provide a better interpretation for studies of translation rather than a description.The paper tries to put categorization and metaphor into the process of translating and translators’ psychology so as to produce a more powerful interpretation. [展开更多
It is important to quantitatively analyze the effects of protection of important ecological spaces in China to ensure national ecological security.By considering changes in the ecological land,this study examines the ...It is important to quantitatively analyze the effects of protection of important ecological spaces in China to ensure national ecological security.By considering changes in the ecological land,this study examines the effects of protecting three types of important natural ecological spaces in China from 1980 to 2018.Moreover,considering important ecological spaces and their surroundings yields differences in the effects of protection between internal and external spaces,where this can provide a scientific basis for the categorization and zoning of China’s land.The results show the following:(1)In 2018,the ratio of ecological land to important natural ecological spaces accounted for 92.64%.This land had a good ecological background that reflects the developmental orientation of important ecological spaces.(2)From 1980 to 2018,the area of ecological land in important ecological spaces shrank but the rate of reduction was lower than the national average,which shows the positive effect of regulating construction in natural ecological spaces.The restorative effects of ecological projects to convert farmland into forests and grasslands have been prominent.The expanded ecological land is mainly distributed in areas where such projects have been implemented,and the reduced area is concentrated in grain-producing areas of the Northeast China Plain and agricultural oases of Xinjiang.In the future,the government should focus on strengthening the management and control of these areas.(3)The area ratio of ecological land was the highest in national nature reserves.The rate of reduction in its area was the lowest and the trend of reduction was the smallest in national nature reserves,which reflects differences in the status of ecological protection among different spaces.(4)The ratio of ecological land to important ecological spaces was higher than that in the surrounding external space,and the rate of reduction in it was lower.Thus,the effects of internal and external protection had clear differences in terms of gradient.展开更多
Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data...Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data classification and information retrieval, they require manually labeled data samples in the training stage. However, manual labeling is a time consuming and errorprone task. One possible solution to this issue is to exploit the large number of unlabeled samples that are easily accessible via the internet. This paper presents a novel active learning method for text categorization. The main objective of active learning is to reduce the labeling effort, without compromising the accuracy of classification, by intelligently selecting which samples should be labeled.The proposed method selects a batch of informative samples using the posterior probabilities provided by a set of multi-class SVM classifiers, and these samples are then manually labeled by an expert. Experimental results indicate that the proposed active learning method significantly reduces the labeling effort, while simultaneously enhancing the classification accuracy.展开更多
Pseudogenes are genomic remnants of ancient protein-coding genes which have lost their coding potentials through evolution.Although broadly existed,pseudogenes used to be considered as junk or relics of genomes which ...Pseudogenes are genomic remnants of ancient protein-coding genes which have lost their coding potentials through evolution.Although broadly existed,pseudogenes used to be considered as junk or relics of genomes which have not drawn enough attentions of biologists until recent years.With the broad applications of high-throughput experimental techniques,growing lines of evidence have strongly suggested that some pseudogenes possess special functions,including regulating parental gene expression and participating in the regulation of many biological processes.In this review,we summarize some basic features of pseudogenes and their functions in regulating development and diseases.All of these observations indicate that pseudogenes are not purely dead fossils of genomes,but warrant further exploration in their distribution,expression regulation and functions.A new nomenclature is desirable for the currently called 'pseudogenes' to better describe their functions.展开更多
This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of e...This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of each other, widely used in the probabilistic models for text categorization (TC), is discussed. However, the basic hypothesis is incom plete for independence of feature set. From the view of feature selection, a new independent measure between features is designed, by which a feature selection algorithm is given to ob rain a feature subset. The selected subset is high in relevance with category and strong in independence between features, satisfies the basic hypothesis at maximum degree. Compared with other traditional feature selection method in TC (which is only taken into the relevance account), the performance of feature subset selected by our method is prior to others with experiments on the benchmark dataset of 20 Newsgroups.展开更多
This paper summarizes several automatic text categorization algorithms in common use recently, analyzes and compares their advantages and disadvantages. It provides clues for making use of appropriate automatic classi...This paper summarizes several automatic text categorization algorithms in common use recently, analyzes and compares their advantages and disadvantages. It provides clues for making use of appropriate automatic classifying algorithms in different fields. Finally some evaluations and summaries of these algorithms are discussed, and directions to further research have been pointed out. Key words text categorization - naive bayes - KNN - SVM - neural network CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (70031010) and the Research Foundation of Beijing Institute of TechnologyBiography: SHI Yong-feng (1980-), male, Master candidate, research direction: web information mining.展开更多
As the information technology rapidly develops,many network applications appear and their communication protocols are unknown.Although many protocol keyword recognition based protocol reverse engineering methods have ...As the information technology rapidly develops,many network applications appear and their communication protocols are unknown.Although many protocol keyword recognition based protocol reverse engineering methods have been proposed,most of the keyword recognition algorithms are time consuming.This paper firstly uses the traffic clustering method F-DBSCAN to cluster the unknown protocol traffic.Then an improved CFSM(Closed Frequent Sequence Mining)algorithm is used to mine closed frequent sequences from the messages and identify protocol keywords.Finally,CFGM(Closed Frequent Group Mining)algorithm is proposed to explore the parallel,sequential and hierarchical relations between the protocol keywords and obtain accurate protocol message formats.Experimental results show that the proposed protocol formats extraction method is better than Apriori algorithm and Sequence alignment algorithm in terms of time complexity and it can achieve high keyword recognition accuracy.Additionally,based on the relations between the keywords,the method can obtain accurate protocol formats.Compared with the protocol formats obtained from the existing methods,our protocol format can better grasp the overall structure of target protocols and the results perform better in the application of protocol reverse engineering such as fuzzing test.展开更多
BACKGROUND According to the latest American Joint Committee on Cancer and Union for International Cancer Control manuals,cystic duct cancer(CC)is categorized as a type of gallbladder cancer(GC),which has the worst pro...BACKGROUND According to the latest American Joint Committee on Cancer and Union for International Cancer Control manuals,cystic duct cancer(CC)is categorized as a type of gallbladder cancer(GC),which has the worst prognosis among all types of biliary cancers.We hypothesized that this categorization could be verified by using taxonomic methods.AIM To investigate the categorization of CC based on population-level data.METHODS Cases of biliary cancers were identified from the Surveillance,Epidemiology,and End Results 18 registries database.Together with routinely used statistical methods,three taxonomic methods,including Fisher’s discriminant,binary logistics and artificial neuron network(ANN)models,were used to clarify the categorizing problem of CC.RESULTS The T staging system of perihilar cholangiocarcinoma[a type of extrahepatic cholangiocarcinoma(EC)]better discriminated CC prognosis than that of GC.After adjusting other covariates,the hazard ratio of CC tended to be closer to that of EC,although not reaching statistical significance.To differentiate EC from GC,three taxonomic models were built and all showed good accuracies.The ANN model had an area under the receiver operating characteristic curve of 0.902.Using the three models,the majority(75.0%-77.8%)of CC cases were categorized as EC.CONCLUSION Our study suggested that CC should be categorized as a type of EC,not GC.Aggressive surgical attitude might be considered in CC cases,to see whether long-term prognosis could be immensely improved like the situation in EC.展开更多
Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to de...Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to describe the relationship between words and categories. The combined feature selection method is used to remove noisy words from the original feature space effectively. Then the category weight of words is calculated by the improved formula combining the frequency of words and distribution of words. Finally, the class weights of the not-categorized documents based on the category weight of words are obtained for realizing opinion extraction. Experiment results show that the method has comparatively high classification and good stability.展开更多
The scientific evidence that climate is changing due to greenhouse gas emission is now incontestable, which may put many social, biological, and geophysical systems in the world at risk. In this paper, we first identi...The scientific evidence that climate is changing due to greenhouse gas emission is now incontestable, which may put many social, biological, and geophysical systems in the world at risk. In this paper, we first identified main risks induced from or aggravated by climate change. Then we categorized them applying a new risk categorization system brought forward by Renn in a framework of International Risk Governance Council. We proposed that "uncertainty" could be treated as the classification criteria. Based on this, we established a quantitative method with fuzzy set theory, in which "confidence" and "likelihood", the main quantitative terms for expressing uncertainties in IPCC, were used as the feature parameters to construct the fuzzy membership functions of four risk types. According to the maximum principle, most climate change risks identified were classified into the appropriate risk types. In the mean time, given that not all the quantitative terms are available, a qualitative approach was also adopted as a complementary classification method. Finally, we get the preliminary results of climate change risk categorization, which might iay the foundation for the future integrated risk management of climate change.展开更多
This paper provides a brief introduction to the methods for generating fuzzy categorical maps from remotely sensed images (in graphical and digital forms).This is followed by a description of the slicing process for d...This paper provides a brief introduction to the methods for generating fuzzy categorical maps from remotely sensed images (in graphical and digital forms).This is followed by a description of the slicing process for deriving fuzzy boundaries from fuzzy categorical maps,which can be based on the maximum fuzzy membership values,confusion index,or measure of entropy.Results from an empirical test preformed in an Edinburgh suburb show that fuzzy boundaries of land cover can be derived from aerial photographs and satellite images by using the three criteria with small differences,and that slicing based on the maximum fuzzy membership values is the easiest and most straightforward solution.This,in turn,implies the suitability of maintaining both a crisp classification and its underlying certainty map for deriving fuzzy boundaries at different thresholds,which is a flexible and compact management of categorical map data and their uncertainty.展开更多
基金Hebei Provincial Science&Technology Supporting Program(No.15227652D)Guided by Observation Methodology for Long-term Forest Ecosystem Research of National Standards of the People’s Republic of China(GB/T 33027-2016).
文摘The China’s Conversion Cropland to Forest Program(CCFP)is one of the largest national ecological construction programs,which has effectively improved ecological environment and produced large ecological benefi t.To provide references for further improving ecological benefi t of CCFP,we analyzed the features,differences and relationships of the categorized forest ecological“benefi t value”(B-V)s in 3 kinds of forest restoration ways in different regions in CCFP,using the data of Chinese Forest Ecosystem Research Network(CFERN)from 1999 to 2013 and the methods of the national standards of(LY/T1606-2003),(LY/T1721-2008)and(LY/T1952-2011).The result showed that annual B-Vs of unit area varied in the range of 3.5-10.0 e4 RMBs/hm2·a.Water conservation B-Vs and species conservation B-Vs are the 2 largest constituents,nutrient accumulation B-V was the least in total B-Vs.The B-Vs performed inconsistently among the forest restoration ways and different regions.The rank of average annual total B-Vs of unit area from high to low was“hillside forest conservation”,“returning cropland to forest”,“afforestation on suitable barren hills and wasteland”.Species conservation B-Vs and water conservation B-Vs in southern regions were higher than that of northern and northwestern regions in China.The hot and rainy regions could produce higher species conservation B-Vs.The regression analysis indicated that water conservation B-Vs had signifi cantly positive correlation with the relevant total B-Vs and positive correlation with the relevant atmosphere purification B-Vs whether in regional or in unit area scale.Unit area species conservation B-V was negatively correlated with the relevant nutrient accumulation B-Vs except the way of“afforestation on suitable barren hills and wasteland”.Regional total species conservation B-Vs had signifi cantly negative correlation with its relevant nutrient accumulation B-Vs except“hillside forest conservation”way.We suggest that suitable forest restoration ways must be selective according to the regional specifi c,B-V features and local ecological goals.
文摘Don't be surprisod if someone tolls you that you can win a world championship by playing online games.Competitive online gaming,or eSports,has debuted as a demonstration sport at the 18th Asian Games held in Jakarta.
基金Supported by the Major State Basic Research Development Program of China(973 Program,No.2007CB512607)the National Science Fund for Distinguished Young Scholars (No.30625042)National Science and Technology Major Project of the Ministry of Science and Technology of China(No. 2009ZX10005-017)
文摘Objective:To explore the complex prescription compatibility law of the cold and hot nature of Mahuang Decoction(麻黄汤,MHD) and Maxing Shigan Decoction(麻杏石甘汤,MXSGD),both categorized formulas but with different hot/cold natures.Methods:Oxygen consumption of mice was determined among three groups:MHD,MXSGD and the control;a cold-hot pad differentiating assay was used to observe the variability of temperature tropism among the groups of mice which was treated with MHD,MXSGD,and their compositions. Meanwhile,the total anti-oxidant capability(T-AOC) activity were detected.Results:After administration of MHD, the mice showed increased oxygen consumption(P0.01).Compared with MHD group,the remaining rate of MXSGD mice on the hot pad was found to be significantly increased with the cold-hot pad differentiating assay (P0.05).There was no significant difference(P0.05) among the remaining rates of MXSGD,MXSGD with high dose Gypsum Fibrosum(MXHGF) group,and MXSGD with low dose Gypsum Fibrosum(MXLGF) group mice.Compared with the MHD group,T-AOC activity of the mice in the Consensus Compositons group was significantly decreased(P=0.0494).Compared with the MXSGD group,T-AOC activity of Gypsum Fibrosum (GF) group was increased significantly(P=0.0013).Conclusions:The differences in cold and hot nature could be represented objectively between MHD with a hot nature and MXSGD with a cold nature.The reason may be the Gypsum Fibrosum which decreased the efficacy of the consensus compositions.However,increasing or decreasing the dose of Gypsum Fibrosum will not change the cold and hot nature of MXSGD.
基金supported by the National Natural Science Foundation of China(32230043 and 32371092)the Ministry of Science and Technology of China(2019YFA0707103)+1 种基金Das Chinesisch-Deutsche Zentrum für Wissenschaftsförderung(M-0093)the High-performance Computing Platform of Peking University.
文摘Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates.However,the neural mechanisms of coding the similarity of different individuals’bodies of the same species to support their categorical representations remain unclear.Using electroencephalography(EEG)and magnetoencephalography(MEG),we investigated the temporal and spatial characteristics of neural processes shared by different individual body silhouettes of the same species by quantifying the repetition suppression of neural responses to human and animal(chimpanzee,dog,and bird)body silhouettes showing different postures.Our EEG results revealed significant repetition suppression of the amplitudes of early frontal/central activity at 180–220 ms(P2)and late occipitoparietal activity at 220–320 ms(P270)in response to animal(but not human)body silhouettes of the same species.Our MEG results further localized the repetition suppression effect related to animal body silhouettes in the left supramarginal gyrus and left frontal cortex at 200–440 ms after stimulus onset.Our findings suggest two neural processes that are involved in spontaneous categorical representations of animal body silhouettes as a cognitive basis of human-animal interactions.
基金National Key R&D Program of China,No.2022YFC3202802National Natural Science Foundation of China,No.52009081,No.52121006,No.52279071Special Funded Project for Basic Scientific Research Operation Expenses of the Central Public Welfare Scientific Research Institutes of China,No.Y524017。
文摘The sparsity of ground gauges poses a significant challenge for evaluating and merging satellite-based and reanalysis-based precipitation datasets in lake regions.While the standard triple collocation(TC)method offers a solution without access to ground-based observations,it fails to address rain/no-rain classification and its suitability for assessing and merging lake precipitation has not been explored.This study combines categorical triple collocation(CTC)with standard TC to create an integrated framework(CTC-TC)tailored to evaluate and merge global gridded precipitation products(GPPs).We assess the efficacy of CTC-TC using six GPPs(ERA5-Land,SM2 RAIN-ASCAT,IMERG-Early,IMERG-Late,GSMaPMVK,and PERSIANN-CCS)across the five largest freshwater lakes in China.CTC-TC effectively captures the spatial patterns of metrics for all GPPs,and precisely estimates the correlation coefficient and root mean square error for satellite-based datasets apart from SM2 RAIN-ASCAT,but overestimates the classification accuracy indicator V for all GPPs.Regarding multi-source fusion,CTC-TC leverages the strengths of individual products of triplets,resulting in significant improvements in the critical success index(CSI)by over 11.9%and the modified Kling-Gupta efficiency(KGE')by more than 13.3%.Compared to baseline models,including standard TC,simple model averaging,one outlier removal,and Bayesian model averaging,CTC-TC achieves gains in CSI and KGE'of no less than 24.7%and 3.6%,respectively.In conclusion,the CTC-TC framework offers a thorough evaluation and efficient fusion of GPPs,addressing both categorical and continuous accuracy in data-scarce regions such as lakes.
文摘Anomaly detection is an important research area in a diverse range of real-world applications.Although many algorithms have been proposed to address anomaly detection for numerical datasets,categorical and mixed datasets remain a significant challenge,primarily because a natural distance metric is lacking.Consequently,the methods proposed in the literature implement entirely different assumptions regarding the definition of cate-gorical anomalies.This paper presents a novel categorical anomaly detection approach,offering two key con-tributions to existing methods.First,a novel surprisal-based anomaly score is introduced,which provides a more accurate assessment of anomalies by considering the full distribution of categorical values.Second,the proposed method considers complex correlations in the data beyond the pairwise interactions of features.This study proposed and tested the novel categorical surprisal anomaly detection algorithm(CSAD)by comparing and evaluating it against six competitors.The experimental results indicate that CSAD produced the best overall performance,achieving the highest average ROC-AUC and PR-AUC values of 0.8 and 0.443,respectively.Furthermore,CSAD's execution time is satisfactory even when processing large,high-dimensional datasets.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(No.RS-2023-00218176)the Soonchunhyang University Research Fund.
文摘Improving early diagnosis of autism spectrum disorder(ASD)in children increasingly relies on predictive models that are reliable and accessible to non-experts.This study aims to develop such models using Python-based tools to improve ASD diagnosis in clinical settings.We performed exploratory data analysis to ensure data quality and identify key patterns in pediatric ASD data.We selected the categorical boosting(CatBoost)algorithm to effectively handle the large number of categorical variables.We used the PyCaret automated machine learning(AutoML)tool to make the models user-friendly for clinicians without extensive machine learning expertise.In addition,we applied Shapley additive explanations(SHAP),an explainable artificial intelligence(XAI)technique,to improve the interpretability of the models.Models developed using CatBoost and other AI algorithms showed high accuracy in diagnosing ASD in children.SHAP provided clear insights into the influence of each variable on diagnostic outcomes,making model decisions transparent and understandable to healthcare professionals.By integrating robust machine learning methods with user-friendly tools such as PyCaret and leveraging XAI techniques such as SHAP,this study contributes to the development of reliable,interpretable,and accessible diagnostic tools for ASD.These advances hold great promise for supporting informed decision-making in clinical settings,ultimately improving early identification and intervention strategies for ASD in the pediatric population.However,the study is limited by the dataset’s demographic imbalance and the lack of external clinical validation,which should be addressed in future research.
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
文摘The recent developments of cognitive theories may provide a better interpretation for studies of translation rather than a description.The paper tries to put categorization and metaphor into the process of translating and translators’ psychology so as to produce a more powerful interpretation. [
基金National Key Research and Development Program of China,No.2017YFC0506506,No.2016YFC0500206。
文摘It is important to quantitatively analyze the effects of protection of important ecological spaces in China to ensure national ecological security.By considering changes in the ecological land,this study examines the effects of protecting three types of important natural ecological spaces in China from 1980 to 2018.Moreover,considering important ecological spaces and their surroundings yields differences in the effects of protection between internal and external spaces,where this can provide a scientific basis for the categorization and zoning of China’s land.The results show the following:(1)In 2018,the ratio of ecological land to important natural ecological spaces accounted for 92.64%.This land had a good ecological background that reflects the developmental orientation of important ecological spaces.(2)From 1980 to 2018,the area of ecological land in important ecological spaces shrank but the rate of reduction was lower than the national average,which shows the positive effect of regulating construction in natural ecological spaces.The restorative effects of ecological projects to convert farmland into forests and grasslands have been prominent.The expanded ecological land is mainly distributed in areas where such projects have been implemented,and the reduced area is concentrated in grain-producing areas of the Northeast China Plain and agricultural oases of Xinjiang.In the future,the government should focus on strengthening the management and control of these areas.(3)The area ratio of ecological land was the highest in national nature reserves.The rate of reduction in its area was the lowest and the trend of reduction was the smallest in national nature reserves,which reflects differences in the status of ecological protection among different spaces.(4)The ratio of ecological land to important ecological spaces was higher than that in the surrounding external space,and the rate of reduction in it was lower.Thus,the effects of internal and external protection had clear differences in terms of gradient.
文摘Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data classification and information retrieval, they require manually labeled data samples in the training stage. However, manual labeling is a time consuming and errorprone task. One possible solution to this issue is to exploit the large number of unlabeled samples that are easily accessible via the internet. This paper presents a novel active learning method for text categorization. The main objective of active learning is to reduce the labeling effort, without compromising the accuracy of classification, by intelligently selecting which samples should be labeled.The proposed method selects a batch of informative samples using the posterior probabilities provided by a set of multi-class SVM classifiers, and these samples are then manually labeled by an expert. Experimental results indicate that the proposed active learning method significantly reduces the labeling effort, while simultaneously enhancing the classification accuracy.
基金supported by the grants from the Ministry of Science and Technology of China(No.2011CBA01101) to X.-J.W.from the Chinese Academy of Sciences(Nos. XDA01020105,KSCX2-EW-R-01-03 and 2010-Biols-CAS0303) to X.-J.W.
文摘Pseudogenes are genomic remnants of ancient protein-coding genes which have lost their coding potentials through evolution.Although broadly existed,pseudogenes used to be considered as junk or relics of genomes which have not drawn enough attentions of biologists until recent years.With the broad applications of high-throughput experimental techniques,growing lines of evidence have strongly suggested that some pseudogenes possess special functions,including regulating parental gene expression and participating in the regulation of many biological processes.In this review,we summarize some basic features of pseudogenes and their functions in regulating development and diseases.All of these observations indicate that pseudogenes are not purely dead fossils of genomes,but warrant further exploration in their distribution,expression regulation and functions.A new nomenclature is desirable for the currently called 'pseudogenes' to better describe their functions.
基金Supported by the National Natural Science Foun-dation of China (60373066 ,60503020) the Outstanding Young Sci-entist’s Fund(60425206) Doctor Foundatoin of Nanjing Universityof Posts and Telecommunications (2003-02)
文摘This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of each other, widely used in the probabilistic models for text categorization (TC), is discussed. However, the basic hypothesis is incom plete for independence of feature set. From the view of feature selection, a new independent measure between features is designed, by which a feature selection algorithm is given to ob rain a feature subset. The selected subset is high in relevance with category and strong in independence between features, satisfies the basic hypothesis at maximum degree. Compared with other traditional feature selection method in TC (which is only taken into the relevance account), the performance of feature subset selected by our method is prior to others with experiments on the benchmark dataset of 20 Newsgroups.
文摘This paper summarizes several automatic text categorization algorithms in common use recently, analyzes and compares their advantages and disadvantages. It provides clues for making use of appropriate automatic classifying algorithms in different fields. Finally some evaluations and summaries of these algorithms are discussed, and directions to further research have been pointed out. Key words text categorization - naive bayes - KNN - SVM - neural network CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (70031010) and the Research Foundation of Beijing Institute of TechnologyBiography: SHI Yong-feng (1980-), male, Master candidate, research direction: web information mining.
基金supported by the National Key R&D Subsidized Project with 2017YFB0802900.
文摘As the information technology rapidly develops,many network applications appear and their communication protocols are unknown.Although many protocol keyword recognition based protocol reverse engineering methods have been proposed,most of the keyword recognition algorithms are time consuming.This paper firstly uses the traffic clustering method F-DBSCAN to cluster the unknown protocol traffic.Then an improved CFSM(Closed Frequent Sequence Mining)algorithm is used to mine closed frequent sequences from the messages and identify protocol keywords.Finally,CFGM(Closed Frequent Group Mining)algorithm is proposed to explore the parallel,sequential and hierarchical relations between the protocol keywords and obtain accurate protocol message formats.Experimental results show that the proposed protocol formats extraction method is better than Apriori algorithm and Sequence alignment algorithm in terms of time complexity and it can achieve high keyword recognition accuracy.Additionally,based on the relations between the keywords,the method can obtain accurate protocol formats.Compared with the protocol formats obtained from the existing methods,our protocol format can better grasp the overall structure of target protocols and the results perform better in the application of protocol reverse engineering such as fuzzing test.
基金Supported by Zhejiang Provincial Natural Science Foundation of China,No.LQ17H030003
文摘BACKGROUND According to the latest American Joint Committee on Cancer and Union for International Cancer Control manuals,cystic duct cancer(CC)is categorized as a type of gallbladder cancer(GC),which has the worst prognosis among all types of biliary cancers.We hypothesized that this categorization could be verified by using taxonomic methods.AIM To investigate the categorization of CC based on population-level data.METHODS Cases of biliary cancers were identified from the Surveillance,Epidemiology,and End Results 18 registries database.Together with routinely used statistical methods,three taxonomic methods,including Fisher’s discriminant,binary logistics and artificial neuron network(ANN)models,were used to clarify the categorizing problem of CC.RESULTS The T staging system of perihilar cholangiocarcinoma[a type of extrahepatic cholangiocarcinoma(EC)]better discriminated CC prognosis than that of GC.After adjusting other covariates,the hazard ratio of CC tended to be closer to that of EC,although not reaching statistical significance.To differentiate EC from GC,three taxonomic models were built and all showed good accuracies.The ANN model had an area under the receiver operating characteristic curve of 0.902.Using the three models,the majority(75.0%-77.8%)of CC cases were categorized as EC.CONCLUSION Our study suggested that CC should be categorized as a type of EC,not GC.Aggressive surgical attitude might be considered in CC cases,to see whether long-term prognosis could be immensely improved like the situation in EC.
基金Supported by the National High Technology Research and Development Program of China (2005AA147030)
文摘Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to describe the relationship between words and categories. The combined feature selection method is used to remove noisy words from the original feature space effectively. Then the category weight of words is calculated by the improved formula combining the frequency of words and distribution of words. Finally, the class weights of the not-categorized documents based on the category weight of words are obtained for realizing opinion extraction. Experiment results show that the method has comparatively high classification and good stability.
基金Under the auspices of National Science & Technology Pillar Program During the 11th Five-Year Plan Period (No 2006BAD20B05)
文摘The scientific evidence that climate is changing due to greenhouse gas emission is now incontestable, which may put many social, biological, and geophysical systems in the world at risk. In this paper, we first identified main risks induced from or aggravated by climate change. Then we categorized them applying a new risk categorization system brought forward by Renn in a framework of International Risk Governance Council. We proposed that "uncertainty" could be treated as the classification criteria. Based on this, we established a quantitative method with fuzzy set theory, in which "confidence" and "likelihood", the main quantitative terms for expressing uncertainties in IPCC, were used as the feature parameters to construct the fuzzy membership functions of four risk types. According to the maximum principle, most climate change risks identified were classified into the appropriate risk types. In the mean time, given that not all the quantitative terms are available, a qualitative approach was also adopted as a complementary classification method. Finally, we get the preliminary results of climate change risk categorization, which might iay the foundation for the future integrated risk management of climate change.
文摘This paper provides a brief introduction to the methods for generating fuzzy categorical maps from remotely sensed images (in graphical and digital forms).This is followed by a description of the slicing process for deriving fuzzy boundaries from fuzzy categorical maps,which can be based on the maximum fuzzy membership values,confusion index,or measure of entropy.Results from an empirical test preformed in an Edinburgh suburb show that fuzzy boundaries of land cover can be derived from aerial photographs and satellite images by using the three criteria with small differences,and that slicing based on the maximum fuzzy membership values is the easiest and most straightforward solution.This,in turn,implies the suitability of maintaining both a crisp classification and its underlying certainty map for deriving fuzzy boundaries at different thresholds,which is a flexible and compact management of categorical map data and their uncertainty.