The China’s Conversion Cropland to Forest Program(CCFP)is one of the largest national ecological construction programs,which has effectively improved ecological environment and produced large ecological benefi t.To p...The China’s Conversion Cropland to Forest Program(CCFP)is one of the largest national ecological construction programs,which has effectively improved ecological environment and produced large ecological benefi t.To provide references for further improving ecological benefi t of CCFP,we analyzed the features,differences and relationships of the categorized forest ecological“benefi t value”(B-V)s in 3 kinds of forest restoration ways in different regions in CCFP,using the data of Chinese Forest Ecosystem Research Network(CFERN)from 1999 to 2013 and the methods of the national standards of(LY/T1606-2003),(LY/T1721-2008)and(LY/T1952-2011).The result showed that annual B-Vs of unit area varied in the range of 3.5-10.0 e4 RMBs/hm2·a.Water conservation B-Vs and species conservation B-Vs are the 2 largest constituents,nutrient accumulation B-V was the least in total B-Vs.The B-Vs performed inconsistently among the forest restoration ways and different regions.The rank of average annual total B-Vs of unit area from high to low was“hillside forest conservation”,“returning cropland to forest”,“afforestation on suitable barren hills and wasteland”.Species conservation B-Vs and water conservation B-Vs in southern regions were higher than that of northern and northwestern regions in China.The hot and rainy regions could produce higher species conservation B-Vs.The regression analysis indicated that water conservation B-Vs had signifi cantly positive correlation with the relevant total B-Vs and positive correlation with the relevant atmosphere purification B-Vs whether in regional or in unit area scale.Unit area species conservation B-V was negatively correlated with the relevant nutrient accumulation B-Vs except the way of“afforestation on suitable barren hills and wasteland”.Regional total species conservation B-Vs had signifi cantly negative correlation with its relevant nutrient accumulation B-Vs except“hillside forest conservation”way.We suggest that suitable forest restoration ways must be selective according to the regional specifi c,B-V features and local ecological goals.展开更多
Don't be surprisod if someone tolls you that you can win a world championship by playing online games.Competitive online gaming,or eSports,has debuted as a demonstration sport at the 18th Asian Games held in Jakarta.
Objective:To explore the complex prescription compatibility law of the cold and hot nature of Mahuang Decoction(麻黄汤,MHD) and Maxing Shigan Decoction(麻杏石甘汤,MXSGD),both categorized formulas but with differe...Objective:To explore the complex prescription compatibility law of the cold and hot nature of Mahuang Decoction(麻黄汤,MHD) and Maxing Shigan Decoction(麻杏石甘汤,MXSGD),both categorized formulas but with different hot/cold natures.Methods:Oxygen consumption of mice was determined among three groups:MHD,MXSGD and the control;a cold-hot pad differentiating assay was used to observe the variability of temperature tropism among the groups of mice which was treated with MHD,MXSGD,and their compositions. Meanwhile,the total anti-oxidant capability(T-AOC) activity were detected.Results:After administration of MHD, the mice showed increased oxygen consumption(P0.01).Compared with MHD group,the remaining rate of MXSGD mice on the hot pad was found to be significantly increased with the cold-hot pad differentiating assay (P0.05).There was no significant difference(P0.05) among the remaining rates of MXSGD,MXSGD with high dose Gypsum Fibrosum(MXHGF) group,and MXSGD with low dose Gypsum Fibrosum(MXLGF) group mice.Compared with the MHD group,T-AOC activity of the mice in the Consensus Compositons group was significantly decreased(P=0.0494).Compared with the MXSGD group,T-AOC activity of Gypsum Fibrosum (GF) group was increased significantly(P=0.0013).Conclusions:The differences in cold and hot nature could be represented objectively between MHD with a hot nature and MXSGD with a cold nature.The reason may be the Gypsum Fibrosum which decreased the efficacy of the consensus compositions.However,increasing or decreasing the dose of Gypsum Fibrosum will not change the cold and hot nature of MXSGD.展开更多
Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates.However,the neural mechanisms of coding the similarity of diffe...Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates.However,the neural mechanisms of coding the similarity of different individuals’bodies of the same species to support their categorical representations remain unclear.Using electroencephalography(EEG)and magnetoencephalography(MEG),we investigated the temporal and spatial characteristics of neural processes shared by different individual body silhouettes of the same species by quantifying the repetition suppression of neural responses to human and animal(chimpanzee,dog,and bird)body silhouettes showing different postures.Our EEG results revealed significant repetition suppression of the amplitudes of early frontal/central activity at 180–220 ms(P2)and late occipitoparietal activity at 220–320 ms(P270)in response to animal(but not human)body silhouettes of the same species.Our MEG results further localized the repetition suppression effect related to animal body silhouettes in the left supramarginal gyrus and left frontal cortex at 200–440 ms after stimulus onset.Our findings suggest two neural processes that are involved in spontaneous categorical representations of animal body silhouettes as a cognitive basis of human-animal interactions.展开更多
Anomaly detection is an important research area in a diverse range of real-world applications.Although many algorithms have been proposed to address anomaly detection for numerical datasets,categorical and mixed datas...Anomaly detection is an important research area in a diverse range of real-world applications.Although many algorithms have been proposed to address anomaly detection for numerical datasets,categorical and mixed datasets remain a significant challenge,primarily because a natural distance metric is lacking.Consequently,the methods proposed in the literature implement entirely different assumptions regarding the definition of cate-gorical anomalies.This paper presents a novel categorical anomaly detection approach,offering two key con-tributions to existing methods.First,a novel surprisal-based anomaly score is introduced,which provides a more accurate assessment of anomalies by considering the full distribution of categorical values.Second,the proposed method considers complex correlations in the data beyond the pairwise interactions of features.This study proposed and tested the novel categorical surprisal anomaly detection algorithm(CSAD)by comparing and evaluating it against six competitors.The experimental results indicate that CSAD produced the best overall performance,achieving the highest average ROC-AUC and PR-AUC values of 0.8 and 0.443,respectively.Furthermore,CSAD's execution time is satisfactory even when processing large,high-dimensional datasets.展开更多
Improving early diagnosis of autism spectrum disorder(ASD)in children increasingly relies on predictive models that are reliable and accessible to non-experts.This study aims to develop such models using Python-based ...Improving early diagnosis of autism spectrum disorder(ASD)in children increasingly relies on predictive models that are reliable and accessible to non-experts.This study aims to develop such models using Python-based tools to improve ASD diagnosis in clinical settings.We performed exploratory data analysis to ensure data quality and identify key patterns in pediatric ASD data.We selected the categorical boosting(CatBoost)algorithm to effectively handle the large number of categorical variables.We used the PyCaret automated machine learning(AutoML)tool to make the models user-friendly for clinicians without extensive machine learning expertise.In addition,we applied Shapley additive explanations(SHAP),an explainable artificial intelligence(XAI)technique,to improve the interpretability of the models.Models developed using CatBoost and other AI algorithms showed high accuracy in diagnosing ASD in children.SHAP provided clear insights into the influence of each variable on diagnostic outcomes,making model decisions transparent and understandable to healthcare professionals.By integrating robust machine learning methods with user-friendly tools such as PyCaret and leveraging XAI techniques such as SHAP,this study contributes to the development of reliable,interpretable,and accessible diagnostic tools for ASD.These advances hold great promise for supporting informed decision-making in clinical settings,ultimately improving early identification and intervention strategies for ASD in the pediatric population.However,the study is limited by the dataset’s demographic imbalance and the lack of external clinical validation,which should be addressed in future research.展开更多
Congenital cataract(CC)is one of the most common causes of pediatric visual impairment.As our understanding of CC's etiology,clinical manifestations,and pathogenic genes deepens,various CC categorization systems b...Congenital cataract(CC)is one of the most common causes of pediatric visual impairment.As our understanding of CC's etiology,clinical manifestations,and pathogenic genes deepens,various CC categorization systems based on different classification criteria have been proposed.Regrettably,the application of the CC category in clinical practice and scientific research is limited.It is challenging to obtain precise information that could guide the timely treatment decision-making for pediatric cataract patients or predict their prognosis from a specific CC classification.This review aims to discuss the status quo of CC categorization systems and the potential directions for future research in this field,focusing on categorization principles and scientific application in clinical practice.Additionally,it aims to propose the potential directions for future research in this domain.展开更多
This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed...This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed herein utilizes the fusion of diversified feature formats,specifically,metadata,textual,and pattern features.The goal is to enhance the system’s ability to discern and generalize transformation rules fromsource to destination formats in varied contexts.Firstly,the article delves into the methodology for extracting these distinct features from raw data and the pre-processing steps undertaken to prepare the data for the model.Subsequent sections expound on the mechanism of feature optimization using Recursive Feature Elimination(RFE)with linear regression,aiming to retain the most contributive features and eliminate redundant or less significant ones.The core of the research revolves around the deployment of the XGBoostmodel for training,using the prepared and optimized feature sets.The article presents a detailed overview of the mathematical model and algorithmic steps behind this procedure.Finally,the process of rule discovery(prediction phase)by the trained XGBoost model is explained,underscoring its role in real-time,automated data transformations.By employingmachine learning and particularly,the XGBoost model in the context of Business Rule Engine(BRE)data transformation,the article underscores a paradigm shift towardsmore scalable,efficient,and less human-dependent data transformation systems.This research opens doors for further exploration into automated rule discovery systems and their applications in various sectors.展开更多
As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of d...As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.展开更多
Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categ...Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categories.Due to high intra-class variances and high inter-class similarity,the fine-grained visual categorization is extremely challenging.This paper first briefly introduces and analyzes the related public datasets.After that,some of the latest methods are reviewed.Based on the feature types,the feature processing methods,and the overall structure used in the model,we divide them into three types of methods:methods based on general convolutional neural network(CNN)and strong supervision of parts,methods based on single feature processing,and meth-ods based on multiple feature processing.Most methods of the first type have a relatively simple structure,which is the result of the initial research.The methods of the other two types include models that have special structures and training processes,which are helpful to obtain discriminative features.We conduct a specific analysis on several methods with high accuracy on pub-lic datasets.In addition,we support that the focus of the future research is to solve the demand of existing methods for the large amount of the data and the computing power.In terms of tech-nology,the extraction of the subtle feature information with the burgeoning vision transformer(ViT)network is also an important research direction.展开更多
The school placement processes of students from immigrant backgrounds considered to be in“difficulty”is an international concern at the intersection of works relating to special education and those concerning the sc...The school placement processes of students from immigrant backgrounds considered to be in“difficulty”is an international concern at the intersection of works relating to special education and those concerning the school experiences of students from immigrant backgrounds or racialized groups.The research problem of this article concerns the identification of these students as disabled or as having adjustment or learning difficulties.From a perspective anchored in Disability Critical Race Studies,this ethnographic study documents different interpretations of perceived difficulties made by school actors with regard to seven primary school students from immigrant backgrounds.Five interpretation types are presented:(1)medicalization by dismissal of cultural markers,(2)medicalization by professional constraint,(3)medicalization by cultural deficit,(4)precautionary wait,and(5)cultural differentialism.Our results help to shed light on the special education overrepresentation phenomenon regarding these students and to understand how ableism and(neo)racism contribute to it.展开更多
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a...To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms.展开更多
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
It is important to quantitatively analyze the effects of protection of important ecological spaces in China to ensure national ecological security.By considering changes in the ecological land,this study examines the ...It is important to quantitatively analyze the effects of protection of important ecological spaces in China to ensure national ecological security.By considering changes in the ecological land,this study examines the effects of protecting three types of important natural ecological spaces in China from 1980 to 2018.Moreover,considering important ecological spaces and their surroundings yields differences in the effects of protection between internal and external spaces,where this can provide a scientific basis for the categorization and zoning of China’s land.The results show the following:(1)In 2018,the ratio of ecological land to important natural ecological spaces accounted for 92.64%.This land had a good ecological background that reflects the developmental orientation of important ecological spaces.(2)From 1980 to 2018,the area of ecological land in important ecological spaces shrank but the rate of reduction was lower than the national average,which shows the positive effect of regulating construction in natural ecological spaces.The restorative effects of ecological projects to convert farmland into forests and grasslands have been prominent.The expanded ecological land is mainly distributed in areas where such projects have been implemented,and the reduced area is concentrated in grain-producing areas of the Northeast China Plain and agricultural oases of Xinjiang.In the future,the government should focus on strengthening the management and control of these areas.(3)The area ratio of ecological land was the highest in national nature reserves.The rate of reduction in its area was the lowest and the trend of reduction was the smallest in national nature reserves,which reflects differences in the status of ecological protection among different spaces.(4)The ratio of ecological land to important ecological spaces was higher than that in the surrounding external space,and the rate of reduction in it was lower.Thus,the effects of internal and external protection had clear differences in terms of gradient.展开更多
The recent developments of cognitive theories may provide a better interpretation for studies of translation rather than a description.The paper tries to put categorization and metaphor into the process of translating...The recent developments of cognitive theories may provide a better interpretation for studies of translation rather than a description.The paper tries to put categorization and metaphor into the process of translating and translators’ psychology so as to produce a more powerful interpretation. [展开更多
Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data...Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data classification and information retrieval, they require manually labeled data samples in the training stage. However, manual labeling is a time consuming and errorprone task. One possible solution to this issue is to exploit the large number of unlabeled samples that are easily accessible via the internet. This paper presents a novel active learning method for text categorization. The main objective of active learning is to reduce the labeling effort, without compromising the accuracy of classification, by intelligently selecting which samples should be labeled.The proposed method selects a batch of informative samples using the posterior probabilities provided by a set of multi-class SVM classifiers, and these samples are then manually labeled by an expert. Experimental results indicate that the proposed active learning method significantly reduces the labeling effort, while simultaneously enhancing the classification accuracy.展开更多
Pseudogenes are genomic remnants of ancient protein-coding genes which have lost their coding potentials through evolution.Although broadly existed,pseudogenes used to be considered as junk or relics of genomes which ...Pseudogenes are genomic remnants of ancient protein-coding genes which have lost their coding potentials through evolution.Although broadly existed,pseudogenes used to be considered as junk or relics of genomes which have not drawn enough attentions of biologists until recent years.With the broad applications of high-throughput experimental techniques,growing lines of evidence have strongly suggested that some pseudogenes possess special functions,including regulating parental gene expression and participating in the regulation of many biological processes.In this review,we summarize some basic features of pseudogenes and their functions in regulating development and diseases.All of these observations indicate that pseudogenes are not purely dead fossils of genomes,but warrant further exploration in their distribution,expression regulation and functions.A new nomenclature is desirable for the currently called 'pseudogenes' to better describe their functions.展开更多
This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of e...This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of each other, widely used in the probabilistic models for text categorization (TC), is discussed. However, the basic hypothesis is incom plete for independence of feature set. From the view of feature selection, a new independent measure between features is designed, by which a feature selection algorithm is given to ob rain a feature subset. The selected subset is high in relevance with category and strong in independence between features, satisfies the basic hypothesis at maximum degree. Compared with other traditional feature selection method in TC (which is only taken into the relevance account), the performance of feature subset selected by our method is prior to others with experiments on the benchmark dataset of 20 Newsgroups.展开更多
基金Hebei Provincial Science&Technology Supporting Program(No.15227652D)Guided by Observation Methodology for Long-term Forest Ecosystem Research of National Standards of the People’s Republic of China(GB/T 33027-2016).
文摘The China’s Conversion Cropland to Forest Program(CCFP)is one of the largest national ecological construction programs,which has effectively improved ecological environment and produced large ecological benefi t.To provide references for further improving ecological benefi t of CCFP,we analyzed the features,differences and relationships of the categorized forest ecological“benefi t value”(B-V)s in 3 kinds of forest restoration ways in different regions in CCFP,using the data of Chinese Forest Ecosystem Research Network(CFERN)from 1999 to 2013 and the methods of the national standards of(LY/T1606-2003),(LY/T1721-2008)and(LY/T1952-2011).The result showed that annual B-Vs of unit area varied in the range of 3.5-10.0 e4 RMBs/hm2·a.Water conservation B-Vs and species conservation B-Vs are the 2 largest constituents,nutrient accumulation B-V was the least in total B-Vs.The B-Vs performed inconsistently among the forest restoration ways and different regions.The rank of average annual total B-Vs of unit area from high to low was“hillside forest conservation”,“returning cropland to forest”,“afforestation on suitable barren hills and wasteland”.Species conservation B-Vs and water conservation B-Vs in southern regions were higher than that of northern and northwestern regions in China.The hot and rainy regions could produce higher species conservation B-Vs.The regression analysis indicated that water conservation B-Vs had signifi cantly positive correlation with the relevant total B-Vs and positive correlation with the relevant atmosphere purification B-Vs whether in regional or in unit area scale.Unit area species conservation B-V was negatively correlated with the relevant nutrient accumulation B-Vs except the way of“afforestation on suitable barren hills and wasteland”.Regional total species conservation B-Vs had signifi cantly negative correlation with its relevant nutrient accumulation B-Vs except“hillside forest conservation”way.We suggest that suitable forest restoration ways must be selective according to the regional specifi c,B-V features and local ecological goals.
文摘Don't be surprisod if someone tolls you that you can win a world championship by playing online games.Competitive online gaming,or eSports,has debuted as a demonstration sport at the 18th Asian Games held in Jakarta.
基金Supported by the Major State Basic Research Development Program of China(973 Program,No.2007CB512607)the National Science Fund for Distinguished Young Scholars (No.30625042)National Science and Technology Major Project of the Ministry of Science and Technology of China(No. 2009ZX10005-017)
文摘Objective:To explore the complex prescription compatibility law of the cold and hot nature of Mahuang Decoction(麻黄汤,MHD) and Maxing Shigan Decoction(麻杏石甘汤,MXSGD),both categorized formulas but with different hot/cold natures.Methods:Oxygen consumption of mice was determined among three groups:MHD,MXSGD and the control;a cold-hot pad differentiating assay was used to observe the variability of temperature tropism among the groups of mice which was treated with MHD,MXSGD,and their compositions. Meanwhile,the total anti-oxidant capability(T-AOC) activity were detected.Results:After administration of MHD, the mice showed increased oxygen consumption(P0.01).Compared with MHD group,the remaining rate of MXSGD mice on the hot pad was found to be significantly increased with the cold-hot pad differentiating assay (P0.05).There was no significant difference(P0.05) among the remaining rates of MXSGD,MXSGD with high dose Gypsum Fibrosum(MXHGF) group,and MXSGD with low dose Gypsum Fibrosum(MXLGF) group mice.Compared with the MHD group,T-AOC activity of the mice in the Consensus Compositons group was significantly decreased(P=0.0494).Compared with the MXSGD group,T-AOC activity of Gypsum Fibrosum (GF) group was increased significantly(P=0.0013).Conclusions:The differences in cold and hot nature could be represented objectively between MHD with a hot nature and MXSGD with a cold nature.The reason may be the Gypsum Fibrosum which decreased the efficacy of the consensus compositions.However,increasing or decreasing the dose of Gypsum Fibrosum will not change the cold and hot nature of MXSGD.
基金supported by the National Natural Science Foundation of China(32230043 and 32371092)the Ministry of Science and Technology of China(2019YFA0707103)+1 种基金Das Chinesisch-Deutsche Zentrum für Wissenschaftsförderung(M-0093)the High-performance Computing Platform of Peking University.
文摘Neural activities differentiating bodies versus non-body stimuli have been identified in the occipitotemporal cortex of both humans and nonhuman primates.However,the neural mechanisms of coding the similarity of different individuals’bodies of the same species to support their categorical representations remain unclear.Using electroencephalography(EEG)and magnetoencephalography(MEG),we investigated the temporal and spatial characteristics of neural processes shared by different individual body silhouettes of the same species by quantifying the repetition suppression of neural responses to human and animal(chimpanzee,dog,and bird)body silhouettes showing different postures.Our EEG results revealed significant repetition suppression of the amplitudes of early frontal/central activity at 180–220 ms(P2)and late occipitoparietal activity at 220–320 ms(P270)in response to animal(but not human)body silhouettes of the same species.Our MEG results further localized the repetition suppression effect related to animal body silhouettes in the left supramarginal gyrus and left frontal cortex at 200–440 ms after stimulus onset.Our findings suggest two neural processes that are involved in spontaneous categorical representations of animal body silhouettes as a cognitive basis of human-animal interactions.
文摘Anomaly detection is an important research area in a diverse range of real-world applications.Although many algorithms have been proposed to address anomaly detection for numerical datasets,categorical and mixed datasets remain a significant challenge,primarily because a natural distance metric is lacking.Consequently,the methods proposed in the literature implement entirely different assumptions regarding the definition of cate-gorical anomalies.This paper presents a novel categorical anomaly detection approach,offering two key con-tributions to existing methods.First,a novel surprisal-based anomaly score is introduced,which provides a more accurate assessment of anomalies by considering the full distribution of categorical values.Second,the proposed method considers complex correlations in the data beyond the pairwise interactions of features.This study proposed and tested the novel categorical surprisal anomaly detection algorithm(CSAD)by comparing and evaluating it against six competitors.The experimental results indicate that CSAD produced the best overall performance,achieving the highest average ROC-AUC and PR-AUC values of 0.8 and 0.443,respectively.Furthermore,CSAD's execution time is satisfactory even when processing large,high-dimensional datasets.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(No.RS-2023-00218176)the Soonchunhyang University Research Fund.
文摘Improving early diagnosis of autism spectrum disorder(ASD)in children increasingly relies on predictive models that are reliable and accessible to non-experts.This study aims to develop such models using Python-based tools to improve ASD diagnosis in clinical settings.We performed exploratory data analysis to ensure data quality and identify key patterns in pediatric ASD data.We selected the categorical boosting(CatBoost)algorithm to effectively handle the large number of categorical variables.We used the PyCaret automated machine learning(AutoML)tool to make the models user-friendly for clinicians without extensive machine learning expertise.In addition,we applied Shapley additive explanations(SHAP),an explainable artificial intelligence(XAI)technique,to improve the interpretability of the models.Models developed using CatBoost and other AI algorithms showed high accuracy in diagnosing ASD in children.SHAP provided clear insights into the influence of each variable on diagnostic outcomes,making model decisions transparent and understandable to healthcare professionals.By integrating robust machine learning methods with user-friendly tools such as PyCaret and leveraging XAI techniques such as SHAP,this study contributes to the development of reliable,interpretable,and accessible diagnostic tools for ASD.These advances hold great promise for supporting informed decision-making in clinical settings,ultimately improving early identification and intervention strategies for ASD in the pediatric population.However,the study is limited by the dataset’s demographic imbalance and the lack of external clinical validation,which should be addressed in future research.
基金supported by the Joint Funding Project of Municipal Schools(Colleges)of Science and Technology Program of Guangzhou,China(2023A03J0188)the Guangzhou Municipal Science and Technology Project(202201011815).
文摘Congenital cataract(CC)is one of the most common causes of pediatric visual impairment.As our understanding of CC's etiology,clinical manifestations,and pathogenic genes deepens,various CC categorization systems based on different classification criteria have been proposed.Regrettably,the application of the CC category in clinical practice and scientific research is limited.It is challenging to obtain precise information that could guide the timely treatment decision-making for pediatric cataract patients or predict their prognosis from a specific CC classification.This review aims to discuss the status quo of CC categorization systems and the potential directions for future research in this field,focusing on categorization principles and scientific application in clinical practice.Additionally,it aims to propose the potential directions for future research in this domain.
文摘This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed herein utilizes the fusion of diversified feature formats,specifically,metadata,textual,and pattern features.The goal is to enhance the system’s ability to discern and generalize transformation rules fromsource to destination formats in varied contexts.Firstly,the article delves into the methodology for extracting these distinct features from raw data and the pre-processing steps undertaken to prepare the data for the model.Subsequent sections expound on the mechanism of feature optimization using Recursive Feature Elimination(RFE)with linear regression,aiming to retain the most contributive features and eliminate redundant or less significant ones.The core of the research revolves around the deployment of the XGBoostmodel for training,using the prepared and optimized feature sets.The article presents a detailed overview of the mathematical model and algorithmic steps behind this procedure.Finally,the process of rule discovery(prediction phase)by the trained XGBoost model is explained,underscoring its role in real-time,automated data transformations.By employingmachine learning and particularly,the XGBoost model in the context of Business Rule Engine(BRE)data transformation,the article underscores a paradigm shift towardsmore scalable,efficient,and less human-dependent data transformation systems.This research opens doors for further exploration into automated rule discovery systems and their applications in various sectors.
文摘As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.
基金supported by the National Natural Science Foundation of China(61571453,61806218).
文摘Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categories.Due to high intra-class variances and high inter-class similarity,the fine-grained visual categorization is extremely challenging.This paper first briefly introduces and analyzes the related public datasets.After that,some of the latest methods are reviewed.Based on the feature types,the feature processing methods,and the overall structure used in the model,we divide them into three types of methods:methods based on general convolutional neural network(CNN)and strong supervision of parts,methods based on single feature processing,and meth-ods based on multiple feature processing.Most methods of the first type have a relatively simple structure,which is the result of the initial research.The methods of the other two types include models that have special structures and training processes,which are helpful to obtain discriminative features.We conduct a specific analysis on several methods with high accuracy on pub-lic datasets.In addition,we support that the focus of the future research is to solve the demand of existing methods for the large amount of the data and the computing power.In terms of tech-nology,the extraction of the subtle feature information with the burgeoning vision transformer(ViT)network is also an important research direction.
文摘The school placement processes of students from immigrant backgrounds considered to be in“difficulty”is an international concern at the intersection of works relating to special education and those concerning the school experiences of students from immigrant backgrounds or racialized groups.The research problem of this article concerns the identification of these students as disabled or as having adjustment or learning difficulties.From a perspective anchored in Disability Critical Race Studies,this ethnographic study documents different interpretations of perceived difficulties made by school actors with regard to seven primary school students from immigrant backgrounds.Five interpretation types are presented:(1)medicalization by dismissal of cultural markers,(2)medicalization by professional constraint,(3)medicalization by cultural deficit,(4)precautionary wait,and(5)cultural differentialism.Our results help to shed light on the special education overrepresentation phenomenon regarding these students and to understand how ableism and(neo)racism contribute to it.
文摘To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms.
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
基金National Key Research and Development Program of China,No.2017YFC0506506,No.2016YFC0500206。
文摘It is important to quantitatively analyze the effects of protection of important ecological spaces in China to ensure national ecological security.By considering changes in the ecological land,this study examines the effects of protecting three types of important natural ecological spaces in China from 1980 to 2018.Moreover,considering important ecological spaces and their surroundings yields differences in the effects of protection between internal and external spaces,where this can provide a scientific basis for the categorization and zoning of China’s land.The results show the following:(1)In 2018,the ratio of ecological land to important natural ecological spaces accounted for 92.64%.This land had a good ecological background that reflects the developmental orientation of important ecological spaces.(2)From 1980 to 2018,the area of ecological land in important ecological spaces shrank but the rate of reduction was lower than the national average,which shows the positive effect of regulating construction in natural ecological spaces.The restorative effects of ecological projects to convert farmland into forests and grasslands have been prominent.The expanded ecological land is mainly distributed in areas where such projects have been implemented,and the reduced area is concentrated in grain-producing areas of the Northeast China Plain and agricultural oases of Xinjiang.In the future,the government should focus on strengthening the management and control of these areas.(3)The area ratio of ecological land was the highest in national nature reserves.The rate of reduction in its area was the lowest and the trend of reduction was the smallest in national nature reserves,which reflects differences in the status of ecological protection among different spaces.(4)The ratio of ecological land to important ecological spaces was higher than that in the surrounding external space,and the rate of reduction in it was lower.Thus,the effects of internal and external protection had clear differences in terms of gradient.
文摘The recent developments of cognitive theories may provide a better interpretation for studies of translation rather than a description.The paper tries to put categorization and metaphor into the process of translating and translators’ psychology so as to produce a more powerful interpretation. [
文摘Support vector machines(SVMs) are a popular class of supervised learning algorithms, and are particularly applicable to large and high-dimensional classification problems. Like most machine learning methods for data classification and information retrieval, they require manually labeled data samples in the training stage. However, manual labeling is a time consuming and errorprone task. One possible solution to this issue is to exploit the large number of unlabeled samples that are easily accessible via the internet. This paper presents a novel active learning method for text categorization. The main objective of active learning is to reduce the labeling effort, without compromising the accuracy of classification, by intelligently selecting which samples should be labeled.The proposed method selects a batch of informative samples using the posterior probabilities provided by a set of multi-class SVM classifiers, and these samples are then manually labeled by an expert. Experimental results indicate that the proposed active learning method significantly reduces the labeling effort, while simultaneously enhancing the classification accuracy.
基金supported by the grants from the Ministry of Science and Technology of China(No.2011CBA01101) to X.-J.W.from the Chinese Academy of Sciences(Nos. XDA01020105,KSCX2-EW-R-01-03 and 2010-Biols-CAS0303) to X.-J.W.
文摘Pseudogenes are genomic remnants of ancient protein-coding genes which have lost their coding potentials through evolution.Although broadly existed,pseudogenes used to be considered as junk or relics of genomes which have not drawn enough attentions of biologists until recent years.With the broad applications of high-throughput experimental techniques,growing lines of evidence have strongly suggested that some pseudogenes possess special functions,including regulating parental gene expression and participating in the regulation of many biological processes.In this review,we summarize some basic features of pseudogenes and their functions in regulating development and diseases.All of these observations indicate that pseudogenes are not purely dead fossils of genomes,but warrant further exploration in their distribution,expression regulation and functions.A new nomenclature is desirable for the currently called 'pseudogenes' to better describe their functions.
基金Supported by the National Natural Science Foun-dation of China (60373066 ,60503020) the Outstanding Young Sci-entist’s Fund(60425206) Doctor Foundatoin of Nanjing Universityof Posts and Telecommunications (2003-02)
文摘This paper proposes a new approach of feature selection based on the independent measure between features for text categorization. A fundamental hypothesis that occurrence of the terms in documents is independent of each other, widely used in the probabilistic models for text categorization (TC), is discussed. However, the basic hypothesis is incom plete for independence of feature set. From the view of feature selection, a new independent measure between features is designed, by which a feature selection algorithm is given to ob rain a feature subset. The selected subset is high in relevance with category and strong in independence between features, satisfies the basic hypothesis at maximum degree. Compared with other traditional feature selection method in TC (which is only taken into the relevance account), the performance of feature subset selected by our method is prior to others with experiments on the benchmark dataset of 20 Newsgroups.