Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes...Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.展开更多
In the task of Facial Expression Recognition(FER),data uncertainty has been a critical factor affecting performance,typically arising from the ambiguity of facial expressions,low-quality images,and the subjectivity of...In the task of Facial Expression Recognition(FER),data uncertainty has been a critical factor affecting performance,typically arising from the ambiguity of facial expressions,low-quality images,and the subjectivity of annotators.Tracking the training history reveals that misclassified samples often exhibit high confidence and excessive uncertainty in the early stages of training.To address this issue,we propose an uncertainty-based robust sample selection strategy,which combines confidence error with RandAugment to improve image diversity,effectively reducing overfitting caused by uncertain samples during deep learning model training.To validate the effectiveness of the proposed method,extensive experiments were conducted on FER public benchmarks.The accuracy obtained were 89.08%on RAF-DB,63.12%on AffectNet,and 88.73%on FERPlus.展开更多
In engineering application,there is only one adaptive weights estimated by most of traditional early warning radars for adaptive interference suppression in a pulse reputation interval(PRI).Therefore,if the training s...In engineering application,there is only one adaptive weights estimated by most of traditional early warning radars for adaptive interference suppression in a pulse reputation interval(PRI).Therefore,if the training samples used to calculate the weight vector does not contain the jamming,then the jamming cannot be removed by adaptive spatial filtering.If the weight vector is constantly updated in the range dimension,the training data may contain target echo signals,resulting in signal cancellation effect.To cope with the situation that the training samples are contaminated by target signal,an iterative training sample selection method based on non-homogeneous detector(NHD)is proposed in this paper for updating the weight vector in entire range dimension.The principle is presented,and the validity is proven by simulation results.展开更多
Cotton is one of the most significant cash crops in the world,and it is also the main source of natural fiber for textiles.It is crucial for cotton management to identify the spatiotemporal distribution of cotton plan...Cotton is one of the most significant cash crops in the world,and it is also the main source of natural fiber for textiles.It is crucial for cotton management to identify the spatiotemporal distribution of cotton planting areas timely and accurately on a fine scale.However,previous research studies have predominantly concentrated on specific years using remote sensing data.Challenges still exist in the extraction of cotton areas for long time series with high accuracy.To address this issue,a novel cotton sample selection method was proposed and the machine learning method is employed to effectively identify the long time series cotton planting areas at a 30-m resolution scale.Bortala and Shuanghe in Xinjiang,China,were selected as the study cases to demonstrate the approach.Specifically,the cropland in this study was extracted by using an object-oriented classification method with Landsat images and the results were optimized as the vectorized boundary of croplands.Then,the cotton samples were selected using the Normalized Difference Vegetation Index(NDVI)series of Moderate Resolution Imaging Spectroradiometer(MODIS)based on its phenological characteristics.Next,cotton was identified based on the croplands from 2000 to 2020 by using the machine learning model.Finally,the performance was evaluated,and the spatiotemporal distribution characteristics of cotton planting areas were analyzed.The results showed that the proposed approach can achieve high accuracy at a fine spatial resolution.The performance evaluation indicated the applicability and suitability of the method,there is a good correlation between the extracted cotton areas and statistical data,and the cotton area of the study area showed an increasing trend.The cotton spatial distribution pattern developed from dispersion to agglomeration.The proposed approach and the derived 30-m cotton maps can provide a scientific reference for the optimization of agricultural management.展开更多
We propose interest-driven progressive visual analytics.The core idea is to filter samples with features of interest to analysts from the given dataset for analysis.The approach relies on a generative model(GM)trained...We propose interest-driven progressive visual analytics.The core idea is to filter samples with features of interest to analysts from the given dataset for analysis.The approach relies on a generative model(GM)trained using the given dataset as the training set.The GM characteristics make it convenient to find ideal generated samples from its latent space.Then,we filter the original samples similar to the ideal generated ones to explore patterns.Our research involves two methods for achieving and applying the idea.First,we give a method to explore ideal samples from a GM’s latent space.Second,we integrate the method into a system to form an embedding-based analytical workflow.Patterns found on open datasets in case studies,results of quantitative experiments,and positive feedback from experts illustrate the general usability and effectiveness of the approach.展开更多
Near-infrared( NIR) spectroscopy has been widely employed as a process analytical tool( PAT) in various fields; the most important reason for the use of this method is its ability to record spectra in real time to cap...Near-infrared( NIR) spectroscopy has been widely employed as a process analytical tool( PAT) in various fields; the most important reason for the use of this method is its ability to record spectra in real time to capture process properties. In quantitative online applications,the robustness of the established NIR model is often deteriorated by process condition variations,nonlinear of the properties or the high-dimensional of the NIR data set. To cope with such situation,a novel method based on principal component analysis( PCA) and artificial neural network( ANN) is proposed and a new sample-selection method is mentioned. The advantage of the presented approach is that it can select proper calibration samples and establish robust model effectively. The performance of the method was tested on a spectroscopic data set from a refinery process. Compared with traditional partial leastsquares( PLS),principal component regression( PCR) and several other modeling methods, the proposed approach was found to achieve good accuracy in the prediction of gasoline properties. An application of the proposed method is also reported.展开更多
Background Functional mapping, despite its proven efficiency, suffers from a “chicken or egg” scenario, in that, poor spatial features lead to inadequate spectral alignment and vice versa during training, often resu...Background Functional mapping, despite its proven efficiency, suffers from a “chicken or egg” scenario, in that, poor spatial features lead to inadequate spectral alignment and vice versa during training, often resulting in slow convergence, high computational costs, and learning failures, particularly when small datasets are used. Methods A novel method is presented for dense-shape correspondence, whereby the spatial information transformed by neural networks is combined with the projections onto spectral maps to overcome the “chicken or egg” challenge by selectively sampling only points with high confidence in their alignment. These points then contribute to the alignment and spectral loss terms, boosting training, and accelerating convergence by a factor of five. To ensure full unsupervised learning, the Gromov–Hausdorff distance metric was used to select the points with the maximal alignment score displaying most confidence. Results The effectiveness of the proposed approach was demonstrated on several benchmark datasets, whereby results were reported as superior to those of spectral and spatial-based methods. Conclusions The proposed method provides a promising new approach to dense-shape correspondence, addressing the key challenges in the field and offering significant advantages over the current methods, including faster convergence, improved accuracy, and reduced computational costs.展开更多
Selection of negative samples significantly influences landslide susceptibility assessment,especially when establishing the relationship between landslides and environmental factors in regions with complex geological ...Selection of negative samples significantly influences landslide susceptibility assessment,especially when establishing the relationship between landslides and environmental factors in regions with complex geological conditions.Traditional sampling strategies commonly used in landslide susceptibility models can lead to a misrepresentation of the distribution of negative samples,causing a deviation from actual geological conditions.This,in turn,negatively affects the discriminative ability and generalization performance of the models.To address this issue,we propose a novel approach for selecting negative samples to enhance the quality of machine learning models.We choose the Liangshan Yi Autonomous Prefecture,located in southwestern Sichuan,China,as the case study.This area,characterized by complex terrain,frequent tectonic activities,and steep slope erosion,experiences recurrent landslides,making it an ideal setting for validating our proposed method.We calculate the contribution values of environmental factors using the relief algorithm to construct the feature space,apply the Target Space Exteriorization Sampling(TSES)method to select negative samples,calculate landslide probability values by Random Forest(RF)modeling,and then create regional landslide susceptibility maps.We evaluate the performance of the RF model optimized by the Environmental Factor Selection-based TSES(EFSTSES)method using standard performance metrics.The results indicated that the model achieved an accuracy(ACC)of 0.962,precision(PRE)of 0.961,and an area under the curve(AUC)of 0.962.These findings demonstrate that the EFSTSES-based model effectively mitigates the negative sample imbalance issue,enhances the differentiation between landslide and non-landslide samples,and reduces misclassification,particularly in geologically complex areas.These improvements offer valuable insights for disaster prevention,land use planning,and risk mitigation strategies.展开更多
This study examines the impact of farmers’cooperatives participation and technology adoption on their economic welfare in China.A double selectivity model(DSM)is applied to correct for sample selection bias stemming ...This study examines the impact of farmers’cooperatives participation and technology adoption on their economic welfare in China.A double selectivity model(DSM)is applied to correct for sample selection bias stemming from both observed and unobserved factors,and a propensity score matching(PSM)method is applied to calculate the agricultural income difference with counter factual analysis using survey data from 396 farmers in 15 provinces in China.The findings indicate that farmers who join farmer cooperatives and adopt agricultural technology can increase agricultural income by 2.77 and 2.35%,respectively,compared with those non-participants and non-adopters.Interestingly,the effect on agricultural income is found to be more significant for the low-income farmers than the high-income ones,with income increasing 5.45 and 4.51%when participating in farmer cooperatives and adopting agricultural technology,respectively.Our findings highlight the positive role of farmer cooperatives and agricultural technology in promoting farmers’economic welfare.Based on the findings,government policy implications are also discussed.展开更多
Landslide susceptibility mapping is a crucial tool for disaster prevention and management.The performance of conventional data-driven model is greatly influenced by the quality of the samples data.The random selection...Landslide susceptibility mapping is a crucial tool for disaster prevention and management.The performance of conventional data-driven model is greatly influenced by the quality of the samples data.The random selection of negative samples results in the lack of interpretability throughout the assessment process.To address this limitation and construct a high-quality negative samples database,this study introduces a physics-informed machine learning approach,combining the random forest model with Scoops 3D,to optimize the negative samples selection strategy and assess the landslide susceptibility of the study area.The Scoops 3D is employed to determine the factor of safety value leveraging Bishop’s simplified method.Instead of conventional random selection,negative samples are extracted from the areas with a high factor of safety value.Subsequently,the results of conventional random forest model and physics-informed data-driven model are analyzed and discussed,focusing on model performance and prediction uncertainty.In comparison to conventional methods,the physics-informed model,set with a safety area threshold of 3,demonstrates a noteworthy improvement in the mean AUC value by 36.7%,coupled with a reduced prediction uncertainty.It is evident that the determination of the safety area threshold exerts an impact on both prediction uncertainty and model performance.展开更多
A vision-based color analysis system was developed for rapid estimation of copper content in the secondary copper smelting process. Firstly, cross section images of secondary copper samples were captured by the design...A vision-based color analysis system was developed for rapid estimation of copper content in the secondary copper smelting process. Firstly, cross section images of secondary copper samples were captured by the designed vision system. After the preprocessing and segmenting procedures, the images were selected according to their grayscale standard deviations of pixels and percentages of edge pixels in the luminance component. The selected images were then used to extract the information of the improved color vector angles, from which the copper content estimation model was developed based on the least squares support vector regression (LSSVR) method. For comparison, three additional LSSVR models, namely, only with sample selection, only with improved color vector angle, without sample selection or improved color vector angle, were developed. In addition, two exponential models, namely, with sample selection, without sample selection, were developed. Experimental results indicate that the proposed method is more effective for improving the copper content estimation accuracy, particularly when the sample size is small.展开更多
Special core analysis(SCAL)measurements play a noteworthy role in reservoir engineering.Due to the time-consuming and costly character of these measurements,routine core analysis(RCAL)data should be inspected thorough...Special core analysis(SCAL)measurements play a noteworthy role in reservoir engineering.Due to the time-consuming and costly character of these measurements,routine core analysis(RCAL)data should be inspected thoroughly to select a representative subset of samples for SCAL.There are no comprehensive guidelines on how representative samples should be selected.In this study,a new framework is presented for selection of representative samples for SCAL.The foundation of this framework is using methods of PSRTI,FZI*(FZI-star)and TEM-function for the early estimation of petrophysical static,dynamic,and pseudo-static rock types at RCAL stage.The global hydraulic element(GHE)approach is benefitted and a FZI*-based GHE method(i.e.,GHE*)is presented for partitioning data.The framework takes into consideration different laboratory,reservoir engineering,geological,petrophysical and statistical factors.A carbonate reservoir case is presented to support our methodology.We also show that the current forms of Lorenz and Stratigraphic Modified Lorenz Plots in reservoir engineering are not appropriate,and present new forms of them.展开更多
A prediction method of protein disulfide bond based on support vector machine and sample selection is proposed in this paper. First, the protein sequences selected are en-coded according to a certain encoding, input d...A prediction method of protein disulfide bond based on support vector machine and sample selection is proposed in this paper. First, the protein sequences selected are en-coded according to a certain encoding, input data for the prediction model of protein disulfide bond is generated;Then sample selection technique is used to select a portion of input data as training samples of support vector machine;finally the prediction model training samples trained is used to predict protein disulfide bond. The result of simulation experiment shows that the prediction model based on support vector ma-chine and sample selection can increase the prediction accuracy of protein disulfide bond.展开更多
A novel dynamic batch selective sampling algorithm based on version space analysis is presented. In the traditional batch selective sampling, example selection is entirely determined by the existing unreliable classif...A novel dynamic batch selective sampling algorithm based on version space analysis is presented. In the traditional batch selective sampling, example selection is entirely determined by the existing unreliable classification boundary; meanwhile, within a batch, examples labeled previously fail to provide instructive information for the selection of the rest. As a result, using the examples selected in batch mode for model refinement will jeopardize the classification performance. Based on the duality between feature space and parameter space under the SVM active learning fi:amework, dynamic batch selective sampling is proposed to address the problem. We select a batch of examples dynamically, using the examples labeled previously as guidance for further selection. In this way, the selection of feedback examples is determined by both the existing classification model and the examples labeled previously. Encouraging experimental results demonstrate the effectiveness of the proposed algorithm.展开更多
Breast mass identification is of great significance for early screening of breast cancer,while the existing detection methods have high missed and misdiagnosis rate for small masses.We propose a small target breast ma...Breast mass identification is of great significance for early screening of breast cancer,while the existing detection methods have high missed and misdiagnosis rate for small masses.We propose a small target breast mass detection network named Residual asymmetric dilated convolution-Cross layer attention-Mean standard deviation adaptive selection-You Only Look Once(RCM-YOLO),which improves the identifiability of small masses by increasing the resolution of feature maps,adopts residual asymmetric dilated convolution to expand the receptive field and optimize the amount of parameters,and proposes the cross-layer attention that transfers the deep semantic information to the shallow layer as auxiliary information to obtain key feature locations.In the training process,we propose an adaptive positive sample selection algorithm to automatically select positive samples,which considers the statistical features of the intersection over union sets to ensure the validity of the training set and the detection accuracy of the model.To verify the performance of our model,we used public datasets to carry out the experiments.The results showed that the mean Average Precision(mAP)of RCM-YOLO reached 90.34%,compared with YOLOv5,the missed detection rate for small masses of RCM-YOLO was reduced to 11%,and the single detection time was reduced to 28 ms.The detection accuracy and speed can be effectively improved by strengthening the feature expression of small masses and the relationship between features.Our method can help doctors in batch screening of breast images,and significantly promote the detection rate of small masses and reduce misdiagnosis.展开更多
Introduction:This study is aimed at analyzing farmers’perception and adaptation to climate change in the Dabus watershed.It is based on analysis of data collected from 734 randomly selected farm household heads subst...Introduction:This study is aimed at analyzing farmers’perception and adaptation to climate change in the Dabus watershed.It is based on analysis of data collected from 734 randomly selected farm household heads substantiated with Focus Group Discussions and field observations.Methods:The study employed descriptive methods to assess farmers’perception of climate change,local indicators of climate change and types of adaptation measures exercised to cope up with the risk of the change in climate.The study also employed the Heckman sample selection model to analyze the two-step process of adaptation to climate change which initially requires farmers’perception that climate is changing prior to responding to the changes through adaptation measures.Results:Based on the model result educational attainment,the age of the head of the household,the number of crop failures in the past,changes in temperature and precipitation significantly influenced farmers’perception of climate change in wet lowland parts of the study area.In dry lowland condition,farming experience,climate information,duration of food shortage,and the number of crop failures experienced determined farmers’perception of climate change.Farmers’adaptation decision in both the wet and dry lowland conditions is influenced by household size,the gender of household head,cultivated land size,education,farm experience,non-farm income,income from livestock,climate information,extension advice,farm-home distance and number of parcels.However,the direction of influence and significance level of most of the explanatory variables vary between the two parts of the study area.Conclusions:In line with the results,any intervention that promotes the use of adaptation measures to climate change may account for location-specific factors that determine farmers'perception of climate change and adaptive responses thereof.展开更多
A Brain-Computer Interface(BCI) aims to produce a new way for people to communicate with computers.Brain signal classification is a challenging issue owing to the high-dimensional data and low Signal-to-Noise Ratio(SN...A Brain-Computer Interface(BCI) aims to produce a new way for people to communicate with computers.Brain signal classification is a challenging issue owing to the high-dimensional data and low Signal-to-Noise Ratio(SNR). In this paper, a novel method is proposed to cope with this problem through sparse representation for the P300 speller paradigm. This work is distinguished using two key contributions. First, we investigate sparse coding and its feasibility for brain signal classification. Training signals are used to learn the dictionaries and test signals are classified according to their sparse representation and reconstruction errors. Second, sample selection and a channel-aware dictionary are proposed to reduce the effect of noise, which can improve performance and enhance the computing efficiency simultaneously. A novel classification method from the sample set perspective is proposed to exploit channel correlations. Specifically, the brain signal of each channel is classified jointly using its spatially neighboring channels and a novel weighted regulation strategy is proposed to overcome outliers in the group. Experimental results have demonstrated that our methods are highly effective. We achieve a state-of-the-art recognition rate of 72.5%, 88.5%, and 98.5% at 5, 10, and 15 epochs, respectively, on BCI Competition Ⅲ Dataset Ⅱ.展开更多
This paper proposes a Bayesian semiparametric accelerated failure time model for doubly censored data with errors-in-covariates. The authors model the distributions of the unobserved covariates and the regression erro...This paper proposes a Bayesian semiparametric accelerated failure time model for doubly censored data with errors-in-covariates. The authors model the distributions of the unobserved covariates and the regression errors via the Dirichlet processes. Moreover, the authors extend the Bayesian Lasso approach to our semiparametric model for variable selection. The authors develop the Markov chain Monte Carlo strategies for posterior calculation. Simulation studies are conducted to show the performance of the proposed method. The authors also demonstrate the implementation of the method using analysis of PBC data and ACTG 175 data.展开更多
文摘Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.
文摘In the task of Facial Expression Recognition(FER),data uncertainty has been a critical factor affecting performance,typically arising from the ambiguity of facial expressions,low-quality images,and the subjectivity of annotators.Tracking the training history reveals that misclassified samples often exhibit high confidence and excessive uncertainty in the early stages of training.To address this issue,we propose an uncertainty-based robust sample selection strategy,which combines confidence error with RandAugment to improve image diversity,effectively reducing overfitting caused by uncertain samples during deep learning model training.To validate the effectiveness of the proposed method,extensive experiments were conducted on FER public benchmarks.The accuracy obtained were 89.08%on RAF-DB,63.12%on AffectNet,and 88.73%on FERPlus.
基金supported by the National Natural Science Foundation of China(62371049)。
文摘In engineering application,there is only one adaptive weights estimated by most of traditional early warning radars for adaptive interference suppression in a pulse reputation interval(PRI).Therefore,if the training samples used to calculate the weight vector does not contain the jamming,then the jamming cannot be removed by adaptive spatial filtering.If the weight vector is constantly updated in the range dimension,the training data may contain target echo signals,resulting in signal cancellation effect.To cope with the situation that the training samples are contaminated by target signal,an iterative training sample selection method based on non-homogeneous detector(NHD)is proposed in this paper for updating the weight vector in entire range dimension.The principle is presented,and the validity is proven by simulation results.
基金supported by the National Natural Science Foundation of China[grant number 42101342]Third Comprehensive Scientific Expedition to Xinjiang[grant number 2021XJKK1403].
文摘Cotton is one of the most significant cash crops in the world,and it is also the main source of natural fiber for textiles.It is crucial for cotton management to identify the spatiotemporal distribution of cotton planting areas timely and accurately on a fine scale.However,previous research studies have predominantly concentrated on specific years using remote sensing data.Challenges still exist in the extraction of cotton areas for long time series with high accuracy.To address this issue,a novel cotton sample selection method was proposed and the machine learning method is employed to effectively identify the long time series cotton planting areas at a 30-m resolution scale.Bortala and Shuanghe in Xinjiang,China,were selected as the study cases to demonstrate the approach.Specifically,the cropland in this study was extracted by using an object-oriented classification method with Landsat images and the results were optimized as the vectorized boundary of croplands.Then,the cotton samples were selected using the Normalized Difference Vegetation Index(NDVI)series of Moderate Resolution Imaging Spectroradiometer(MODIS)based on its phenological characteristics.Next,cotton was identified based on the croplands from 2000 to 2020 by using the machine learning model.Finally,the performance was evaluated,and the spatiotemporal distribution characteristics of cotton planting areas were analyzed.The results showed that the proposed approach can achieve high accuracy at a fine spatial resolution.The performance evaluation indicated the applicability and suitability of the method,there is a good correlation between the extracted cotton areas and statistical data,and the cotton area of the study area showed an increasing trend.The cotton spatial distribution pattern developed from dispersion to agglomeration.The proposed approach and the derived 30-m cotton maps can provide a scientific reference for the optimization of agricultural management.
文摘We propose interest-driven progressive visual analytics.The core idea is to filter samples with features of interest to analysts from the given dataset for analysis.The approach relies on a generative model(GM)trained using the given dataset as the training set.The GM characteristics make it convenient to find ideal generated samples from its latent space.Then,we filter the original samples similar to the ideal generated ones to explore patterns.Our research involves two methods for achieving and applying the idea.First,we give a method to explore ideal samples from a GM’s latent space.Second,we integrate the method into a system to form an embedding-based analytical workflow.Patterns found on open datasets in case studies,results of quantitative experiments,and positive feedback from experts illustrate the general usability and effectiveness of the approach.
基金National Natural Science Foundations of China(Nos.U1162202,61222303)National High-Tech Research and Development Program of China(No.2013AA040701)the Fundamental Research Funds for the Central Universities and Shanghai Leading Academic Discipline Project,China(No.B504)
文摘Near-infrared( NIR) spectroscopy has been widely employed as a process analytical tool( PAT) in various fields; the most important reason for the use of this method is its ability to record spectra in real time to capture process properties. In quantitative online applications,the robustness of the established NIR model is often deteriorated by process condition variations,nonlinear of the properties or the high-dimensional of the NIR data set. To cope with such situation,a novel method based on principal component analysis( PCA) and artificial neural network( ANN) is proposed and a new sample-selection method is mentioned. The advantage of the presented approach is that it can select proper calibration samples and establish robust model effectively. The performance of the method was tested on a spectroscopic data set from a refinery process. Compared with traditional partial leastsquares( PLS),principal component regression( PCR) and several other modeling methods, the proposed approach was found to achieve good accuracy in the prediction of gasoline properties. An application of the proposed method is also reported.
基金Supported by the Zimin Institute for Engineering Solutions Advancing Better Lives。
文摘Background Functional mapping, despite its proven efficiency, suffers from a “chicken or egg” scenario, in that, poor spatial features lead to inadequate spectral alignment and vice versa during training, often resulting in slow convergence, high computational costs, and learning failures, particularly when small datasets are used. Methods A novel method is presented for dense-shape correspondence, whereby the spatial information transformed by neural networks is combined with the projections onto spectral maps to overcome the “chicken or egg” challenge by selectively sampling only points with high confidence in their alignment. These points then contribute to the alignment and spectral loss terms, boosting training, and accelerating convergence by a factor of five. To ensure full unsupervised learning, the Gromov–Hausdorff distance metric was used to select the points with the maximal alignment score displaying most confidence. Results The effectiveness of the proposed approach was demonstrated on several benchmark datasets, whereby results were reported as superior to those of spectral and spatial-based methods. Conclusions The proposed method provides a promising new approach to dense-shape correspondence, addressing the key challenges in the field and offering significant advantages over the current methods, including faster convergence, improved accuracy, and reduced computational costs.
基金supported by Natural Science Research Project of Anhui Educational Committee(2023AH030041)National Natural Science Foundation of China(42277136)Anhui Province Young and Middle-aged Teacher Training Action Project(DTR2023018).
文摘Selection of negative samples significantly influences landslide susceptibility assessment,especially when establishing the relationship between landslides and environmental factors in regions with complex geological conditions.Traditional sampling strategies commonly used in landslide susceptibility models can lead to a misrepresentation of the distribution of negative samples,causing a deviation from actual geological conditions.This,in turn,negatively affects the discriminative ability and generalization performance of the models.To address this issue,we propose a novel approach for selecting negative samples to enhance the quality of machine learning models.We choose the Liangshan Yi Autonomous Prefecture,located in southwestern Sichuan,China,as the case study.This area,characterized by complex terrain,frequent tectonic activities,and steep slope erosion,experiences recurrent landslides,making it an ideal setting for validating our proposed method.We calculate the contribution values of environmental factors using the relief algorithm to construct the feature space,apply the Target Space Exteriorization Sampling(TSES)method to select negative samples,calculate landslide probability values by Random Forest(RF)modeling,and then create regional landslide susceptibility maps.We evaluate the performance of the RF model optimized by the Environmental Factor Selection-based TSES(EFSTSES)method using standard performance metrics.The results indicated that the model achieved an accuracy(ACC)of 0.962,precision(PRE)of 0.961,and an area under the curve(AUC)of 0.962.These findings demonstrate that the EFSTSES-based model effectively mitigates the negative sample imbalance issue,enhances the differentiation between landslide and non-landslide samples,and reduces misclassification,particularly in geologically complex areas.These improvements offer valuable insights for disaster prevention,land use planning,and risk mitigation strategies.
基金the Special Project of Major Theoretical Research and Interpretation of Philosophy and Social Sciences of Chongqing Municipal Education Commission,China(19SKZDZX15)the Key Project of Humanities and Social Sciences Research of Chongqing Education Commission,China(18SKSJ003)the Funding for Cultivating Major Projects in Humanities and Social Sciences of Southwest University,China(SWU1809009)。
文摘This study examines the impact of farmers’cooperatives participation and technology adoption on their economic welfare in China.A double selectivity model(DSM)is applied to correct for sample selection bias stemming from both observed and unobserved factors,and a propensity score matching(PSM)method is applied to calculate the agricultural income difference with counter factual analysis using survey data from 396 farmers in 15 provinces in China.The findings indicate that farmers who join farmer cooperatives and adopt agricultural technology can increase agricultural income by 2.77 and 2.35%,respectively,compared with those non-participants and non-adopters.Interestingly,the effect on agricultural income is found to be more significant for the low-income farmers than the high-income ones,with income increasing 5.45 and 4.51%when participating in farmer cooperatives and adopting agricultural technology,respectively.Our findings highlight the positive role of farmer cooperatives and agricultural technology in promoting farmers’economic welfare.Based on the findings,government policy implications are also discussed.
基金Project(G2022165004L)supported by the High-end Foreign Expert Introduction Program,ChinaProject(2021XM3008)supported by the Special Foundation of Postdoctoral Support Program,Chongqing,China+1 种基金Project(2018-ZL-01)supported by the Sichuan Transportation Science and Technology Project,ChinaProject(HZ2021001)supported by the Chongqing Municipal Education Commission,China。
文摘Landslide susceptibility mapping is a crucial tool for disaster prevention and management.The performance of conventional data-driven model is greatly influenced by the quality of the samples data.The random selection of negative samples results in the lack of interpretability throughout the assessment process.To address this limitation and construct a high-quality negative samples database,this study introduces a physics-informed machine learning approach,combining the random forest model with Scoops 3D,to optimize the negative samples selection strategy and assess the landslide susceptibility of the study area.The Scoops 3D is employed to determine the factor of safety value leveraging Bishop’s simplified method.Instead of conventional random selection,negative samples are extracted from the areas with a high factor of safety value.Subsequently,the results of conventional random forest model and physics-informed data-driven model are analyzed and discussed,focusing on model performance and prediction uncertainty.In comparison to conventional methods,the physics-informed model,set with a safety area threshold of 3,demonstrates a noteworthy improvement in the mean AUC value by 36.7%,coupled with a reduced prediction uncertainty.It is evident that the determination of the safety area threshold exerts an impact on both prediction uncertainty and model performance.
基金Project(2011BAE23B05)supported by National Key Technology R&D Program of ChinaProject(61004134)supported by the National Natural Science Foundation of ChinaProject(LQ13F030007)supported by Zhejiang Provincial Natural Science Foundation of China
文摘A vision-based color analysis system was developed for rapid estimation of copper content in the secondary copper smelting process. Firstly, cross section images of secondary copper samples were captured by the designed vision system. After the preprocessing and segmenting procedures, the images were selected according to their grayscale standard deviations of pixels and percentages of edge pixels in the luminance component. The selected images were then used to extract the information of the improved color vector angles, from which the copper content estimation model was developed based on the least squares support vector regression (LSSVR) method. For comparison, three additional LSSVR models, namely, only with sample selection, only with improved color vector angle, without sample selection or improved color vector angle, were developed. In addition, two exponential models, namely, with sample selection, without sample selection, were developed. Experimental results indicate that the proposed method is more effective for improving the copper content estimation accuracy, particularly when the sample size is small.
文摘Special core analysis(SCAL)measurements play a noteworthy role in reservoir engineering.Due to the time-consuming and costly character of these measurements,routine core analysis(RCAL)data should be inspected thoroughly to select a representative subset of samples for SCAL.There are no comprehensive guidelines on how representative samples should be selected.In this study,a new framework is presented for selection of representative samples for SCAL.The foundation of this framework is using methods of PSRTI,FZI*(FZI-star)and TEM-function for the early estimation of petrophysical static,dynamic,and pseudo-static rock types at RCAL stage.The global hydraulic element(GHE)approach is benefitted and a FZI*-based GHE method(i.e.,GHE*)is presented for partitioning data.The framework takes into consideration different laboratory,reservoir engineering,geological,petrophysical and statistical factors.A carbonate reservoir case is presented to support our methodology.We also show that the current forms of Lorenz and Stratigraphic Modified Lorenz Plots in reservoir engineering are not appropriate,and present new forms of them.
文摘A prediction method of protein disulfide bond based on support vector machine and sample selection is proposed in this paper. First, the protein sequences selected are en-coded according to a certain encoding, input data for the prediction model of protein disulfide bond is generated;Then sample selection technique is used to select a portion of input data as training samples of support vector machine;finally the prediction model training samples trained is used to predict protein disulfide bond. The result of simulation experiment shows that the prediction model based on support vector ma-chine and sample selection can increase the prediction accuracy of protein disulfide bond.
文摘A novel dynamic batch selective sampling algorithm based on version space analysis is presented. In the traditional batch selective sampling, example selection is entirely determined by the existing unreliable classification boundary; meanwhile, within a batch, examples labeled previously fail to provide instructive information for the selection of the rest. As a result, using the examples selected in batch mode for model refinement will jeopardize the classification performance. Based on the duality between feature space and parameter space under the SVM active learning fi:amework, dynamic batch selective sampling is proposed to address the problem. We select a batch of examples dynamically, using the examples labeled previously as guidance for further selection. In this way, the selection of feedback examples is determined by both the existing classification model and the examples labeled previously. Encouraging experimental results demonstrate the effectiveness of the proposed algorithm.
基金supported by the National Natural Science Foundation of China(No.62271264)the National Key Research and Development Program of China(No.2021ZD0102100)the Industry University Research Foundation of Jiangsu Province(No.BY2022459).
文摘Breast mass identification is of great significance for early screening of breast cancer,while the existing detection methods have high missed and misdiagnosis rate for small masses.We propose a small target breast mass detection network named Residual asymmetric dilated convolution-Cross layer attention-Mean standard deviation adaptive selection-You Only Look Once(RCM-YOLO),which improves the identifiability of small masses by increasing the resolution of feature maps,adopts residual asymmetric dilated convolution to expand the receptive field and optimize the amount of parameters,and proposes the cross-layer attention that transfers the deep semantic information to the shallow layer as auxiliary information to obtain key feature locations.In the training process,we propose an adaptive positive sample selection algorithm to automatically select positive samples,which considers the statistical features of the intersection over union sets to ensure the validity of the training set and the detection accuracy of the model.To verify the performance of our model,we used public datasets to carry out the experiments.The results showed that the mean Average Precision(mAP)of RCM-YOLO reached 90.34%,compared with YOLOv5,the missed detection rate for small masses of RCM-YOLO was reduced to 11%,and the single detection time was reduced to 28 ms.The detection accuracy and speed can be effectively improved by strengthening the feature expression of small masses and the relationship between features.Our method can help doctors in batch screening of breast images,and significantly promote the detection rate of small masses and reduce misdiagnosis.
基金The authors would like to thank Addis Ababa University(AAU)and Dire-Dawa University(DDU)for providing financial support for the data collection and write-up of the manuscript.
文摘Introduction:This study is aimed at analyzing farmers’perception and adaptation to climate change in the Dabus watershed.It is based on analysis of data collected from 734 randomly selected farm household heads substantiated with Focus Group Discussions and field observations.Methods:The study employed descriptive methods to assess farmers’perception of climate change,local indicators of climate change and types of adaptation measures exercised to cope up with the risk of the change in climate.The study also employed the Heckman sample selection model to analyze the two-step process of adaptation to climate change which initially requires farmers’perception that climate is changing prior to responding to the changes through adaptation measures.Results:Based on the model result educational attainment,the age of the head of the household,the number of crop failures in the past,changes in temperature and precipitation significantly influenced farmers’perception of climate change in wet lowland parts of the study area.In dry lowland condition,farming experience,climate information,duration of food shortage,and the number of crop failures experienced determined farmers’perception of climate change.Farmers’adaptation decision in both the wet and dry lowland conditions is influenced by household size,the gender of household head,cultivated land size,education,farm experience,non-farm income,income from livestock,climate information,extension advice,farm-home distance and number of parcels.However,the direction of influence and significance level of most of the explanatory variables vary between the two parts of the study area.Conclusions:In line with the results,any intervention that promotes the use of adaptation measures to climate change may account for location-specific factors that determine farmers'perception of climate change and adaptive responses thereof.
基金supported by the National High Technology Research and Development (863) Program of China(No. 2012AA011004)the National Science and Technology Support Program (No. 2013BAK02B04)。
文摘A Brain-Computer Interface(BCI) aims to produce a new way for people to communicate with computers.Brain signal classification is a challenging issue owing to the high-dimensional data and low Signal-to-Noise Ratio(SNR). In this paper, a novel method is proposed to cope with this problem through sparse representation for the P300 speller paradigm. This work is distinguished using two key contributions. First, we investigate sparse coding and its feasibility for brain signal classification. Training signals are used to learn the dictionaries and test signals are classified according to their sparse representation and reconstruction errors. Second, sample selection and a channel-aware dictionary are proposed to reduce the effect of noise, which can improve performance and enhance the computing efficiency simultaneously. A novel classification method from the sample set perspective is proposed to exploit channel correlations. Specifically, the brain signal of each channel is classified jointly using its spatially neighboring channels and a novel weighted regulation strategy is proposed to overcome outliers in the group. Experimental results have demonstrated that our methods are highly effective. We achieve a state-of-the-art recognition rate of 72.5%, 88.5%, and 98.5% at 5, 10, and 15 epochs, respectively, on BCI Competition Ⅲ Dataset Ⅱ.
基金supported by the National Natural Science Foundation of China under Grant Nos.11171007/A011103,11171230,and 11471024
文摘This paper proposes a Bayesian semiparametric accelerated failure time model for doubly censored data with errors-in-covariates. The authors model the distributions of the unobserved covariates and the regression errors via the Dirichlet processes. Moreover, the authors extend the Bayesian Lasso approach to our semiparametric model for variable selection. The authors develop the Markov chain Monte Carlo strategies for posterior calculation. Simulation studies are conducted to show the performance of the proposed method. The authors also demonstrate the implementation of the method using analysis of PBC data and ACTG 175 data.