Activity cliffs(ACs)are generally defined as pairs of similar compounds that only differ by a minor structural modification but exhibit a large difference in their binding affinity for a given target.ACs offer crucial...Activity cliffs(ACs)are generally defined as pairs of similar compounds that only differ by a minor structural modification but exhibit a large difference in their binding affinity for a given target.ACs offer crucial insights that aid medicinal chemists in optimizing molecular structures.Nonetheless,they also form a major source of prediction error in structure-activity relationship(SAR)models.To date,several studies have demonstrated that deep neural networks based on molecular images or graphs might need to be improved further in predicting the potency of ACs.In this paper,we integrated the triplet loss in face recognition with pre-training strategy to develop a prediction model ACtriplet,tailored for ACs.Through extensive comparison with multiple baseline models on 30 benchmark datasets,the results showed that ACtriplet was significantly better than those deep learning(DL)models without pretraining.In addition,we explored the effect of pre-training on data representation.Finally,the case study demonstrated that our model's interpretability module could explain the prediction results reasonably.In the dilemma that the amount of data could not be increased rapidly,this innovative framework would better make use of the existing data,which would propel the potential of DL in the early stage of drug discovery and optimization.展开更多
Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers t...Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers to techniques that increase the amount of image data.Common operations for image augmentation include changes in illumination,rotation,contrast,size,viewing angle,and others.Recently,Generative Adversarial Networks(GANs)have been employed for image generation.However,like image augmentation methods,GAN approaches can only generate images that are similar to the original images.Therefore,they also cannot generate new classes of data.Texture images presentmore challenges than general images,and generating textures is more complex than creating other types of images.This study proposes a gradient-based deep neural network method that generates a new class of texture.It is possible to rapidly generate new classes of textures using different kernels from pre-trained deep networks.After generating new textures for each class,the number of textures increases through image augmentation.During this process,several techniques are proposed to automatically remove incomplete and similar textures that are created.The proposed method is faster than some well-known generative networks by around 4 to 10 times.In addition,the quality of the generated textures surpasses that of these networks.The proposed method can generate textures that surpass those of someGANs and parametric models in certain image qualitymetrics.It can provide a big texture dataset to train deep networks.A new big texture dataset is created artificially using the proposed method.This dataset is approximately 2 GB in size and comprises 30,000 textures,each 150×150 pixels in size,organized into 600 classes.It is uploaded to the Kaggle site and Google Drive.This dataset is called BigTex.Compared to other texture datasets,the proposed dataset is the largest and can serve as a comprehensive texture dataset for training more powerful deep neural networks and mitigating overfitting.展开更多
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective...Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.展开更多
Temperature-programmed desorption(TPD)is a fundamental technique in surface science and heterogeneous catalysis for characterizing adsorption behavior,and for extracting key parameters such as adsorption energy.Howeve...Temperature-programmed desorption(TPD)is a fundamental technique in surface science and heterogeneous catalysis for characterizing adsorption behavior,and for extracting key parameters such as adsorption energy.However,the majority of existing TPD data is accessible in the form of published images,which lacks structured and quantitative datasets.This constrains its utility for rigorous quantitative analysis and computational modelling.Using carbon monoxide(CO)which is a widely adopted probe molecule,a curated and standardized dataset of CO-TPD is constructed,encompassing 14 transition-metal single-crystal surfaces,including copper(Cu)and ruthenium(Ru).By systematically extracting numerical data points from published spectra and applying normalization,essential spectral features such as peak shape are fully preserved.The dataset also documents relevant experimental parameters,including heating rates,and was developed using a standardized protocol for data collection and quality control.This resource serves as both a reference library to support the deconvolution of TPD spectra from complex catalysts and an experimental benchmark for calibrating parameters in theoretical models.By providing a reliable and accessible data function,this work advances the microscopic understanding and the rational design of catalyst active centers.展开更多
Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using d...Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.展开更多
With the deep integration of smart manufacturing and IoT technologies,higher demands are placed on the intelligence and real-time performance of industrial equipment fault detection.For industrial fans,base bolt loose...With the deep integration of smart manufacturing and IoT technologies,higher demands are placed on the intelligence and real-time performance of industrial equipment fault detection.For industrial fans,base bolt loosening faults are difficult to identify through conventional spectrum analysis,and the extreme scarcity of fault data leads to limited training datasets,making traditional deep learning methods inaccurate in fault identification and incapable of detecting loosening severity.This paper employs Bayesian Learning by training on a small fault dataset collected from the actual operation of axial-flow fans in a factory to obtain posterior distribution.This method proposes specific data processing approaches and a configuration of Bayesian Convolutional Neural Network(BCNN).It can effectively improve the model’s generalization ability.Experimental results demonstrate high detection accuracy and alignment with real-world applications,offering practical significance and reference value for industrial fan bolt loosening detection under data-limited conditions.展开更多
Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three ...Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design.展开更多
Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P...Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P methodology for constructing high-quality instruction datasets in TCM mental disorders.This approach integrates theoretical knowledge and practical case studies through a dual-track strategy.(i)Theoretical track:textbooks and guidelines on TCM mental disorders were manually segmented.Initial responses were generated using DeepSeek-V3,followed by refinement by the Qwen3-32B model to align the expression with human preferences.A screening algorithm was then applied to select 16000 high-quality instruction pairs.(ii)Practical track:starting from over 600 real clinical case seeds,diagnostic and therapeutic instruction pairs were generated using DeepSeek-V3 and subsequently screened through manual evaluation,resulting in 4000 high-quality practiceoriented instruction pairs.The integration of both tracks yielded the Med-Mental-Instruct-T&P dataset,comprising a total of 20000 instruction pairs.To validate the dataset’s effectiveness,three experimental evaluations(both manual and automated)were conducted:(i)comparative studies to compare the performance of models fine-tuned on different datasets;(ii)benchmarking to compare against mainstream TCM-specific large language models(LLMs);(iii)data ablation study to investigate the relationship between data volume and model performance.Results Experimental results demonstrate the superior performance of T&P-model finetuned on the Med-Mental-Instruct-T&P dataset.In the comparative study,the T&P-model significantly outperformed the baseline models trained solely on self-generated or purely human-curated baseline data.This superiority was evident in both automated metrics(ROUGEL>0.55)and expert manual evaluations(scoring above 7/10 across accuracy).In benchmark comparisons,the T&P-model also excelled against existing mainstream TCM LLMs(e.g.,HuatuoGPT and ZuoyiGPT).It showed particularly strong capabilities in handling diverse clinical presentations,including challenging disorders such as insomnia and coma,showcasing its robustness and versatility.Data ablation studies showed that T&P-model performance had an overall upward trend with minor fluctuations when training data increased from 10%to 50%;beyond 50%,performance improvement slowed significantly,with metrics plateauing and approaching a saturation point.展开更多
Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for ...Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for addressing challenges such as occlusions,indistinct edges,and stacked configurations,which demand large,diverse datasets.To meet these demands,we propose two complementary approaches:a semi-automatic annotation interface using tools like the segment anything model(SAM)and GrabCut and a synthetic data generation pipeline leveraging 3D-scanned models.These methods reduce reliance on real meat,mitigate food waste,and improve scalability.Experimental results demonstrate that incorporating synthetic data enhances segmentation model performance and,when combined with real data,further boosts accuracy,paving the way for more efficient automation in the food industry.展开更多
How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classif...How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.展开更多
Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natur...Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natural language instruction and the observed surrounding visual information. To make the best decision, in every step during the navigation, the agent should pay more attention to understanding the objects, the object attributes, and the object relationships. But most current methods process all received textual and visual information equally. Therefore, this paper integrates more detailed semantic connections between visual and textual information through three pre-training tasks(object prediction, object attributes prediction, and object relationship prediction). The model will learn better fusion representation and alignment between these two types of information to improve the success rate(SR) and generalization. The experiments show that compared with the former baseline models, the SR on the unseen validation set(Val Unseen) increased by 7%, and the SR weighted by path length(SPL) increased by 7%;the SR on the test set(Test) increased 4%, SPL increased by 3%.展开更多
Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image qual...Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image quality of handheld ultrasound devices is not always satisfactory due to the limited equipment size,which hinders accurate diagnoses by doctors.At the same time,paired ultrasound images are difficult to obtain from the clinic because imaging process is complicated.Therefore,we propose a modified cycle generative adversarial network(cycleGAN) for ultrasound image enhancement from multiple organs via unpaired pre-training.We introduce an ultrasound image pre-training method that does not require paired images,alleviating the requirement for large-scale paired datasets.We also propose an enhanced block with different structures in the pre-training and fine-tuning phases,which can help achieve the goals of different training phases.To improve the robustness of the model,we add Gaussian noise to the training images as data augmentation.Our approach is effective in obtaining the best quantitative evaluation results using a small number of parameters and less training costs to improve the quality of handheld ultrasound devices.展开更多
Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time ...Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time and place, not to mention the cost involved in traveling, accommodation and training venue. However, in the most online training courses, all trainees are given same training materials and teaching paradigms. The problem of differentia ting the trainees’ abilities is the main concern. We need a pre-training test t o identify and classify of the weaknesses and strengths of differentiate trainee s so as to devise an appropriate training programs for the trainees. Adaptation of a Web-based Computer adaptive Test (CAT) for the pre-training test make the web-based training more efficient. The advantages of CAT are self-pacing, eff iciency, time and cost saving, immediate scoring and feedback, accuracy and secu rity, etc (Rudner, 1998; UMN, 1999; Novell, 2000; Linacre, 2000; Windowsglore, 2 000). Moreover, Web-based CAT also gives greater flexibility and convenience. T his paper describes how this CAT tool is built, how it helps instructor identify the strengths and weaknesses of trainees, and how to assure quality on the CAT system.展开更多
As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate unders...As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.展开更多
Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence o...Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.展开更多
Scientific knowledge on the chemical compositions of fine particulate matter(PM_(2.5)) is essential for properly assessing its health and climate effects,and for decisionmakers to develop efficient mitigation strategi...Scientific knowledge on the chemical compositions of fine particulate matter(PM_(2.5)) is essential for properly assessing its health and climate effects,and for decisionmakers to develop efficient mitigation strategies.A high-resolution PM_(2.5) chemical composition dataset(CAQRA-aerosol)is developed in this study,which provides hourly maps of organic carbon,black carbon,ammonium,nitrate,and sulfate in China from 2013 to 2020 with a horizontal resolution of 15 km.This paper describes the method,access,and validation results of this dataset.It shows that CAQRA-aerosol has good consistency with observations and achieves higher or comparable accuracy with previous PM_(2.5) composition datasets.Based on CAQRA-aerosol,spatiotemporal changes of different PM_(2.5) compositions were investigated from a national viewpoint,which emphasizes different changes of nitrate from other compositions.The estimated annual rate of population-weighted concentrations of nitrate is 0.23μg m^(−3)yr^(−1) from 2015 to 2020,compared with−0.19 to−1.1μg m^(−3)yr^(−1) for other compositions.The whole dataset is freely available from the China Air Pollution Data Center(https://doi.org/10.12423/capdb_PKU.2023.DA).展开更多
Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at diff...Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples.This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios.To address this limitation,we introduce the million-scale fine-grained geospatial scene classification dataset(MEET),which contains over 1.03 million zoom-free remote sensing scene samples,manually annotated into 80 fine-grained categories.In MEET,each scene sample follows a scene-in-scene layout,where the central scene serves as the reference,and auxiliary scenes provide crucial spatial context for fine-grained classification.Moreover,to tackle the emerging challenge of scene-in-scene classification,we present the context-aware transformer(CAT),a model specifically designed for this task,which adaptively fuses spatial context to accurately classify the scene samples.CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes.Based on MEET,we establish a comprehensive benchmark for fine-grained geospatial scene classification,evaluating CAT against 11 competitive baselines.The results demonstrate that CAT significantly outperforms these baselines,achieving a 1.88%higher balanced accuracy(BA)with the Swin-Large backbone,and a notable 7.87%improvement with the Swin-Huge backbone.Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping.The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html.展开更多
This data set collects,compares and contrasts the capacities and structures of a series of hard carbon materials,and then searches for correlations between structure and electrochemical performance.The capacity data o...This data set collects,compares and contrasts the capacities and structures of a series of hard carbon materials,and then searches for correlations between structure and electrochemical performance.The capacity data of the hard carbons were obtained by charge/discharge tests and the materials were characterized by XRD,gas adsorption,true density tests and SAXS.In particular,the fitting of SAXS gave a series of structural parameters which showed good characterization.The related test details are given with the structural data of the hard carbons and the electrochemical performance of the sodium-ion batteries.展开更多
The automatic classification of thyroid nodules in ultrasound images is a critical research focus in medical imaging.However,publicly available thyroid ultrasound datasets remain scarce.In this study,we developed the ...The automatic classification of thyroid nodules in ultrasound images is a critical research focus in medical imaging.However,publicly available thyroid ultrasound datasets remain scarce.In this study,we developed the Ultrasound Dataset for Thyroid Nodules(UD-TN),a comprehensive dataset containing 10,495 labeled images classified as benign or malignant based on pathology-confirmed results.To establish a benchmark,we proposed the Thyroid Ultrasound Image Neural Network(ThyUNet),a deep learning model designed for accurate nodule classification.By incorporating high-resolution feature enhancement,instance normalization,and dilated convolutions into residual blocks,ThyUNet excels in extracting fine-grained features,particularly for small nodules.Experimental results demonstrate that ThyUNet achieves state-of-the-art performance,with an accuracy of 89.7%,a sensitivity of 0.879,and a specificity of 0.910 on the testing set.These results surpass those of other advanced architectures,highlighting the model’s effectiveness.UD-TN and ThyUNet contribute significantly to advancing intelligent medical diagnostics.Dataset details and access instructions are available at https://github.com/18811755633/Sample-of-UD-TN.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.:U23A20530,82273858,and 82173746)the National Key Research and Development Programof China(Grant No.:2023YFF1204904)Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism(Shanghai Municipal Education Commission,China).
文摘Activity cliffs(ACs)are generally defined as pairs of similar compounds that only differ by a minor structural modification but exhibit a large difference in their binding affinity for a given target.ACs offer crucial insights that aid medicinal chemists in optimizing molecular structures.Nonetheless,they also form a major source of prediction error in structure-activity relationship(SAR)models.To date,several studies have demonstrated that deep neural networks based on molecular images or graphs might need to be improved further in predicting the potency of ACs.In this paper,we integrated the triplet loss in face recognition with pre-training strategy to develop a prediction model ACtriplet,tailored for ACs.Through extensive comparison with multiple baseline models on 30 benchmark datasets,the results showed that ACtriplet was significantly better than those deep learning(DL)models without pretraining.In addition,we explored the effect of pre-training on data representation.Finally,the case study demonstrated that our model's interpretability module could explain the prediction results reasonably.In the dilemma that the amount of data could not be increased rapidly,this innovative framework would better make use of the existing data,which would propel the potential of DL in the early stage of drug discovery and optimization.
基金supported via funding from Prince Sattam bin Abdulaziz University(PSAU/2025/R/1446)Princess Nourah bint Abdulrahman University(PNURSP2025R300)Prince Sultan University.
文摘Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers to techniques that increase the amount of image data.Common operations for image augmentation include changes in illumination,rotation,contrast,size,viewing angle,and others.Recently,Generative Adversarial Networks(GANs)have been employed for image generation.However,like image augmentation methods,GAN approaches can only generate images that are similar to the original images.Therefore,they also cannot generate new classes of data.Texture images presentmore challenges than general images,and generating textures is more complex than creating other types of images.This study proposes a gradient-based deep neural network method that generates a new class of texture.It is possible to rapidly generate new classes of textures using different kernels from pre-trained deep networks.After generating new textures for each class,the number of textures increases through image augmentation.During this process,several techniques are proposed to automatically remove incomplete and similar textures that are created.The proposed method is faster than some well-known generative networks by around 4 to 10 times.In addition,the quality of the generated textures surpasses that of these networks.The proposed method can generate textures that surpass those of someGANs and parametric models in certain image qualitymetrics.It can provide a big texture dataset to train deep networks.A new big texture dataset is created artificially using the proposed method.This dataset is approximately 2 GB in size and comprises 30,000 textures,each 150×150 pixels in size,organized into 600 classes.It is uploaded to the Kaggle site and Google Drive.This dataset is called BigTex.Compared to other texture datasets,the proposed dataset is the largest and can serve as a comprehensive texture dataset for training more powerful deep neural networks and mitigating overfitting.
基金National Key Research and Development Program of China,Grant/Award Number:2021YFC1910402。
文摘Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.
基金Supported by the Robotic AI-Scientist Platform of Chinese Academy of SciencesNational Natural Science Foundation of China(22372185)+2 种基金Youth Talent Development Program of SKLCC(2025BWZ009)Natural Science Foundation of Shanxi Province(202203021221219)Research on the Construction of Scientific and Technological Innovation Think Tank of Shanxi Association for Science and Technology(KXKT202542)。
文摘Temperature-programmed desorption(TPD)is a fundamental technique in surface science and heterogeneous catalysis for characterizing adsorption behavior,and for extracting key parameters such as adsorption energy.However,the majority of existing TPD data is accessible in the form of published images,which lacks structured and quantitative datasets.This constrains its utility for rigorous quantitative analysis and computational modelling.Using carbon monoxide(CO)which is a widely adopted probe molecule,a curated and standardized dataset of CO-TPD is constructed,encompassing 14 transition-metal single-crystal surfaces,including copper(Cu)and ruthenium(Ru).By systematically extracting numerical data points from published spectra and applying normalization,essential spectral features such as peak shape are fully preserved.The dataset also documents relevant experimental parameters,including heating rates,and was developed using a standardized protocol for data collection and quality control.This resource serves as both a reference library to support the deconvolution of TPD spectra from complex catalysts and an experimental benchmark for calibrating parameters in theoretical models.By providing a reliable and accessible data function,this work advances the microscopic understanding and the rational design of catalyst active centers.
基金The work described in this paper was fully supported by a grant from Hong Kong Metropolitan University(RIF/2021/05).
文摘Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.
基金funded by the Zhejiang Provincial Key Science and Technology“LingYan”Project Foundation,grant number 2023C01145Zhejiang Gongshang University Higher Education Research Projects,grant number Xgy22028.
文摘With the deep integration of smart manufacturing and IoT technologies,higher demands are placed on the intelligence and real-time performance of industrial equipment fault detection.For industrial fans,base bolt loosening faults are difficult to identify through conventional spectrum analysis,and the extreme scarcity of fault data leads to limited training datasets,making traditional deep learning methods inaccurate in fault identification and incapable of detecting loosening severity.This paper employs Bayesian Learning by training on a small fault dataset collected from the actual operation of axial-flow fans in a factory to obtain posterior distribution.This method proposes specific data processing approaches and a configuration of Bayesian Convolutional Neural Network(BCNN).It can effectively improve the model’s generalization ability.Experimental results demonstrate high detection accuracy and alignment with real-world applications,offering practical significance and reference value for industrial fan bolt loosening detection under data-limited conditions.
基金supported by the research fund of Hanyang University(HY-202500000001616).
文摘Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design.
基金Key Scientific Research Project of the Hunan Provincial Department of Education(23A312).
文摘Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P methodology for constructing high-quality instruction datasets in TCM mental disorders.This approach integrates theoretical knowledge and practical case studies through a dual-track strategy.(i)Theoretical track:textbooks and guidelines on TCM mental disorders were manually segmented.Initial responses were generated using DeepSeek-V3,followed by refinement by the Qwen3-32B model to align the expression with human preferences.A screening algorithm was then applied to select 16000 high-quality instruction pairs.(ii)Practical track:starting from over 600 real clinical case seeds,diagnostic and therapeutic instruction pairs were generated using DeepSeek-V3 and subsequently screened through manual evaluation,resulting in 4000 high-quality practiceoriented instruction pairs.The integration of both tracks yielded the Med-Mental-Instruct-T&P dataset,comprising a total of 20000 instruction pairs.To validate the dataset’s effectiveness,three experimental evaluations(both manual and automated)were conducted:(i)comparative studies to compare the performance of models fine-tuned on different datasets;(ii)benchmarking to compare against mainstream TCM-specific large language models(LLMs);(iii)data ablation study to investigate the relationship between data volume and model performance.Results Experimental results demonstrate the superior performance of T&P-model finetuned on the Med-Mental-Instruct-T&P dataset.In the comparative study,the T&P-model significantly outperformed the baseline models trained solely on self-generated or purely human-curated baseline data.This superiority was evident in both automated metrics(ROUGEL>0.55)and expert manual evaluations(scoring above 7/10 across accuracy).In benchmark comparisons,the T&P-model also excelled against existing mainstream TCM LLMs(e.g.,HuatuoGPT and ZuoyiGPT).It showed particularly strong capabilities in handling diverse clinical presentations,including challenging disorders such as insomnia and coma,showcasing its robustness and versatility.Data ablation studies showed that T&P-model performance had an overall upward trend with minor fluctuations when training data increased from 10%to 50%;beyond 50%,performance improvement slowed significantly,with metrics plateauing and approaching a saturation point.
基金supported by European Union’s Horizon Europe research and innovation programme,project AGILEHAND(Smart Grading,Handling and Packaging Solutions for Soft and Deformable Products in Agile and Reconfigurable Lines)(101092043).
文摘Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for addressing challenges such as occlusions,indistinct edges,and stacked configurations,which demand large,diverse datasets.To meet these demands,we propose two complementary approaches:a semi-automatic annotation interface using tools like the segment anything model(SAM)and GrabCut and a synthetic data generation pipeline leveraging 3D-scanned models.These methods reduce reliance on real meat,mitigate food waste,and improve scalability.Experimental results demonstrate that incorporating synthetic data enhances segmentation model performance and,when combined with real data,further boosts accuracy,paving the way for more efficient automation in the food industry.
基金supported by the National Natural Science Foundation of China(U1435220)
文摘How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.
基金Supported by the National Natural Science Foundation of China (62006150)Songjiang District Science and Technology Research Project (19SJKJGG83)Shanghai Young Science and Technology Talents Sailing Program (19YF1418400)。
文摘Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natural language instruction and the observed surrounding visual information. To make the best decision, in every step during the navigation, the agent should pay more attention to understanding the objects, the object attributes, and the object relationships. But most current methods process all received textual and visual information equally. Therefore, this paper integrates more detailed semantic connections between visual and textual information through three pre-training tasks(object prediction, object attributes prediction, and object relationship prediction). The model will learn better fusion representation and alignment between these two types of information to improve the success rate(SR) and generalization. The experiments show that compared with the former baseline models, the SR on the unseen validation set(Val Unseen) increased by 7%, and the SR weighted by path length(SPL) increased by 7%;the SR on the test set(Test) increased 4%, SPL increased by 3%.
文摘Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image quality of handheld ultrasound devices is not always satisfactory due to the limited equipment size,which hinders accurate diagnoses by doctors.At the same time,paired ultrasound images are difficult to obtain from the clinic because imaging process is complicated.Therefore,we propose a modified cycle generative adversarial network(cycleGAN) for ultrasound image enhancement from multiple organs via unpaired pre-training.We introduce an ultrasound image pre-training method that does not require paired images,alleviating the requirement for large-scale paired datasets.We also propose an enhanced block with different structures in the pre-training and fine-tuning phases,which can help achieve the goals of different training phases.To improve the robustness of the model,we add Gaussian noise to the training images as data augmentation.Our approach is effective in obtaining the best quantitative evaluation results using a small number of parameters and less training costs to improve the quality of handheld ultrasound devices.
文摘Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time and place, not to mention the cost involved in traveling, accommodation and training venue. However, in the most online training courses, all trainees are given same training materials and teaching paradigms. The problem of differentia ting the trainees’ abilities is the main concern. We need a pre-training test t o identify and classify of the weaknesses and strengths of differentiate trainee s so as to devise an appropriate training programs for the trainees. Adaptation of a Web-based Computer adaptive Test (CAT) for the pre-training test make the web-based training more efficient. The advantages of CAT are self-pacing, eff iciency, time and cost saving, immediate scoring and feedback, accuracy and secu rity, etc (Rudner, 1998; UMN, 1999; Novell, 2000; Linacre, 2000; Windowsglore, 2 000). Moreover, Web-based CAT also gives greater flexibility and convenience. T his paper describes how this CAT tool is built, how it helps instructor identify the strengths and weaknesses of trainees, and how to assure quality on the CAT system.
基金financially supported by the Natural Science Foundation of China(Grant No.42301492)the National Key R&D Program of China(Grant Nos.2022YFF0711600,2022YFF0801201,2022YFF0801200)+3 种基金the Major Special Project of Xinjiang(Grant No.2022A03009-3)the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources(Grant No.KF-2022-07014)the Opening Fund of the Key Laboratory of the Geological Survey and Evaluation of the Ministry of Education(Grant No.GLAB 2023ZR01)the Fundamental Research Funds for the Central Universities。
文摘As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.
文摘Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.
基金support from the National Key Scientific and Technological Infrastructure project “Earth System Science Numerical Simulator Facility” (Earth Lab)sponsored by the National Natural Science Foundation of China (Grant Nos. 42175132, 92044303, and 42205119)+2 种基金the National Key R&D Program (Grant Nos. 2020YFA0607802 and 2022YFC3703003)the CAS Information Technology Program (Grant No. CAS-WX2021SF-0107-02)the fellowship of China Postdoctoral Science Foundation (Grant No. 2022M723093)
文摘Scientific knowledge on the chemical compositions of fine particulate matter(PM_(2.5)) is essential for properly assessing its health and climate effects,and for decisionmakers to develop efficient mitigation strategies.A high-resolution PM_(2.5) chemical composition dataset(CAQRA-aerosol)is developed in this study,which provides hourly maps of organic carbon,black carbon,ammonium,nitrate,and sulfate in China from 2013 to 2020 with a horizontal resolution of 15 km.This paper describes the method,access,and validation results of this dataset.It shows that CAQRA-aerosol has good consistency with observations and achieves higher or comparable accuracy with previous PM_(2.5) composition datasets.Based on CAQRA-aerosol,spatiotemporal changes of different PM_(2.5) compositions were investigated from a national viewpoint,which emphasizes different changes of nitrate from other compositions.The estimated annual rate of population-weighted concentrations of nitrate is 0.23μg m^(−3)yr^(−1) from 2015 to 2020,compared with−0.19 to−1.1μg m^(−3)yr^(−1) for other compositions.The whole dataset is freely available from the China Air Pollution Data Center(https://doi.org/10.12423/capdb_PKU.2023.DA).
基金supported by the National Natural Science Foundation of China(42030102,42371321).
文摘Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples.This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios.To address this limitation,we introduce the million-scale fine-grained geospatial scene classification dataset(MEET),which contains over 1.03 million zoom-free remote sensing scene samples,manually annotated into 80 fine-grained categories.In MEET,each scene sample follows a scene-in-scene layout,where the central scene serves as the reference,and auxiliary scenes provide crucial spatial context for fine-grained classification.Moreover,to tackle the emerging challenge of scene-in-scene classification,we present the context-aware transformer(CAT),a model specifically designed for this task,which adaptively fuses spatial context to accurately classify the scene samples.CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes.Based on MEET,we establish a comprehensive benchmark for fine-grained geospatial scene classification,evaluating CAT against 11 competitive baselines.The results demonstrate that CAT significantly outperforms these baselines,achieving a 1.88%higher balanced accuracy(BA)with the Swin-Large backbone,and a notable 7.87%improvement with the Swin-Huge backbone.Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping.The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html.
基金supported by the National Natural Science Foundation of China(22379157)CAS Project for Young Scientists in Basic Research(YSBR-102)+2 种基金Institute of Coal Chemistry,Chinese Academy of Sciences(SCJC-XCL-2023-13,SCJCXCL-2023-10)Talent Projects for Outstanding Doctoral Students to Work in Shanxi Province(E3SWR4791Z)Fundamental Research Program of Shanxi Province(202403021222485).
文摘This data set collects,compares and contrasts the capacities and structures of a series of hard carbon materials,and then searches for correlations between structure and electrochemical performance.The capacity data of the hard carbons were obtained by charge/discharge tests and the materials were characterized by XRD,gas adsorption,true density tests and SAXS.In particular,the fitting of SAXS gave a series of structural parameters which showed good characterization.The related test details are given with the structural data of the hard carbons and the electrochemical performance of the sodium-ion batteries.
基金the Young Scientists Fund of the National Natural Science Foundation of China(Grant No.82402274 and 82272008)the Science&Technology Development Fund of Tianjin Education Commission for Higher Education(Grant No.2021KJ194).
文摘The automatic classification of thyroid nodules in ultrasound images is a critical research focus in medical imaging.However,publicly available thyroid ultrasound datasets remain scarce.In this study,we developed the Ultrasound Dataset for Thyroid Nodules(UD-TN),a comprehensive dataset containing 10,495 labeled images classified as benign or malignant based on pathology-confirmed results.To establish a benchmark,we proposed the Thyroid Ultrasound Image Neural Network(ThyUNet),a deep learning model designed for accurate nodule classification.By incorporating high-resolution feature enhancement,instance normalization,and dilated convolutions into residual blocks,ThyUNet excels in extracting fine-grained features,particularly for small nodules.Experimental results demonstrate that ThyUNet achieves state-of-the-art performance,with an accuracy of 89.7%,a sensitivity of 0.879,and a specificity of 0.910 on the testing set.These results surpass those of other advanced architectures,highlighting the model’s effectiveness.UD-TN and ThyUNet contribute significantly to advancing intelligent medical diagnostics.Dataset details and access instructions are available at https://github.com/18811755633/Sample-of-UD-TN.