In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering...In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering correction(MSC)-maximum-minimum normalization(MN)was identified as the optimal preprocessing technique.The competitive adaptive reweighted sampling(CARS),successive projections algorithm(SPA),and their combined methods were employed to extract feature wavelengths.Classification models based on back propagation(BP),support vector machine(SVM),random forest(RF),and partial least squares(PLS)were established using full-band data and feature wavelengths.Among all models,the(CARS-SPA)-BP model achieved the highest accuracy rate of 98.44%.This study offers novel insights and methodologies for the rapid and accurate identification of corn seeds as well as other crop seeds.展开更多
A new method based on the iterative adaptive algorithm(IAA)and blocking matrix preprocessing(BMP)is proposed to study the suppression of multi-mainlobe interference.The algorithm is applied to precisely estimate the s...A new method based on the iterative adaptive algorithm(IAA)and blocking matrix preprocessing(BMP)is proposed to study the suppression of multi-mainlobe interference.The algorithm is applied to precisely estimate the spatial spectrum and the directions of arrival(DOA)of interferences to overcome the drawbacks associated with conventional adaptive beamforming(ABF)methods.The mainlobe interferences are identified by calculating the correlation coefficients between direction steering vectors(SVs)and rejected by the BMP pretreatment.Then,IAA is subsequently employed to reconstruct a sidelobe interference-plus-noise covariance matrix for the preferable ABF and residual interference suppression.Simulation results demonstrate the excellence of the proposed method over normal methods based on BMP and eigen-projection matrix perprocessing(EMP)under both uncorrelated and coherent circumstances.展开更多
Gas hydrate(GH)is an unconventional resource estimated at 1000-120,000 trillion m^(3)worldwide.Research on GH is ongoing to determine its geological and flow characteristics for commercial produc-tion.After two large-...Gas hydrate(GH)is an unconventional resource estimated at 1000-120,000 trillion m^(3)worldwide.Research on GH is ongoing to determine its geological and flow characteristics for commercial produc-tion.After two large-scale drilling expeditions to study the GH-bearing zone in the Ulleung Basin,the mineral composition of 488 sediment samples was analyzed using X-ray diffraction(XRD).Because the analysis is costly and dependent on experts,a machine learning model was developed to predict the mineral composition using XRD intensity profiles as input data.However,the model’s performance was limited because of improper preprocessing of the intensity profile.Because preprocessing was applied to each feature,the intensity trend was not preserved even though this factor is the most important when analyzing mineral composition.In this study,the profile was preprocessed for each sample using min-max scaling because relative intensity is critical for mineral analysis.For 49 test data among the 488 data,the convolutional neural network(CNN)model improved the average absolute error and coefficient of determination by 41%and 46%,respectively,than those of CNN model with feature-based pre-processing.This study confirms that combining preprocessing for each sample with CNN is the most efficient approach for analyzing XRD data.The developed model can be used for the compositional analysis of sediment samples from the Ulleung Basin and the Korea Plateau.In addition,the overall procedure can be applied to any XRD data of sediments worldwide.展开更多
This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model...This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model follows a“three-stage”and“two-subject”framework,incorporating a structured design for teaching content and assessment methods before,during,and after class.Practical results indicate that this approach significantly enhances teaching effectiveness and improves students’learning autonomy.展开更多
An improved version of the sparse A^(*)algorithm is proposed to address the common issue of excessive expansion of nodes and failure to consider current ship status and parameters in traditional path planning algorith...An improved version of the sparse A^(*)algorithm is proposed to address the common issue of excessive expansion of nodes and failure to consider current ship status and parameters in traditional path planning algorithms.This algorithm considers factors such as initial position and orientation of the ship,safety range,and ship draft to determine the optimal obstacle-avoiding route from the current to the destination point for ship planning.A coordinate transformation algorithm is also applied to convert commonly used latitude and longitude coordinates of ship travel paths to easily utilized and analyzed Cartesian coordinates.The algorithm incorporates a hierarchical chart processing algorithm to handle multilayered chart data.Furthermore,the algorithm considers the impact of ship length on grid size and density when implementing chart gridification,adjusting the grid size and density accordingly based on ship length.Simulation results show that compared to traditional path planning algorithms,the sparse A^(*)algorithm reduces the average number of path points by 25%,decreases the average maximum storage node number by 17%,and raises the average path turning angle by approximately 10°,effectively improving the safety of ship planning paths.展开更多
Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding pro...Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding process,then obtains laser fringe information through digital image processing,identifies welding defects,and finally realizes online control of weld defects.The performance of a convolutional neural network is related to its structure and the quality of the input image.The acquired original images are labeled with LabelMe,and repeated attempts are made to determine the appropriate filtering and edge detection image preprocessing methods.Two-stage convolutional neural networks with different structures are built on the Tensorflow deep learning framework,different thresholds of intersection over union are set,and deep learning methods are used to evaluate the collected original images and the preprocessed images separately.Compared with the test results,the comprehensive performance of the improved feature pyramid networks algorithm based on the basic network VGG16 is lower than that of the basic network Resnet101.Edge detection of the image will significantly improve the accuracy of the model.Adding blur will reduce the accuracy of the model slightly;however,the overall performance of the improved algorithm is still relatively good,which proves the stability of the algorithm.The self-developed software inspection system can be used for image preprocessing and defect recognition,which can be used to record the number and location of typical defects in continuous welds.展开更多
The big data generated by tunnel boring machines(TBMs)are widely used to reveal complex rock-machine interactions by machine learning(ML)algorithms.Data preprocessing plays a crucial role in improving ML accuracy.For ...The big data generated by tunnel boring machines(TBMs)are widely used to reveal complex rock-machine interactions by machine learning(ML)algorithms.Data preprocessing plays a crucial role in improving ML accuracy.For this,a TBM big data preprocessing method in ML was proposed in the present study.It emphasized the accurate division of TBM tunneling cycle and the optimization method of feature extraction.Based on the data collected from a TBM water conveyance tunnel in China,its effectiveness was demonstrated by application in predicting TBM performance.Firstly,the Score-Kneedle(S-K)method was proposed to divide a TBM tunneling cycle into five phases.Conducted on 500 TBM tunneling cycles,the S-K method accurately divided all five phases in 458 cycles(accuracy of 91.6%),which is superior to the conventional duration division method(accuracy of 74.2%).Additionally,the S-K method accurately divided the stable phase in 493 cycles(accuracy of 98.6%),which is superior to two state-of-the-art division methods,namely the histogram discriminant method(accuracy of 94.6%)and the cumulative sum change point detection method(accuracy of 92.8%).Secondly,features were extracted from the divided phases.Specifically,TBM tunneling resistances were extracted from the free rotating phase and free advancing phase.The resistances were subtracted from the total forces to represent the true rock-fragmentation forces.The secant slope and the mean value were extracted as features of the increasing phase and stable phase,respectively.Finally,an ML model integrating a deep neural network and genetic algorithm(GA-DNN)was established to learn the preprocessed data.The GA-DNN used 6 secant slope features extracted from the increasing phase to predict the mean field penetration index(FPI)and torque penetration index(TPI)in the stable phase,guiding TBM drivers to make better decisions in advance.The results indicate that the proposed TBM big data preprocessing method can improve prediction accuracy significantly(improving R2s of TPI and FPI on the test dataset from 0.7716 to 0.9178 and from 0.7479 to 0.8842,respectively).展开更多
With the rapid development of computer vision technology,artificial intelligence algorithms,and high-performance computing platforms,machine vision technology has gradually shown its great potential in automated produ...With the rapid development of computer vision technology,artificial intelligence algorithms,and high-performance computing platforms,machine vision technology has gradually shown its great potential in automated production lines,especially in defect detection.Machine vision technology can be applied in many industries such as semiconductor,automobile manufacturing,aerospace,food,and drugs,which can significantly improve detection efficiency and accuracy,reduce labor costs,improve product quality,enhance market competitiveness,and provide strong support for the arrival of Industry 4.0 era.In this article,the concept,advantages,and disadvantages of machine vision and the algorithm framework of machine vision in the defect detection system are briefly described,aiming to promote the rapid development of industry and strengthen China’s industry.展开更多
AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited...AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.展开更多
Enhancing the accuracy of real-time ship roll prediction is crucial for maritime safety and operational efficiency.To address the challenge of accurately predicting the ship roll status with nonlinear time-varying dyn...Enhancing the accuracy of real-time ship roll prediction is crucial for maritime safety and operational efficiency.To address the challenge of accurately predicting the ship roll status with nonlinear time-varying dynamic characteristics,a real-time ship roll prediction scheme is proposed on the basis of a data preprocessing strategy and a novel stochastic trainer-based feedforward neural network.The sliding data window serves as a ship time-varying dynamic observer to enhance model prediction stability.The variational mode decomposition method extracts effective information on ship roll motion and reduces the non-stationary characteristics of the series.The energy entropy method reconstructs the mode components into high-frequency,medium-frequency,and low-frequency series to reduce model complexity.An improved black widow optimization algorithm trainer-based feedforward neural network with enhanced local optimal avoidance predicts the high-frequency component,enabling accurate tracking of abrupt signals.Additionally,the deterministic algorithm trainer-based neural network,characterized by rapid processing speed,predicts the remaining two mode components.Thus,real-time ship roll forecasting can be achieved through the reconstruction of mode component prediction results.The feasibility and effectiveness of the proposed hybrid prediction scheme for ship roll motion are demonstrated through the measured data of a full-scale ship trial.The proposed prediction scheme achieves real-time ship roll prediction with superior prediction accuracy.展开更多
Predicting NO_(x)in the sintering process of iron ore powder in advance was helpful to adjust the denitrification process in time.Taking NO_(x)in the sintering process of iron ore powder as the object,the boxplot,empi...Predicting NO_(x)in the sintering process of iron ore powder in advance was helpful to adjust the denitrification process in time.Taking NO_(x)in the sintering process of iron ore powder as the object,the boxplot,empirical mode decomposition algorithm,Pearson correlation coefficient,maximum information coefficient and other methods were used to preprocess the sintering data and naive Bayes classification algorithm was used to identify the sintering conditions.The regression prediction model with high accuracy and good stability was selected as the sub-model for different sintering conditions,and the sub-models were combined into an integrated prediction model.Based on actual operational data,the approach proved the superiority and effectiveness of the developed model in predicting NO_(x),yielding an accuracy of 96.17%and an absolute error of 5.56,and thereby providing valuable foresight for on-site sintering operations.展开更多
Convolutional neural networks(CNNs)exhibit superior performance in image feature extraction,making them extensively used in the area of traffic sign recognition.However,the design of existing traffic sign recognition ...Convolutional neural networks(CNNs)exhibit superior performance in image feature extraction,making them extensively used in the area of traffic sign recognition.However,the design of existing traffic sign recognition algorithms often relies on expert knowledge to enhance the image feature extraction networks,necessitating image preprocessing and model parameter tuning.This increases the complexity of the model design process.This study introduces an evolutionary neural architecture search(ENAS)algorithm for the automatic design of neural network models tailored for traffic sign recognition.By integrating the construction parameters of residual network(ResNet)into evolutionary algorithms(EAs),we automatically generate lightweight networks for traffic sign recognition,utilizing blocks as the fundamental building units.Experimental evaluations on the German traffic sign recognition benchmark(GTSRB)dataset reveal that the algorithm attains a recognition accuracy of 99.32%,with a mere 2.8×10^(6)parameters.Experimental results comparing the proposed method with other traffic sign recognition algorithms demonstrate that the method can more efficiently discover neural network architectures,significantly reducing the number of network parameters while maintaining recognition accuracy.展开更多
The intrinsic heterogeneity of metabolic dysfunction-associated fatty liver disease(MASLD)and the intricate pathogenesis have impeded the advancement and clinical implementation of therapeutic interventions,underscori...The intrinsic heterogeneity of metabolic dysfunction-associated fatty liver disease(MASLD)and the intricate pathogenesis have impeded the advancement and clinical implementation of therapeutic interventions,underscoring the critical demand for novel treatments.A recent publication by Li et al proposes mesenchymal stem cells as promising effectors for the treatment of MASLD.This editorial is a continuum of the article published by Jiang et al which focuses on the significance of strategies to enhance the functionality of mesenchymal stem cells to improve efficacy in curing MASLD,including physical pretreatment,drug or chemical pretreatment,pretreatment with bioactive substances,and genetic engineering.展开更多
The proliferation of Internet of Things(IoT)technology has exponentially increased the number of devices interconnected over networks,thereby escalating the potential vectors for cybersecurity threats.In response,this...The proliferation of Internet of Things(IoT)technology has exponentially increased the number of devices interconnected over networks,thereby escalating the potential vectors for cybersecurity threats.In response,this study rigorously applies and evaluates deep learning models—namely Convolutional Neural Networks(CNN),Autoencoders,and Long Short-Term Memory(LSTM)networks—to engineer an advanced Intrusion Detection System(IDS)specifically designed for IoT environments.Utilizing the comprehensive UNSW-NB15 dataset,which encompasses 49 distinct features representing varied network traffic characteristics,our methodology focused on meticulous data preprocessing including cleaning,normalization,and strategic feature selection to enhance model performance.A robust comparative analysis highlights the CNN model’s outstanding performance,achieving an accuracy of 99.89%,precision of 99.90%,recall of 99.88%,and an F1 score of 99.89%in binary classification tasks,outperforming other evaluated models significantly.These results not only confirm the superior detection capabilities of CNNs in distinguishing between benign and malicious network activities but also illustrate the model’s effectiveness in multiclass classification tasks,addressing various attack vectors prevalent in IoT setups.The empirical findings from this research demonstrate deep learning’s transformative potential in fortifying network security infrastructures against sophisticated cyber threats,providing a scalable,high-performance solution that enhances security measures across increasingly complex IoT ecosystems.This study’s outcomes are critical for security practitioners and researchers focusing on the next generation of cyber defense mechanisms,offering a data-driven foundation for future advancements in IoT security strategies.展开更多
Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill ...Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill from doctors and technicians.In our study,we focused on the 3D dose prediction problem in radiotherapy by applying the deeplearning approach to computed tomography(CT)images of cancer patients.Medical image data has more complex characteristics than normal image data,and this research aims to explore the effectiveness of data preprocessing and augmentation in the context of the 3D dose prediction problem.We proposed four strategies to clarify our hypothesis in different aspects of applying data preprocessing and augmentation.In strategies,we trained our custom convolutional neural network model which has a structure inspired by the U-net,and residual blocks were also applied to the architecture.The output of the network is added with a rectified linear unit(Re-Lu)function for each pixel to ensure there are no negative values,which are absurd with radiation doses.Our experiments were conducted on the dataset of the Open Knowledge-Based Planning Challenge which was collected from head and neck cancer patients treatedwith radiation therapy.The results of four strategies showthat our hypothesis is rational by evaluating metrics in terms of the Dose-score and the Dose-volume histogram score(DVH-score).In the best training cases,the Dose-score is 3.08 and the DVH-score is 1.78.In addition,we also conducted a comparison with the results of another study in the same context of using the loss function.展开更多
Analyzing colon cancer data is essential for improving early detection,treatment outcomes,public health initiatives,research efforts,and overall patient care,ultimately leading to better outcomes and reduced burden as...Analyzing colon cancer data is essential for improving early detection,treatment outcomes,public health initiatives,research efforts,and overall patient care,ultimately leading to better outcomes and reduced burden associated with this disease.The prediction of any disease depends on the quality of the available dataset.Before applying the prediction algorithm,it is important to analyze its characteristics.This research presented a comprehensive framework for addressing data imbalance in colon cancer datasets,which has been a significant challenge in previous studies in terms of imbalancing and high dimensionality for the prediction of colon cancer data.Both characters are important concepts of preprocessing.Imbalancing refers to the adjusting the data points in the proper portion of the class label.Feature selection is the process of selecting the strong feature from the available dataspace.This study aims to improve the performance of the popular tree,rule,lazy(K nearest neighbor(KNN))classifiers,and support vector machine(SVM)algorithm after addressing the imbalancing issue of data analysis and applying various feature selection methods such as chi-square,symmetrical uncertainty,correlation-based feature selection(CFS)subset,and classifier subset evaluators.The proposed research framework shows that after balancing the dataset,all the algorithms performed better with all applied feature selection methods.Out of all methods,Jrip records 85.71%accuracy with classifier subset evaluators,Ridor marks 84.52%accuracy with CFS,J48 produces 83.33%accuracy with both CFS and classifier subset evaluators,simple cart notices 84.52%with classifier subset evaluators,KNN records 91.66%accuracy with Chi and CFS,and SVM produces 92.85%with symmetrical uncertainty.展开更多
One of the most dangerous forms of cancer, skin cancer has been on the rise over the past ten years. Nonetheless, melanoma detection is a method that uses deep learning algorithms to analyze images and accurately diag...One of the most dangerous forms of cancer, skin cancer has been on the rise over the past ten years. Nonetheless, melanoma detection is a method that uses deep learning algorithms to analyze images and accurately diagnose melanoma. An improved result for cancer treatment may result from early diagnosis. Then, in a matter of seconds, it will be simple to identify skin cancer using deep learning. In this research, a deep learning-based automatic skin cancer detection method is proposed. Data was considered from the ISIC database dataset which has 2357 images. To obtain average color information and normalize all color channel information, we used a few preprocessing approaches. Next, data was collected for categorization and reshaping of the images. To avoid overfitting, we additionally employed data augmentation. In the end, the Convolutional Neural Network was used to achieve our goal, which improved the accuracy of prediction. Using the Resnet50 algorithm, the accuracy rate rose to 98%, which will be helpful to get a good outcome with better accuracy.展开更多
Over the past ten years, there has been an increase in cardiovascular disease, one of the most dangerous types of disease. However, cardiovascular detection is a technique that analyzes data and precisely diagnoses ca...Over the past ten years, there has been an increase in cardiovascular disease, one of the most dangerous types of disease. However, cardiovascular detection is a technique that analyzes data and precisely diagnoses cardiovascular disease using machine learning algorithms. Early diagnosis may lead to better outcomes for heart treatment. Then, utilizing machine learning to detect cardiac disease will be easy in a couple of seconds. This study proposes an automatic way for detecting cardiovascular diseases such as heart disease using machine learning. A physician’s accurate and thorough evaluation of a patient’s cardiovascular risk plays a critical role in lowering the incidence and severity of heart attacks and strokes as well as improving cardiovascular protection. To develop technology for the early detection of cardiovascular disease, the Kaggle dataset was gathered. Certain preprocessing techniques were used to improve accuracy and outcomes. Ultimately, we employed decision trees, logistic regression, and random forests to reach our objective. Of these, random forest yielded the highest accuracy of 96%, making them useful for obtaining high-quality results with greater precision.展开更多
Network intrusion detection systems need to be updated due to the rise in cyber threats. In order to improve detection accuracy, this research presents a strong strategy that makes use of a stacked ensemble method, wh...Network intrusion detection systems need to be updated due to the rise in cyber threats. In order to improve detection accuracy, this research presents a strong strategy that makes use of a stacked ensemble method, which combines the advantages of several machine learning models. The ensemble is made up of various base models, such as Decision Trees, K-Nearest Neighbors (KNN), Multi-Layer Perceptrons (MLP), and Naive Bayes, each of which offers a distinct perspective on the properties of the data. The research adheres to a methodical workflow that begins with thorough data preprocessing to guarantee the accuracy and applicability of the data. In order to extract useful attributes from network traffic data—which are essential for efficient model training—feature engineering is used. The ensemble approach combines these models by training a Logistic Regression model meta-learner on base model predictions. In addition to increasing prediction accuracy, this tiered approach helps get around the drawbacks that come with using individual models. High accuracy, precision, and recall are shown in the model’s evaluation of a network intrusion dataset, indicating the model’s efficacy in identifying malicious activity. Cross-validation is used to make sure the models are reliable and well-generalized to new, untested data. In addition to advancing cybersecurity, the research establishes a foundation for the implementation of flexible and scalable intrusion detection systems. This hybrid, stacked ensemble model has a lot of potential for improving cyberattack prevention, lowering the likelihood of cyberattacks, and offering a scalable solution that can be adjusted to meet new threats and technological advancements.展开更多
In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model...In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control.展开更多
基金supported by the Science and Technology Development Plan Project of Jilin Provincial Department of Science and Technology (No.20220203112S)the Jilin Provincial Department of Education Science and Technology Research Project (No.JJKH20210039KJ)。
文摘In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering correction(MSC)-maximum-minimum normalization(MN)was identified as the optimal preprocessing technique.The competitive adaptive reweighted sampling(CARS),successive projections algorithm(SPA),and their combined methods were employed to extract feature wavelengths.Classification models based on back propagation(BP),support vector machine(SVM),random forest(RF),and partial least squares(PLS)were established using full-band data and feature wavelengths.Among all models,the(CARS-SPA)-BP model achieved the highest accuracy rate of 98.44%.This study offers novel insights and methodologies for the rapid and accurate identification of corn seeds as well as other crop seeds.
基金The National Natural Science Foundation of China(No.U19B2031).
文摘A new method based on the iterative adaptive algorithm(IAA)and blocking matrix preprocessing(BMP)is proposed to study the suppression of multi-mainlobe interference.The algorithm is applied to precisely estimate the spatial spectrum and the directions of arrival(DOA)of interferences to overcome the drawbacks associated with conventional adaptive beamforming(ABF)methods.The mainlobe interferences are identified by calculating the correlation coefficients between direction steering vectors(SVs)and rejected by the BMP pretreatment.Then,IAA is subsequently employed to reconstruct a sidelobe interference-plus-noise covariance matrix for the preferable ABF and residual interference suppression.Simulation results demonstrate the excellence of the proposed method over normal methods based on BMP and eigen-projection matrix perprocessing(EMP)under both uncorrelated and coherent circumstances.
基金supported by the Gas Hydrate R&D Organization and the Korea Institute of Geoscience and Mineral Resources(KIGAM)(GP2021-010)supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(No.2021R1C1C1004460)Korea Institute of Energy Technology Evaluation and Planning(KETEP)grant funded by the Korean government(MOTIE)(20214000000500,Training Program of CCUS for Green Growth).
文摘Gas hydrate(GH)is an unconventional resource estimated at 1000-120,000 trillion m^(3)worldwide.Research on GH is ongoing to determine its geological and flow characteristics for commercial produc-tion.After two large-scale drilling expeditions to study the GH-bearing zone in the Ulleung Basin,the mineral composition of 488 sediment samples was analyzed using X-ray diffraction(XRD).Because the analysis is costly and dependent on experts,a machine learning model was developed to predict the mineral composition using XRD intensity profiles as input data.However,the model’s performance was limited because of improper preprocessing of the intensity profile.Because preprocessing was applied to each feature,the intensity trend was not preserved even though this factor is the most important when analyzing mineral composition.In this study,the profile was preprocessed for each sample using min-max scaling because relative intensity is critical for mineral analysis.For 49 test data among the 488 data,the convolutional neural network(CNN)model improved the average absolute error and coefficient of determination by 41%and 46%,respectively,than those of CNN model with feature-based pre-processing.This study confirms that combining preprocessing for each sample with CNN is the most efficient approach for analyzing XRD data.The developed model can be used for the compositional analysis of sediment samples from the Ulleung Basin and the Korea Plateau.In addition,the overall procedure can be applied to any XRD data of sediments worldwide.
基金2024 Anqing Normal University University-Level Key Project(ZK2024062D)。
文摘This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model follows a“three-stage”and“two-subject”framework,incorporating a structured design for teaching content and assessment methods before,during,and after class.Practical results indicate that this approach significantly enhances teaching effectiveness and improves students’learning autonomy.
基金Supported by the Tianjin University of Technology Graduate R esearch Innovation Project(YJ2281).
文摘An improved version of the sparse A^(*)algorithm is proposed to address the common issue of excessive expansion of nodes and failure to consider current ship status and parameters in traditional path planning algorithms.This algorithm considers factors such as initial position and orientation of the ship,safety range,and ship draft to determine the optimal obstacle-avoiding route from the current to the destination point for ship planning.A coordinate transformation algorithm is also applied to convert commonly used latitude and longitude coordinates of ship travel paths to easily utilized and analyzed Cartesian coordinates.The algorithm incorporates a hierarchical chart processing algorithm to handle multilayered chart data.Furthermore,the algorithm considers the impact of ship length on grid size and density when implementing chart gridification,adjusting the grid size and density accordingly based on ship length.Simulation results show that compared to traditional path planning algorithms,the sparse A^(*)algorithm reduces the average number of path points by 25%,decreases the average maximum storage node number by 17%,and raises the average path turning angle by approximately 10°,effectively improving the safety of ship planning paths.
基金the National Natural Science Foundation of China(No.12064027)。
文摘Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding process,then obtains laser fringe information through digital image processing,identifies welding defects,and finally realizes online control of weld defects.The performance of a convolutional neural network is related to its structure and the quality of the input image.The acquired original images are labeled with LabelMe,and repeated attempts are made to determine the appropriate filtering and edge detection image preprocessing methods.Two-stage convolutional neural networks with different structures are built on the Tensorflow deep learning framework,different thresholds of intersection over union are set,and deep learning methods are used to evaluate the collected original images and the preprocessed images separately.Compared with the test results,the comprehensive performance of the improved feature pyramid networks algorithm based on the basic network VGG16 is lower than that of the basic network Resnet101.Edge detection of the image will significantly improve the accuracy of the model.Adding blur will reduce the accuracy of the model slightly;however,the overall performance of the improved algorithm is still relatively good,which proves the stability of the algorithm.The self-developed software inspection system can be used for image preprocessing and defect recognition,which can be used to record the number and location of typical defects in continuous welds.
基金The support provided by the Natural Science Foundation of Hubei Province(Grant No.2021CFA081)the National Natural Science Foundation of China(Grant No.42277160)the fellowship of China Postdoctoral Science Foundation(Grant No.2022TQ0241)is gratefully acknowledged.
文摘The big data generated by tunnel boring machines(TBMs)are widely used to reveal complex rock-machine interactions by machine learning(ML)algorithms.Data preprocessing plays a crucial role in improving ML accuracy.For this,a TBM big data preprocessing method in ML was proposed in the present study.It emphasized the accurate division of TBM tunneling cycle and the optimization method of feature extraction.Based on the data collected from a TBM water conveyance tunnel in China,its effectiveness was demonstrated by application in predicting TBM performance.Firstly,the Score-Kneedle(S-K)method was proposed to divide a TBM tunneling cycle into five phases.Conducted on 500 TBM tunneling cycles,the S-K method accurately divided all five phases in 458 cycles(accuracy of 91.6%),which is superior to the conventional duration division method(accuracy of 74.2%).Additionally,the S-K method accurately divided the stable phase in 493 cycles(accuracy of 98.6%),which is superior to two state-of-the-art division methods,namely the histogram discriminant method(accuracy of 94.6%)and the cumulative sum change point detection method(accuracy of 92.8%).Secondly,features were extracted from the divided phases.Specifically,TBM tunneling resistances were extracted from the free rotating phase and free advancing phase.The resistances were subtracted from the total forces to represent the true rock-fragmentation forces.The secant slope and the mean value were extracted as features of the increasing phase and stable phase,respectively.Finally,an ML model integrating a deep neural network and genetic algorithm(GA-DNN)was established to learn the preprocessed data.The GA-DNN used 6 secant slope features extracted from the increasing phase to predict the mean field penetration index(FPI)and torque penetration index(TPI)in the stable phase,guiding TBM drivers to make better decisions in advance.The results indicate that the proposed TBM big data preprocessing method can improve prediction accuracy significantly(improving R2s of TPI and FPI on the test dataset from 0.7716 to 0.9178 and from 0.7479 to 0.8842,respectively).
文摘With the rapid development of computer vision technology,artificial intelligence algorithms,and high-performance computing platforms,machine vision technology has gradually shown its great potential in automated production lines,especially in defect detection.Machine vision technology can be applied in many industries such as semiconductor,automobile manufacturing,aerospace,food,and drugs,which can significantly improve detection efficiency and accuracy,reduce labor costs,improve product quality,enhance market competitiveness,and provide strong support for the arrival of Industry 4.0 era.In this article,the concept,advantages,and disadvantages of machine vision and the algorithm framework of machine vision in the defect detection system are briefly described,aiming to promote the rapid development of industry and strengthen China’s industry.
文摘AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.
基金supported by the National Natural Science Foundation of China(Grant Nos.52231014 and 52271361)the Natural Science Foundation of Guangdong Province of China(Grant No.2023A1515010684).
文摘Enhancing the accuracy of real-time ship roll prediction is crucial for maritime safety and operational efficiency.To address the challenge of accurately predicting the ship roll status with nonlinear time-varying dynamic characteristics,a real-time ship roll prediction scheme is proposed on the basis of a data preprocessing strategy and a novel stochastic trainer-based feedforward neural network.The sliding data window serves as a ship time-varying dynamic observer to enhance model prediction stability.The variational mode decomposition method extracts effective information on ship roll motion and reduces the non-stationary characteristics of the series.The energy entropy method reconstructs the mode components into high-frequency,medium-frequency,and low-frequency series to reduce model complexity.An improved black widow optimization algorithm trainer-based feedforward neural network with enhanced local optimal avoidance predicts the high-frequency component,enabling accurate tracking of abrupt signals.Additionally,the deterministic algorithm trainer-based neural network,characterized by rapid processing speed,predicts the remaining two mode components.Thus,real-time ship roll forecasting can be achieved through the reconstruction of mode component prediction results.The feasibility and effectiveness of the proposed hybrid prediction scheme for ship roll motion are demonstrated through the measured data of a full-scale ship trial.The proposed prediction scheme achieves real-time ship roll prediction with superior prediction accuracy.
基金financially supported by the Natural Science Basic foundation of China(Program No.52174325)the Key Research and Development Program of Shaanxi(Grant No.2020GY-166 and Program No.2020GY-247)the Shaanxi Provincial Innovation Capacity Support Plan(Grant No.2023-CX-TD-53).
文摘Predicting NO_(x)in the sintering process of iron ore powder in advance was helpful to adjust the denitrification process in time.Taking NO_(x)in the sintering process of iron ore powder as the object,the boxplot,empirical mode decomposition algorithm,Pearson correlation coefficient,maximum information coefficient and other methods were used to preprocess the sintering data and naive Bayes classification algorithm was used to identify the sintering conditions.The regression prediction model with high accuracy and good stability was selected as the sub-model for different sintering conditions,and the sub-models were combined into an integrated prediction model.Based on actual operational data,the approach proved the superiority and effectiveness of the developed model in predicting NO_(x),yielding an accuracy of 96.17%and an absolute error of 5.56,and thereby providing valuable foresight for on-site sintering operations.
基金supported by the National Natural Science Foundation of China(No.62066041).
文摘Convolutional neural networks(CNNs)exhibit superior performance in image feature extraction,making them extensively used in the area of traffic sign recognition.However,the design of existing traffic sign recognition algorithms often relies on expert knowledge to enhance the image feature extraction networks,necessitating image preprocessing and model parameter tuning.This increases the complexity of the model design process.This study introduces an evolutionary neural architecture search(ENAS)algorithm for the automatic design of neural network models tailored for traffic sign recognition.By integrating the construction parameters of residual network(ResNet)into evolutionary algorithms(EAs),we automatically generate lightweight networks for traffic sign recognition,utilizing blocks as the fundamental building units.Experimental evaluations on the German traffic sign recognition benchmark(GTSRB)dataset reveal that the algorithm attains a recognition accuracy of 99.32%,with a mere 2.8×10^(6)parameters.Experimental results comparing the proposed method with other traffic sign recognition algorithms demonstrate that the method can more efficiently discover neural network architectures,significantly reducing the number of network parameters while maintaining recognition accuracy.
文摘The intrinsic heterogeneity of metabolic dysfunction-associated fatty liver disease(MASLD)and the intricate pathogenesis have impeded the advancement and clinical implementation of therapeutic interventions,underscoring the critical demand for novel treatments.A recent publication by Li et al proposes mesenchymal stem cells as promising effectors for the treatment of MASLD.This editorial is a continuum of the article published by Jiang et al which focuses on the significance of strategies to enhance the functionality of mesenchymal stem cells to improve efficacy in curing MASLD,including physical pretreatment,drug or chemical pretreatment,pretreatment with bioactive substances,and genetic engineering.
文摘The proliferation of Internet of Things(IoT)technology has exponentially increased the number of devices interconnected over networks,thereby escalating the potential vectors for cybersecurity threats.In response,this study rigorously applies and evaluates deep learning models—namely Convolutional Neural Networks(CNN),Autoencoders,and Long Short-Term Memory(LSTM)networks—to engineer an advanced Intrusion Detection System(IDS)specifically designed for IoT environments.Utilizing the comprehensive UNSW-NB15 dataset,which encompasses 49 distinct features representing varied network traffic characteristics,our methodology focused on meticulous data preprocessing including cleaning,normalization,and strategic feature selection to enhance model performance.A robust comparative analysis highlights the CNN model’s outstanding performance,achieving an accuracy of 99.89%,precision of 99.90%,recall of 99.88%,and an F1 score of 99.89%in binary classification tasks,outperforming other evaluated models significantly.These results not only confirm the superior detection capabilities of CNNs in distinguishing between benign and malicious network activities but also illustrate the model’s effectiveness in multiclass classification tasks,addressing various attack vectors prevalent in IoT setups.The empirical findings from this research demonstrate deep learning’s transformative potential in fortifying network security infrastructures against sophisticated cyber threats,providing a scalable,high-performance solution that enhances security measures across increasingly complex IoT ecosystems.This study’s outcomes are critical for security practitioners and researchers focusing on the next generation of cyber defense mechanisms,offering a data-driven foundation for future advancements in IoT security strategies.
基金sponsored by the Institute of Information Technology(Vietnam Academy of Science and Technology)with Project Code“CS24.01”.
文摘Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill from doctors and technicians.In our study,we focused on the 3D dose prediction problem in radiotherapy by applying the deeplearning approach to computed tomography(CT)images of cancer patients.Medical image data has more complex characteristics than normal image data,and this research aims to explore the effectiveness of data preprocessing and augmentation in the context of the 3D dose prediction problem.We proposed four strategies to clarify our hypothesis in different aspects of applying data preprocessing and augmentation.In strategies,we trained our custom convolutional neural network model which has a structure inspired by the U-net,and residual blocks were also applied to the architecture.The output of the network is added with a rectified linear unit(Re-Lu)function for each pixel to ensure there are no negative values,which are absurd with radiation doses.Our experiments were conducted on the dataset of the Open Knowledge-Based Planning Challenge which was collected from head and neck cancer patients treatedwith radiation therapy.The results of four strategies showthat our hypothesis is rational by evaluating metrics in terms of the Dose-score and the Dose-volume histogram score(DVH-score).In the best training cases,the Dose-score is 3.08 and the DVH-score is 1.78.In addition,we also conducted a comparison with the results of another study in the same context of using the loss function.
文摘Analyzing colon cancer data is essential for improving early detection,treatment outcomes,public health initiatives,research efforts,and overall patient care,ultimately leading to better outcomes and reduced burden associated with this disease.The prediction of any disease depends on the quality of the available dataset.Before applying the prediction algorithm,it is important to analyze its characteristics.This research presented a comprehensive framework for addressing data imbalance in colon cancer datasets,which has been a significant challenge in previous studies in terms of imbalancing and high dimensionality for the prediction of colon cancer data.Both characters are important concepts of preprocessing.Imbalancing refers to the adjusting the data points in the proper portion of the class label.Feature selection is the process of selecting the strong feature from the available dataspace.This study aims to improve the performance of the popular tree,rule,lazy(K nearest neighbor(KNN))classifiers,and support vector machine(SVM)algorithm after addressing the imbalancing issue of data analysis and applying various feature selection methods such as chi-square,symmetrical uncertainty,correlation-based feature selection(CFS)subset,and classifier subset evaluators.The proposed research framework shows that after balancing the dataset,all the algorithms performed better with all applied feature selection methods.Out of all methods,Jrip records 85.71%accuracy with classifier subset evaluators,Ridor marks 84.52%accuracy with CFS,J48 produces 83.33%accuracy with both CFS and classifier subset evaluators,simple cart notices 84.52%with classifier subset evaluators,KNN records 91.66%accuracy with Chi and CFS,and SVM produces 92.85%with symmetrical uncertainty.
文摘One of the most dangerous forms of cancer, skin cancer has been on the rise over the past ten years. Nonetheless, melanoma detection is a method that uses deep learning algorithms to analyze images and accurately diagnose melanoma. An improved result for cancer treatment may result from early diagnosis. Then, in a matter of seconds, it will be simple to identify skin cancer using deep learning. In this research, a deep learning-based automatic skin cancer detection method is proposed. Data was considered from the ISIC database dataset which has 2357 images. To obtain average color information and normalize all color channel information, we used a few preprocessing approaches. Next, data was collected for categorization and reshaping of the images. To avoid overfitting, we additionally employed data augmentation. In the end, the Convolutional Neural Network was used to achieve our goal, which improved the accuracy of prediction. Using the Resnet50 algorithm, the accuracy rate rose to 98%, which will be helpful to get a good outcome with better accuracy.
文摘Over the past ten years, there has been an increase in cardiovascular disease, one of the most dangerous types of disease. However, cardiovascular detection is a technique that analyzes data and precisely diagnoses cardiovascular disease using machine learning algorithms. Early diagnosis may lead to better outcomes for heart treatment. Then, utilizing machine learning to detect cardiac disease will be easy in a couple of seconds. This study proposes an automatic way for detecting cardiovascular diseases such as heart disease using machine learning. A physician’s accurate and thorough evaluation of a patient’s cardiovascular risk plays a critical role in lowering the incidence and severity of heart attacks and strokes as well as improving cardiovascular protection. To develop technology for the early detection of cardiovascular disease, the Kaggle dataset was gathered. Certain preprocessing techniques were used to improve accuracy and outcomes. Ultimately, we employed decision trees, logistic regression, and random forests to reach our objective. Of these, random forest yielded the highest accuracy of 96%, making them useful for obtaining high-quality results with greater precision.
文摘Network intrusion detection systems need to be updated due to the rise in cyber threats. In order to improve detection accuracy, this research presents a strong strategy that makes use of a stacked ensemble method, which combines the advantages of several machine learning models. The ensemble is made up of various base models, such as Decision Trees, K-Nearest Neighbors (KNN), Multi-Layer Perceptrons (MLP), and Naive Bayes, each of which offers a distinct perspective on the properties of the data. The research adheres to a methodical workflow that begins with thorough data preprocessing to guarantee the accuracy and applicability of the data. In order to extract useful attributes from network traffic data—which are essential for efficient model training—feature engineering is used. The ensemble approach combines these models by training a Logistic Regression model meta-learner on base model predictions. In addition to increasing prediction accuracy, this tiered approach helps get around the drawbacks that come with using individual models. High accuracy, precision, and recall are shown in the model’s evaluation of a network intrusion dataset, indicating the model’s efficacy in identifying malicious activity. Cross-validation is used to make sure the models are reliable and well-generalized to new, untested data. In addition to advancing cybersecurity, the research establishes a foundation for the implementation of flexible and scalable intrusion detection systems. This hybrid, stacked ensemble model has a lot of potential for improving cyberattack prevention, lowering the likelihood of cyberattacks, and offering a scalable solution that can be adjusted to meet new threats and technological advancements.
文摘In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control.