Automated classification of gas flow states in blast furnaces using top-camera imagery typically demands a large volume of labeled data,whose manual annotation is both labor-intensive and cost-prohibitive.To mitigate ...Automated classification of gas flow states in blast furnaces using top-camera imagery typically demands a large volume of labeled data,whose manual annotation is both labor-intensive and cost-prohibitive.To mitigate this challenge,we present an enhanced semi-supervised learning approach based on the Mean Teacher framework,incorporating a novel feature loss module to maximize classification performance with limited labeled samples.The model studies show that the proposed model surpasses both the baseline Mean Teacher model and fully supervised method in accuracy.Specifically,for datasets with 20%,30%,and 40%label ratios,using a single training iteration,the model yields accuracies of 78.61%,82.21%,and 85.2%,respectively,while multiple-cycle training iterations achieves 82.09%,81.97%,and 81.59%,respectively.Furthermore,scenario-specific training schemes are introduced to support diverse deployment need.These findings highlight the potential of the proposed technique in minimizing labeling requirements and advancing intelligent blast furnace diagnostics.展开更多
Quantitative analysis of aluminum-silicon(Al-Si)alloy microstructure is crucial for evaluating and controlling alloy performance.Conventional analysis methods rely on manual segmentation,which is inefficient and subje...Quantitative analysis of aluminum-silicon(Al-Si)alloy microstructure is crucial for evaluating and controlling alloy performance.Conventional analysis methods rely on manual segmentation,which is inefficient and subjective,while fully supervised deep learning approaches require extensive and expensive pixel-level annotated data.Furthermore,existing semi-supervised methods still face challenges in handling the adhesion of adjacent primary silicon particles and effectively utilizing consistency in unlabeled data.To address these issues,this paper proposes a novel semi-supervised framework for Al-Si alloy microstructure image segmentation.First,we introduce a Rotational Uncertainty Correction Strategy(RUCS).This strategy employs multi-angle rotational perturbations andMonte Carlo sampling to assess prediction consistency,generating a pixel-wise confidence weight map.By integrating this map into the loss function,the model dynamically focuses on high-confidence regions,thereby improving generalization ability while reducing manual annotation pressure.Second,we design a Boundary EnhancementModule(BEM)to strengthen boundary feature extraction through erosion difference and multi-scale dilated convolutions.This module guides the model to focus on the boundary regions of adjacent particles,effectively resolving particle adhesion and improving segmentation accuracy.Systematic experiments were conducted on the Aluminum-Silicon Alloy Microstructure Dataset(ASAD).Results indicate that the proposed method performs exceptionally well with scarce labeled data.Specifically,using only 5%labeled data,our method improves the Jaccard index and Adjusted Rand Index(ARI)by 2.84 and 1.57 percentage points,respectively,and reduces the Variation of Information(VI)by 8.65 compared to stateof-the-art semi-supervised models,approaching the performance levels of 10%labeled data.These results demonstrate that the proposed method significantly enhances the accuracy and robustness of quantitative microstructure analysis while reducing annotation costs.展开更多
To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervis...To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervised masked contrastive learning and domain adaptation(SSMCL-DA)method for gearbox fault diagnosis under variable conditions.Initially,during the unsupervised pre-training phase,a dual signal augmentation strategy is devised,which simultaneously applies random masking in the time domain and random scaling in the frequency domain to unlabeled samples,thereby constructing more challenging positive sample pairs to guide the encoder in learning intrinsic features robust to condition variations.Subsequently,a ConvNeXt-Transformer hybrid architecture is employed,integrating the superior local detail modeling capacity of ConvNeXt with the robust global perception capability of Transformer to enhance feature extraction in complex scenarios.Thereafter,a contrastive learning model is constructed with the optimization objective of maximizing feature similarity across different masked instances of the same sample,enabling the extraction of consistent features from multiple masked perspectives and reducing reliance on labeled data.In the final supervised fine-tuning phase,a multi-scale attention mechanism is incorporated for feature rectification,and a domain adaptation module combining Local Maximum Mean Discrepancy(LMMD)with adversarial learning is proposed.This module embodies a dual mechanism:LMMD facilitates fine-grained class-conditional alignment,compelling features of identical fault classes to converge across varying conditions,while the domain discriminator utilizes adversarial training to guide the feature extractor toward learning domain-invariant features.Working in concert,they markedly diminish feature distribution discrepancies induced by changes in load,rotational speed,and other factors,thereby boosting the model’s adaptability to cross-condition scenarios.Experimental evaluations on the WT planetary gearbox dataset and the Case Western Reserve University(CWRU)bearing dataset demonstrate that the SSMCL-DA model effectively identifies multiple fault classes in gearboxes,with diagnostic performance substantially surpassing that of conventional methods.Under cross-condition scenarios,the model attains fault diagnosis accuracies of 99.21%for the WT planetary gearbox and 99.86%for the bearings,respectively.Furthermore,the model exhibits stable generalization capability in cross-device settings.展开更多
Asparagus stem blight is a devastating crop disease,and the early detection of its pathogenic spores is essential for effective disease control and prevention.However,spore detection is still hindered by complex backg...Asparagus stem blight is a devastating crop disease,and the early detection of its pathogenic spores is essential for effective disease control and prevention.However,spore detection is still hindered by complex backgrounds,small target sizes,and high annotation costs,which limit its practical application and widespread adoption.To address these issues,a semi-supervised spore detection framework is proposed for use under complex background conditions.Firstly,a difficulty perception scoring function is designed to quantify the detection difficulty of each image region.For regions with higher difficulty scores,a masking strategy is applied,while the remaining regions are adversarial augmentation is applied to encourage the model to learn fromchallenging areasmore effectively.Secondly,a Gaussian Mixture Model is employed to dynamically adjust the allocation threshold for pseudo-labels,thereby reducing the influence of unreliable supervision signals and enhancing the stability of semi-supervised learning.Finally,the Wasserstein distance is introduced for object localization refinement,offering a more robust positioning approach.Experimental results demonstrate that the proposed framework achieves 88.9% mAP50 and 60.7% mAP50-95,surpassing the baseline method by 4.2% and 4.6%,respectively,using only 10% of labeled data.In comparison with other state-of-the-art semi-supervised detection models,the proposed method exhibits superior detection accuracy and robustness.In conclusion,the framework not only offers an efficient and reliable solution for plant pathogen spore detection but also provides strong algorithmic support for real-time spore detection and early disease warning systems,with significant engineering application potential.展开更多
Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rel...Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.展开更多
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
The thermal and electrical conductivities of magnesium alloys are highly sensitive to composition and microstructure,with thermal conductivity varying by up to 20-fold across different as-cast alloy systems,making rap...The thermal and electrical conductivities of magnesium alloys are highly sensitive to composition and microstructure,with thermal conductivity varying by up to 20-fold across different as-cast alloy systems,making rapid and accurate prediction crucial for high-throughput screening and development of high-performance alloys.This study introduces a physics-informed symbolic regression approach that addresses the limitations of traditional methods,including the high computational cost of first-principles calculations and the poor interpretability of machine learning models.Comprehensive datasets comprising 1512 data points from 60 literature sources were analyzed,including thermal conductivity measurements from 52 alloy systems and electrical conductivity measurements from 36 systems.The derived symbolic regression model achieved Mean Absolute Percentage Errors(MAPEs)of 11.2%and 11.4%for thermal conductivity in low and high-component systems,respectively.When integrated with the Smith-Palmer equation,electrical conductivity predictions reached MAPEs of 15.6%and 16.4%.Independent validation on an entirely separate dataset of 554 data points from 53 additional literature sources,including 37 previously unseen alloy systems,confirmed model generalizability with MAPEs of 10.7%-15.2%.Shapley Additive Explanations(SHAP)analysis was employed to evaluate the relative importance of different features affecting conductivity,while equation decomposition quantified the contribution of individual functional terms.This methodology bridges data-driven prediction with mechanistic understanding,establishing a foundation for knowledge-based design of magnesium alloys with tailored transport properties.展开更多
Traditional oilfields face increasing extraction challenges, primarily due to reservoir quality degradation and production decline, which are further exacerbated by volatile international crude oil prices—illustrated...Traditional oilfields face increasing extraction challenges, primarily due to reservoir quality degradation and production decline, which are further exacerbated by volatile international crude oil prices—illustrated by Brent Crude’s trajectory from pandemic-induced negative pricing to geopolitically driven surges exceeding USD 100 per barrel. This study addresses these complexities through an integrated methodological framework applied to medium-permeability sandstone reservoirs in the Xinjiang oilfield by combining advanced numerical simulations with multivariate regression analysis. The methodology employs Latin Hypercube Sampling (LHS) to stratify geological parameter distributions and constructs heterogeneous reservoir models using Petrel software, rigorously validated through historical production data matching. Production forecasting integrates numerical simulation and Decline Curve Analysis (DCA), while investment estimation utilizes Ordinary Least Squares (OLS) regression to correlate engineering parameters with drilling and completion costs. Economic evaluation incorporates Discounted Cash Flow (DCF) modeling and breakeven analysis, establishing techno-economic boundaries via oil price sensitivity analysis ranging from USD 40 to 90 per barrel. Visualization tools, including 3D heatmaps, delineate nonlinear interactions among engineering, geological, and investment datasets under economic constraints. Key findings demonstrate that for the target reservoirs, as oil prices increase from USD 40 to USD 90 per barrel, the minimum economic thickness threshold decreases from approximately 5.7 m to about 2.5 m, with model prediction errors consistently below 25% across validation datasets. This framework provides scientifically grounded decision support for optimizing capital allocation and offers actionable insights to enhance undeveloped hydrocarbon development planning amid market uncertainty. Ultimately, it supports national energy security through technically robust and economically viable resource exploitation strategies.展开更多
The accessibility of urban public transit directly influences residents’quality of life,travel behavior,and social equity.Its correlation with housing prices has garnered significant attention across disciplines such...The accessibility of urban public transit directly influences residents’quality of life,travel behavior,and social equity.Its correlation with housing prices has garnered significant attention across disciplines such as geography,economics,and urban planning.Although much existing research focuses on the impact of individual transportation facilities on housing prices,there is a notable gap in comprehensive analyses that assess the influence of overall urban transit accessibility on housing market dynamics.This study selected the main urban area of Hefei,China,as a case to investigate the spatial distribution of housing prices and evaluate public transit accessibility in 2022.Employing techniques such as the optimized parameter geographical detector and local spatial regression models,the study aimed to elucidate the effects and underlying mechanisms of urban transit accessibility on housing prices.The findings revealed that:1)housing prices in Hefei exhibited a clustered spatial pattern,with high prices concentrated in the city center and lower prices in peripheral areas,forming three distinct high-price hotspots with a‘belt-like’distribution;2)public transit accessibility showed a‘coreperiphery’structure,with accessibility declining in a‘circumferential’pattern around the city center.Based on the‘housing price-accessibility’dimension,four categories were identified:high price-high accessibility(37.25%),high price-low accessibility(19.07%),low price-high accessibility(21.95%),and low price-low accessibility(21.73%);3)the impact of transit accessibility on housing prices was spatially heterogeneous,with bus travel showing the strongest explanatory power(0.692),followed by automobile,subway,and bicycle travel.The interaction of these transportation modes generated a synergistic effect on housing price differentiation,with most influencing factors contributing more than 25%.These findings offer valuable insights for optimizing the spatial distribution of public transit infrastructure and improving both urban housing quality and residents’living standards.展开更多
BACKGROUND Paternal perinatal depression(PPD)is closely associated with maternal mental health challenges,marital strain,and adverse child developmental outcomes.Despite its significant impact,PPD remains under-recogn...BACKGROUND Paternal perinatal depression(PPD)is closely associated with maternal mental health challenges,marital strain,and adverse child developmental outcomes.Despite its significant impact,PPD remains under-recognized in family-centered clinical practice.Concurrently,against the backdrop of rising rates of delayed marriage and China’s Maternity Incentive Policy,the proportion of women giving birth at an advanced maternal age is increasing.Nevertheless,research specifically examining PPD among spouses of older mothers remains critically scarce,both in China and globally.AIM To investigate PPD and its influencing factors in Chinese advanced maternal age families.METHODS This cross-sectional study included 358 participants;it was conducted among fathers of pregnant women of advanced maternal age at five hospitals in the Pearl River Delta region of China from September 2023 to June 2024.Data were collected via a general information questionnaire,the Social Support Rating Scale,and the Edinburgh Postnatal Depression Scale.Latent profile analysis and regression mixture models(RMMs)were adopted to analyze the latent PPD types and factors that influenced PPD.RESULTS The incidence of PPD was 16.48%,and three profiles were identified:Low-symptomatic(175 cases,48.89%),monophasic(140 cases,39.10%),and high-symptomatic(43 cases,12.01%).The RMM analysis revealed that first pregnancy,low income(<¥3000/month),part-time work,and a history of abnormal pregnancy were positively associated with the high-symptomatic type(P<0.05).Conversely,high subjective support and support utilization were negatively associated with the high-symptomatic type compared with the low-symptomatic type(P<0.05).Good couple relationships,high objective and subjective support,and high support utilization were negatively associated with monophasic disorder(P<0.05).CONCLUSION PPD incidence is high among Chinese fathers with advanced maternal age partners,and the characteristics of depression are varied.Healthcare practitioners should prioritize individuals with low levels of social support.展开更多
This paper proposed a semi-supervised regression model with co-training algorithm based on support vector machine, which was used for retrieving water quality variables from SPOT 5 remote sensing data. The model consi...This paper proposed a semi-supervised regression model with co-training algorithm based on support vector machine, which was used for retrieving water quality variables from SPOT 5 remote sensing data. The model consisted of two support vector regressors (SVRs). Nonlinear relationship between water quality variables and SPOT 5 spectrum was described by the two SVRs, and semi-supervised co-training algorithm for the SVRs was es-tablished. The model was used for retrieving concentrations of four representative pollution indicators―permangan- ate index (CODmn), ammonia nitrogen (NH3-N), chemical oxygen demand (COD) and dissolved oxygen (DO) of the Weihe River in Shaanxi Province, China. The spatial distribution map for those variables over a part of the Weihe River was also produced. SVR can be used to implement any nonlinear mapping readily, and semi-supervis- ed learning can make use of both labeled and unlabeled samples. By integrating the two SVRs and using semi-supervised learning, we provide an operational method when paired samples are limited. The results show that it is much better than the multiple statistical regression method, and can provide the whole water pollution condi-tions for management fast and can be extended to hyperspectral remote sensing applications.展开更多
The accuracy of laser-induced breakdown spectroscopy(LIBS) quantitative method is greatly dependent on the amount of certified standard samples used for training. However, in practical applications, only limited stand...The accuracy of laser-induced breakdown spectroscopy(LIBS) quantitative method is greatly dependent on the amount of certified standard samples used for training. However, in practical applications, only limited standard samples with labeled certified concentrations are available. A novel semi-supervised LIBS quantitative analysis method is proposed, based on co-training regression model with selection of effective unlabeled samples. The main idea of the proposed method is to obtain better regression performance by adding effective unlabeled samples in semisupervised learning. First, effective unlabeled samples are selected according to the testing samples by Euclidean metric. Two original regression models based on least squares support vector machine with different parameters are trained by the labeled samples separately, and then the effective unlabeled samples predicted by the two models are used to enlarge the training dataset based on labeling confidence estimation. The final predictions of the proposed method on the testing samples will be determined by weighted combinations of the predictions of two updated regression models. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples were carried out, in which 5 samples with labeled concentrations and 11 unlabeled samples were used to train the regression models and the remaining 7 samples were used for testing. With the numbers of effective unlabeled samples increasing, the root mean square error of the proposed method went down from 1.80% to 0.84% and the relative prediction error was reduced from 9.15% to 4.04%.展开更多
Accurate prediction of the remaining useful life(RUL)is crucial for the design and management of lithium-ion batteries.Although various machine learning models offer promising predictions,one critical but often overlo...Accurate prediction of the remaining useful life(RUL)is crucial for the design and management of lithium-ion batteries.Although various machine learning models offer promising predictions,one critical but often overlooked challenge is their demand for considerable run-to-failure data for training.Collection of such training data leads to prohibitive testing efforts as the run-to-failure tests can last for years.Here,we propose a semi-supervised representation learning method to enhance prediction accuracy by learning from data without RUL labels.Our approach builds on a sophisticated deep neural network that comprises an encoder and three decoder heads to extract time-dependent representation features from short-term battery operating data regardless of the existence of RUL labels.The approach is validated using three datasets collected from 34 batteries operating under various conditions,encompassing over 19,900 charge and discharge cycles.Our method achieves a root mean squared error(RMSE)within 25 cycles,even when only 1/50 of the training dataset is labelled,representing a reduction of 48%compared to the conventional approach.We also demonstrate the method's robustness with varying numbers of labelled data and different weights assigned to the three decoder heads.The projection of extracted features in low space reveals that our method effectively learns degradation features from unlabelled data.Our approach highlights the promise of utilising semi-supervised learning to reduce the data demand for reliability monitoring of energy devices.展开更多
Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)t...Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset.展开更多
Background:The COVID-1’s impact on influenza activity is of interest to inform future flu prevention and control strategies.Our study aim to examine COVID-19’s effects on influenza in Fujian Province,China,using a r...Background:The COVID-1’s impact on influenza activity is of interest to inform future flu prevention and control strategies.Our study aim to examine COVID-19’s effects on influenza in Fujian Province,China,using a regression discontinuity design.Methods:We utilized influenza-like illness(ILI)percentage as an indicator of influenza activity,with data from all sentinel hospitals between Week 4,2020,and Week 51,2023.The data is divided into two groups:the COVID-19 epidemic period and the post-epidemic period.Statistical analysis was performed with R software using robust RD design methods to account for potential confounders including seasonality,temperature,and influenza vaccination rates.Results:There was a discernible increase in the ILI percentage during the post-epidemic period.The robustness of the findings was confirmed with various RD design bandwidth selection methods and placebo tests,with certwo bandwidth providing the largest estimated effect size:a 14.6-percentage-point increase in the ILI percentage(β=0.146;95%CI:0.096–0.196).Sensitivity analyses and adjustments for confounders consistently pointed to an increased ILI percentage during the post-epidemic period compared to the epidemic period.Conclusion:The 14.6 percentage-point increase in the ILI percentage in Fujian Province,China,after the end of the COVID-19 pandemic suggests that there may be a need to re-evaluate and possibly enhance public health measures to control influenza transmission.Further research is needed to fully understand the factors contributing to this rise and to assess the ongoing impacts of post-pandemic behavioral changes.展开更多
This study numerically examines the heat and mass transfer characteristics of two ternary nanofluids via converging and diverg-ing channels.Furthermore,the study aims to assess two ternary nanofluids combinations to d...This study numerically examines the heat and mass transfer characteristics of two ternary nanofluids via converging and diverg-ing channels.Furthermore,the study aims to assess two ternary nanofluids combinations to determine which configuration can provide better heat and mass transfer and lower entropy production,while ensuring cost efficiency.This work bridges the gap be-tween academic research and industrial feasibility by incorporating cost analysis,entropy generation,and thermal efficiency.To compare the velocity,temperature,and concentration profiles,we examine two ternary nanofluids,i.e.,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O and TiO_(2)+SiO_(2)+Cu/H_(2)O,while considering the shape of nanoparticles.The velocity slip and Soret/Dufour effects are taken into consideration.Furthermore,regression analysis for Nusselt and Sherwood numbers of the model is carried out.The Runge-Kutta fourth-order method with shooting technique is employed to acquire the numerical solution of the governed system of ordinary differential equations.The flow pattern attributes of ternary nanofluids are meticulously examined and simulated with the fluc-tuation of flow-dominating parameters.Additionally,the influence of these parameters is demonstrated in the flow,temperature,and concentration fields.For variation in Eckert and Dufour numbers,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher temperature than TiO_(2)+SiO_(2)+Cu/H_(2)O.The results obtained indicate that the ternary nanofluid TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher heat transfer rate,lesser entropy generation,greater mass transfer rate,and lower cost than that of TiO_(2)+SiO_(2)+Cu/H_(2)O ternary nanofluid.展开更多
Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence o...Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.展开更多
As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely...As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.展开更多
The classification of respiratory sounds is crucial in diagnosing and monitoring respiratory diseases.However,auscultation is highly subjective,making it challenging to analyze respiratory sounds accurately.Although d...The classification of respiratory sounds is crucial in diagnosing and monitoring respiratory diseases.However,auscultation is highly subjective,making it challenging to analyze respiratory sounds accurately.Although deep learning has been increasingly applied to this task,most existing approaches have primarily relied on supervised learning.Since supervised learning requires large amounts of labeled data,recent studies have explored self-supervised and semi-supervised methods to overcome this limitation.However,these approaches have largely assumed a closedset setting,where the classes present in the unlabeled data are considered identical to those in the labeled data.In contrast,this study explores an open-set semi-supervised learning setting,where the unlabeled data may contain additional,unknown classes.To address this challenge,a distance-based prototype network is employed to classify respiratory sounds in an open-set setting.In the first stage,the prototype network is trained using labeled and unlabeled data to derive prototype representations of known classes.In the second stage,distances between unlabeled data and known class prototypes are computed,and samples exceeding an adaptive threshold are identified as unknown.A new prototype is then calculated for this unknown class.In the final stage,semi-supervised learning is employed to classify labeled and unlabeled data into known and unknown classes.Compared to conventional closed-set semisupervised learning approaches,the proposed method achieved an average classification accuracy improvement of 2%–5%.Additionally,in cases of data scarcity,utilizing unlabeled data further improved classification performance by 6%–8%.The findings of this study are expected to significantly enhance respiratory sound classification performance in practical clinical settings.展开更多
In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by re...In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by replacing them with a minimally adequate collection of their linear combinations without loss of information.Recently,regularization methods have been proposed in SIR to incorporate a sparse structure of predictors for better interpretability.However,existing methods consider convex relaxation to bypass the sparsity constraint,which may not lead to the best subset,and particularly tends to include irrelevant variables when predictors are correlated.In this study,we approach sparse SIR as a nonconvex optimization problem and directly tackle the sparsity constraint by establishing the optimal conditions and iteratively solving them by means of the splicing technique.Without employing convex relaxation on the sparsity constraint and the orthogonal constraint,our algorithm exhibits superior empirical merits,as evidenced by extensive numerical studies.Computationally,our algorithm is much faster than the relaxed approach for the natural sparse SIR estimator.Statistically,our algorithm surpasses existing methods in terms of accuracy for central subspace estimation and best subset selection and sustains high performance even with correlated predictors.展开更多
基金financial support provided by the Natural Science Foundation of Hebei Province,China(No.E2024105036)the Tangshan Talent Funding Project,China(Nos.B202302007 and A2021110015)+1 种基金the National Natural Science Foundation of China(No.52264042)the Australian Research Council(No.IH230100010)。
文摘Automated classification of gas flow states in blast furnaces using top-camera imagery typically demands a large volume of labeled data,whose manual annotation is both labor-intensive and cost-prohibitive.To mitigate this challenge,we present an enhanced semi-supervised learning approach based on the Mean Teacher framework,incorporating a novel feature loss module to maximize classification performance with limited labeled samples.The model studies show that the proposed model surpasses both the baseline Mean Teacher model and fully supervised method in accuracy.Specifically,for datasets with 20%,30%,and 40%label ratios,using a single training iteration,the model yields accuracies of 78.61%,82.21%,and 85.2%,respectively,while multiple-cycle training iterations achieves 82.09%,81.97%,and 81.59%,respectively.Furthermore,scenario-specific training schemes are introduced to support diverse deployment need.These findings highlight the potential of the proposed technique in minimizing labeling requirements and advancing intelligent blast furnace diagnostics.
基金funded by the National Natural Science Foundation of China (52061020).
文摘Quantitative analysis of aluminum-silicon(Al-Si)alloy microstructure is crucial for evaluating and controlling alloy performance.Conventional analysis methods rely on manual segmentation,which is inefficient and subjective,while fully supervised deep learning approaches require extensive and expensive pixel-level annotated data.Furthermore,existing semi-supervised methods still face challenges in handling the adhesion of adjacent primary silicon particles and effectively utilizing consistency in unlabeled data.To address these issues,this paper proposes a novel semi-supervised framework for Al-Si alloy microstructure image segmentation.First,we introduce a Rotational Uncertainty Correction Strategy(RUCS).This strategy employs multi-angle rotational perturbations andMonte Carlo sampling to assess prediction consistency,generating a pixel-wise confidence weight map.By integrating this map into the loss function,the model dynamically focuses on high-confidence regions,thereby improving generalization ability while reducing manual annotation pressure.Second,we design a Boundary EnhancementModule(BEM)to strengthen boundary feature extraction through erosion difference and multi-scale dilated convolutions.This module guides the model to focus on the boundary regions of adjacent particles,effectively resolving particle adhesion and improving segmentation accuracy.Systematic experiments were conducted on the Aluminum-Silicon Alloy Microstructure Dataset(ASAD).Results indicate that the proposed method performs exceptionally well with scarce labeled data.Specifically,using only 5%labeled data,our method improves the Jaccard index and Adjusted Rand Index(ARI)by 2.84 and 1.57 percentage points,respectively,and reduces the Variation of Information(VI)by 8.65 compared to stateof-the-art semi-supervised models,approaching the performance levels of 10%labeled data.These results demonstrate that the proposed method significantly enhances the accuracy and robustness of quantitative microstructure analysis while reducing annotation costs.
基金supported by the National Natural Science Foundation of China Funded Project(Project Name:Research on Robust Adaptive Allocation Mechanism of Human Machine Co-Driving System Based on NMS Features,Project Approval Number:52172381).
文摘To address the issue of scarce labeled samples and operational condition variations that degrade the accuracy of fault diagnosis models in variable-condition gearbox fault diagnosis,this paper proposes a semi-supervised masked contrastive learning and domain adaptation(SSMCL-DA)method for gearbox fault diagnosis under variable conditions.Initially,during the unsupervised pre-training phase,a dual signal augmentation strategy is devised,which simultaneously applies random masking in the time domain and random scaling in the frequency domain to unlabeled samples,thereby constructing more challenging positive sample pairs to guide the encoder in learning intrinsic features robust to condition variations.Subsequently,a ConvNeXt-Transformer hybrid architecture is employed,integrating the superior local detail modeling capacity of ConvNeXt with the robust global perception capability of Transformer to enhance feature extraction in complex scenarios.Thereafter,a contrastive learning model is constructed with the optimization objective of maximizing feature similarity across different masked instances of the same sample,enabling the extraction of consistent features from multiple masked perspectives and reducing reliance on labeled data.In the final supervised fine-tuning phase,a multi-scale attention mechanism is incorporated for feature rectification,and a domain adaptation module combining Local Maximum Mean Discrepancy(LMMD)with adversarial learning is proposed.This module embodies a dual mechanism:LMMD facilitates fine-grained class-conditional alignment,compelling features of identical fault classes to converge across varying conditions,while the domain discriminator utilizes adversarial training to guide the feature extractor toward learning domain-invariant features.Working in concert,they markedly diminish feature distribution discrepancies induced by changes in load,rotational speed,and other factors,thereby boosting the model’s adaptability to cross-condition scenarios.Experimental evaluations on the WT planetary gearbox dataset and the Case Western Reserve University(CWRU)bearing dataset demonstrate that the SSMCL-DA model effectively identifies multiple fault classes in gearboxes,with diagnostic performance substantially surpassing that of conventional methods.Under cross-condition scenarios,the model attains fault diagnosis accuracies of 99.21%for the WT planetary gearbox and 99.86%for the bearings,respectively.Furthermore,the model exhibits stable generalization capability in cross-device settings.
基金supported by Development of asparagus price database based on agricultural big data(381724).
文摘Asparagus stem blight is a devastating crop disease,and the early detection of its pathogenic spores is essential for effective disease control and prevention.However,spore detection is still hindered by complex backgrounds,small target sizes,and high annotation costs,which limit its practical application and widespread adoption.To address these issues,a semi-supervised spore detection framework is proposed for use under complex background conditions.Firstly,a difficulty perception scoring function is designed to quantify the detection difficulty of each image region.For regions with higher difficulty scores,a masking strategy is applied,while the remaining regions are adversarial augmentation is applied to encourage the model to learn fromchallenging areasmore effectively.Secondly,a Gaussian Mixture Model is employed to dynamically adjust the allocation threshold for pseudo-labels,thereby reducing the influence of unreliable supervision signals and enhancing the stability of semi-supervised learning.Finally,the Wasserstein distance is introduced for object localization refinement,offering a more robust positioning approach.Experimental results demonstrate that the proposed framework achieves 88.9% mAP50 and 60.7% mAP50-95,surpassing the baseline method by 4.2% and 4.6%,respectively,using only 10% of labeled data.In comparison with other state-of-the-art semi-supervised detection models,the proposed method exhibits superior detection accuracy and robustness.In conclusion,the framework not only offers an efficient and reliable solution for plant pathogen spore detection but also provides strong algorithmic support for real-time spore detection and early disease warning systems,with significant engineering application potential.
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金supported by the National Key Research and Development Program of China(No.2023YFB3712401)the National Natural Science Foundation of China(No.52274301)+2 种基金the Aeronautical Science Foundation of China(No.2023Z0530S6005)Academician Workstation of Kunming University of Science and Technology(2024),Ningbo Yongjiang Talent-Introduction Program(No.2022A-023C)Zhejiang Phenomenological Materials Technology Co.,Ltd.,China.
文摘The thermal and electrical conductivities of magnesium alloys are highly sensitive to composition and microstructure,with thermal conductivity varying by up to 20-fold across different as-cast alloy systems,making rapid and accurate prediction crucial for high-throughput screening and development of high-performance alloys.This study introduces a physics-informed symbolic regression approach that addresses the limitations of traditional methods,including the high computational cost of first-principles calculations and the poor interpretability of machine learning models.Comprehensive datasets comprising 1512 data points from 60 literature sources were analyzed,including thermal conductivity measurements from 52 alloy systems and electrical conductivity measurements from 36 systems.The derived symbolic regression model achieved Mean Absolute Percentage Errors(MAPEs)of 11.2%and 11.4%for thermal conductivity in low and high-component systems,respectively.When integrated with the Smith-Palmer equation,electrical conductivity predictions reached MAPEs of 15.6%and 16.4%.Independent validation on an entirely separate dataset of 554 data points from 53 additional literature sources,including 37 previously unseen alloy systems,confirmed model generalizability with MAPEs of 10.7%-15.2%.Shapley Additive Explanations(SHAP)analysis was employed to evaluate the relative importance of different features affecting conductivity,while equation decomposition quantified the contribution of individual functional terms.This methodology bridges data-driven prediction with mechanistic understanding,establishing a foundation for knowledge-based design of magnesium alloys with tailored transport properties.
文摘Traditional oilfields face increasing extraction challenges, primarily due to reservoir quality degradation and production decline, which are further exacerbated by volatile international crude oil prices—illustrated by Brent Crude’s trajectory from pandemic-induced negative pricing to geopolitically driven surges exceeding USD 100 per barrel. This study addresses these complexities through an integrated methodological framework applied to medium-permeability sandstone reservoirs in the Xinjiang oilfield by combining advanced numerical simulations with multivariate regression analysis. The methodology employs Latin Hypercube Sampling (LHS) to stratify geological parameter distributions and constructs heterogeneous reservoir models using Petrel software, rigorously validated through historical production data matching. Production forecasting integrates numerical simulation and Decline Curve Analysis (DCA), while investment estimation utilizes Ordinary Least Squares (OLS) regression to correlate engineering parameters with drilling and completion costs. Economic evaluation incorporates Discounted Cash Flow (DCF) modeling and breakeven analysis, establishing techno-economic boundaries via oil price sensitivity analysis ranging from USD 40 to 90 per barrel. Visualization tools, including 3D heatmaps, delineate nonlinear interactions among engineering, geological, and investment datasets under economic constraints. Key findings demonstrate that for the target reservoirs, as oil prices increase from USD 40 to USD 90 per barrel, the minimum economic thickness threshold decreases from approximately 5.7 m to about 2.5 m, with model prediction errors consistently below 25% across validation datasets. This framework provides scientifically grounded decision support for optimizing capital allocation and offers actionable insights to enhance undeveloped hydrocarbon development planning amid market uncertainty. Ultimately, it supports national energy security through technically robust and economically viable resource exploitation strategies.
基金Under the auspices of the National Natural Science Foundation of China(No.42271224,41901193)Ministry of Edu cation Humanities and Social Sciences Research Planning Fund Project of China(No.24YJAZH190)+1 种基金Anhui Province Excellent Youth Research Project in Universities(No.2022AH030019)Anhui Social Sciences Innovation Development Research Project(No.2024CXQ503)。
文摘The accessibility of urban public transit directly influences residents’quality of life,travel behavior,and social equity.Its correlation with housing prices has garnered significant attention across disciplines such as geography,economics,and urban planning.Although much existing research focuses on the impact of individual transportation facilities on housing prices,there is a notable gap in comprehensive analyses that assess the influence of overall urban transit accessibility on housing market dynamics.This study selected the main urban area of Hefei,China,as a case to investigate the spatial distribution of housing prices and evaluate public transit accessibility in 2022.Employing techniques such as the optimized parameter geographical detector and local spatial regression models,the study aimed to elucidate the effects and underlying mechanisms of urban transit accessibility on housing prices.The findings revealed that:1)housing prices in Hefei exhibited a clustered spatial pattern,with high prices concentrated in the city center and lower prices in peripheral areas,forming three distinct high-price hotspots with a‘belt-like’distribution;2)public transit accessibility showed a‘coreperiphery’structure,with accessibility declining in a‘circumferential’pattern around the city center.Based on the‘housing price-accessibility’dimension,four categories were identified:high price-high accessibility(37.25%),high price-low accessibility(19.07%),low price-high accessibility(21.95%),and low price-low accessibility(21.73%);3)the impact of transit accessibility on housing prices was spatially heterogeneous,with bus travel showing the strongest explanatory power(0.692),followed by automobile,subway,and bicycle travel.The interaction of these transportation modes generated a synergistic effect on housing price differentiation,with most influencing factors contributing more than 25%.These findings offer valuable insights for optimizing the spatial distribution of public transit infrastructure and improving both urban housing quality and residents’living standards.
基金Supported by High-level Professional Groups in Gangdong Province,No.GSPZYQ2020101Guangdong Province Educational Research Planning Project,No.2024GXJK742。
文摘BACKGROUND Paternal perinatal depression(PPD)is closely associated with maternal mental health challenges,marital strain,and adverse child developmental outcomes.Despite its significant impact,PPD remains under-recognized in family-centered clinical practice.Concurrently,against the backdrop of rising rates of delayed marriage and China’s Maternity Incentive Policy,the proportion of women giving birth at an advanced maternal age is increasing.Nevertheless,research specifically examining PPD among spouses of older mothers remains critically scarce,both in China and globally.AIM To investigate PPD and its influencing factors in Chinese advanced maternal age families.METHODS This cross-sectional study included 358 participants;it was conducted among fathers of pregnant women of advanced maternal age at five hospitals in the Pearl River Delta region of China from September 2023 to June 2024.Data were collected via a general information questionnaire,the Social Support Rating Scale,and the Edinburgh Postnatal Depression Scale.Latent profile analysis and regression mixture models(RMMs)were adopted to analyze the latent PPD types and factors that influenced PPD.RESULTS The incidence of PPD was 16.48%,and three profiles were identified:Low-symptomatic(175 cases,48.89%),monophasic(140 cases,39.10%),and high-symptomatic(43 cases,12.01%).The RMM analysis revealed that first pregnancy,low income(<¥3000/month),part-time work,and a history of abnormal pregnancy were positively associated with the high-symptomatic type(P<0.05).Conversely,high subjective support and support utilization were negatively associated with the high-symptomatic type compared with the low-symptomatic type(P<0.05).Good couple relationships,high objective and subjective support,and high support utilization were negatively associated with monophasic disorder(P<0.05).CONCLUSION PPD incidence is high among Chinese fathers with advanced maternal age partners,and the characteristics of depression are varied.Healthcare practitioners should prioritize individuals with low levels of social support.
基金Under the auspices of National Natural Science Foundation of China (No. 40671133)Fundamental Research Funds for the Central Universities (No. GK200902015)
文摘This paper proposed a semi-supervised regression model with co-training algorithm based on support vector machine, which was used for retrieving water quality variables from SPOT 5 remote sensing data. The model consisted of two support vector regressors (SVRs). Nonlinear relationship between water quality variables and SPOT 5 spectrum was described by the two SVRs, and semi-supervised co-training algorithm for the SVRs was es-tablished. The model was used for retrieving concentrations of four representative pollution indicators―permangan- ate index (CODmn), ammonia nitrogen (NH3-N), chemical oxygen demand (COD) and dissolved oxygen (DO) of the Weihe River in Shaanxi Province, China. The spatial distribution map for those variables over a part of the Weihe River was also produced. SVR can be used to implement any nonlinear mapping readily, and semi-supervis- ed learning can make use of both labeled and unlabeled samples. By integrating the two SVRs and using semi-supervised learning, we provide an operational method when paired samples are limited. The results show that it is much better than the multiple statistical regression method, and can provide the whole water pollution condi-tions for management fast and can be extended to hyperspectral remote sensing applications.
基金supported by National Natural Science Foundation of China (No. 51674032)
文摘The accuracy of laser-induced breakdown spectroscopy(LIBS) quantitative method is greatly dependent on the amount of certified standard samples used for training. However, in practical applications, only limited standard samples with labeled certified concentrations are available. A novel semi-supervised LIBS quantitative analysis method is proposed, based on co-training regression model with selection of effective unlabeled samples. The main idea of the proposed method is to obtain better regression performance by adding effective unlabeled samples in semisupervised learning. First, effective unlabeled samples are selected according to the testing samples by Euclidean metric. Two original regression models based on least squares support vector machine with different parameters are trained by the labeled samples separately, and then the effective unlabeled samples predicted by the two models are used to enlarge the training dataset based on labeling confidence estimation. The final predictions of the proposed method on the testing samples will be determined by weighted combinations of the predictions of two updated regression models. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples were carried out, in which 5 samples with labeled concentrations and 11 unlabeled samples were used to train the regression models and the remaining 7 samples were used for testing. With the numbers of effective unlabeled samples increasing, the root mean square error of the proposed method went down from 1.80% to 0.84% and the relative prediction error was reduced from 9.15% to 4.04%.
基金supported by the National Natural Science Foundation of China(No.52207229)the Key Research and Development Program of Ningxia Hui Autonomous Region of China(No.2024BEE02003)+1 种基金the financial support from the AEGiS Research Grant 2024,University of Wollongong(No.R6254)the financial support from the China Scholarship Council(No.202207550010).
文摘Accurate prediction of the remaining useful life(RUL)is crucial for the design and management of lithium-ion batteries.Although various machine learning models offer promising predictions,one critical but often overlooked challenge is their demand for considerable run-to-failure data for training.Collection of such training data leads to prohibitive testing efforts as the run-to-failure tests can last for years.Here,we propose a semi-supervised representation learning method to enhance prediction accuracy by learning from data without RUL labels.Our approach builds on a sophisticated deep neural network that comprises an encoder and three decoder heads to extract time-dependent representation features from short-term battery operating data regardless of the existence of RUL labels.The approach is validated using three datasets collected from 34 batteries operating under various conditions,encompassing over 19,900 charge and discharge cycles.Our method achieves a root mean squared error(RMSE)within 25 cycles,even when only 1/50 of the training dataset is labelled,representing a reduction of 48%compared to the conventional approach.We also demonstrate the method's robustness with varying numbers of labelled data and different weights assigned to the three decoder heads.The projection of extracted features in low space reveals that our method effectively learns degradation features from unlabelled data.Our approach highlights the promise of utilising semi-supervised learning to reduce the data demand for reliability monitoring of energy devices.
基金supported by the Natural Science Foundation of China(No.41804112,author:Chengyun Song).
文摘Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset.
基金supported by the Youth Scientific Research Project of Fujian Provincial Center for Disease Control and Prevention(2022QN02)the Fujian Provincial Health Youth Scientific Research Project(2023QNA040).
文摘Background:The COVID-1’s impact on influenza activity is of interest to inform future flu prevention and control strategies.Our study aim to examine COVID-19’s effects on influenza in Fujian Province,China,using a regression discontinuity design.Methods:We utilized influenza-like illness(ILI)percentage as an indicator of influenza activity,with data from all sentinel hospitals between Week 4,2020,and Week 51,2023.The data is divided into two groups:the COVID-19 epidemic period and the post-epidemic period.Statistical analysis was performed with R software using robust RD design methods to account for potential confounders including seasonality,temperature,and influenza vaccination rates.Results:There was a discernible increase in the ILI percentage during the post-epidemic period.The robustness of the findings was confirmed with various RD design bandwidth selection methods and placebo tests,with certwo bandwidth providing the largest estimated effect size:a 14.6-percentage-point increase in the ILI percentage(β=0.146;95%CI:0.096–0.196).Sensitivity analyses and adjustments for confounders consistently pointed to an increased ILI percentage during the post-epidemic period compared to the epidemic period.Conclusion:The 14.6 percentage-point increase in the ILI percentage in Fujian Province,China,after the end of the COVID-19 pandemic suggests that there may be a need to re-evaluate and possibly enhance public health measures to control influenza transmission.Further research is needed to fully understand the factors contributing to this rise and to assess the ongoing impacts of post-pandemic behavioral changes.
基金supported by DST-FIST(Government of India)(Grant No.SR/FIST/MS-1/2017/13)and Seed Money Project(Grant No.DoRDC/733).
文摘This study numerically examines the heat and mass transfer characteristics of two ternary nanofluids via converging and diverg-ing channels.Furthermore,the study aims to assess two ternary nanofluids combinations to determine which configuration can provide better heat and mass transfer and lower entropy production,while ensuring cost efficiency.This work bridges the gap be-tween academic research and industrial feasibility by incorporating cost analysis,entropy generation,and thermal efficiency.To compare the velocity,temperature,and concentration profiles,we examine two ternary nanofluids,i.e.,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O and TiO_(2)+SiO_(2)+Cu/H_(2)O,while considering the shape of nanoparticles.The velocity slip and Soret/Dufour effects are taken into consideration.Furthermore,regression analysis for Nusselt and Sherwood numbers of the model is carried out.The Runge-Kutta fourth-order method with shooting technique is employed to acquire the numerical solution of the governed system of ordinary differential equations.The flow pattern attributes of ternary nanofluids are meticulously examined and simulated with the fluc-tuation of flow-dominating parameters.Additionally,the influence of these parameters is demonstrated in the flow,temperature,and concentration fields.For variation in Eckert and Dufour numbers,TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher temperature than TiO_(2)+SiO_(2)+Cu/H_(2)O.The results obtained indicate that the ternary nanofluid TiO_(2)+SiO_(2)+Al_(2)O_(3)/H_(2)O has a higher heat transfer rate,lesser entropy generation,greater mass transfer rate,and lower cost than that of TiO_(2)+SiO_(2)+Cu/H_(2)O ternary nanofluid.
文摘Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models.
基金supported by the National Natural Science Foundation of China(62375013).
文摘As the core component of inertial navigation systems, fiber optic gyroscope (FOG), with technical advantages such as low power consumption, long lifespan, fast startup speed, and flexible structural design, are widely used in aerospace, unmanned driving, and other fields. However, due to the temper-ature sensitivity of optical devices, the influence of environmen-tal temperature causes errors in FOG, thereby greatly limiting their output accuracy. This work researches on machine-learn-ing based temperature error compensation techniques for FOG. Specifically, it focuses on compensating for the bias errors gen-erated in the fiber ring due to the Shupe effect. This work pro-poses a composite model based on k-means clustering, sup-port vector regression, and particle swarm optimization algo-rithms. And it significantly reduced redundancy within the sam-ples by adopting the interval sequence sample. Moreover, met-rics such as root mean square error (RMSE), mean absolute error (MAE), bias stability, and Allan variance, are selected to evaluate the model’s performance and compensation effective-ness. This work effectively enhances the consistency between data and models across different temperature ranges and tem-perature gradients, improving the bias stability of the FOG from 0.022 °/h to 0.006 °/h. Compared to the existing methods utiliz-ing a single machine learning model, the proposed method increases the bias stability of the compensated FOG from 57.11% to 71.98%, and enhances the suppression of rate ramp noise coefficient from 2.29% to 14.83%. This work improves the accuracy of FOG after compensation, providing theoretical guid-ance and technical references for sensors error compensation work in other fields.
基金supported by Innovative Human Resource Development for Local Intellectualization Programthrough the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(IITP-2025-RS-2022-00156360).
文摘The classification of respiratory sounds is crucial in diagnosing and monitoring respiratory diseases.However,auscultation is highly subjective,making it challenging to analyze respiratory sounds accurately.Although deep learning has been increasingly applied to this task,most existing approaches have primarily relied on supervised learning.Since supervised learning requires large amounts of labeled data,recent studies have explored self-supervised and semi-supervised methods to overcome this limitation.However,these approaches have largely assumed a closedset setting,where the classes present in the unlabeled data are considered identical to those in the labeled data.In contrast,this study explores an open-set semi-supervised learning setting,where the unlabeled data may contain additional,unknown classes.To address this challenge,a distance-based prototype network is employed to classify respiratory sounds in an open-set setting.In the first stage,the prototype network is trained using labeled and unlabeled data to derive prototype representations of known classes.In the second stage,distances between unlabeled data and known class prototypes are computed,and samples exceeding an adaptive threshold are identified as unknown.A new prototype is then calculated for this unknown class.In the final stage,semi-supervised learning is employed to classify labeled and unlabeled data into known and unknown classes.Compared to conventional closed-set semisupervised learning approaches,the proposed method achieved an average classification accuracy improvement of 2%–5%.Additionally,in cases of data scarcity,utilizing unlabeled data further improved classification performance by 6%–8%.The findings of this study are expected to significantly enhance respiratory sound classification performance in practical clinical settings.
文摘In this study,we examine the problem of sliced inverse regression(SIR),a widely used method for sufficient dimension reduction(SDR).It was designed to find reduced-dimensional versions of multivariate predictors by replacing them with a minimally adequate collection of their linear combinations without loss of information.Recently,regularization methods have been proposed in SIR to incorporate a sparse structure of predictors for better interpretability.However,existing methods consider convex relaxation to bypass the sparsity constraint,which may not lead to the best subset,and particularly tends to include irrelevant variables when predictors are correlated.In this study,we approach sparse SIR as a nonconvex optimization problem and directly tackle the sparsity constraint by establishing the optimal conditions and iteratively solving them by means of the splicing technique.Without employing convex relaxation on the sparsity constraint and the orthogonal constraint,our algorithm exhibits superior empirical merits,as evidenced by extensive numerical studies.Computationally,our algorithm is much faster than the relaxed approach for the natural sparse SIR estimator.Statistically,our algorithm surpasses existing methods in terms of accuracy for central subspace estimation and best subset selection and sustains high performance even with correlated predictors.