Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recogni...Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.展开更多
A new framework of region-based dynamic image fusion is proposed. First, the technique of target detection is applied to dynamic images (image sequences) to segment images into different targets and background regions...A new framework of region-based dynamic image fusion is proposed. First, the technique of target detection is applied to dynamic images (image sequences) to segment images into different targets and background regions. Then different fusion rules are employed in different regions so that the target information is preserved as much as possible. In addition, steerable non-separable wavelet frame transform is used in the process of multi-resolution analysis, so the system achieves favorable characters of orientation and invariant shift. Compared with other image fusion methods, experimental results showed that the proposed method has better capabilities of target recognition and preserves clear background information.展开更多
Medical image analysis based on deep learning has become an important technical requirement in the field of smart healthcare.In view of the difficulties in collaborative modeling of local details and global features i...Medical image analysis based on deep learning has become an important technical requirement in the field of smart healthcare.In view of the difficulties in collaborative modeling of local details and global features in multimodal image analysis of ophthalmology,as well as the existence of information redundancy in cross-modal data fusion,this paper proposes amultimodal fusion framework based on cross-modal collaboration and weighted attention mechanism.In terms of feature extraction,the framework collaboratively extracts local fine-grained features and global structural dependencies through a parallel dual-branch architecture,overcoming the limitations of traditional single-modality models in capturing either local or global information;in terms of fusion strategy,the framework innovatively designs a cross-modal dynamic fusion strategy,combining overlappingmulti-head self-attention modules with a bidirectional feature alignment mechanism,addressing the bottlenecks of low feature interaction efficiency and excessive attention fusion computations in traditional parallel fusion,and further introduces cross-domain local integration technology,which enhances the representation ability of the lesion area through pixel-level feature recalibration and optimizes the diagnostic robustness of complex cases.Experiments show that the framework exhibits excellent feature expression and generalization performance in cross-domain scenarios of ophthalmic medical images and natural images,providing a high-precision,low-redundancy fusion paradigm for multimodal medical image analysis,and promoting the upgrade of intelligent diagnosis and treatment fromsingle-modal static analysis to dynamic decision-making.展开更多
Micro-expressions,fleeting involuntary facial cues lasting under half a second,reveal genuine emotions and are valuable in clinical diagnosis and psychotherapy.Real-time recognition on resource-constrained embedded de...Micro-expressions,fleeting involuntary facial cues lasting under half a second,reveal genuine emotions and are valuable in clinical diagnosis and psychotherapy.Real-time recognition on resource-constrained embedded devices remains challenging,as current methods struggle to balance performance and efficiency.This study introduces a semi-lightweight multifunctional network that enhances real-time deployment and accuracy.Unlike prior simplistic feature fusion techniques,our novel multi-feature fusion strategy leverages temporal,spatial,and differential features to better capture dynamic changes.Enhanced by Residual Network(ResNet)architecture with channel and spatial attention mechanisms,the model improves feature representation while maintaining a lightweight design.Evaluations on SMIC,CASME II,SAMM,and their composite dataset show superior performance in Unweighted F1 Score(UF1)and Unweighted Average Recall(UAR),alongside faster detection speeds compared to existing algorithms.展开更多
Polarization-based integrated navigation system(PINS)that combines the polarization sensor(PS)and the inertial navigation system(INS)has been widely recognized as an effective solution for acquiring attitude informati...Polarization-based integrated navigation system(PINS)that combines the polarization sensor(PS)and the inertial navigation system(INS)has been widely recognized as an effective solution for acquiring attitude information of unmanned aerial vehicles(UAVs).However,based on the PINS hardware configuration,the accurate acquisition of UAV position information remains a challenge.In this article,we propose an improved PS/INS integrated navigation scheme by incorporating an embedded UAV dynamic model(UDM).Compared with existing PS/INS fusion methods,the presented PINS enables the optimal estimation of the UDM thrust coefficient error along with other system state elements,thus significantly improving the UDM accuracy.On this basis,the UDM and PS are used to fuse with the INS,which improves the estimation accuracy of both the UAV attitude and position.Furthermore,we employ an adaptive fusion strategy to detect the reliability of PS data.Therefore,once the UDM is corrected using reliable PS data,it can further fuse with the INS,thereby improving the environmental adaptability of the PINS.The simulation and flight experiment results verified the effectiveness of the proposed PS/INS/UDM integrated navigation scheme.展开更多
Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(...Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(TCNs)can lead to models that ignore the impact of network traffic features at different scales on the detection performance.On the other hand,some intrusion detection methods considermulti-scale information of traffic data,but considering only forward network traffic information can lead to deficiencies in capturing multi-scale temporal features.To address both of these issues,we propose a hybrid Convolutional Neural Network that supports a multi-output strategy(BONUS)for industrial internet intrusion detection.First,we create a multiscale Temporal Convolutional Network by stacking TCN of different scales to capture the multiscale information of network traffic.Meanwhile,we propose a bi-directional structure and dynamically set the weights to fuse the forward and backward contextual information of network traffic at each scale to enhance the model’s performance in capturing the multi-scale temporal features of network traffic.In addition,we introduce a gated network for each of the two branches in the proposed method to assist the model in learning the feature representation of each branch.Extensive experiments reveal the effectiveness of the proposed approach on two publicly available traffic intrusion detection datasets named UNSW-NB15 and NSL-KDD with F1 score of 85.03% and 99.31%,respectively,which also validates the effectiveness of enhancing the model’s ability to capture multi-scale temporal features of traffic data on detection performance.展开更多
As a critical material in construction engineering,concrete requires accurate prediction of its outlet temperature to ensure structural quality and enhance construction efficiency.This study proposes a novel hybrid pr...As a critical material in construction engineering,concrete requires accurate prediction of its outlet temperature to ensure structural quality and enhance construction efficiency.This study proposes a novel hybrid prediction method that integrates a heat conduction physical model with a multilayer perceptron(MLP)neural network,dynamically fused via a weighted strategy to achieve high-precision temperature estimation.Experimental results on an independent test set demonstrated the superior performance of the fused model,with a root mean square error(RMSE)of 1.59℃ and a mean absolute error(MAE)of 1.23℃,representing a 25.3%RMSE reduction compared to conventional physical models.Ambient temperature and coarse aggregate temperature were identified as the most influential variables.Furthermore,the model-based temperature control strategy reduced costs by 0.81 CNY/m^(3),showing significant potential for improving resource efficiency and supporting sustainable construction practices.展开更多
Contrastive graph clustering(CGC)has become a prominent method for self-supervised representation learning by contrasting augmented graph data pairs.However,the performance of CGC methods critically depends on the cho...Contrastive graph clustering(CGC)has become a prominent method for self-supervised representation learning by contrasting augmented graph data pairs.However,the performance of CGC methods critically depends on the choice of data augmentation,which usually limits the capacity of network generalization.Besides,most existing methods characterize positive and negative samples based on the nodes themselves,ignoring the influence of neighbors with different hop numbers on the node.In this study,a novel self-cumulative contrastive graph clustering(SC-CGC)method is devised,which is capable of dynamically adjusting the influence of neighbors with different hops.Our intuition is that better neighbors are closer and distant ones are further away in their feature space,thus we can perform neighbor contrasting without data augmentation.To be specific,SC-CGC relies on two neural networks,i.e.,autoencoder network(AE)and graph autoencoder network(GAE),to encode the node information and graph structure,respectively.To make these two networks interact and learn from each other,a dynamic fusion mechanism is devised to transfer the knowledge learned by AE to the corresponding GAE layer by layer.Then,a self-cumulative contrastive loss function is designed to characterize the structural information by dynamically accumulating the influence of the nodes with different hops.Finally,our approach simultaneously refines the representation learning and clustering assignments in a self-supervised manner.Extensive experiments on 8 realistic datasets demonstrate that SC-CGC consistently performs better over SOTA techniques.The code is available at https://github.com/Xiaoqiang-Yan/JAS-SCCGC.展开更多
Solar forecasting using ground-based sky image offers a promising approach to reduce uncertainty in photovoltaic(PV)power generation.However,existing methods often rely on deterministic predictions that lack diversity...Solar forecasting using ground-based sky image offers a promising approach to reduce uncertainty in photovoltaic(PV)power generation.However,existing methods often rely on deterministic predictions that lack diversity,making it difficult to capture the inherently stochastic nature of cloud movement.To address this limitation,we propose a new two-stage probabilistic forecasting framework.In the first stage,we introduce I-GPT,a multiscale physics-constrained generative model for stochastic sky image prediction.Given a sequence of past sky images,I-GPT uses a Transformer-based VQ-VAE.It also incorporates multi-scale physics-informed recurrent units(Multi-scale PhyCell)and dynamically weighted fuses physical and appearance features.This approach enables the generation of multiple plausible future sky images with realistic and coherent cloud motion.In the second stage,these predicted sky images are fed into an Image-to-Power U-Net(IP-U-Net)to produce 15-min-ahead probabilistic PV power forecasts.In experiments using our dataset,the proposed approach significantly outperforms deterministic,other stochastic,multimodal,and smart persistence baselines models,achieving a superior reliability–sharpness trade-off.It attains a Continuous Ranked Probability Score(CRPS)of 2.912 kW and a Winkler Score(WS)of 33.103 kW on the test set and CRPS of 2.073 kW and WS of 22.202 kW on the validation set.Translating to 35.9%and 42.78%improvement in predictive skill over the smart persistence model.Notably,our method excels during rapidly changing cloud-cover conditions.By enhancing both the accuracy and robustness of short-term PV forecasting,the framework provides tangible benefits for Virtual Power Plant(VPP)operation,supporting more reliable scheduling,grid stability,and risk-aware energy management.展开更多
Most machine learning-based remaining useful life(RUL)prediction methods only yield point predictions,and their“black-box”nature results in low interpretability.Stochastic process-based modeling can predict RUL prob...Most machine learning-based remaining useful life(RUL)prediction methods only yield point predictions,and their“black-box”nature results in low interpretability.Stochastic process-based modeling can predict RUL probability density function(PDF),yet it often suffers from inaccurate modeling and failure to fully utilize historical degradation data of the same equipment type.To overcome these limitations,this paper integrates the two approaches and proposes an Attention-Gaussian-LSTM-Wiener(AG-LSTM-Wiener)-based RUL prediction method,enabling dynamic weighted fusion of predicted PDFs.An AG-LSTM-Wiener model with a two-branch structure is constructed.Health indicator(HI)is fed into the corresponding branch models to generate two different PDF curves.Decision blocks are employed to estimate RUL,from which weights are derived to achieve dynamic weighted fusion of the PDFs.Experiments on the CMPASS turbofan engine degradation dataset validate the proposed method’s effectiveness.Results demonstrate that the proposed method not only prevents PDF curve distortion but also improves the prediction accuracy compared with other methods.With the root mean squared error(RMSE)and Score reduced by 32.8%and 46.1%on average,and the mean squared error of PDF(MSEPDF)improved by 99.3%compared to AG-LSTM,which exhibits the best performance among the contrast methods.展开更多
Turbulence,a complex multi-scale phenomenon inherent in fluid flow systems,presents critical challenges and opportunities for understanding physical mechanisms across scientific and engineering domains.Although high-r...Turbulence,a complex multi-scale phenomenon inherent in fluid flow systems,presents critical challenges and opportunities for understanding physical mechanisms across scientific and engineering domains.Although high-resolution(HR)turbulence data remain indispensable for advancing both theoretical insights and engineering solutions,their acquisition is severely limited by prohibitively high computational costs.While deep learning architectures show transformative potential in reconstructing high-fidelity flow representations from sparse measurements,current methodologies suffer from two inherent constraints:strict reliance on perfectly paired training data and inability to perform multi-scale reconstruction within a unified framework.To address these challenges,we propose HADF,a hash-adaptive dynamic fusion implicit network for turbulence reconstruction.Specifically,we develop a low-resolution(LR)consistency loss that facilitates effective model training under conditions of missing paired data,eliminating the conventional requirement for fully matched LR and HR datasets.We further employ hash-adaptive spatial encoding and dynamic feature fusion to extract turbulence features,mapping them with implicit neural representations for reconstruction at arbitrary resolutions.Experimental results demonstrate that HADF achieves superior performance in global reconstruction accuracy and local physical properties compared to state-of-the-art models.It precisely recovers fine turbulence details for partially unpaired data conditions and diverse resolutions by training only once while maintaining robustness against noise.展开更多
Detection and counting of abalones is one of key technologies of abalones breeding density estimation.The abalones in the breeding stage are small in size,densely distributed,and occluded between individuals,so the ex...Detection and counting of abalones is one of key technologies of abalones breeding density estimation.The abalones in the breeding stage are small in size,densely distributed,and occluded between individuals,so the existing object detection algorithms have low precision for detecting the abalones in the breeding stage.To solve this problem,a detection and counting method of juvenile abalones based on improved SSD network is proposed in this research.The innovation points of this method are:Firstly,the multi-layer feature dynamic fusion method is proposed to obtain more color and texture information and improve detection precision of juvenile abalones with small size;secondly,the multiscale attention feature extraction method is proposed to highlight shape and edge feature information of juvenile abalones and increase detection precision of juvenile abalones with dense distribution and individual coverage;finally,the loss feedback training method is used to increase the diversity of data and the pixels of juvenile abalones in the images to get the even higher detection precision of juvenile abalones with small size.The experimental results show that the AP@0.5 value,AP@0.7 value and AP@0.75 value of the detection results of the proposed method are 91.14%,89.90% and 80.14%,respectively.The precision and recall rates of the counting results are 99.59% and 97.74%,respectively,which are superior to the counting results of SSD,FSSD,MutualGuide,EfficientDet and VarifocalNet models.The proposed method can provide support for real-time monitoring of aquaculture density for juvenile abalones.展开更多
文摘Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.
基金Project (No. 2004CB719401) supported by the National Basic Research Program (973) of China
文摘A new framework of region-based dynamic image fusion is proposed. First, the technique of target detection is applied to dynamic images (image sequences) to segment images into different targets and background regions. Then different fusion rules are employed in different regions so that the target information is preserved as much as possible. In addition, steerable non-separable wavelet frame transform is used in the process of multi-resolution analysis, so the system achieves favorable characters of orientation and invariant shift. Compared with other image fusion methods, experimental results showed that the proposed method has better capabilities of target recognition and preserves clear background information.
基金funded by the Ongoing Research Funding Program(ORF-2025-102),King Saud University,Riyadh,Saudi Arabiaby the Science and Technology Research Programof Chongqing Municipal Education Commission(Grant No.KJQN202400813)by the Graduate Research Innovation Project(Grant Nos.yjscxx2025-269-193 and CYS25618).
文摘Medical image analysis based on deep learning has become an important technical requirement in the field of smart healthcare.In view of the difficulties in collaborative modeling of local details and global features in multimodal image analysis of ophthalmology,as well as the existence of information redundancy in cross-modal data fusion,this paper proposes amultimodal fusion framework based on cross-modal collaboration and weighted attention mechanism.In terms of feature extraction,the framework collaboratively extracts local fine-grained features and global structural dependencies through a parallel dual-branch architecture,overcoming the limitations of traditional single-modality models in capturing either local or global information;in terms of fusion strategy,the framework innovatively designs a cross-modal dynamic fusion strategy,combining overlappingmulti-head self-attention modules with a bidirectional feature alignment mechanism,addressing the bottlenecks of low feature interaction efficiency and excessive attention fusion computations in traditional parallel fusion,and further introduces cross-domain local integration technology,which enhances the representation ability of the lesion area through pixel-level feature recalibration and optimizes the diagnostic robustness of complex cases.Experiments show that the framework exhibits excellent feature expression and generalization performance in cross-domain scenarios of ophthalmic medical images and natural images,providing a high-precision,low-redundancy fusion paradigm for multimodal medical image analysis,and promoting the upgrade of intelligent diagnosis and treatment fromsingle-modal static analysis to dynamic decision-making.
文摘Micro-expressions,fleeting involuntary facial cues lasting under half a second,reveal genuine emotions and are valuable in clinical diagnosis and psychotherapy.Real-time recognition on resource-constrained embedded devices remains challenging,as current methods struggle to balance performance and efficiency.This study introduces a semi-lightweight multifunctional network that enhances real-time deployment and accuracy.Unlike prior simplistic feature fusion techniques,our novel multi-feature fusion strategy leverages temporal,spatial,and differential features to better capture dynamic changes.Enhanced by Residual Network(ResNet)architecture with channel and spatial attention mechanisms,the model improves feature representation while maintaining a lightweight design.Evaluations on SMIC,CASME II,SAMM,and their composite dataset show superior performance in Unweighted F1 Score(UF1)and Unweighted Average Recall(UAR),alongside faster detection speeds compared to existing algorithms.
基金supported by the National Natural Science Foundation of China(Grant Nos.62388101,62425302,62227813,62373033,62403024)the National Key R&D Program of China(Grant No.2020YFA0711200)。
文摘Polarization-based integrated navigation system(PINS)that combines the polarization sensor(PS)and the inertial navigation system(INS)has been widely recognized as an effective solution for acquiring attitude information of unmanned aerial vehicles(UAVs).However,based on the PINS hardware configuration,the accurate acquisition of UAV position information remains a challenge.In this article,we propose an improved PS/INS integrated navigation scheme by incorporating an embedded UAV dynamic model(UDM).Compared with existing PS/INS fusion methods,the presented PINS enables the optimal estimation of the UDM thrust coefficient error along with other system state elements,thus significantly improving the UDM accuracy.On this basis,the UDM and PS are used to fuse with the INS,which improves the estimation accuracy of both the UAV attitude and position.Furthermore,we employ an adaptive fusion strategy to detect the reliability of PS data.Therefore,once the UDM is corrected using reliable PS data,it can further fuse with the INS,thereby improving the environmental adaptability of the PINS.The simulation and flight experiment results verified the effectiveness of the proposed PS/INS/UDM integrated navigation scheme.
基金sponsored by the Autonomous Region Key R&D Task Special(2022B01008)the National Key R&D Program of China(SQ2022AAA010308-5).
文摘Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(TCNs)can lead to models that ignore the impact of network traffic features at different scales on the detection performance.On the other hand,some intrusion detection methods considermulti-scale information of traffic data,but considering only forward network traffic information can lead to deficiencies in capturing multi-scale temporal features.To address both of these issues,we propose a hybrid Convolutional Neural Network that supports a multi-output strategy(BONUS)for industrial internet intrusion detection.First,we create a multiscale Temporal Convolutional Network by stacking TCN of different scales to capture the multiscale information of network traffic.Meanwhile,we propose a bi-directional structure and dynamically set the weights to fuse the forward and backward contextual information of network traffic at each scale to enhance the model’s performance in capturing the multi-scale temporal features of network traffic.In addition,we introduce a gated network for each of the two branches in the proposed method to assist the model in learning the feature representation of each branch.Extensive experiments reveal the effectiveness of the proposed approach on two publicly available traffic intrusion detection datasets named UNSW-NB15 and NSL-KDD with F1 score of 85.03% and 99.31%,respectively,which also validates the effectiveness of enhancing the model’s ability to capture multi-scale temporal features of traffic data on detection performance.
基金funded by National Key Research and Development Plan(2018YFC0406703)Supported by the National Natural Science Foundation of China(51779277)+4 种基金Chinese Academy of Water Sciences(SD0145B072021)Supported by the State Key Laboratory of Flow Water Cycle Simulation and Regulation,SKL2022ZD05Support provided by the fund of State Key Laboratory of Water Cycle and Water Security,IWHR(Grant No.SKL2024YJZD05)Support provided by the fund of Power China(DJ-ZDXM-2020-50)Support provided by the fund of Research and Application of Intelligent Simulation and Intelligent Control Technology for Structural States of Gravity DAMS in Jingling Reservoir Project,Zhejiang Province(JLSKFW-2024113).
文摘As a critical material in construction engineering,concrete requires accurate prediction of its outlet temperature to ensure structural quality and enhance construction efficiency.This study proposes a novel hybrid prediction method that integrates a heat conduction physical model with a multilayer perceptron(MLP)neural network,dynamically fused via a weighted strategy to achieve high-precision temperature estimation.Experimental results on an independent test set demonstrated the superior performance of the fused model,with a root mean square error(RMSE)of 1.59℃ and a mean absolute error(MAE)of 1.23℃,representing a 25.3%RMSE reduction compared to conventional physical models.Ambient temperature and coarse aggregate temperature were identified as the most influential variables.Furthermore,the model-based temperature control strategy reduced costs by 0.81 CNY/m^(3),showing significant potential for improving resource efficiency and supporting sustainable construction practices.
基金supported by the National Natural Science Foundation of China(62371423,62450002,62425107)China Postdoctoral Science Foundation(2020M682357).
文摘Contrastive graph clustering(CGC)has become a prominent method for self-supervised representation learning by contrasting augmented graph data pairs.However,the performance of CGC methods critically depends on the choice of data augmentation,which usually limits the capacity of network generalization.Besides,most existing methods characterize positive and negative samples based on the nodes themselves,ignoring the influence of neighbors with different hop numbers on the node.In this study,a novel self-cumulative contrastive graph clustering(SC-CGC)method is devised,which is capable of dynamically adjusting the influence of neighbors with different hops.Our intuition is that better neighbors are closer and distant ones are further away in their feature space,thus we can perform neighbor contrasting without data augmentation.To be specific,SC-CGC relies on two neural networks,i.e.,autoencoder network(AE)and graph autoencoder network(GAE),to encode the node information and graph structure,respectively.To make these two networks interact and learn from each other,a dynamic fusion mechanism is devised to transfer the knowledge learned by AE to the corresponding GAE layer by layer.Then,a self-cumulative contrastive loss function is designed to characterize the structural information by dynamically accumulating the influence of the nodes with different hops.Finally,our approach simultaneously refines the representation learning and clustering assignments in a self-supervised manner.Extensive experiments on 8 realistic datasets demonstrate that SC-CGC consistently performs better over SOTA techniques.The code is available at https://github.com/Xiaoqiang-Yan/JAS-SCCGC.
基金supported by the“Regional Innovation Strategy(RIS)”through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(MOE)(2021RIS002)the Technology Development Program(RS-2025-02312851)funded by the Ministry of SMEs and Startups(MSS,Republic of Korea).
文摘Solar forecasting using ground-based sky image offers a promising approach to reduce uncertainty in photovoltaic(PV)power generation.However,existing methods often rely on deterministic predictions that lack diversity,making it difficult to capture the inherently stochastic nature of cloud movement.To address this limitation,we propose a new two-stage probabilistic forecasting framework.In the first stage,we introduce I-GPT,a multiscale physics-constrained generative model for stochastic sky image prediction.Given a sequence of past sky images,I-GPT uses a Transformer-based VQ-VAE.It also incorporates multi-scale physics-informed recurrent units(Multi-scale PhyCell)and dynamically weighted fuses physical and appearance features.This approach enables the generation of multiple plausible future sky images with realistic and coherent cloud motion.In the second stage,these predicted sky images are fed into an Image-to-Power U-Net(IP-U-Net)to produce 15-min-ahead probabilistic PV power forecasts.In experiments using our dataset,the proposed approach significantly outperforms deterministic,other stochastic,multimodal,and smart persistence baselines models,achieving a superior reliability–sharpness trade-off.It attains a Continuous Ranked Probability Score(CRPS)of 2.912 kW and a Winkler Score(WS)of 33.103 kW on the test set and CRPS of 2.073 kW and WS of 22.202 kW on the validation set.Translating to 35.9%and 42.78%improvement in predictive skill over the smart persistence model.Notably,our method excels during rapidly changing cloud-cover conditions.By enhancing both the accuracy and robustness of short-term PV forecasting,the framework provides tangible benefits for Virtual Power Plant(VPP)operation,supporting more reliable scheduling,grid stability,and risk-aware energy management.
基金funded by the National Natural Science Foundation of China(62273030,62303353)the Interdisciplinary Research Project for Young Teachers of USTB(Fundamental Research Funds for the Central Universities)(FRF-IDRY-GD21-004)the Open Fund of Intelligent Control Laboratory.
文摘Most machine learning-based remaining useful life(RUL)prediction methods only yield point predictions,and their“black-box”nature results in low interpretability.Stochastic process-based modeling can predict RUL probability density function(PDF),yet it often suffers from inaccurate modeling and failure to fully utilize historical degradation data of the same equipment type.To overcome these limitations,this paper integrates the two approaches and proposes an Attention-Gaussian-LSTM-Wiener(AG-LSTM-Wiener)-based RUL prediction method,enabling dynamic weighted fusion of predicted PDFs.An AG-LSTM-Wiener model with a two-branch structure is constructed.Health indicator(HI)is fed into the corresponding branch models to generate two different PDF curves.Decision blocks are employed to estimate RUL,from which weights are derived to achieve dynamic weighted fusion of the PDFs.Experiments on the CMPASS turbofan engine degradation dataset validate the proposed method’s effectiveness.Results demonstrate that the proposed method not only prevents PDF curve distortion but also improves the prediction accuracy compared with other methods.With the root mean squared error(RMSE)and Score reduced by 32.8%and 46.1%on average,and the mean squared error of PDF(MSEPDF)improved by 99.3%compared to AG-LSTM,which exhibits the best performance among the contrast methods.
基金Project supported by the National Natural Science Foundation of China(No.12402349)the Natural Science Foundation of Hunan Province(No.2024JJ6468)+1 种基金the Youth Foundation of the National University of Defense Technology(No.ZK2023-11)the National Key Research and Development Program of China(No.2021YFB0300101)。
文摘Turbulence,a complex multi-scale phenomenon inherent in fluid flow systems,presents critical challenges and opportunities for understanding physical mechanisms across scientific and engineering domains.Although high-resolution(HR)turbulence data remain indispensable for advancing both theoretical insights and engineering solutions,their acquisition is severely limited by prohibitively high computational costs.While deep learning architectures show transformative potential in reconstructing high-fidelity flow representations from sparse measurements,current methodologies suffer from two inherent constraints:strict reliance on perfectly paired training data and inability to perform multi-scale reconstruction within a unified framework.To address these challenges,we propose HADF,a hash-adaptive dynamic fusion implicit network for turbulence reconstruction.Specifically,we develop a low-resolution(LR)consistency loss that facilitates effective model training under conditions of missing paired data,eliminating the conventional requirement for fully matched LR and HR datasets.We further employ hash-adaptive spatial encoding and dynamic feature fusion to extract turbulence features,mapping them with implicit neural representations for reconstruction at arbitrary resolutions.Experimental results demonstrate that HADF achieves superior performance in global reconstruction accuracy and local physical properties compared to state-of-the-art models.It precisely recovers fine turbulence details for partially unpaired data conditions and diverse resolutions by training only once while maintaining robustness against noise.
基金jointly supported by the National Key R&D Project(2020YFD0900204)the Yantai Key R&D Project(2019XDHZ084).
文摘Detection and counting of abalones is one of key technologies of abalones breeding density estimation.The abalones in the breeding stage are small in size,densely distributed,and occluded between individuals,so the existing object detection algorithms have low precision for detecting the abalones in the breeding stage.To solve this problem,a detection and counting method of juvenile abalones based on improved SSD network is proposed in this research.The innovation points of this method are:Firstly,the multi-layer feature dynamic fusion method is proposed to obtain more color and texture information and improve detection precision of juvenile abalones with small size;secondly,the multiscale attention feature extraction method is proposed to highlight shape and edge feature information of juvenile abalones and increase detection precision of juvenile abalones with dense distribution and individual coverage;finally,the loss feedback training method is used to increase the diversity of data and the pixels of juvenile abalones in the images to get the even higher detection precision of juvenile abalones with small size.The experimental results show that the AP@0.5 value,AP@0.7 value and AP@0.75 value of the detection results of the proposed method are 91.14%,89.90% and 80.14%,respectively.The precision and recall rates of the counting results are 99.59% and 97.74%,respectively,which are superior to the counting results of SSD,FSSD,MutualGuide,EfficientDet and VarifocalNet models.The proposed method can provide support for real-time monitoring of aquaculture density for juvenile abalones.