Single-pixel imaging(SPI)is a prominent scattering media imaging technique that allows image transmission via one-dimensional detection under structured illumination,with applications spanning from long-range imaging ...Single-pixel imaging(SPI)is a prominent scattering media imaging technique that allows image transmission via one-dimensional detection under structured illumination,with applications spanning from long-range imaging to microscopy.Recent advancements leveraging deep learning(DL)have significantly improved SPI performance,especially at low compression ratios.However,most DL-based SPI methods proposed so far rely heavily on extensive labeled datasets for supervised training,which are often impractical in real-world scenarios.Here,we propose an unsupervised learningenabled label-free SPI method for resilient information transmission through unknown dynamic scattering media.Additionally,we introduce a physics-informed autoencoder framework to optimize encoding schemes,further enhancing image quality at low compression ratios.Simulation and experimental results demonstrate that high-efficiency data transmission with structural similarity exceeding 0.9 is achieved through challenging turbulent channels.Moreover,experiments demonstrate that in a 5 m underwater dynamic turbulent channel,USAF target imaging quality surpasses traditional methods by over 13 dB.The compressive encoded transmission of 720×720 resolution video exceeding 30 seconds with great fidelity is also successfully demonstrated.These preliminary results suggest that our proposed method opens up a new paradigm for resilient information transmission through unknown dynamic scattering media and holds potential for broader applications within many other scattering media imaging technologies.展开更多
In order to solve the problem of automatic defect detection and process control in the welding and arc additive process,the paper monitors the current,voltage,audio,and other data during the welding process and extrac...In order to solve the problem of automatic defect detection and process control in the welding and arc additive process,the paper monitors the current,voltage,audio,and other data during the welding process and extracts the minimum value,standard deviation,deviation from the voltage and current data.It extracts spectral features such as root mean square,spectral centroid,and zero-crossing rate from audio data,fuses the features extracted from multiple sensor signals,and establishes multiple machine learning supervised and unsupervised models.They are used to detect abnormalities in the welding process.The experimental results show that the established multiple machine learning models have high accuracy,among which the supervised learning model,the balanced accuracy of Ada boost is 0.957,and the unsupervised learning model Isolation Forest has a balanced accuracy of 0.909.展开更多
In this paper,we propose a structural developmental neural network to address the plasticity‐stability dilemma,computational inefficiency,and lack of prior knowledge in continual unsupervised learning.This model uses...In this paper,we propose a structural developmental neural network to address the plasticity‐stability dilemma,computational inefficiency,and lack of prior knowledge in continual unsupervised learning.This model uses competitive learning rules and dynamic neurons with information saturation to achieve parameter adjustment and adaptive structure development.Dynamic neurons adjust the information saturation after winning the competition and use this parameter to modulate the neuron parameter adjustment and the division timing.By dividing to generate new neurons,the network not only keeps sensitive to novel features but also can subdivide classes learnt repeatedly.The dynamic neurons with information saturation and division mechanism can simulate the long short‐term memory of the human brain,which enables the network to continually learn new samples while maintaining the previous learning results.The parent‐child relationship between neurons arising from neuronal division enables the network to simulate the human cognitive process that gradually refines the perception of objects.By setting the clustering layer parameter,users can choose the desired degree of class subdivision.Experimental results on artificial and real‐world datasets demonstrate that the proposed model is feasible for unsupervised learning tasks in instance increment and class incre-ment scenarios and outperforms prior structural developmental neural networks.展开更多
This paper reports distinct spatio-spectral properties of Zen-meditation EEG (electroencephalograph), compared with resting EEG, by implementing unsupervised machine learning scheme in clustering the brain mappings of...This paper reports distinct spatio-spectral properties of Zen-meditation EEG (electroencephalograph), compared with resting EEG, by implementing unsupervised machine learning scheme in clustering the brain mappings of centroid frequency (BMFc). Zen practitioners simultaneously concentrate on the third ventricle, hypothalamus and corpora quadrigemina touniversalize all brain neurons to construct a <i>detached</i> brain and gradually change the normal brain traits, leading to the process of brain-neuroplasticity. During such tri-aperture concentration, EEG exhibits prominent diffuse high-frequency oscillations. Unsupervised self-organizing map (SOM), clusters the dataset of quantitative EEG by matching the input feature vector Fc and the output cluster center through the SOM network weights. Input dataset contains brain mappings of 30 centroid frequencies extracted from CWT (continuous wavelet transform) coefficients. According to SOM clustering results, resting EEG is dominated by global low-frequency (<14 Hz) activities, except channels T7, F7 and TP7 (>14.4 Hz);whereas Zen-meditation EEG exhibits globally high-frequency (>16 Hz) activities throughout the entire record. Beta waves with a wide range of frequencies are often associated with active concentration. Nonetheless, clinic report discloses that benzodiazepines, medication treatment for anxiety, insomnia and panic attacks to relieve mind/body stress, often induce <i>beta buzz</i>. We may hypothesize that Zen-meditation practitioners attain the unique state of mindfulness concentration under optimal body-mind relaxation.展开更多
Hybrid beamforming is a promising technique for massive multiple-input multiple-output(MIMO)systems because it is able to reduce the hardware cost and power consumption while improving system performance,which is cons...Hybrid beamforming is a promising technique for massive multiple-input multiple-output(MIMO)systems because it is able to reduce the hardware cost and power consumption while improving system performance,which is considered a key enabler for the fifth-generation and beyond communications.However,the acquisition of perfect channel state information is a challenging task since the hybrid precoder design utilizes the phase shifters in the analog domain.In this paper,we investigate the spectral efficiency(SE)for millimeter wave hybrid massive MIMO system,where the optimization of precoding adopted unsupervised learning.To maximize the system’s SE,this paper proposes an efficient DNN-LSTM-Res network(named as DNN-LSTM-ResNet),which incorporates the deep neural network(DNN),a long short-term memory(LSTM)method and a residual network(ResNet),where the neural network is treated as the basic network and residual neural network is utilized to implement the degradation problem.Finally,numerical simulations are provided to validate the effectiveness of our proposed DNN-LSTM-ResNet for hybrid massive MIMO systems.Results showcased the proposed network greatly decreases the number of training parameters and can effectively improve the system’s SE that achieves approximately 20% compared to conventional algorithms.Moreover,the proposed DNN-LSTM-ResNet has the characteristics of fast convergence speed and strong search capability.展开更多
Defect detection based on computer vision is a critical component in ensuring the quality of industrial products.However,existing detection methods encounter several challenges in practical applications,including the ...Defect detection based on computer vision is a critical component in ensuring the quality of industrial products.However,existing detection methods encounter several challenges in practical applications,including the scarcity of labeled samples,limited adaptability of pre-trained models,and the data heterogeneity in distributed environments.To address these issues,this research proposes an unsupervised defect detection method,FLAME(Federated Learning with Adaptive Multi-Model Embeddings).The method comprises three stages:(1)Feature learning stage:this work proposes FADE(Feature-Adaptive Domain-Specific Embeddings),a framework employs Gaussian noise injection to simulate defective patterns and implements a feature discriminator for defect detection,thereby enhancing the pre-trained model’s industrial imagery representation capabilities.(2)Knowledge distillation co-training stage:a multi-model feature knowledge distillation mechanism is introduced.Through feature-level knowledge transfer between the global model and historical local models,the current local model is guided to learn better feature representations from the global model.The approach prevents local models from converging to local optima and mitigates performance degradation caused by data heterogeneity.(3)Model parameter aggregation stage:participating clients utilize weighted averaging aggregation to synthesize an updated global model,facilitating efficient knowledge consolidation.Experimental results demonstrate that FADE improves the average image-level Area under the Receiver Operating Characteristic Curve(AUROC)by 7.34%compared to methods directly utilizing pre-trained models.In federated learning environments,FLAME’s multi-model feature knowledge distillation mechanism outperforms the classic FedAvg algorithm by 2.34%in average image-level AUROC,while exhibiting superior convergence properties.展开更多
Highly stretchable and robust strain sensors are rapidly emerging as promising candidates for a diverse of wearable electronics.The main challenge for the practical application of wearable electronics is the energy co...Highly stretchable and robust strain sensors are rapidly emerging as promising candidates for a diverse of wearable electronics.The main challenge for the practical application of wearable electronics is the energy consumption and device aging.Energy consumption mainly depends on the conductivity of the sensor,and it is a key factor in determining device aging.Here,we design a liq-uid metal(LM)-embedded hydrogel as a sensing material to overcome the bar-rier of energy consumption and device aging of wearable electronics.The sensing material simultaneously exhibits high conductivity(up to 22 S m�1),low elastic modulus(23 kPa),and ultrahigh stretchability(1500%)with excel-lent robustness(consistent performance against 12000 mechanical cycling).A motion monitoring system is composed of intrinsically soft LM-embedded hydrogel as sensing material,a microcontroller,signal-processing circuits,Bluetooth transceiver,and self-organizing map developed software for the visu-alization of multi-dimensional data.This system integrating multiple functions including signal conditioning,processing,and wireless transmission achieves monitor hand gesture as well as sign-to-verbal translation.This approach provides an ideal strategy for deaf-mute communicating with normal people and broadens the application of wearable electronics.展开更多
Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems.Many issues in this field still unsolved,so several modern anomaly detection methods struggle...Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems.Many issues in this field still unsolved,so several modern anomaly detection methods struggle to maintain adequate accuracy due to the highly descriptive nature of big data.Such a phenomenon is referred to as the“curse of dimensionality”that affects traditional techniques in terms of both accuracy and performance.Thus,this research proposed a hybrid model based on Deep Autoencoder Neural Network(DANN)with five layers to reduce the difference between the input and output.The proposed model was applied to a real-world gas turbine(GT)dataset that contains 87620 columns and 56 rows.During the experiment,two issues have been investigated and solved to enhance the results.The first is the dataset class imbalance,which solved using SMOTE technique.The second issue is the poor performance,which can be solved using one of the optimization algorithms.Several optimization algorithms have been investigated and tested,including stochastic gradient descent(SGD),RMSprop,Adam and Adamax.However,Adamax optimization algorithm showed the best results when employed to train theDANNmodel.The experimental results show that our proposed model can detect the anomalies by efficiently reducing the high dimensionality of dataset with accuracy of 99.40%,F1-score of 0.9649,Area Under the Curve(AUC)rate of 0.9649,and a minimal loss function during the hybrid model training.展开更多
Underwater image enhancement aims to restore a clean appearance and thus improves the quality of underwater degraded images.Current methods feed the whole image directly into the model for enhancement.However,they ign...Underwater image enhancement aims to restore a clean appearance and thus improves the quality of underwater degraded images.Current methods feed the whole image directly into the model for enhancement.However,they ignored that the R,G and B channels of underwater degraded images present varied degrees of degradation,due to the selective absorption for the light.To address this issue,we propose an unsupervised multi-expert learning model by considering the enhancement of each color channel.Specifically,an unsupervised architecture based on generative adversarial network is employed to alleviate the need for paired underwater images.Based on this,we design a generator,including a multi-expert encoder,a feature fusion module and a feature fusion-guided decoder,to generate the clear underwater image.Accordingly,a multi-expert discriminator is proposed to verify the authenticity of the R,G and B channels,respectively.In addition,content perceptual loss and edge loss are introduced into the loss function to further improve the content and details of the enhanced images.Extensive experiments on public datasets demonstrate that our method achieves more pleasing results in vision quality.Various metrics(PSNR,SSIM,UIQM and UCIQE) evaluated on our enhanced images have been improved obviously.展开更多
Classifying topological phases of matter with strong interactions is a notoriously challenging task and has attracted considerable attention in recent years.In this paper,we propose an unsupervised machine learning ap...Classifying topological phases of matter with strong interactions is a notoriously challenging task and has attracted considerable attention in recent years.In this paper,we propose an unsupervised machine learning approach that can classify a wide range of symmetry-protected interacting topological phases directly from the experimental observables and without a priori knowledge.We analytically show that Green’s functions,which can be derived from spectral functions that can be measured directly in an experiment,are suitable for serving as the input data for our learning proposal based on the diffusion map.As a concrete example,we consider a one-dimensional interacting topological insulators model and show that,through extensive numerical simulations,our diffusion map approach works as desired.In addition,we put forward a generic scheme to measure the spectral functions in ultracold atomic systems through momentum-resolved Raman spectroscopy.Our work circumvents the costly diagonalization of the system Hamiltonian,and provides a versatile protocol for the straightforward and autonomous identification of interacting topological phases from experimental observables in an unsupervised manner.展开更多
The development of deep learning has inspired some new methods to solve the 3D reconstruction problem for Tomographic Particle Image Velocimetry (Tomo-PIV). However, the supervised learning method requires a large num...The development of deep learning has inspired some new methods to solve the 3D reconstruction problem for Tomographic Particle Image Velocimetry (Tomo-PIV). However, the supervised learning method requires a large number of data with ground truth as training information, which is very difficult to gather from experiments. Although synthetic datasets can be used as alternatives, they are still not exactly the same with the real-world experimental data. In this paper, an Unsupervised Reconstruction Technique based on U-net (UnRTU) is proposed to reconstruct volume particle distribution explicitly. Instead of using ground truth data, a projection function is used as an unsupervised loss function for network training to reconstruct particle distribution. The UnRTU was compared with some traditional algebraic reconstruction algorithms and supervised learning method using synthetic data under different particle density and noise level. The results indicate that UnRTU outperforms these traditional approaches in both reconstruction quality and noise robustness, and is comparable to the supervised learning methods AI-PR. For experimental tests, particles dispersed in cured epoxy resin are moved by an electric rail with a certain speed to obtain the ground truth data of particle velocity. Compared with other algorithms, the reconstructed particle distribution by UnRTU has the best reconstruction fidelity. And the accuracy of the 3D velocity field estimated by UnRTU is 12.9% higher than that from the traditional MLOS-MART algorithm. It demonstrates significant potential and advantages for UnRTU in 3D reconstruction of particle distribution. Finally, UnRTU was successfully applied to the high-speed planar cascade airflow field, demonstrating its applicability for measuring complex fluid flow fields at higher particle density.展开更多
Machine learning (ML) is a rapidly growing tool even in the lithium-ion battery (LIB) research field. To utilize this tool, more and more datasets have been published. However, applicability of a ML model to different...Machine learning (ML) is a rapidly growing tool even in the lithium-ion battery (LIB) research field. To utilize this tool, more and more datasets have been published. However, applicability of a ML model to different information sources or various LIB cell types has not been well studied. In this paper, an unsupervised learning model called variational autoencoder (VAE) is evaluated with three datasets of charge-discharge cycles with different conditions. The model was first trained with a publicly available dataset of commercial cylindrical cells, and then evaluated with our private datasets of commercial pouch and hand-made coin cells. These cells used different chemistry and were tested with different cycle testers under different purposes, which induces various characteristics to each dataset. We report that researchers can recognise these characteristics with VAE to plan a proper data preprocessing. We also discuss about interpretability of a ML model.展开更多
Unsupervised learning algorithms can effectively solve sample imbalance.To address battery consistency anomalies in new energy vehicles,we adopt a variety of unsupervised learning algorithms to evaluate and predict th...Unsupervised learning algorithms can effectively solve sample imbalance.To address battery consistency anomalies in new energy vehicles,we adopt a variety of unsupervised learning algorithms to evaluate and predict the battery consistency of three vehicles using charging fragment data from actual operating conditions.We extract battery-related features,such as the mean of maximum difference,standard deviation,and entropy of batteries and then apply principal component analysis to reduce the dimensionality and record the amount of preserved information.We then build models through a collection of unsupervised learning algorithms for the anomaly detection of cell consistency faults.We also determine whether unsupervised and supervised learning algorithms can address the battery consistency problem and document the parameter tuning process.In addition,we compare the prediction effectiveness of charging and discharging features modeled individually and in combination,determine the choice of charging and discharging features to be modeled in combination,and visualize the multidimensional data for fault detection.Experimental results show that the unsupervised learning algorithm is effective in visualizing and predicting vehicle core conformance faults,and can accurately predict faults in real time.The“distance+boxplot”algorithm shows the best performance with a prediction accuracy of 80%,a recall rate of 100%,and an F1 of 0.89.The proposed approach can be applied to monitor battery consistency faults in real time and reduce the possibility of disasters arising from consistency faults.展开更多
Particle image velocimetry(PIV)is an essential method in experimental fluid dynamics.In recent years,the development of deep learning‐based methods has inspired new ap-proaches to tackle the PIV problem,which conside...Particle image velocimetry(PIV)is an essential method in experimental fluid dynamics.In recent years,the development of deep learning‐based methods has inspired new ap-proaches to tackle the PIV problem,which considerably improves the accuracy of PIV.However,the supervised learning of PIV is driven by large volumes of data with ground truth information.Therefore,the authors consider unsupervised PIV methods.There has been some work on unsupervised PIV,but they are not nearly as effective as supervised learning PIV.The authors try to improve the effectiveness and accuracy of unsupervised PIV by adding classical PIV methods and physical constraints.In this paper,the authors propose an unsupervised PIV method combined with the cross‐correlation method and divergence‐free constraint,which obtains better performance than other unsupervised PIV methods.The authors compare some classical PIV methods and some deep learning methods,such as LiteFlowNet,LiteFlowNet‐en,and UnLiteFlowNet with the authors’model on the synthetic dataset.Besides,the authors contrast the results of LiteFlowNet,UnLiteFlowNet and the authors’model on experimental particle images.As a result,the authors’model shows comparable performance with classical PIV methods as well as supervised PIV methods and outperforms the previous unsupervised PIV method in most flow cases.展开更多
Pedestrian crashes at high-speed locations are a persistent road safety concern.Driving at high speeds means that the driver has less time to react and make evasive maneuvers to avoid a pedestrian crash.On top of this...Pedestrian crashes at high-speed locations are a persistent road safety concern.Driving at high speeds means that the driver has less time to react and make evasive maneuvers to avoid a pedestrian crash.On top of this,other crash-contributing factors such as humans(pedestrians or drivers),vehicles,roadways,and surrounding environmental factors actively interact together to cause a crash at high-speed locations.The pattern of pedestrian crashes also differs significantly according to the high-speed intersection and segment locations which require further investigation.This study applied association rules mining(ARM),an unsupervised learning algorithm,to reveal the hidden association of pedestrian crash risk factors according to the high-speed intersection and segments separately.The study used Louisiana pedestrian fatal and injury crash data(2010 to 2019).Any crash location with a posted speed limit of 45 mph or above is classified as a high-speed location.Based on the generated association rules,the results show that pedestrian crashes at a high-speed intersection are associated with the intersection geometry(3-leg)and control(1 stop,no traffic control device),driver characteristics(careless operation,failure to yield,inattentive-distracted,older,and younger driver),pedestrian-related factors(violations,alcohol/drug involvement),settings(open country,residential,business,industrial),dark lighting conditions and so on.Most pedestrian crashes at high-speed segments are associated with roadways with no physical separation,dark-no-streetlight conditions,open country locations,interstates and so on.The findings of the study may help to select appropriate countermeasures to reduce pedestrian crashes at high-speed locations.展开更多
Reliable electricity infrastructure is critical for modern society,highlighting the importance of securing the stability of fundamental power electronic systems.However,as such systems frequently involve high-current ...Reliable electricity infrastructure is critical for modern society,highlighting the importance of securing the stability of fundamental power electronic systems.However,as such systems frequently involve high-current and high-voltage conditions,there is a greater likelihood of failures.Consequently,anomaly detection of power electronic systems holds great significance,which is a task that properly-designed neural networks can well undertake,as proven in various scenarios.Transformer-like networks are promising for such application,yet with its structure initially designed for different tasks,features extracted by beginning layers are often lost,decreasing detection performance.Also,such data-driven methods typically require sufficient anomalous data for training,which could be difficult to obtain in practice.Therefore,to improve feature utilization while achieving efficient unsupervised learning,a novel model,Densely-connected Decoder Transformer(DDformer),is proposed for unsupervised anomaly detection of power electronic systems in this paper.First,efficient labelfree training is achieved based on the concept of autoencoder with recursive-free output.An encoder-decoder structure with densely-connected decoder is then adopted,merging features from all encoder layers to avoid possible loss of mined features while reducing training difficulty.Both simulation and real-world experiments are conducted to validate the capabilities of DDformer,and the average FDR has surpassed baseline models,reaching 89.39%,93.91%,95.98%in different experiment setups respectively.展开更多
Side-channel analysis(SCA)has emerged as a research hotspot in the field of cryptanalysis.Among various approaches,unsupervised deep learning-based methods demonstrate powerful information extraction capabilities with...Side-channel analysis(SCA)has emerged as a research hotspot in the field of cryptanalysis.Among various approaches,unsupervised deep learning-based methods demonstrate powerful information extraction capabilities without requiring labeled data.However,existing unsupervised methods,particularly those represented by differential deep learning analysis(DDLA)and its improved variants,while overcoming the dependency on labeled data inherent in template analysis,still suffer from high time complexity and training costs when handling key byte difference comparisons.To address this issue,this paper introduces invariant information clustering(IIC)into SCA for the first time,and thus proposes a novel unsupervised learning-based SCA method,named IIC-SCA.By leveraging mutual information maximization techniques for automatic feature extraction of power leakage data,our approach achieves key recovery through a single training session,eliminating the prohibitive computational overhead of traditional methods that require separate training for all possible key bytes.Experimental results on the ASCAD dataset demonstrate successful key extraction using only 50000 training traces and 2000 attack traces.Furthermore,compared with DDLA,the proposed method reduces training time by approximately 93.40%and memory consumption by about 6.15%,significantly decreasing the temporal and resource costs of unsupervised SCA.This breakthrough provides new insights for developing low-cost,high-efficiency cryptographic attack methodologies.展开更多
Performing the high-resolution stratigraphic analysis may be challenging and time-consuming if one has to work with large datasets.Moreover,sedimentary records have signals of different frequencies and intrinsic noise...Performing the high-resolution stratigraphic analysis may be challenging and time-consuming if one has to work with large datasets.Moreover,sedimentary records have signals of different frequencies and intrinsic noise,resulting in a complex signature that is difficult to identify only through eyes-based analysis.This work proposes identifying transgressive-regressive(T-R)sequences from carbonate facies successions of three South American basins:(i)São Francisco Basin-Brazil,(ii)Santos Basin-Brazil,and(iii)Salta Basin-Argentina.We applied a hidden Markov model in an unsupervised approach followed by a Score-Based Recommender System that automatically finds medium or low-frequency sedimentary cycles from high-frequency ones.Our method is applied to facies identified using Fullbore Formation Microimager(FMI)logs,outcrop description,and composite logs from carbonate intervals.The automatic recommendation results showed better long-distance correlations between medium-to low-frequency sedimentary cycles,whereas the hidden Markov model method successfully identified high-resolution(high-frequency)transgressive and regressive systems tracts from the given facies successions.Our workflow offers advances in the automated analyses and construction of to lower-higher-rank stratigraphic framework and short to long-distance stratigraphic correlation,allowing for scale large-automated processing of the basin dataset.Our approach in this work fits the unsupervised learning framework,as we require no previous input of stratigraphical analysis in the basin.The results provide solutions for prospecting any sediment-hosted mineral resource,especially for the oil and gas industry,offering support for subsurface geological characterization,whether at the exploration scale or for reservoir zoning during production development.展开更多
In the dynamic scene of autonomous vehicles,the depth estimation of monocular cameras often faces the problem of inaccurate edge depth estimation.To solve this problem,we propose an unsupervised monocular depth estima...In the dynamic scene of autonomous vehicles,the depth estimation of monocular cameras often faces the problem of inaccurate edge depth estimation.To solve this problem,we propose an unsupervised monocular depth estimation model based on edge enhancement,which is specifically aimed at the depth perception challenge in dynamic scenes.The model consists of two core networks:a deep prediction network and a motion estimation network,both of which adopt an encoder-decoder architecture.The depth prediction network is based on the U-Net structure of ResNet18,which is responsible for generating the depth map of the scene.The motion estimation network is based on the U-Net structure of Flow-Net,focusing on the motion estimation of dynamic targets.In the decoding stage of the motion estimation network,we innovatively introduce an edge-enhanced decoder,which integrates a convolutional block attention module(CBAM)in the decoding process to enhance the recognition ability of the edge features of moving objects.In addition,we also designed a strip convolution module to improve the model’s capture efficiency of discrete moving targets.To further improve the performance of the model,we propose a novel edge regularization method based on the Laplace operator,which effectively accelerates the convergence process of themodel.Experimental results on the KITTI and Cityscapes datasets show that compared with the current advanced dynamic unsupervised monocular model,the proposed model has a significant improvement in depth estimation accuracy and convergence speed.Specifically,the rootmean square error(RMSE)is reduced by 4.8%compared with the DepthMotion algorithm,while the training convergence speed is increased by 36%,which shows the superior performance of the model in the depth estimation task in dynamic scenes.展开更多
The performance of traditional vibration based fault diagnosis methods greatly depends on those hand- crafted features extracted using signal processing algo- rithms, which require significant amounts of domain knowle...The performance of traditional vibration based fault diagnosis methods greatly depends on those hand- crafted features extracted using signal processing algo- rithms, which require significant amounts of domain knowledge and human labor, and do not generalize well to new diagnosis domains. Recently, unsupervised represen- tation learning provides an alternative promising solution to feature extraction in traditional fault diagnosis due to its superior learning ability from unlabeled data. Given that vibration signals usually contain multiple temporal struc- tures, this paper proposes a multiscale representation learning (MSRL) framework to learn useful features directly from raw vibration signals, with the aim to capture rich and complementary fault pattern information at dif- ferent scales. In our proposed approach, a coarse-grained procedure is first employed to obtain multiple scale signals from an original vibration signal. Then, sparse filtering, a newly developed unsupervised learning algorithm, is applied to automatically learn useful features from each scale signal, respectively, and then the learned features at each scale to be concatenated one by one to obtain multi- scale representations. Finally, the multiscale representa- tions are fed into a supervised classifier to achieve diagnosis results. Our proposed approach is evaluated using two different case studies: motor bearing and wind turbine gearbox fault diagnosis. Experimental results show that the proposed MSRL approach can take full advantages of the availability of unlabeled data to learn discriminative features and achieved better performance with higher accuracy and stability compared to the traditional approaches.展开更多
基金supported by the Natural Science Foundation of China Project(No.62525102).
文摘Single-pixel imaging(SPI)is a prominent scattering media imaging technique that allows image transmission via one-dimensional detection under structured illumination,with applications spanning from long-range imaging to microscopy.Recent advancements leveraging deep learning(DL)have significantly improved SPI performance,especially at low compression ratios.However,most DL-based SPI methods proposed so far rely heavily on extensive labeled datasets for supervised training,which are often impractical in real-world scenarios.Here,we propose an unsupervised learningenabled label-free SPI method for resilient information transmission through unknown dynamic scattering media.Additionally,we introduce a physics-informed autoencoder framework to optimize encoding schemes,further enhancing image quality at low compression ratios.Simulation and experimental results demonstrate that high-efficiency data transmission with structural similarity exceeding 0.9 is achieved through challenging turbulent channels.Moreover,experiments demonstrate that in a 5 m underwater dynamic turbulent channel,USAF target imaging quality surpasses traditional methods by over 13 dB.The compressive encoded transmission of 720×720 resolution video exceeding 30 seconds with great fidelity is also successfully demonstrated.These preliminary results suggest that our proposed method opens up a new paradigm for resilient information transmission through unknown dynamic scattering media and holds potential for broader applications within many other scattering media imaging technologies.
文摘In order to solve the problem of automatic defect detection and process control in the welding and arc additive process,the paper monitors the current,voltage,audio,and other data during the welding process and extracts the minimum value,standard deviation,deviation from the voltage and current data.It extracts spectral features such as root mean square,spectral centroid,and zero-crossing rate from audio data,fuses the features extracted from multiple sensor signals,and establishes multiple machine learning supervised and unsupervised models.They are used to detect abnormalities in the welding process.The experimental results show that the established multiple machine learning models have high accuracy,among which the supervised learning model,the balanced accuracy of Ada boost is 0.957,and the unsupervised learning model Isolation Forest has a balanced accuracy of 0.909.
基金supported by the National Natural Science Foundation of China(Grants Nos.61825305 and U21A20518).
文摘In this paper,we propose a structural developmental neural network to address the plasticity‐stability dilemma,computational inefficiency,and lack of prior knowledge in continual unsupervised learning.This model uses competitive learning rules and dynamic neurons with information saturation to achieve parameter adjustment and adaptive structure development.Dynamic neurons adjust the information saturation after winning the competition and use this parameter to modulate the neuron parameter adjustment and the division timing.By dividing to generate new neurons,the network not only keeps sensitive to novel features but also can subdivide classes learnt repeatedly.The dynamic neurons with information saturation and division mechanism can simulate the long short‐term memory of the human brain,which enables the network to continually learn new samples while maintaining the previous learning results.The parent‐child relationship between neurons arising from neuronal division enables the network to simulate the human cognitive process that gradually refines the perception of objects.By setting the clustering layer parameter,users can choose the desired degree of class subdivision.Experimental results on artificial and real‐world datasets demonstrate that the proposed model is feasible for unsupervised learning tasks in instance increment and class incre-ment scenarios and outperforms prior structural developmental neural networks.
文摘This paper reports distinct spatio-spectral properties of Zen-meditation EEG (electroencephalograph), compared with resting EEG, by implementing unsupervised machine learning scheme in clustering the brain mappings of centroid frequency (BMFc). Zen practitioners simultaneously concentrate on the third ventricle, hypothalamus and corpora quadrigemina touniversalize all brain neurons to construct a <i>detached</i> brain and gradually change the normal brain traits, leading to the process of brain-neuroplasticity. During such tri-aperture concentration, EEG exhibits prominent diffuse high-frequency oscillations. Unsupervised self-organizing map (SOM), clusters the dataset of quantitative EEG by matching the input feature vector Fc and the output cluster center through the SOM network weights. Input dataset contains brain mappings of 30 centroid frequencies extracted from CWT (continuous wavelet transform) coefficients. According to SOM clustering results, resting EEG is dominated by global low-frequency (<14 Hz) activities, except channels T7, F7 and TP7 (>14.4 Hz);whereas Zen-meditation EEG exhibits globally high-frequency (>16 Hz) activities throughout the entire record. Beta waves with a wide range of frequencies are often associated with active concentration. Nonetheless, clinic report discloses that benzodiazepines, medication treatment for anxiety, insomnia and panic attacks to relieve mind/body stress, often induce <i>beta buzz</i>. We may hypothesize that Zen-meditation practitioners attain the unique state of mindfulness concentration under optimal body-mind relaxation.
基金supported by the National Natural Science Foundation of China(No.62471152).
文摘Hybrid beamforming is a promising technique for massive multiple-input multiple-output(MIMO)systems because it is able to reduce the hardware cost and power consumption while improving system performance,which is considered a key enabler for the fifth-generation and beyond communications.However,the acquisition of perfect channel state information is a challenging task since the hybrid precoder design utilizes the phase shifters in the analog domain.In this paper,we investigate the spectral efficiency(SE)for millimeter wave hybrid massive MIMO system,where the optimization of precoding adopted unsupervised learning.To maximize the system’s SE,this paper proposes an efficient DNN-LSTM-Res network(named as DNN-LSTM-ResNet),which incorporates the deep neural network(DNN),a long short-term memory(LSTM)method and a residual network(ResNet),where the neural network is treated as the basic network and residual neural network is utilized to implement the degradation problem.Finally,numerical simulations are provided to validate the effectiveness of our proposed DNN-LSTM-ResNet for hybrid massive MIMO systems.Results showcased the proposed network greatly decreases the number of training parameters and can effectively improve the system’s SE that achieves approximately 20% compared to conventional algorithms.Moreover,the proposed DNN-LSTM-ResNet has the characteristics of fast convergence speed and strong search capability.
基金supported in part by the National Natural Science Foundation of China under Grants 32171909,52205254,32301704the Guangdong Basic and Applied Basic Research Foundation under Grants 2023A1515011255,2024A1515010199+1 种基金the Scientific Research Projects of Universities in Guangdong Province under Grants 2024ZDZX1042,2024ZDZX3057the Ji-Hua Laboratory Open Project under Grant X220931UZ230.
文摘Defect detection based on computer vision is a critical component in ensuring the quality of industrial products.However,existing detection methods encounter several challenges in practical applications,including the scarcity of labeled samples,limited adaptability of pre-trained models,and the data heterogeneity in distributed environments.To address these issues,this research proposes an unsupervised defect detection method,FLAME(Federated Learning with Adaptive Multi-Model Embeddings).The method comprises three stages:(1)Feature learning stage:this work proposes FADE(Feature-Adaptive Domain-Specific Embeddings),a framework employs Gaussian noise injection to simulate defective patterns and implements a feature discriminator for defect detection,thereby enhancing the pre-trained model’s industrial imagery representation capabilities.(2)Knowledge distillation co-training stage:a multi-model feature knowledge distillation mechanism is introduced.Through feature-level knowledge transfer between the global model and historical local models,the current local model is guided to learn better feature representations from the global model.The approach prevents local models from converging to local optima and mitigates performance degradation caused by data heterogeneity.(3)Model parameter aggregation stage:participating clients utilize weighted averaging aggregation to synthesize an updated global model,facilitating efficient knowledge consolidation.Experimental results demonstrate that FADE improves the average image-level Area under the Receiver Operating Characteristic Curve(AUROC)by 7.34%compared to methods directly utilizing pre-trained models.In federated learning environments,FLAME’s multi-model feature knowledge distillation mechanism outperforms the classic FedAvg algorithm by 2.34%in average image-level AUROC,while exhibiting superior convergence properties.
基金National Natural Science Foundation of China,Grant/Award Numbers:22176221,51763010,51963011Central Public-interest Scientific Institution Basal Research Fund(CAFS),Grant/Award Number:2020TD75+2 种基金Jiangxi Provincial Double Thousand Talents Plan-Youth Program,Grant/Award Number:JXSQ2019201108Jiangxi Key Laboratory of Flexible Electronics,Grant/Award Number:20212BCD42004National。
文摘Highly stretchable and robust strain sensors are rapidly emerging as promising candidates for a diverse of wearable electronics.The main challenge for the practical application of wearable electronics is the energy consumption and device aging.Energy consumption mainly depends on the conductivity of the sensor,and it is a key factor in determining device aging.Here,we design a liq-uid metal(LM)-embedded hydrogel as a sensing material to overcome the bar-rier of energy consumption and device aging of wearable electronics.The sensing material simultaneously exhibits high conductivity(up to 22 S m�1),low elastic modulus(23 kPa),and ultrahigh stretchability(1500%)with excel-lent robustness(consistent performance against 12000 mechanical cycling).A motion monitoring system is composed of intrinsically soft LM-embedded hydrogel as sensing material,a microcontroller,signal-processing circuits,Bluetooth transceiver,and self-organizing map developed software for the visu-alization of multi-dimensional data.This system integrating multiple functions including signal conditioning,processing,and wireless transmission achieves monitor hand gesture as well as sign-to-verbal translation.This approach provides an ideal strategy for deaf-mute communicating with normal people and broadens the application of wearable electronics.
基金This research/paper was fully supported by Universiti Teknologi PETRONAS,under the Yayasan Universiti Teknologi PETRONAS(YUTP)Fundamental Research Grant Scheme(YUTP-015LC0-123).
文摘Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems.Many issues in this field still unsolved,so several modern anomaly detection methods struggle to maintain adequate accuracy due to the highly descriptive nature of big data.Such a phenomenon is referred to as the“curse of dimensionality”that affects traditional techniques in terms of both accuracy and performance.Thus,this research proposed a hybrid model based on Deep Autoencoder Neural Network(DANN)with five layers to reduce the difference between the input and output.The proposed model was applied to a real-world gas turbine(GT)dataset that contains 87620 columns and 56 rows.During the experiment,two issues have been investigated and solved to enhance the results.The first is the dataset class imbalance,which solved using SMOTE technique.The second issue is the poor performance,which can be solved using one of the optimization algorithms.Several optimization algorithms have been investigated and tested,including stochastic gradient descent(SGD),RMSprop,Adam and Adamax.However,Adamax optimization algorithm showed the best results when employed to train theDANNmodel.The experimental results show that our proposed model can detect the anomalies by efficiently reducing the high dimensionality of dataset with accuracy of 99.40%,F1-score of 0.9649,Area Under the Curve(AUC)rate of 0.9649,and a minimal loss function during the hybrid model training.
基金supported in part by the National Key Research and Development Program of China(2020YFB1313002)the National Natural Science Foundation of China(62276023,U22B2055,62222302,U2013202)+1 种基金the Fundamental Research Funds for the Central Universities(FRF-TP-22-003C1)the Postgraduate Education Reform Project of Henan Province(2021SJGLX260Y)。
文摘Underwater image enhancement aims to restore a clean appearance and thus improves the quality of underwater degraded images.Current methods feed the whole image directly into the model for enhancement.However,they ignored that the R,G and B channels of underwater degraded images present varied degrees of degradation,due to the selective absorption for the light.To address this issue,we propose an unsupervised multi-expert learning model by considering the enhancement of each color channel.Specifically,an unsupervised architecture based on generative adversarial network is employed to alleviate the need for paired underwater images.Based on this,we design a generator,including a multi-expert encoder,a feature fusion module and a feature fusion-guided decoder,to generate the clear underwater image.Accordingly,a multi-expert discriminator is proposed to verify the authenticity of the R,G and B channels,respectively.In addition,content perceptual loss and edge loss are introduced into the loss function to further improve the content and details of the enhanced images.Extensive experiments on public datasets demonstrate that our method achieves more pleasing results in vision quality.Various metrics(PSNR,SSIM,UIQM and UCIQE) evaluated on our enhanced images have been improved obviously.
基金supported by the National Natural Science Foundation of China(T2225008,12075128,11905108)support from the Shanghai Qi Zhi Institute.
文摘Classifying topological phases of matter with strong interactions is a notoriously challenging task and has attracted considerable attention in recent years.In this paper,we propose an unsupervised machine learning approach that can classify a wide range of symmetry-protected interacting topological phases directly from the experimental observables and without a priori knowledge.We analytically show that Green’s functions,which can be derived from spectral functions that can be measured directly in an experiment,are suitable for serving as the input data for our learning proposal based on the diffusion map.As a concrete example,we consider a one-dimensional interacting topological insulators model and show that,through extensive numerical simulations,our diffusion map approach works as desired.In addition,we put forward a generic scheme to measure the spectral functions in ultracold atomic systems through momentum-resolved Raman spectroscopy.Our work circumvents the costly diagonalization of the system Hamiltonian,and provides a versatile protocol for the straightforward and autonomous identification of interacting topological phases from experimental observables in an unsupervised manner.
基金the foundation of National Natural Science Foundation of China(No.52376163)National Key Laboratory of Science and Technology on Aerodynamic Design and Research(No.614220121050327).
文摘The development of deep learning has inspired some new methods to solve the 3D reconstruction problem for Tomographic Particle Image Velocimetry (Tomo-PIV). However, the supervised learning method requires a large number of data with ground truth as training information, which is very difficult to gather from experiments. Although synthetic datasets can be used as alternatives, they are still not exactly the same with the real-world experimental data. In this paper, an Unsupervised Reconstruction Technique based on U-net (UnRTU) is proposed to reconstruct volume particle distribution explicitly. Instead of using ground truth data, a projection function is used as an unsupervised loss function for network training to reconstruct particle distribution. The UnRTU was compared with some traditional algebraic reconstruction algorithms and supervised learning method using synthetic data under different particle density and noise level. The results indicate that UnRTU outperforms these traditional approaches in both reconstruction quality and noise robustness, and is comparable to the supervised learning methods AI-PR. For experimental tests, particles dispersed in cured epoxy resin are moved by an electric rail with a certain speed to obtain the ground truth data of particle velocity. Compared with other algorithms, the reconstructed particle distribution by UnRTU has the best reconstruction fidelity. And the accuracy of the 3D velocity field estimated by UnRTU is 12.9% higher than that from the traditional MLOS-MART algorithm. It demonstrates significant potential and advantages for UnRTU in 3D reconstruction of particle distribution. Finally, UnRTU was successfully applied to the high-speed planar cascade airflow field, demonstrating its applicability for measuring complex fluid flow fields at higher particle density.
基金supported by the project“ZeDaBase-Batteriezelldatenbank”of the Initiative and Networking Fund of the Helmholtz Association(KW-BASF-6).
文摘Machine learning (ML) is a rapidly growing tool even in the lithium-ion battery (LIB) research field. To utilize this tool, more and more datasets have been published. However, applicability of a ML model to different information sources or various LIB cell types has not been well studied. In this paper, an unsupervised learning model called variational autoencoder (VAE) is evaluated with three datasets of charge-discharge cycles with different conditions. The model was first trained with a publicly available dataset of commercial cylindrical cells, and then evaluated with our private datasets of commercial pouch and hand-made coin cells. These cells used different chemistry and were tested with different cycle testers under different purposes, which induces various characteristics to each dataset. We report that researchers can recognise these characteristics with VAE to plan a proper data preprocessing. We also discuss about interpretability of a ML model.
文摘Unsupervised learning algorithms can effectively solve sample imbalance.To address battery consistency anomalies in new energy vehicles,we adopt a variety of unsupervised learning algorithms to evaluate and predict the battery consistency of three vehicles using charging fragment data from actual operating conditions.We extract battery-related features,such as the mean of maximum difference,standard deviation,and entropy of batteries and then apply principal component analysis to reduce the dimensionality and record the amount of preserved information.We then build models through a collection of unsupervised learning algorithms for the anomaly detection of cell consistency faults.We also determine whether unsupervised and supervised learning algorithms can address the battery consistency problem and document the parameter tuning process.In addition,we compare the prediction effectiveness of charging and discharging features modeled individually and in combination,determine the choice of charging and discharging features to be modeled in combination,and visualize the multidimensional data for fault detection.Experimental results show that the unsupervised learning algorithm is effective in visualizing and predicting vehicle core conformance faults,and can accurately predict faults in real time.The“distance+boxplot”algorithm shows the best performance with a prediction accuracy of 80%,a recall rate of 100%,and an F1 of 0.89.The proposed approach can be applied to monitor battery consistency faults in real time and reduce the possibility of disasters arising from consistency faults.
基金Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY21F030003National Key Research and Development Program of China,Grant/Award Number:2019YFB1705800National Natural Science Foundation of China,Grant/Award Number:61973270。
文摘Particle image velocimetry(PIV)is an essential method in experimental fluid dynamics.In recent years,the development of deep learning‐based methods has inspired new ap-proaches to tackle the PIV problem,which considerably improves the accuracy of PIV.However,the supervised learning of PIV is driven by large volumes of data with ground truth information.Therefore,the authors consider unsupervised PIV methods.There has been some work on unsupervised PIV,but they are not nearly as effective as supervised learning PIV.The authors try to improve the effectiveness and accuracy of unsupervised PIV by adding classical PIV methods and physical constraints.In this paper,the authors propose an unsupervised PIV method combined with the cross‐correlation method and divergence‐free constraint,which obtains better performance than other unsupervised PIV methods.The authors compare some classical PIV methods and some deep learning methods,such as LiteFlowNet,LiteFlowNet‐en,and UnLiteFlowNet with the authors’model on the synthetic dataset.Besides,the authors contrast the results of LiteFlowNet,UnLiteFlowNet and the authors’model on experimental particle images.As a result,the authors’model shows comparable performance with classical PIV methods as well as supervised PIV methods and outperforms the previous unsupervised PIV method in most flow cases.
基金the support of the Louisiana Department of Transportation Development(LADOTD)for supplying the database used in this study.
文摘Pedestrian crashes at high-speed locations are a persistent road safety concern.Driving at high speeds means that the driver has less time to react and make evasive maneuvers to avoid a pedestrian crash.On top of this,other crash-contributing factors such as humans(pedestrians or drivers),vehicles,roadways,and surrounding environmental factors actively interact together to cause a crash at high-speed locations.The pattern of pedestrian crashes also differs significantly according to the high-speed intersection and segment locations which require further investigation.This study applied association rules mining(ARM),an unsupervised learning algorithm,to reveal the hidden association of pedestrian crash risk factors according to the high-speed intersection and segments separately.The study used Louisiana pedestrian fatal and injury crash data(2010 to 2019).Any crash location with a posted speed limit of 45 mph or above is classified as a high-speed location.Based on the generated association rules,the results show that pedestrian crashes at a high-speed intersection are associated with the intersection geometry(3-leg)and control(1 stop,no traffic control device),driver characteristics(careless operation,failure to yield,inattentive-distracted,older,and younger driver),pedestrian-related factors(violations,alcohol/drug involvement),settings(open country,residential,business,industrial),dark lighting conditions and so on.Most pedestrian crashes at high-speed segments are associated with roadways with no physical separation,dark-no-streetlight conditions,open country locations,interstates and so on.The findings of the study may help to select appropriate countermeasures to reduce pedestrian crashes at high-speed locations.
基金supported in part by the National Natural Science Foundation of China under Grant 62303090,U2330206in part by the Postdoctoral Science Foundation of China under Grant 2023M740516+1 种基金in part by the Natural Science Foundation of Sichuan Province under Grant 2024NSFSC1480in part by the New Cornerstone Science Foundation through the XPLORER PRIZE.
文摘Reliable electricity infrastructure is critical for modern society,highlighting the importance of securing the stability of fundamental power electronic systems.However,as such systems frequently involve high-current and high-voltage conditions,there is a greater likelihood of failures.Consequently,anomaly detection of power electronic systems holds great significance,which is a task that properly-designed neural networks can well undertake,as proven in various scenarios.Transformer-like networks are promising for such application,yet with its structure initially designed for different tasks,features extracted by beginning layers are often lost,decreasing detection performance.Also,such data-driven methods typically require sufficient anomalous data for training,which could be difficult to obtain in practice.Therefore,to improve feature utilization while achieving efficient unsupervised learning,a novel model,Densely-connected Decoder Transformer(DDformer),is proposed for unsupervised anomaly detection of power electronic systems in this paper.First,efficient labelfree training is achieved based on the concept of autoencoder with recursive-free output.An encoder-decoder structure with densely-connected decoder is then adopted,merging features from all encoder layers to avoid possible loss of mined features while reducing training difficulty.Both simulation and real-world experiments are conducted to validate the capabilities of DDformer,and the average FDR has surpassed baseline models,reaching 89.39%,93.91%,95.98%in different experiment setups respectively.
文摘Side-channel analysis(SCA)has emerged as a research hotspot in the field of cryptanalysis.Among various approaches,unsupervised deep learning-based methods demonstrate powerful information extraction capabilities without requiring labeled data.However,existing unsupervised methods,particularly those represented by differential deep learning analysis(DDLA)and its improved variants,while overcoming the dependency on labeled data inherent in template analysis,still suffer from high time complexity and training costs when handling key byte difference comparisons.To address this issue,this paper introduces invariant information clustering(IIC)into SCA for the first time,and thus proposes a novel unsupervised learning-based SCA method,named IIC-SCA.By leveraging mutual information maximization techniques for automatic feature extraction of power leakage data,our approach achieves key recovery through a single training session,eliminating the prohibitive computational overhead of traditional methods that require separate training for all possible key bytes.Experimental results on the ASCAD dataset demonstrate successful key extraction using only 50000 training traces and 2000 attack traces.Furthermore,compared with DDLA,the proposed method reduces training time by approximately 93.40%and memory consumption by about 6.15%,significantly decreasing the temporal and resource costs of unsupervised SCA.This breakthrough provides new insights for developing low-cost,high-efficiency cryptographic attack methodologies.
文摘Performing the high-resolution stratigraphic analysis may be challenging and time-consuming if one has to work with large datasets.Moreover,sedimentary records have signals of different frequencies and intrinsic noise,resulting in a complex signature that is difficult to identify only through eyes-based analysis.This work proposes identifying transgressive-regressive(T-R)sequences from carbonate facies successions of three South American basins:(i)São Francisco Basin-Brazil,(ii)Santos Basin-Brazil,and(iii)Salta Basin-Argentina.We applied a hidden Markov model in an unsupervised approach followed by a Score-Based Recommender System that automatically finds medium or low-frequency sedimentary cycles from high-frequency ones.Our method is applied to facies identified using Fullbore Formation Microimager(FMI)logs,outcrop description,and composite logs from carbonate intervals.The automatic recommendation results showed better long-distance correlations between medium-to low-frequency sedimentary cycles,whereas the hidden Markov model method successfully identified high-resolution(high-frequency)transgressive and regressive systems tracts from the given facies successions.Our workflow offers advances in the automated analyses and construction of to lower-higher-rank stratigraphic framework and short to long-distance stratigraphic correlation,allowing for scale large-automated processing of the basin dataset.Our approach in this work fits the unsupervised learning framework,as we require no previous input of stratigraphical analysis in the basin.The results provide solutions for prospecting any sediment-hosted mineral resource,especially for the oil and gas industry,offering support for subsurface geological characterization,whether at the exploration scale or for reservoir zoning during production development.
基金funded by the Yangtze River Delta Science and Technology Innovation Community Joint Research Project(2023CSJGG1600)the Natural Science Foundation of Anhui Province(2208085MF173)Wuhu“ChiZhu Light”Major Science and Technology Project(2023ZD01,2023ZD03).
文摘In the dynamic scene of autonomous vehicles,the depth estimation of monocular cameras often faces the problem of inaccurate edge depth estimation.To solve this problem,we propose an unsupervised monocular depth estimation model based on edge enhancement,which is specifically aimed at the depth perception challenge in dynamic scenes.The model consists of two core networks:a deep prediction network and a motion estimation network,both of which adopt an encoder-decoder architecture.The depth prediction network is based on the U-Net structure of ResNet18,which is responsible for generating the depth map of the scene.The motion estimation network is based on the U-Net structure of Flow-Net,focusing on the motion estimation of dynamic targets.In the decoding stage of the motion estimation network,we innovatively introduce an edge-enhanced decoder,which integrates a convolutional block attention module(CBAM)in the decoding process to enhance the recognition ability of the edge features of moving objects.In addition,we also designed a strip convolution module to improve the model’s capture efficiency of discrete moving targets.To further improve the performance of the model,we propose a novel edge regularization method based on the Laplace operator,which effectively accelerates the convergence process of themodel.Experimental results on the KITTI and Cityscapes datasets show that compared with the current advanced dynamic unsupervised monocular model,the proposed model has a significant improvement in depth estimation accuracy and convergence speed.Specifically,the rootmean square error(RMSE)is reduced by 4.8%compared with the DepthMotion algorithm,while the training convergence speed is increased by 36%,which shows the superior performance of the model in the depth estimation task in dynamic scenes.
基金Supported by Hebei Provincial Natural Science Foundation of China(Grant No.F2016203421)
文摘The performance of traditional vibration based fault diagnosis methods greatly depends on those hand- crafted features extracted using signal processing algo- rithms, which require significant amounts of domain knowledge and human labor, and do not generalize well to new diagnosis domains. Recently, unsupervised represen- tation learning provides an alternative promising solution to feature extraction in traditional fault diagnosis due to its superior learning ability from unlabeled data. Given that vibration signals usually contain multiple temporal struc- tures, this paper proposes a multiscale representation learning (MSRL) framework to learn useful features directly from raw vibration signals, with the aim to capture rich and complementary fault pattern information at dif- ferent scales. In our proposed approach, a coarse-grained procedure is first employed to obtain multiple scale signals from an original vibration signal. Then, sparse filtering, a newly developed unsupervised learning algorithm, is applied to automatically learn useful features from each scale signal, respectively, and then the learned features at each scale to be concatenated one by one to obtain multi- scale representations. Finally, the multiscale representa- tions are fed into a supervised classifier to achieve diagnosis results. Our proposed approach is evaluated using two different case studies: motor bearing and wind turbine gearbox fault diagnosis. Experimental results show that the proposed MSRL approach can take full advantages of the availability of unlabeled data to learn discriminative features and achieved better performance with higher accuracy and stability compared to the traditional approaches.