3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Des...3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Despite its theoretical efficiency advantages,practical implementations face under-explored limitations:the fixed geometric patterns of conventional sparse convolutional kernels inevitably process non-contributory positions during sliding-window operations,particularly in regions with uneven point cloud density.To address this,we propose Hierarchical Shape Pruning for 3D Sparse Convolution(HSP-S),which dynamically eliminates redundant kernel stripes through layer-adaptive thresholding.Unlike static soft pruning methods,HSP-S maintains trainable sparsity patterns by progressively adjusting pruning thresholds during optimization,enlarging original parameter search space while removing redundant operations.Extensive experiments validate effectiveness of HSP-S acrossmajor autonomous driving benchmarks.On KITTI’s 3D object detection task,our method reduces 93.47%redundant kernel computations whilemaintaining comparable accuracy(1.56%mAP drop).Remarkably,on themore complexNuScenes benchmark,HSP-S achieves simultaneous computation reduction(21.94%sparsity)and accuracy gains(1.02%mAP(mean Average Precision)and 0.47%NDS(nuScenes detection score)improvement),demonstrating its scalability to diverse perception scenarios.This work establishes the first learnable shape pruning framework that simultaneously enhances computational efficiency and preserves detection accuracy in 3D perception systems.展开更多
Deep neural networks play an important role in the recognition of waste electrical appliances. However, deep neural network components still lack reliability in decision-making features. To address this problem, a spa...Deep neural networks play an important role in the recognition of waste electrical appliances. However, deep neural network components still lack reliability in decision-making features. To address this problem, a sparse convolutional model with semantic expression(SCMSE) is proposed. First, a low-rank sparse semantic expression component, combining the benefits of residual networks and sparse representation, is adapted to enhance sparse feature extraction and semantic expression. Second, a reliable network architecture is obtained by iterating the optimal sparse solution, enhancing semantic expression. Finally, the results of visualization experiments on the waste electrical appliances dataset demonstrate that the proposed SCMSE can obtain excellent semantic performance.展开更多
Multi-source information can be obtained through the fusion of infrared images and visible light images,which have the characteristics of complementary information.However,the existing acquisition methods of fusion im...Multi-source information can be obtained through the fusion of infrared images and visible light images,which have the characteristics of complementary information.However,the existing acquisition methods of fusion images have disadvantages such as blurred edges,low contrast,and loss of details.Based on convolution sparse representation and improved pulse-coupled neural network this paper proposes an image fusion algorithm that decompose the source images into high-frequency and low-frequency subbands by non-subsampled Shearlet Transform(NSST).Furthermore,the low-frequency subbands were fused by convolutional sparse representation(CSR),and the high-frequency subbands were fused by an improved pulse coupled neural network(IPCNN)algorithm,which can effectively solve the problem of difficulty in setting parameters of the traditional PCNN algorithm,improving the performance of sparse representation with details injection.The result reveals that the proposed method in this paper has more advantages than the existing mainstream fusion algorithms in terms of visual effects and objective indicators.展开更多
Due to the strong background noise and the acquisition system noise,the useful characteristics are often difficult to be detected.To solve this problem,sparse coding captures a concise representation of the high-level...Due to the strong background noise and the acquisition system noise,the useful characteristics are often difficult to be detected.To solve this problem,sparse coding captures a concise representation of the high-level features in the signal using the underlying structure of the signal.Recently,an Online Convolutional Sparse Coding(OCSC)denoising algorithm has been proposed.However,it does not consider the structural characteristics of the signal,the sparsity of each iteration is not enough.Therefore,a threshold shrinkage algorithm considering neighborhood sparsity is proposed,and a training strategy from loose to tight is developed to further improve the denoising performance of the algorithm,called Variable Threshold Neighborhood Online Convolution Sparse Coding(VTNOCSC).By embedding the structural sparse threshold shrinkage operator into the process of solving the sparse coefficient and gradually approaching the optimal noise separation point in the training,the signal denoising performance of the algorithm is greatly improved.VTNOCSC is used to process the actual bearing fault signal,the noise interference is successfully reduced and the interest features are more evident.Compared with other existing methods,VTNOCSC has better denoising performance.展开更多
Image fusion based on the sparse representation(SR)has become the primary research direction of the transform domain method.However,the SR-based image fusion algorithm has the characteristics of high computational com...Image fusion based on the sparse representation(SR)has become the primary research direction of the transform domain method.However,the SR-based image fusion algorithm has the characteristics of high computational complexity and neglecting the local features of an image,resulting in limited image detail retention and a high registration misalignment sensitivity.In order to overcome these shortcomings and the noise existing in the image of the fusion process,this paper proposes a new signal decomposition model,namely the multi-source image fusion algorithm of the gradient regularization convolution SR(CSR).The main innovation of this work is using the sparse optimization function to perform two-scale decomposition of the source image to obtain high-frequency components and low-frequency components.The sparse coefficient is obtained by the gradient regularization CSR model,and the sparse coefficient is taken as the maximum value to get the optimal high frequency component of the fused image.The best low frequency component is obtained by using the fusion strategy of the extreme or the average value.The final fused image is obtained by adding two optimal components.Experimental results demonstrate that this method greatly improves the ability to maintain image details and reduces image registration sensitivity.展开更多
Multipath signal recognition is crucial to the ability to provide high-precision absolute-position services by the BeiDou Navigation Satellite System(BDS).However,most existing approaches to this issue involve supervi...Multipath signal recognition is crucial to the ability to provide high-precision absolute-position services by the BeiDou Navigation Satellite System(BDS).However,most existing approaches to this issue involve supervised machine learning(ML)methods,and it is difficult to move to unsupervised multipath signal recognition because of the limitations in signal labeling.Inspired by an autoencoder with powerful unsupervised feature extraction,we propose a new deep learning(DL)model for BDS signal recognition that places a long short-term memory(LSTM)module in series with a convolutional sparse autoencoder to create a new autoencoder structure.First,we propose to capture the temporal correlations in long-duration BeiDou satellite time-series signals by using the LSTM module to mine the temporal change patterns in the time series.Second,we develop a convolutional sparse autoencoder method that learns a compressed representation of the input data,which then enables downscaled and unsupervised feature extraction from long-duration BeiDou satellite series signals.Finally,we add an l_(1/2) regularizer to the objective function of our DL model to remove redundant neurons from the neural network while ensuring recognition accuracy.We tested our proposed approach on a real urban canyon dataset,and the results demonstrated that our algorithm could achieve better classification performance than two ML-based methods(e.g.,11%better than a support vector machine)and two existing DL-based methods(e.g.,7.26%better than convolutional neural networks).展开更多
Enhancing seismic resolution is a key component in seismic data processing, which plays a valuable role in raising the prospecting accuracy of oil reservoirs. However, in noisy situations, existing resolution enhancem...Enhancing seismic resolution is a key component in seismic data processing, which plays a valuable role in raising the prospecting accuracy of oil reservoirs. However, in noisy situations, existing resolution enhancement methods are difficult to yield satisfactory processing outcomes for reservoir characterization. To solve this problem, we develop a new approach for simultaneous denoising and resolution enhancement of seismic data based on convolution dictionary learning. First, an elastic convolution dictionary learning algorithm is presented to efficiently learn a convolution dictionary with stronger representation capability from the noisy data to be processed. Specifically, the algorithm introduces the elastic L1/2 norm as a sparsity constraint and employs a steepest gradient descent strategy to efficiently solve the frequency-domain linear system with substantial computational cost in a half-quadratic splitting framework. Then, based on the learned convolution dictionary, a weighted convolutional sparse representation paradigm is designed to encode the noisy data to acquire an optimal sparse approximation of the effective signal. Subsequently, a high-resolution dictionary with a broadband spectrum is constructed by the proposed parameter scaling strategy and matched filtering technique on the basis of atomic spectrum modeling. Finally, the optimal sparse approximation of the effective signal and the constructed high-resolution dictionary are used for data reconstruction to obtain the seismic signal with high resolution and high signal-to-noise ratio. Synthetic and field dataset examples are executed to check the effectiveness and reliability of the developed method. The results indicate that this method has a more competitive performance in seismic applications compared with the conventional deconvolution and spectral whitening methods.展开更多
Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological systems.However,due to the complexity of preparation technology and the presence of many uncontroll...Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological systems.However,due to the complexity of preparation technology and the presence of many uncontrollable factors during the preparation processing,leads to problems such as difficulty in preparing slice images and breakage of slice images.Therefore,we proposed a biological slice image small-scale corruption inpainting algorithm with interpretability based on multi-layer deep sparse representation,achieving the high-fidelity reconstruction of slice images.We further discussed the relationship between deep convolutional neural networks and sparse representation,ensuring the high-fidelity characteristic of the algorithm first.A novel deep wavelet dictionary is proposed that can better obtain image prior and possess learnable feature.And multi-layer deep sparse representation is used to implement dictionary learning,acquiring better signal expression.Compared with methods such as NLABH,Shearlet,Partial Differential Equation(PDE),K-Singular Value Decomposition(K-SVD),Convolutional Sparse Coding,and Deep Image Prior,the proposed algorithm has better subjective reconstruction and objective evaluation with small-scale image data,which realized high-fidelity inpainting,under the condition of small-scale image data.And theOn2-level time complexitymakes the proposed algorithm practical.The proposed algorithm can be effectively extended to other cross-sectional image inpainting problems,such as magnetic resonance images,and computed tomography images.展开更多
Image fusion technology is the basis of computer vision task,but information is easily affected by noise during transmission.In this paper,an Improved Pigeon-Inspired Optimization(IPIO)is proposed,and used for multi-f...Image fusion technology is the basis of computer vision task,but information is easily affected by noise during transmission.In this paper,an Improved Pigeon-Inspired Optimization(IPIO)is proposed,and used for multi-focus noisy image fusion by combining with the boundary handling of the convolutional sparse representation.By two-scale image decomposition,the input image is decomposed into base layer and detail layer.For the base layer,IPIO algorithm is used to obtain the optimized weights for fusion,whose value range is gained by fusing the edge information.Besides,the global information entropy is used as the fitness index of the IPIO,which has high efficiency especially for discrete optimization problems.For the detail layer,the fusion of its coefficients is completed by performing boundary processing when solving the convolution sparse representation in the frequency domain.The sum of the above base and detail layers is as the final fused image.Experimental results show that the proposed algorithm has a better fusion effect compared with the recent algorithms.展开更多
A learning-based adaptive loop filter is developed for the geometry-based point-cloud compression(G-PCC)standard to reduce attribute compression artifacts.The proposed method first generates multiple most probable sam...A learning-based adaptive loop filter is developed for the geometry-based point-cloud compression(G-PCC)standard to reduce attribute compression artifacts.The proposed method first generates multiple most probable sample offsets(MPSOs)as potential compression distortion approximations,and then linearly weights them for artifact mitigation.Therefore,we drive the filtered reconstruction as closely to the uncompressed PCA as possible.To this end,we devise an attribute artifact reduction network(ARNet)consisting of two consecutive processing phases:MPSOs derivation and MPSOs combination.The MPSOs derivation uses a two-stream network to model local neighborhood variations from direct spatial embedding and frequency-dependent embedding,where sparse convolutions are utilized to best aggregate information from sparsely and irregularly distributed points.The MPSOs combination is guided by the least-squares error metric to derive weighting coefficients on the fly to further capture the content dynamics of the input PCAs.ARNet is implemented as an in-loop filtering tool for GPCC,where the linear weighting coefficients are encapsulated into the bitstream with negligible bitrate overhead.The experimental results demonstrate significant improvements over the latest G-PCC both subjectively and objectively.For example,our method offers a 22.12%YUV Bjøntegaard delta rate(BDRate)reduction compared to G-PCC across various commonly used test point clouds.Compared with a recent study showing state-of-the-art performance,our work not only gains 13.23%YUV BD-Rate but also provides a 30×processing speedup.展开更多
In power generation industries,boilers are required to be operated under a range of different conditions to accommodate demands for fuel randomness and energy fluctuation.Reliable prediction of the combustion operatio...In power generation industries,boilers are required to be operated under a range of different conditions to accommodate demands for fuel randomness and energy fluctuation.Reliable prediction of the combustion operation condition is crucial for an in-depth understanding of boiler performance and maintaining high combustion efficiency.However,it is difficult to establish an accurate prediction model based on traditional data-driven methods,which requires prior expert knowledge and a large number of labeled data.To overcome these limitations,a novel prediction method for the combustion operation condition based on flame imaging and a hybrid deep neural network is proposed.The proposed hybrid model is a combination of convolutional sparse autoencoder(CSAE)and least support vector machine(LSSVM),i.e.,CSAE-LSSVM,where the convolutional sparse autoencoder with deep architectures is utilized to extract the essential features of flame image,and then essential features are input into the least support vector machine for operation condition prediction.A comprehensive investigation of optimal hyper-parameter and dropout technique is carried out to improve the performance of the CSAE-LSSVM.The effectiveness of the proposed model is evaluated by 300 MW tangential coal-fired boiler flame images.The prediction accuracy of the proposed hybrid model reaches 98.06%,and its prediction time is 3.06 ms/image.It is observed that the proposed model could present a superior performance in comparison to other existing neural network models.展开更多
文摘3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Despite its theoretical efficiency advantages,practical implementations face under-explored limitations:the fixed geometric patterns of conventional sparse convolutional kernels inevitably process non-contributory positions during sliding-window operations,particularly in regions with uneven point cloud density.To address this,we propose Hierarchical Shape Pruning for 3D Sparse Convolution(HSP-S),which dynamically eliminates redundant kernel stripes through layer-adaptive thresholding.Unlike static soft pruning methods,HSP-S maintains trainable sparsity patterns by progressively adjusting pruning thresholds during optimization,enlarging original parameter search space while removing redundant operations.Extensive experiments validate effectiveness of HSP-S acrossmajor autonomous driving benchmarks.On KITTI’s 3D object detection task,our method reduces 93.47%redundant kernel computations whilemaintaining comparable accuracy(1.56%mAP drop).Remarkably,on themore complexNuScenes benchmark,HSP-S achieves simultaneous computation reduction(21.94%sparsity)and accuracy gains(1.02%mAP(mean Average Precision)and 0.47%NDS(nuScenes detection score)improvement),demonstrating its scalability to diverse perception scenarios.This work establishes the first learnable shape pruning framework that simultaneously enhances computational efficiency and preserves detection accuracy in 3D perception systems.
基金supported by the National Key Research and Development Project(Grant No.2022YFB3305800-5)the National Natural Science Foundation of China(Grant Nos.61903010, 62125301, 62021003, and 61890930-5)+2 种基金the Beijing Outstanding Young Scientist Program(Grant No.BJJWZYJH01201910005020)the Beijing Natural Science Foundation(Grant No.KZ202110005009)the Beijing Youth Scholar(Grant No.037)。
文摘Deep neural networks play an important role in the recognition of waste electrical appliances. However, deep neural network components still lack reliability in decision-making features. To address this problem, a sparse convolutional model with semantic expression(SCMSE) is proposed. First, a low-rank sparse semantic expression component, combining the benefits of residual networks and sparse representation, is adapted to enhance sparse feature extraction and semantic expression. Second, a reliable network architecture is obtained by iterating the optimal sparse solution, enhancing semantic expression. Finally, the results of visualization experiments on the waste electrical appliances dataset demonstrate that the proposed SCMSE can obtain excellent semantic performance.
基金supported in part by the National Natural Science Foundation of China under Grant 41505017.
文摘Multi-source information can be obtained through the fusion of infrared images and visible light images,which have the characteristics of complementary information.However,the existing acquisition methods of fusion images have disadvantages such as blurred edges,low contrast,and loss of details.Based on convolution sparse representation and improved pulse-coupled neural network this paper proposes an image fusion algorithm that decompose the source images into high-frequency and low-frequency subbands by non-subsampled Shearlet Transform(NSST).Furthermore,the low-frequency subbands were fused by convolutional sparse representation(CSR),and the high-frequency subbands were fused by an improved pulse coupled neural network(IPCNN)algorithm,which can effectively solve the problem of difficulty in setting parameters of the traditional PCNN algorithm,improving the performance of sparse representation with details injection.The result reveals that the proposed method in this paper has more advantages than the existing mainstream fusion algorithms in terms of visual effects and objective indicators.
基金supported by the National Key Research and Development Program of China(No.2018YFB2003300)National Science and Technology Major Project,China(No.2017-IV-0008-0045)National Natural Science Foundation of China(No.51675262).
文摘Due to the strong background noise and the acquisition system noise,the useful characteristics are often difficult to be detected.To solve this problem,sparse coding captures a concise representation of the high-level features in the signal using the underlying structure of the signal.Recently,an Online Convolutional Sparse Coding(OCSC)denoising algorithm has been proposed.However,it does not consider the structural characteristics of the signal,the sparsity of each iteration is not enough.Therefore,a threshold shrinkage algorithm considering neighborhood sparsity is proposed,and a training strategy from loose to tight is developed to further improve the denoising performance of the algorithm,called Variable Threshold Neighborhood Online Convolution Sparse Coding(VTNOCSC).By embedding the structural sparse threshold shrinkage operator into the process of solving the sparse coefficient and gradually approaching the optimal noise separation point in the training,the signal denoising performance of the algorithm is greatly improved.VTNOCSC is used to process the actual bearing fault signal,the noise interference is successfully reduced and the interest features are more evident.Compared with other existing methods,VTNOCSC has better denoising performance.
基金the National Natural Science Foundation of China(61671383)Shaanxi Key Industry Innovation Chain Project(2018ZDCXL-G-12-2,2019ZDLGY14-02-02,2019ZDLGY14-02-03).
文摘Image fusion based on the sparse representation(SR)has become the primary research direction of the transform domain method.However,the SR-based image fusion algorithm has the characteristics of high computational complexity and neglecting the local features of an image,resulting in limited image detail retention and a high registration misalignment sensitivity.In order to overcome these shortcomings and the noise existing in the image of the fusion process,this paper proposes a new signal decomposition model,namely the multi-source image fusion algorithm of the gradient regularization convolution SR(CSR).The main innovation of this work is using the sparse optimization function to perform two-scale decomposition of the source image to obtain high-frequency components and low-frequency components.The sparse coefficient is obtained by the gradient regularization CSR model,and the sparse coefficient is taken as the maximum value to get the optimal high frequency component of the fused image.The best low frequency component is obtained by using the fusion strategy of the extreme or the average value.The final fused image is obtained by adding two optimal components.Experimental results demonstrate that this method greatly improves the ability to maintain image details and reduces image registration sensitivity.
基金supported in part by the National Natural Science Foundations of China(Nos.62273106,62203122,62320106008,62373114,62203123,and 62073086)in part by Guangdong Basic and Applied Basic Research Foundation(Nos.2023A1515011480 and 2023A1515011159)in part by China Postdoctoral Science Foundation funded project(No.2022M720840).
文摘Multipath signal recognition is crucial to the ability to provide high-precision absolute-position services by the BeiDou Navigation Satellite System(BDS).However,most existing approaches to this issue involve supervised machine learning(ML)methods,and it is difficult to move to unsupervised multipath signal recognition because of the limitations in signal labeling.Inspired by an autoencoder with powerful unsupervised feature extraction,we propose a new deep learning(DL)model for BDS signal recognition that places a long short-term memory(LSTM)module in series with a convolutional sparse autoencoder to create a new autoencoder structure.First,we propose to capture the temporal correlations in long-duration BeiDou satellite time-series signals by using the LSTM module to mine the temporal change patterns in the time series.Second,we develop a convolutional sparse autoencoder method that learns a compressed representation of the input data,which then enables downscaled and unsupervised feature extraction from long-duration BeiDou satellite series signals.Finally,we add an l_(1/2) regularizer to the objective function of our DL model to remove redundant neurons from the neural network while ensuring recognition accuracy.We tested our proposed approach on a real urban canyon dataset,and the results demonstrated that our algorithm could achieve better classification performance than two ML-based methods(e.g.,11%better than a support vector machine)and two existing DL-based methods(e.g.,7.26%better than convolutional neural networks).
基金This work is supported by the Laoshan National Laboratoryof ScienceandTechnologyFoundation(No.LSKj202203400)the National Natural Science Foundation of China(No.41874146).
文摘Enhancing seismic resolution is a key component in seismic data processing, which plays a valuable role in raising the prospecting accuracy of oil reservoirs. However, in noisy situations, existing resolution enhancement methods are difficult to yield satisfactory processing outcomes for reservoir characterization. To solve this problem, we develop a new approach for simultaneous denoising and resolution enhancement of seismic data based on convolution dictionary learning. First, an elastic convolution dictionary learning algorithm is presented to efficiently learn a convolution dictionary with stronger representation capability from the noisy data to be processed. Specifically, the algorithm introduces the elastic L1/2 norm as a sparsity constraint and employs a steepest gradient descent strategy to efficiently solve the frequency-domain linear system with substantial computational cost in a half-quadratic splitting framework. Then, based on the learned convolution dictionary, a weighted convolutional sparse representation paradigm is designed to encode the noisy data to acquire an optimal sparse approximation of the effective signal. Subsequently, a high-resolution dictionary with a broadband spectrum is constructed by the proposed parameter scaling strategy and matched filtering technique on the basis of atomic spectrum modeling. Finally, the optimal sparse approximation of the effective signal and the constructed high-resolution dictionary are used for data reconstruction to obtain the seismic signal with high resolution and high signal-to-noise ratio. Synthetic and field dataset examples are executed to check the effectiveness and reliability of the developed method. The results indicate that this method has a more competitive performance in seismic applications compared with the conventional deconvolution and spectral whitening methods.
基金supported by the National Natural Science Foundation of China(Grant No.61871380)the Shandong Provincial Natural Science Foundation(Grant No.ZR2020MF019)Beijing Natural Science Foundation(Grant No.4172034).
文摘Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological systems.However,due to the complexity of preparation technology and the presence of many uncontrollable factors during the preparation processing,leads to problems such as difficulty in preparing slice images and breakage of slice images.Therefore,we proposed a biological slice image small-scale corruption inpainting algorithm with interpretability based on multi-layer deep sparse representation,achieving the high-fidelity reconstruction of slice images.We further discussed the relationship between deep convolutional neural networks and sparse representation,ensuring the high-fidelity characteristic of the algorithm first.A novel deep wavelet dictionary is proposed that can better obtain image prior and possess learnable feature.And multi-layer deep sparse representation is used to implement dictionary learning,acquiring better signal expression.Compared with methods such as NLABH,Shearlet,Partial Differential Equation(PDE),K-Singular Value Decomposition(K-SVD),Convolutional Sparse Coding,and Deep Image Prior,the proposed algorithm has better subjective reconstruction and objective evaluation with small-scale image data,which realized high-fidelity inpainting,under the condition of small-scale image data.And theOn2-level time complexitymakes the proposed algorithm practical.The proposed algorithm can be effectively extended to other cross-sectional image inpainting problems,such as magnetic resonance images,and computed tomography images.
基金supported in part by National Key Research and Development Program of China(2018YFB0804202,2018YFB0804203)Regional Joint Fund of NSFC(U19A2057)+1 种基金National Natural Science Foundation of China(61876070)Jilin Province Science and Technology Development Plan Project(20190303134SF).
文摘Image fusion technology is the basis of computer vision task,but information is easily affected by noise during transmission.In this paper,an Improved Pigeon-Inspired Optimization(IPIO)is proposed,and used for multi-focus noisy image fusion by combining with the boundary handling of the convolutional sparse representation.By two-scale image decomposition,the input image is decomposed into base layer and detail layer.For the base layer,IPIO algorithm is used to obtain the optimized weights for fusion,whose value range is gained by fusing the edge information.Besides,the global information entropy is used as the fitness index of the IPIO,which has high efficiency especially for discrete optimization problems.For the detail layer,the fusion of its coefficients is completed by performing boundary processing when solving the convolution sparse representation in the frequency domain.The sum of the above base and detail layers is as the final fused image.Experimental results show that the proposed algorithm has a better fusion effect compared with the recent algorithms.
基金supported by the National Natural Science Foundation of China under Grant No.62171174National Key R&D Program of China under Grant No.2023YFC3706605the Open Project of Zhejiang Provincial Key Laboratory of Information Processing,Communication,and Networking.
文摘A learning-based adaptive loop filter is developed for the geometry-based point-cloud compression(G-PCC)standard to reduce attribute compression artifacts.The proposed method first generates multiple most probable sample offsets(MPSOs)as potential compression distortion approximations,and then linearly weights them for artifact mitigation.Therefore,we drive the filtered reconstruction as closely to the uncompressed PCA as possible.To this end,we devise an attribute artifact reduction network(ARNet)consisting of two consecutive processing phases:MPSOs derivation and MPSOs combination.The MPSOs derivation uses a two-stream network to model local neighborhood variations from direct spatial embedding and frequency-dependent embedding,where sparse convolutions are utilized to best aggregate information from sparsely and irregularly distributed points.The MPSOs combination is guided by the least-squares error metric to derive weighting coefficients on the fly to further capture the content dynamics of the input PCAs.ARNet is implemented as an in-loop filtering tool for GPCC,where the linear weighting coefficients are encapsulated into the bitstream with negligible bitrate overhead.The experimental results demonstrate significant improvements over the latest G-PCC both subjectively and objectively.For example,our method offers a 22.12%YUV Bjøntegaard delta rate(BDRate)reduction compared to G-PCC across various commonly used test point clouds.Compared with a recent study showing state-of-the-art performance,our work not only gains 13.23%YUV BD-Rate but also provides a 30×processing speedup.
基金supported by the National Natural Science Foundation of China(Grant No.51976038)the Natural Science Foundation of Jiangsu Province,China for Young Scholars(Grant No.BK20190366)the China Scholarship Council(Grant No.202006090164).
文摘In power generation industries,boilers are required to be operated under a range of different conditions to accommodate demands for fuel randomness and energy fluctuation.Reliable prediction of the combustion operation condition is crucial for an in-depth understanding of boiler performance and maintaining high combustion efficiency.However,it is difficult to establish an accurate prediction model based on traditional data-driven methods,which requires prior expert knowledge and a large number of labeled data.To overcome these limitations,a novel prediction method for the combustion operation condition based on flame imaging and a hybrid deep neural network is proposed.The proposed hybrid model is a combination of convolutional sparse autoencoder(CSAE)and least support vector machine(LSSVM),i.e.,CSAE-LSSVM,where the convolutional sparse autoencoder with deep architectures is utilized to extract the essential features of flame image,and then essential features are input into the least support vector machine for operation condition prediction.A comprehensive investigation of optimal hyper-parameter and dropout technique is carried out to improve the performance of the CSAE-LSSVM.The effectiveness of the proposed model is evaluated by 300 MW tangential coal-fired boiler flame images.The prediction accuracy of the proposed hybrid model reaches 98.06%,and its prediction time is 3.06 ms/image.It is observed that the proposed model could present a superior performance in comparison to other existing neural network models.