It is well known that a block discrete cosine transform compressed image exhibits visually annoying blocking artifacts at low-bit-rate. A new post-processing deblocking algorithm in wavelet domain is proposed. The alg...It is well known that a block discrete cosine transform compressed image exhibits visually annoying blocking artifacts at low-bit-rate. A new post-processing deblocking algorithm in wavelet domain is proposed. The algorithm exploits blocking-artifact features shown in wavelet domain. The energy of blocking artifacts is concentrated into some lines to form annoying visual effects after wavelet transform. The aim of reducing blocking artifacts is to capture excessive energy on the block boundary effectively and reduce it below the visual scope. Adaptive operators for different subbands are computed based on the wavelet coefficients. The operators are made adaptive to different images and characteristics of blocking artifacts. Experimental results show that the proposed method can significantly improve the visual quality and also increase the peak signal-noise-ratio(PSNR) in the output image.展开更多
Fusion methods based on multi-scale transforms have become the mainstream of the pixel-level image fusion. However,most of these methods cannot fully exploit spatial domain information of source images, which lead to ...Fusion methods based on multi-scale transforms have become the mainstream of the pixel-level image fusion. However,most of these methods cannot fully exploit spatial domain information of source images, which lead to the degradation of image.This paper presents a fusion framework based on block-matching and 3D(BM3D) multi-scale transform. The algorithm first divides the image into different blocks and groups these 2D image blocks into 3D arrays by their similarity. Then it uses a 3D transform which consists of a 2D multi-scale and a 1D transform to transfer the arrays into transform coefficients, and then the obtained low-and high-coefficients are fused by different fusion rules. The final fused image is obtained from a series of fused 3D image block groups after the inverse transform by using an aggregation process. In the experimental part, we comparatively analyze some existing algorithms and the using of different transforms, e.g. non-subsampled Contourlet transform(NSCT), non-subsampled Shearlet transform(NSST), in the 3D transform step. Experimental results show that the proposed fusion framework can not only improve subjective visual effect, but also obtain better objective evaluation criteria than state-of-the-art methods.展开更多
A new all-zero block determination rule was used to reduce the complexity of the AVS-M encoder. It reuses the sum of absolute difference of 4x4 block obtained from motion estimation or intra prediction as parameters s...A new all-zero block determination rule was used to reduce the complexity of the AVS-M encoder. It reuses the sum of absolute difference of 4x4 block obtained from motion estimation or intra prediction as parameters so that the determination threshold need to be computed only once when quantization parameter (QP) is invariable for given video sequence. This method avoids a lot of computation for transform, quantization, inverse transform, inverse quantization and block reconstruction. Simulation results showed that it can save about 20%~50% computation without any video quality degradation.展开更多
Convolutional neural networks(CNN)based on U-shaped structures and skip connections play a pivotal role in various image segmentation tasks.Recently,Transformer starts to lead new trends in the image segmentation task...Convolutional neural networks(CNN)based on U-shaped structures and skip connections play a pivotal role in various image segmentation tasks.Recently,Transformer starts to lead new trends in the image segmentation task.Transformer layer can construct the relationship between all pixels,and the two parties can complement each other well.On the basis of these characteristics,we try to combine Transformer pipeline and convolutional neural network pipeline to gain the advantages of both.The image is put into the U-shaped encoder-decoder architecture based on empirical combination of self-attention and convolution,in which skip connections are utilized for localglobal semantic feature learning.At the same time,the image is also put into the convolutional neural network architecture.The final segmentation result will be formed by Mix block which combines both.The mixture model of the convolutional neural network and the Transformer network for road segmentation(MCTNet)can achieve effective segmentation results on KITTI dataset and Unstructured Road Scene(URS)dataset built by ourselves.Codes,self-built datasets and trainable models will be available on https://github.com/xflxfl1992/MCTNet.展开更多
In the H.263 video codec related systems, motion estimation and Discrete Cosine Transform (DCT) have the most computational requirements. In order to reduce complexity of the encoder to dedicate more resources to othe...In the H.263 video codec related systems, motion estimation and Discrete Cosine Transform (DCT) have the most computational requirements. In order to reduce complexity of the encoder to dedicate more resources to other functions, according to the study of existing methods, an Improved All Zero Block Finding (IAZBF) method based on the statistic characteristics of DCT coefficients is proposed. Compared with existing methods, IAZBF improves the detecting efficiency by about 50% without importing too much extra computation requirement. Being computed with additions and shifts instead of complicated multiplications, IAZBF is of low computation complexity, especially for low-end processors. In addition, IAZBF upholds picture fidelity and remains compatible with the H.263 bitstream standard.展开更多
Due to coarse quantization, block-based discrete cosine transform(BDCT) compression methods usually suffer from visible blocking artifacts at the block boundaries. A novel efficient de-blocking method in DCT domain is...Due to coarse quantization, block-based discrete cosine transform(BDCT) compression methods usually suffer from visible blocking artifacts at the block boundaries. A novel efficient de-blocking method in DCT domain is proposed. A specific criterion for edge detection is given, one-dimensional DCT is applied on each row of the adjacent blocks and the shifted block in smooth region, and the transform coefficients of the shifted block are modified by weighting the average of three coefficients of the block. Mean square difference of slope criterion is used to judge the efficiency of the proposed algorithm. Simulation results show that the new method not only obtains satisfactory image quality, but also maintains high frequency information.展开更多
基金Science and Technology Project of Guangdong Province(2006A10201003)2005 Startup Project of Jinan University(51205067)Soft Science Project of Guangdong Province(2006B70103011)
文摘It is well known that a block discrete cosine transform compressed image exhibits visually annoying blocking artifacts at low-bit-rate. A new post-processing deblocking algorithm in wavelet domain is proposed. The algorithm exploits blocking-artifact features shown in wavelet domain. The energy of blocking artifacts is concentrated into some lines to form annoying visual effects after wavelet transform. The aim of reducing blocking artifacts is to capture excessive energy on the block boundary effectively and reduce it below the visual scope. Adaptive operators for different subbands are computed based on the wavelet coefficients. The operators are made adaptive to different images and characteristics of blocking artifacts. Experimental results show that the proposed method can significantly improve the visual quality and also increase the peak signal-noise-ratio(PSNR) in the output image.
文摘针对现有方法在腹部中小器官图像分割性能方面存在的不足,提出一种基于局部和全局并行编码的网络模型用于腹部多器官图像分割.首先,设计一种提取多尺度特征信息的局部编码分支;其次,全局特征编码分支采用分块Transformer,通过块内Transformer和块间Transformer的组合,既捕获了全局的长距离依赖信息又降低了计算量;再次,设计特征融合模块,以融合来自两条编码分支的上下文信息;最后,设计解码模块,实现全局信息与局部上下文信息的交互,更好地补偿解码阶段的信息损失.在Synapse多器官CT数据集上进行实验,与目前9种先进方法相比,在平均Dice相似系数(DSC)和Hausdorff距离(HD)指标上都达到了最佳性能,分别为83.10%和17.80 mm.
基金supported by the National Natural Science Foundation of China(6157206361401308)+6 种基金the Fundamental Research Funds for the Central Universities(2016YJS039)the Natural Science Foundation of Hebei Province(F2016201142F2016201187)the Natural Social Foundation of Hebei Province(HB15TQ015)the Science Research Project of Hebei Province(QN2016085ZC2016040)the Natural Science Foundation of Hebei University(2014-303)
文摘Fusion methods based on multi-scale transforms have become the mainstream of the pixel-level image fusion. However,most of these methods cannot fully exploit spatial domain information of source images, which lead to the degradation of image.This paper presents a fusion framework based on block-matching and 3D(BM3D) multi-scale transform. The algorithm first divides the image into different blocks and groups these 2D image blocks into 3D arrays by their similarity. Then it uses a 3D transform which consists of a 2D multi-scale and a 1D transform to transfer the arrays into transform coefficients, and then the obtained low-and high-coefficients are fused by different fusion rules. The final fused image is obtained from a series of fused 3D image block groups after the inverse transform by using an aggregation process. In the experimental part, we comparatively analyze some existing algorithms and the using of different transforms, e.g. non-subsampled Contourlet transform(NSCT), non-subsampled Shearlet transform(NSST), in the 3D transform step. Experimental results show that the proposed fusion framework can not only improve subjective visual effect, but also obtain better objective evaluation criteria than state-of-the-art methods.
基金Project (No. 05R214207) supported by the Sustentation Fund Plan for Post Doctor of Shanghai, China
文摘A new all-zero block determination rule was used to reduce the complexity of the AVS-M encoder. It reuses the sum of absolute difference of 4x4 block obtained from motion estimation or intra prediction as parameters so that the determination threshold need to be computed only once when quantization parameter (QP) is invariable for given video sequence. This method avoids a lot of computation for transform, quantization, inverse transform, inverse quantization and block reconstruction. Simulation results showed that it can save about 20%~50% computation without any video quality degradation.
基金supported by the Postgraduate Research&Practice Innovation Program of Jiangsu Province (SJCX21_1427)General Program of Natural Science Research in Jiangsu Universities (21KJB520019).
文摘Convolutional neural networks(CNN)based on U-shaped structures and skip connections play a pivotal role in various image segmentation tasks.Recently,Transformer starts to lead new trends in the image segmentation task.Transformer layer can construct the relationship between all pixels,and the two parties can complement each other well.On the basis of these characteristics,we try to combine Transformer pipeline and convolutional neural network pipeline to gain the advantages of both.The image is put into the U-shaped encoder-decoder architecture based on empirical combination of self-attention and convolution,in which skip connections are utilized for localglobal semantic feature learning.At the same time,the image is also put into the convolutional neural network architecture.The final segmentation result will be formed by Mix block which combines both.The mixture model of the convolutional neural network and the Transformer network for road segmentation(MCTNet)can achieve effective segmentation results on KITTI dataset and Unstructured Road Scene(URS)dataset built by ourselves.Codes,self-built datasets and trainable models will be available on https://github.com/xflxfl1992/MCTNet.
基金Supported by the China Aviation Fund (No. 02153071)
文摘In the H.263 video codec related systems, motion estimation and Discrete Cosine Transform (DCT) have the most computational requirements. In order to reduce complexity of the encoder to dedicate more resources to other functions, according to the study of existing methods, an Improved All Zero Block Finding (IAZBF) method based on the statistic characteristics of DCT coefficients is proposed. Compared with existing methods, IAZBF improves the detecting efficiency by about 50% without importing too much extra computation requirement. Being computed with additions and shifts instead of complicated multiplications, IAZBF is of low computation complexity, especially for low-end processors. In addition, IAZBF upholds picture fidelity and remains compatible with the H.263 bitstream standard.
基金Science and Technology Project of Guangdong Province(2006A10201003) 2005 Jinan University StartupProject(51205067) Soft Science Project of Guangdong Province(2006B70103011)
文摘Due to coarse quantization, block-based discrete cosine transform(BDCT) compression methods usually suffer from visible blocking artifacts at the block boundaries. A novel efficient de-blocking method in DCT domain is proposed. A specific criterion for edge detection is given, one-dimensional DCT is applied on each row of the adjacent blocks and the shifted block in smooth region, and the transform coefficients of the shifted block are modified by weighting the average of three coefficients of the block. Mean square difference of slope criterion is used to judge the efficiency of the proposed algorithm. Simulation results show that the new method not only obtains satisfactory image quality, but also maintains high frequency information.