Generative image steganography is a technique that directly generates stego images from secret infor-mation.Unlike traditional methods,it theoretically resists steganalysis because there is no cover image.Currently,th...Generative image steganography is a technique that directly generates stego images from secret infor-mation.Unlike traditional methods,it theoretically resists steganalysis because there is no cover image.Currently,the existing generative image steganography methods generally have good steganography performance,but there is still potential room for enhancing both the quality of stego images and the accuracy of secret information extraction.Therefore,this paper proposes a generative image steganography algorithm based on attribute feature transformation and invertible mapping rule.Firstly,the reference image is disentangled by a content and an attribute encoder to obtain content features and attribute features,respectively.Then,a mean mapping rule is introduced to map the binary secret information into a noise vector,conforming to the distribution of attribute features.This noise vector is input into the generator to produce the attribute transformed stego image with the content feature of the reference image.Additionally,we design an adversarial loss,a reconstruction loss,and an image diversity loss to train the proposed model.Experimental results demonstrate that the stego images generated by the proposed method are of high quality,with an average extraction accuracy of 99.4%for the hidden information.Furthermore,since the stego image has a uniform distribution similar to the attribute-transformed image without secret information,it effectively resists both subjective and objective steganalysis.展开更多
Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combin...Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.展开更多
Expression, occlusion, and pose variations are three main challenges for 3D face recognition. A novel method is presented to address 3D face recognition using scale-invariant feature transform(SIFT) features on 3D mes...Expression, occlusion, and pose variations are three main challenges for 3D face recognition. A novel method is presented to address 3D face recognition using scale-invariant feature transform(SIFT) features on 3D meshes. After preprocessing, shape index extrema on the 3D facial surface are selected as keypoints in the difference scale space and the unstable keypoints are removed after two screening steps. Then, a local coordinate system for each keypoint is established by principal component analysis(PCA).Next, two local geometric features are extracted around each keypoint through the local coordinate system. Additionally, the features are augmented by the symmetrization according to the approximate left-right symmetry in human face. The proposed method is evaluated on the Bosphorus, BU-3DFE, and Gavab databases, respectively. Good results are achieved on these three datasets. As a result, the proposed method proves robust to facial expression variations, partial external occlusions and large pose changes.展开更多
The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.I...The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.展开更多
Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high...Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high contrast.However,limited by the equipment cost and reconstruction time requirements,the existing PAI systems distributed with annular array transducers are difficult to take into account both the image quality and the imaging speed.In this paper,a triple-path feature transform network(TFT-Net)for ring-array photoacoustic tomography is proposed to enhance the imaging quality from limited-view and sparse measurement data.Specifically,the network combines the raw photoacoustic pressure signals and conventional linear reconstruction images as input data,and takes the photoacoustic physical model as a prior information to guide the reconstruction process.In addition,to enhance the ability of extracting signal features,the residual block and squeeze and excitation block are introduced into the TFT-Net.For further efficient reconstruction,the final output of photoacoustic signals uses‘filter-then-upsample’operation with a pixel-shuffle multiplexer and a max out module.Experiment results on simulated and in-vivo data demonstrate that the constructed TFT-Net can restore the target boundary clearly,reduce background noise,and realize fast and high-quality photoacoustic image reconstruction of limited view with sparse sampling.展开更多
Low-dose computed tomography(LDCT)has gained increasing attention owing to its crucial role in reducing radiation exposure in patients.However,LDCT-reconstructed images often suffer from significant noise and artifact...Low-dose computed tomography(LDCT)has gained increasing attention owing to its crucial role in reducing radiation exposure in patients.However,LDCT-reconstructed images often suffer from significant noise and artifacts,negatively impacting the radiologists’ability to accurately diagnose.To address this issue,many studies have focused on denoising LDCT images using deep learning(DL)methods.However,these DL-based denoising methods have been hindered by the highly variable feature distribution of LDCT data from different imaging sources,which adversely affects the performance of current denoising models.In this study,we propose a parallel processing model,the multi-encoder deep feature transformation network(MDFTN),which is designed to enhance the performance of LDCT imaging for multisource data.Unlike traditional network structures,which rely on continual learning to process multitask data,the approach can simultaneously handle LDCT images within a unified framework from various imaging sources.The proposed MDFTN consists of multiple encoders and decoders along with a deep feature transformation module(DFTM).During forward propagation in network training,each encoder extracts diverse features from its respective data source in parallel and the DFTM compresses these features into a shared feature space.Subsequently,each decoder performs an inverse operation for multisource loss estimation.Through collaborative training,the proposed MDFTN leverages the complementary advantages of multisource data distribution to enhance its adaptability and generalization.Numerous experiments were conducted on two public datasets and one local dataset,which demonstrated that the proposed network model can simultaneously process multisource data while effectively suppressing noise and preserving fine structures.The source code is available at https://github.com/123456789ey/MDFTN.展开更多
On the basis of scale invariant feature transform(SIFT) descriptors,a novel kind of local invariants based on SIFT sequence scale(SIFT-SS) is proposed and applied to target classification.First of all,the merits o...On the basis of scale invariant feature transform(SIFT) descriptors,a novel kind of local invariants based on SIFT sequence scale(SIFT-SS) is proposed and applied to target classification.First of all,the merits of using an SIFT algorithm for target classification are discussed.Secondly,the scales of SIFT descriptors are sorted by descending as SIFT-SS,which is sent to a support vector machine(SVM) with radial based function(RBF) kernel in order to train SVM classifier,which will be used for achieving target classification.Experimental results indicate that the SIFT-SS algorithm is efficient for target classification and can obtain a higher recognition rate than affine moment invariants(AMI) and multi-scale auto-convolution(MSA) in some complex situations,such as the situation with the existence of noises and occlusions.Moreover,the computational time of SIFT-SS is shorter than MSA and longer than AMI.展开更多
This paper focuses mainly on semi-strapdown image homing guided (SSIHG) system design based on optical flow for a six-degree-of-freedom (6-DOF) axial-symmetric skid-to-turn missile. Three optical flow algorithms s...This paper focuses mainly on semi-strapdown image homing guided (SSIHG) system design based on optical flow for a six-degree-of-freedom (6-DOF) axial-symmetric skid-to-turn missile. Three optical flow algorithms suitable for large displacements are introduced and compared. The influence of different displacements on computational accuracy of the three algorithms is analyzed statistically. The total optical flow of the SSIHG missile is obtained using the Scale Invariant Feature Transform (SIFT) algorithm, which is the best among the three for large displacements. After removing the rotational optical flow caused by rotation of the gimbal and missile body from the total optical flow, the remaining translational optical flow is smoothed via Kalman filtering. The circular navigation guidance (CNG) law with impact angle constraint is then obtained utilizing the smoothed translational optical flow and position of the target image. Simulations are carried out under both disturbed and undisturbed conditions, and results indicate the proposed guidance strategy for SSIHG missiles can result in a precise target hit with a desired impact angle without the need for the time-to-go parameter.展开更多
A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low freq...A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.展开更多
This paper presents a pure vision based technique for 3D reconstruction of planet terrain. The reconstruction accuracy depends ultimately on an optimization technique known as 'bundle adjustment'. In vision te...This paper presents a pure vision based technique for 3D reconstruction of planet terrain. The reconstruction accuracy depends ultimately on an optimization technique known as 'bundle adjustment'. In vision techniques, the translation is only known up to a scale factor, and a single scale factor is assumed for the whole sequence of images if only one camera is used. If an extra camera is available, stereo vision based reconstruction can be obtained by binocular views. If the baseline of the stereo setup is known, the scale factor problem is solved. We found that direct application of classical bundle adjustment on the constraints inherent between the binocular views has not been tested. Our method incorporated this constraint into the conventional bundle adjustment method. This special binocular bundle adjustment has been performed on image sequences similar to planet terrain circumstances. Experimental results show that our special method enhances not only the localization accuracy, but also the terrain mapping quality.展开更多
With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is...With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.展开更多
Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based ...Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.展开更多
This paper focuses on improving the detection performance of spectrum sensing in cognitive radio(CR) networks under complicated electromagnetic environment. Some existing fast spectrum sensing algorithms cannot get sp...This paper focuses on improving the detection performance of spectrum sensing in cognitive radio(CR) networks under complicated electromagnetic environment. Some existing fast spectrum sensing algorithms cannot get specific features of the licensed users'(LUs') signal, thus they cannot be applied in this situation without knowing the power of noise. On the other hand some algorithms that yield specific features are too complicated. In this paper, an algorithm based on the cyclostationary feature detection and theory of Hilbert transformation is proposed. Comparing with the conventional cyclostationary feature detection algorithm, this approach is more flexible i.e. it can flexibly change the computational complexity according to current electromagnetic environment by changing its sampling times and the step size of cyclic frequency. Results of simulation indicate that this approach can flexibly detect the feature of received signal and provide satisfactory detection performance compared to existing approaches in low Signal-to-noise Ratio(SNR) situations.展开更多
The global context(GC) descriptor is improved for describing interest regions,uses gradient orientation for binning,and thus provides more robust invariance for geometric and photometric transformations.The performanc...The global context(GC) descriptor is improved for describing interest regions,uses gradient orientation for binning,and thus provides more robust invariance for geometric and photometric transformations.The performance of the improved GC(IGC) to image matching is studied through extensive experiments on the Oxford A?ne dataset.Empirical results indicate that the proposed IGC yields quite stable and robust results,signi?cantly outperforms the original GC,and also can outperform the classical scale-invariant feature transform(SIFT) in most of the test cases.By integrating the IGC to the SIFT,the resulting of hybrid SIFT+IGC performs best over all other single descriptors in these experimental evaluations with various geometric transformations.展开更多
To improve the performance of the scale invariant feature transform ( SIFT), a modified SIFT (M-SIFT) descriptor is proposed to realize fast and robust key-point extraction and matching. In descriptor generation, ...To improve the performance of the scale invariant feature transform ( SIFT), a modified SIFT (M-SIFT) descriptor is proposed to realize fast and robust key-point extraction and matching. In descriptor generation, 3 rotation-invariant concentric-ring grids around the key-point location are used instead of 16 square grids used in the original SIFT. Then, 10 orientations are accumulated for each grid, which results in a 30-dimension descriptor. In descriptor matching, rough rejection mismatches is proposed based on the difference of grey information between matching points. The per- formance of the proposed method is tested for image mosaic on simulated and real-worid images. Experimental results show that the M-SIFT descriptor inherits the SIFT' s ability of being invariant to image scale and rotation, illumination change and affine distortion. Besides the time cost of feature extraction is reduced by 50% compared with the original SIFT. And the rough rejection mismatches can reject at least 70% of mismatches. The results also demonstrate that the performance of the pro- posed M-SIFT method is superior to other improved SIFT methods in speed and robustness.展开更多
Image matching based on scale invariant feature transform(SIFT) is one of the most popular image matching algorithms, which exhibits high robustness and accuracy. Grayscale images rather than color images are genera...Image matching based on scale invariant feature transform(SIFT) is one of the most popular image matching algorithms, which exhibits high robustness and accuracy. Grayscale images rather than color images are generally used to get SIFT descriptors in order to reduce the complexity. The regions which have a similar grayscale level but different hues tend to produce wrong matching results in this case. Therefore, the loss of color information may result in decreasing of matching ratio. An image matching algorithm based on SIFT is proposed, which adds a color offset and an exposure offset when converting color images to grayscale images in order to enhance the matching ratio. Experimental results show that the proposed algorithm can effectively differentiate the regions with different colors but the similar grayscale level, and increase the matching ratio of image matching based on SIFT. Furthermore, it does not introduce much complexity than the traditional SIFT.展开更多
Small or smooth cloned regions are difficult to be detected in image copy-move forgery (CMF) detection. Aiming at this problem, an effective method based on image segmentation and swarm intelligent (SI) algorithm ...Small or smooth cloned regions are difficult to be detected in image copy-move forgery (CMF) detection. Aiming at this problem, an effective method based on image segmentation and swarm intelligent (SI) algorithm is proposed. This method segments image into small nonoverlapping blocks. A calculation of smooth degree is given for each block. Test image is segmented into independent layers according to the smooth degree. SI algorithm is applied in finding the optimal detection parameters for each layer. These parameters are used to detect each layer by scale invariant features transform (SIFT)-based scheme, which can locate a mass of keypoints. The experimental results prove the good performance of the proposed method, which is effective to identify the CMF image with small or smooth cloned region.展开更多
A super-resolution enhancement algorithm was proposed based on the combination of fractional calculus and Projection onto Convex Sets(POCS)for unmanned aerial vehicles(UAVs)images.The representative problems of UAV im...A super-resolution enhancement algorithm was proposed based on the combination of fractional calculus and Projection onto Convex Sets(POCS)for unmanned aerial vehicles(UAVs)images.The representative problems of UAV images including motion blur,fisheye effect distortion,overexposed,and so on can be improved by the proposed algorithm.The fractional calculus operator is used to enhance the high-resolution and low-resolution reference frames for POCS.The affine transformation parameters between low-resolution images and reference frame are calculated by Scale Invariant Feature Transform(SIFT)for matching.The point spread function of POCS is simulated by a fractional integral filter instead of Gaussian filter for more clarity of texture and detail.The objective indices and subjective effect are compared between the proposed and other methods.The experimental results indicate that the proposed method outperforms other algorithms in most cases,especially in the structure and detail clarity of the reconstructed images.展开更多
基金supported in part by the National Natural Science Foundation of China(Nos.62202234,62401270)the China Postdoctoral Science Foundation(No.2023M741778)the Natural Science Foundation of Jiangsu Province(Nos.BK20240706,BK20240694).
文摘Generative image steganography is a technique that directly generates stego images from secret infor-mation.Unlike traditional methods,it theoretically resists steganalysis because there is no cover image.Currently,the existing generative image steganography methods generally have good steganography performance,but there is still potential room for enhancing both the quality of stego images and the accuracy of secret information extraction.Therefore,this paper proposes a generative image steganography algorithm based on attribute feature transformation and invertible mapping rule.Firstly,the reference image is disentangled by a content and an attribute encoder to obtain content features and attribute features,respectively.Then,a mean mapping rule is introduced to map the binary secret information into a noise vector,conforming to the distribution of attribute features.This noise vector is input into the generator to produce the attribute transformed stego image with the content feature of the reference image.Additionally,we design an adversarial loss,a reconstruction loss,and an image diversity loss to train the proposed model.Experimental results demonstrate that the stego images generated by the proposed method are of high quality,with an average extraction accuracy of 99.4%for the hidden information.Furthermore,since the stego image has a uniform distribution similar to the attribute-transformed image without secret information,it effectively resists both subjective and objective steganalysis.
基金supported by National Natural Science Foundation of China(No.61103123)Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry
文摘Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.
基金Project(XDA06020300)supported by the"Strategic Priority Research Program"of the Chinese Academy of SciencesProject(12511501700)supported by the Research on the Key Technology of Internet of Things for Urban Community Safety Based on Video Sensor networks
文摘Expression, occlusion, and pose variations are three main challenges for 3D face recognition. A novel method is presented to address 3D face recognition using scale-invariant feature transform(SIFT) features on 3D meshes. After preprocessing, shape index extrema on the 3D facial surface are selected as keypoints in the difference scale space and the unstable keypoints are removed after two screening steps. Then, a local coordinate system for each keypoint is established by principal component analysis(PCA).Next, two local geometric features are extracted around each keypoint through the local coordinate system. Additionally, the features are augmented by the symmetrization according to the approximate left-right symmetry in human face. The proposed method is evaluated on the Bosphorus, BU-3DFE, and Gavab databases, respectively. Good results are achieved on these three datasets. As a result, the proposed method proves robust to facial expression variations, partial external occlusions and large pose changes.
基金National Natural Science Foundation of China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)University Superior Discipline Construction Project of Jiangsu Province。
文摘The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.
基金supported by National Key R&D Program of China[2022YFC2402400]the National Natural Science Foundation of China[Grant No.62275062]Guangdong Provincial Key Laboratory of Biomedical Optical Imaging Technology[Grant No.2020B121201010-4].
文摘Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high contrast.However,limited by the equipment cost and reconstruction time requirements,the existing PAI systems distributed with annular array transducers are difficult to take into account both the image quality and the imaging speed.In this paper,a triple-path feature transform network(TFT-Net)for ring-array photoacoustic tomography is proposed to enhance the imaging quality from limited-view and sparse measurement data.Specifically,the network combines the raw photoacoustic pressure signals and conventional linear reconstruction images as input data,and takes the photoacoustic physical model as a prior information to guide the reconstruction process.In addition,to enhance the ability of extracting signal features,the residual block and squeeze and excitation block are introduced into the TFT-Net.For further efficient reconstruction,the final output of photoacoustic signals uses‘filter-then-upsample’operation with a pixel-shuffle multiplexer and a max out module.Experiment results on simulated and in-vivo data demonstrate that the constructed TFT-Net can restore the target boundary clearly,reduce background noise,and realize fast and high-quality photoacoustic image reconstruction of limited view with sparse sampling.
基金supported in part by the National Key Research and Development Program of China,No.2022YFC2404103in part by the Jiangsu Provincial Key Research and Development Program Social Development Project,No.BE2022720+1 种基金in part by the Natural Science Foundation of China,No.62001471in part by the Suzhou Science and Technology Plan Project,No.SYG202345.
文摘Low-dose computed tomography(LDCT)has gained increasing attention owing to its crucial role in reducing radiation exposure in patients.However,LDCT-reconstructed images often suffer from significant noise and artifacts,negatively impacting the radiologists’ability to accurately diagnose.To address this issue,many studies have focused on denoising LDCT images using deep learning(DL)methods.However,these DL-based denoising methods have been hindered by the highly variable feature distribution of LDCT data from different imaging sources,which adversely affects the performance of current denoising models.In this study,we propose a parallel processing model,the multi-encoder deep feature transformation network(MDFTN),which is designed to enhance the performance of LDCT imaging for multisource data.Unlike traditional network structures,which rely on continual learning to process multitask data,the approach can simultaneously handle LDCT images within a unified framework from various imaging sources.The proposed MDFTN consists of multiple encoders and decoders along with a deep feature transformation module(DFTM).During forward propagation in network training,each encoder extracts diverse features from its respective data source in parallel and the DFTM compresses these features into a shared feature space.Subsequently,each decoder performs an inverse operation for multisource loss estimation.Through collaborative training,the proposed MDFTN leverages the complementary advantages of multisource data distribution to enhance its adaptability and generalization.Numerous experiments were conducted on two public datasets and one local dataset,which demonstrated that the proposed network model can simultaneously process multisource data while effectively suppressing noise and preserving fine structures.The source code is available at https://github.com/123456789ey/MDFTN.
基金supported by the National High Technology Research and Development Program (863 Program) (2010AA7080302)
文摘On the basis of scale invariant feature transform(SIFT) descriptors,a novel kind of local invariants based on SIFT sequence scale(SIFT-SS) is proposed and applied to target classification.First of all,the merits of using an SIFT algorithm for target classification are discussed.Secondly,the scales of SIFT descriptors are sorted by descending as SIFT-SS,which is sent to a support vector machine(SVM) with radial based function(RBF) kernel in order to train SVM classifier,which will be used for achieving target classification.Experimental results indicate that the SIFT-SS algorithm is efficient for target classification and can obtain a higher recognition rate than affine moment invariants(AMI) and multi-scale auto-convolution(MSA) in some complex situations,such as the situation with the existence of noises and occlusions.Moreover,the computational time of SIFT-SS is shorter than MSA and longer than AMI.
基金supported by the Armament Research Fund of China (No.9020A02010313BQ01)
文摘This paper focuses mainly on semi-strapdown image homing guided (SSIHG) system design based on optical flow for a six-degree-of-freedom (6-DOF) axial-symmetric skid-to-turn missile. Three optical flow algorithms suitable for large displacements are introduced and compared. The influence of different displacements on computational accuracy of the three algorithms is analyzed statistically. The total optical flow of the SSIHG missile is obtained using the Scale Invariant Feature Transform (SIFT) algorithm, which is the best among the three for large displacements. After removing the rotational optical flow caused by rotation of the gimbal and missile body from the total optical flow, the remaining translational optical flow is smoothed via Kalman filtering. The circular navigation guidance (CNG) law with impact angle constraint is then obtained utilizing the smoothed translational optical flow and position of the target image. Simulations are carried out under both disturbed and undisturbed conditions, and results indicate the proposed guidance strategy for SSIHG missiles can result in a precise target hit with a desired impact angle without the need for the time-to-go parameter.
基金supported by the National Natural Science Foundation of China (6117212711071002)+1 种基金the Specialized Research Fund for the Doctoral Program of Higher Education (20113401110006)the Innovative Research Team of 211 Project in Anhui University (KJTD007A)
文摘A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.
基金the National Natural Science Foundation of China (Nos. 60505017 and 60534070)the Science Planning Project of Zhejiang Province, China (No. 2005C14008)
文摘This paper presents a pure vision based technique for 3D reconstruction of planet terrain. The reconstruction accuracy depends ultimately on an optimization technique known as 'bundle adjustment'. In vision techniques, the translation is only known up to a scale factor, and a single scale factor is assumed for the whole sequence of images if only one camera is used. If an extra camera is available, stereo vision based reconstruction can be obtained by binocular views. If the baseline of the stereo setup is known, the scale factor problem is solved. We found that direct application of classical bundle adjustment on the constraints inherent between the binocular views has not been tested. Our method incorporated this constraint into the conventional bundle adjustment method. This special binocular bundle adjustment has been performed on image sequences similar to planet terrain circumstances. Experimental results show that our special method enhances not only the localization accuracy, but also the terrain mapping quality.
基金This research was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2018R1D1A1A02085645)Also,this work was supported by the KoreaMedical Device Development Fund grant funded by the Korean government(the Ministry of Science and ICT,the Ministry of Trade,Industry and Energy,the Ministry of Health&Welfare,theMinistry of Food and Drug Safety)(Project Number:202012D05-02).
文摘With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.
文摘Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.
基金sponsored by National Basic Research Program of China (973 Program, No. 2013CB329003)National Natural Science Foundation of China (No. 91438205)+1 种基金China Postdoctoral Science Foundation (No. 2011M500664)Open Research fund Program of Key Lab. for Spacecraft TT&C and Communication, Ministry of Education, China (No.CTTC-FX201305)
文摘This paper focuses on improving the detection performance of spectrum sensing in cognitive radio(CR) networks under complicated electromagnetic environment. Some existing fast spectrum sensing algorithms cannot get specific features of the licensed users'(LUs') signal, thus they cannot be applied in this situation without knowing the power of noise. On the other hand some algorithms that yield specific features are too complicated. In this paper, an algorithm based on the cyclostationary feature detection and theory of Hilbert transformation is proposed. Comparing with the conventional cyclostationary feature detection algorithm, this approach is more flexible i.e. it can flexibly change the computational complexity according to current electromagnetic environment by changing its sampling times and the step size of cyclic frequency. Results of simulation indicate that this approach can flexibly detect the feature of received signal and provide satisfactory detection performance compared to existing approaches in low Signal-to-noise Ratio(SNR) situations.
基金the National Natural Science Foundation of China(Nos.60970109 and 61170228)
文摘The global context(GC) descriptor is improved for describing interest regions,uses gradient orientation for binning,and thus provides more robust invariance for geometric and photometric transformations.The performance of the improved GC(IGC) to image matching is studied through extensive experiments on the Oxford A?ne dataset.Empirical results indicate that the proposed IGC yields quite stable and robust results,signi?cantly outperforms the original GC,and also can outperform the classical scale-invariant feature transform(SIFT) in most of the test cases.By integrating the IGC to the SIFT,the resulting of hybrid SIFT+IGC performs best over all other single descriptors in these experimental evaluations with various geometric transformations.
基金Supported by the National Natural Science Foundation of China(60905012)
文摘To improve the performance of the scale invariant feature transform ( SIFT), a modified SIFT (M-SIFT) descriptor is proposed to realize fast and robust key-point extraction and matching. In descriptor generation, 3 rotation-invariant concentric-ring grids around the key-point location are used instead of 16 square grids used in the original SIFT. Then, 10 orientations are accumulated for each grid, which results in a 30-dimension descriptor. In descriptor matching, rough rejection mismatches is proposed based on the difference of grey information between matching points. The per- formance of the proposed method is tested for image mosaic on simulated and real-worid images. Experimental results show that the M-SIFT descriptor inherits the SIFT' s ability of being invariant to image scale and rotation, illumination change and affine distortion. Besides the time cost of feature extraction is reduced by 50% compared with the original SIFT. And the rough rejection mismatches can reject at least 70% of mismatches. The results also demonstrate that the performance of the pro- posed M-SIFT method is superior to other improved SIFT methods in speed and robustness.
基金supported by the National Natural Science Foundation of China(61271315)the State Scholarship Fund of China
文摘Image matching based on scale invariant feature transform(SIFT) is one of the most popular image matching algorithms, which exhibits high robustness and accuracy. Grayscale images rather than color images are generally used to get SIFT descriptors in order to reduce the complexity. The regions which have a similar grayscale level but different hues tend to produce wrong matching results in this case. Therefore, the loss of color information may result in decreasing of matching ratio. An image matching algorithm based on SIFT is proposed, which adds a color offset and an exposure offset when converting color images to grayscale images in order to enhance the matching ratio. Experimental results show that the proposed algorithm can effectively differentiate the regions with different colors but the similar grayscale level, and increase the matching ratio of image matching based on SIFT. Furthermore, it does not introduce much complexity than the traditional SIFT.
基金Supported by the National Natural Science Foundation of China(61472429,61070192,91018008,61303074,61170240)the National High Technology Research Development Program of China(863 Program)(2007AA01Z414)+1 种基金the National Science and Technology Major Project of China(2012ZX01039-004)the Beijing Natural Science Foundation(4122041)
文摘Small or smooth cloned regions are difficult to be detected in image copy-move forgery (CMF) detection. Aiming at this problem, an effective method based on image segmentation and swarm intelligent (SI) algorithm is proposed. This method segments image into small nonoverlapping blocks. A calculation of smooth degree is given for each block. Test image is segmented into independent layers according to the smooth degree. SI algorithm is applied in finding the optimal detection parameters for each layer. These parameters are used to detect each layer by scale invariant features transform (SIFT)-based scheme, which can locate a mass of keypoints. The experimental results prove the good performance of the proposed method, which is effective to identify the CMF image with small or smooth cloned region.
基金This work is supported by the National Key Research and Development Program of China[grant number 2016YFB0502602]the National Natural Science Foundation of China[grant number 61471272]the Natural Science Foundation of Hubei Province,China[grant number 2016CFB499].
文摘A super-resolution enhancement algorithm was proposed based on the combination of fractional calculus and Projection onto Convex Sets(POCS)for unmanned aerial vehicles(UAVs)images.The representative problems of UAV images including motion blur,fisheye effect distortion,overexposed,and so on can be improved by the proposed algorithm.The fractional calculus operator is used to enhance the high-resolution and low-resolution reference frames for POCS.The affine transformation parameters between low-resolution images and reference frame are calculated by Scale Invariant Feature Transform(SIFT)for matching.The point spread function of POCS is simulated by a fractional integral filter instead of Gaussian filter for more clarity of texture and detail.The objective indices and subjective effect are compared between the proposed and other methods.The experimental results indicate that the proposed method outperforms other algorithms in most cases,especially in the structure and detail clarity of the reconstructed images.