An effective nonrigid image registrationmethod is developed based on the optical flow field(OFF)framework for the complex registration of structure images.In our method,a new force is modeled and integrated into the o...An effective nonrigid image registrationmethod is developed based on the optical flow field(OFF)framework for the complex registration of structure images.In our method,a new force is modeled and integrated into the original optical flow equation to jointly drive the motion direction of pixels.At any point in the offset field,in addition to the force generated by the OFF model derived from local gradient information to drive the pixels in the floating image to infiltrate into the reference pixel set,a new“guiding force”derived from the global grayscale overall trend in a given neighborhood system helps the pixels to more properly spread into the corresponding reference pixel set,particularly when the gradient field of the reference image is unstable.In the experiment,a data set containing several images with complex structures was employed to validate the performance of our registration model.The test results show that our method can quickly and efficiently register complex images and is robust to noise in images.展开更多
Objective: Most of the western music consists of a melody and an accompaniment. The melody is referred to as the foreground, with the accompaniment the background. In visual processing, the lateral occipital complex (...Objective: Most of the western music consists of a melody and an accompaniment. The melody is referred to as the foreground, with the accompaniment the background. In visual processing, the lateral occipital complex (LOC) is known to participate in foreground and background segregation. We investigated the role of LOC in music processing with use of positron emission tomography (PET). Method: Musically na?ve subjects listened to unfamiliar novel melodies with (accompaniment condition) and without the accompaniment (melodic condition). Using a PET subtraction technique, we studied changes in regional cerebral blood flow (rCBF) during the accompaniment condition compared to the melodic condition. Results: The accompanyment condition was associated with bilateral increase of rCBF at the lateral and medial surfaces of both occipital lobes, medial parts of fusiform gyri, cingulate gyri, precentral gyri, insular cortices, and cerebellum. During the melodic condition, the activation at the anterior and posterior portions of the temporal lobes, medial surface of the frontal lobes, inferior frontal gyri, orbitofrontal cortices, inferior parietal lobules, and cerebellum was observed. Conclusions: The LOC participates in recognition of melody with accompaniment, a phenomenon that can be regarded as foreground and background segregation in auditory processing. The fusiform cortex which was known to participate in the color recognition might be activated by the recognition of flourish sounds by the accompaniment, compared to melodic condition. It is supposed that the LOC and fusiform cortex play similar functions beyond the difference of sensory modalities.展开更多
Often we encounter documents with text printed on complex color background. Readability of textual contents in such documents is very poor due to complexity of the background and mix up of color(s) of foreground text ...Often we encounter documents with text printed on complex color background. Readability of textual contents in such documents is very poor due to complexity of the background and mix up of color(s) of foreground text with colors of background. Automatic segmentation of foreground text in such document images is very much essential for smooth reading of the document contents either by human or by machine. In this paper we propose a novel approach to extract the foreground text in color document images having complex background. The proposed approach is a hybrid approach which combines connected component and texture feature analysis of potential text regions. The proposed approach utilizes Canny edge detector to detect all possible text edge pixels. Connected component analysis is performed on these edge pixels to identify candidate text regions. Because of background complexity it is also possible that a non-text region may be identified as a text region. This problem is overcome by analyzing the texture features of potential text region corresponding to each connected component. An unsupervised local thresholding is devised to perform foreground segmentation in detected text regions. Finally the text regions which are noisy are identified and reprocessed to further enhance the quality of retrieved foreground. The proposed approach can handle document images with varying background of multiple colors and texture;and foreground text in any color, font, size and orientation. Experimental results show that the proposed algorithm detects on an average 97.12% of text regions in the source document. Readability of the extracted foreground text is illustrated through Optical character recognition (OCR) in case the text is in English. The proposed approach is compared with some existing methods of foreground separation in document images. Experimental results show that our approach performs better.展开更多
Accurately identifying building distribution from remote sensing images with complex background information is challenging.The emergence of diffusion models has prompted the innovative idea of employing the reverse de...Accurately identifying building distribution from remote sensing images with complex background information is challenging.The emergence of diffusion models has prompted the innovative idea of employing the reverse denoising process to distill building distribution from these complex backgrounds.Building on this concept,we propose a novel framework,building extraction diffusion model(BEDiff),which meticulously refines the extraction of building footprints from remote sensing images in a stepwise fashion.Our approach begins with the design of booster guidance,a mechanism that extracts structural and semantic features from remote sensing images to serve as priors,thereby providing targeted guidance for the diffusion process.Additionally,we introduce a cross-feature fusion module(CFM)that bridges the semantic gap between different types of features,facilitating the integration of the attributes extracted by booster guidance into the diffusion process more effectively.Our proposed BEDiff marks the first application of diffusion models to the task of building extraction.Empirical evidence from extensive experiments on the Beijing building dataset demonstrates the superior performance of BEDiff,affirming its effectiveness and potential for enhancing the accuracy of building extraction in complex urban landscapes.展开更多
针对真实环境下采集的病害图像中存在的大量噪声和复杂背景干扰,导致作物病害识别准确性和泛化性低的问题,该研究提出基于自适应BayesShrink和频-空特征融合的作物病害识别方法(adaptive BayesShrink and frequencyspatial domain featu...针对真实环境下采集的病害图像中存在的大量噪声和复杂背景干扰,导致作物病害识别准确性和泛化性低的问题,该研究提出基于自适应BayesShrink和频-空特征融合的作物病害识别方法(adaptive BayesShrink and frequencyspatial domain features fusion, AFSF-DCT)。首先,设计了自适应BayesShrink算法(Ad-BayesShrink)以减少噪声干扰,同时保留更多细节,降低识别模型提取病害特征的难度。然后提出基于频-空特征融合和动态交叉自注意机制的作物病害识别模型(crop leaf disease identification model based on frequency-spatial features fusion and dynamic cross-self-attention,FSF-DCT)。为实现全面的频-空特征映射,设计了基于离散小波变换(discrete wavelet transform,DWT)和倒残差结构(bneck)的频-空特征映射(DWT-Bneck)分支以捕获多尺度病害特征。频域分支设计了基于2D DWT的特征映射模块(2D DWT-based frequency-features decomposition module, DWFD)以捕获病害细节和纹理,用于补充空间域特征在全局信息表达上的不足。空间域分支在bneck中引入CBAM(convolutional block attention module)和Dynamic Shift Max激活函数以实现全面的空间特征映射。最后设计了动态交叉自注意特征融合模块(multi-scale features fusion network based on dynamic cross-self-attention, MDCS-DF)融合频-空特征并增强模型对病害特征的关注。结果表明,Ad-BayesShrink获得了35.78的最高峰值信噪比,优于VisuShrink和SUREShrink。FSF-DCT在自建数据集和2个开源数据集(PlantVillage和AI challenger 2018)上分别获得了99.20%、99.90%和90.75%的识别精度,且具有较小的参数量(7.48 M)和浮点运算数(4.62 G),优于当前大部分的主流识别模型。AFSF-DCT可为复杂背景下的作物叶片病害的快速精准检测提供模型参考。展开更多
在电力巡检过程中,无人机等边端智能检测设备往往面临输电线路绝缘子缺陷目标小、背景因素复杂等难点,且边端设备的硬件条件限制了模型的规模,导致设备算力有限,模型准确率偏低。针对上述问题,该文提出了一种基于YOLOv8-RFL(You only lo...在电力巡检过程中,无人机等边端智能检测设备往往面临输电线路绝缘子缺陷目标小、背景因素复杂等难点,且边端设备的硬件条件限制了模型的规模,导致设备算力有限,模型准确率偏低。针对上述问题,该文提出了一种基于YOLOv8-RFL(You only look once version 8-RFL)模型的输电线路绝缘子缺陷检测方法。首先,通过对原有主干网络C2f(CSPDarknet53 to 2-Stage FPN)模块进行改进,增强模型对于绝缘子缺陷的特征提取能力;其次,构建基于特征聚焦的泛化特征金字塔网络(focusing generalized feature pyramid networks,FGFPN),采用“特征聚焦-扩散”的思想,精细化小缺陷目标的特征表达;然后,设计基于交叉注意机制的特征语义融合模块(feature semantic fusion module,FSFM),优化了对关键特征信息的捕获和利用;最后,提出轻量化权重共享检测头(Lightweight weight sharing detection head,LWSD),在保证检测精度的同时提高模型的计算效率和实时性。实验表明,改进后的YOLOv8-RFL模型均值平均精度(mean average precision,mAP)达到了93.2%,相较于基准模型提升了5.9%,在降低模型参数量和所需计算量的同时,实现了更好的绝缘子小目标缺陷检测效果,对于复杂背景下的输电线路绝缘子缺陷检测具有一定的现实意义。展开更多
基金supported in part by the National Key Research and Development Program of China under Grant no.2020YFB1806403.
文摘An effective nonrigid image registrationmethod is developed based on the optical flow field(OFF)framework for the complex registration of structure images.In our method,a new force is modeled and integrated into the original optical flow equation to jointly drive the motion direction of pixels.At any point in the offset field,in addition to the force generated by the OFF model derived from local gradient information to drive the pixels in the floating image to infiltrate into the reference pixel set,a new“guiding force”derived from the global grayscale overall trend in a given neighborhood system helps the pixels to more properly spread into the corresponding reference pixel set,particularly when the gradient field of the reference image is unstable.In the experiment,a data set containing several images with complex structures was employed to validate the performance of our registration model.The test results show that our method can quickly and efficiently register complex images and is robust to noise in images.
文摘Objective: Most of the western music consists of a melody and an accompaniment. The melody is referred to as the foreground, with the accompaniment the background. In visual processing, the lateral occipital complex (LOC) is known to participate in foreground and background segregation. We investigated the role of LOC in music processing with use of positron emission tomography (PET). Method: Musically na?ve subjects listened to unfamiliar novel melodies with (accompaniment condition) and without the accompaniment (melodic condition). Using a PET subtraction technique, we studied changes in regional cerebral blood flow (rCBF) during the accompaniment condition compared to the melodic condition. Results: The accompanyment condition was associated with bilateral increase of rCBF at the lateral and medial surfaces of both occipital lobes, medial parts of fusiform gyri, cingulate gyri, precentral gyri, insular cortices, and cerebellum. During the melodic condition, the activation at the anterior and posterior portions of the temporal lobes, medial surface of the frontal lobes, inferior frontal gyri, orbitofrontal cortices, inferior parietal lobules, and cerebellum was observed. Conclusions: The LOC participates in recognition of melody with accompaniment, a phenomenon that can be regarded as foreground and background segregation in auditory processing. The fusiform cortex which was known to participate in the color recognition might be activated by the recognition of flourish sounds by the accompaniment, compared to melodic condition. It is supposed that the LOC and fusiform cortex play similar functions beyond the difference of sensory modalities.
文摘Often we encounter documents with text printed on complex color background. Readability of textual contents in such documents is very poor due to complexity of the background and mix up of color(s) of foreground text with colors of background. Automatic segmentation of foreground text in such document images is very much essential for smooth reading of the document contents either by human or by machine. In this paper we propose a novel approach to extract the foreground text in color document images having complex background. The proposed approach is a hybrid approach which combines connected component and texture feature analysis of potential text regions. The proposed approach utilizes Canny edge detector to detect all possible text edge pixels. Connected component analysis is performed on these edge pixels to identify candidate text regions. Because of background complexity it is also possible that a non-text region may be identified as a text region. This problem is overcome by analyzing the texture features of potential text region corresponding to each connected component. An unsupervised local thresholding is devised to perform foreground segmentation in detected text regions. Finally the text regions which are noisy are identified and reprocessed to further enhance the quality of retrieved foreground. The proposed approach can handle document images with varying background of multiple colors and texture;and foreground text in any color, font, size and orientation. Experimental results show that the proposed algorithm detects on an average 97.12% of text regions in the source document. Readability of the extracted foreground text is illustrated through Optical character recognition (OCR) in case the text is in English. The proposed approach is compared with some existing methods of foreground separation in document images. Experimental results show that our approach performs better.
基金supported by the National Natural Science Foundation of China(Nos.61906168,62202429 and 62272267)the Zhejiang Provincial Natural Science Foundation of China(No.LY23F020023)the Construction of Hubei Provincial Key Laboratory for Intelligent Visual Monitoring of Hydropower Projects(No.2022SDSJ01)。
文摘Accurately identifying building distribution from remote sensing images with complex background information is challenging.The emergence of diffusion models has prompted the innovative idea of employing the reverse denoising process to distill building distribution from these complex backgrounds.Building on this concept,we propose a novel framework,building extraction diffusion model(BEDiff),which meticulously refines the extraction of building footprints from remote sensing images in a stepwise fashion.Our approach begins with the design of booster guidance,a mechanism that extracts structural and semantic features from remote sensing images to serve as priors,thereby providing targeted guidance for the diffusion process.Additionally,we introduce a cross-feature fusion module(CFM)that bridges the semantic gap between different types of features,facilitating the integration of the attributes extracted by booster guidance into the diffusion process more effectively.Our proposed BEDiff marks the first application of diffusion models to the task of building extraction.Empirical evidence from extensive experiments on the Beijing building dataset demonstrates the superior performance of BEDiff,affirming its effectiveness and potential for enhancing the accuracy of building extraction in complex urban landscapes.
文摘针对真实环境下采集的病害图像中存在的大量噪声和复杂背景干扰,导致作物病害识别准确性和泛化性低的问题,该研究提出基于自适应BayesShrink和频-空特征融合的作物病害识别方法(adaptive BayesShrink and frequencyspatial domain features fusion, AFSF-DCT)。首先,设计了自适应BayesShrink算法(Ad-BayesShrink)以减少噪声干扰,同时保留更多细节,降低识别模型提取病害特征的难度。然后提出基于频-空特征融合和动态交叉自注意机制的作物病害识别模型(crop leaf disease identification model based on frequency-spatial features fusion and dynamic cross-self-attention,FSF-DCT)。为实现全面的频-空特征映射,设计了基于离散小波变换(discrete wavelet transform,DWT)和倒残差结构(bneck)的频-空特征映射(DWT-Bneck)分支以捕获多尺度病害特征。频域分支设计了基于2D DWT的特征映射模块(2D DWT-based frequency-features decomposition module, DWFD)以捕获病害细节和纹理,用于补充空间域特征在全局信息表达上的不足。空间域分支在bneck中引入CBAM(convolutional block attention module)和Dynamic Shift Max激活函数以实现全面的空间特征映射。最后设计了动态交叉自注意特征融合模块(multi-scale features fusion network based on dynamic cross-self-attention, MDCS-DF)融合频-空特征并增强模型对病害特征的关注。结果表明,Ad-BayesShrink获得了35.78的最高峰值信噪比,优于VisuShrink和SUREShrink。FSF-DCT在自建数据集和2个开源数据集(PlantVillage和AI challenger 2018)上分别获得了99.20%、99.90%和90.75%的识别精度,且具有较小的参数量(7.48 M)和浮点运算数(4.62 G),优于当前大部分的主流识别模型。AFSF-DCT可为复杂背景下的作物叶片病害的快速精准检测提供模型参考。
文摘在电力巡检过程中,无人机等边端智能检测设备往往面临输电线路绝缘子缺陷目标小、背景因素复杂等难点,且边端设备的硬件条件限制了模型的规模,导致设备算力有限,模型准确率偏低。针对上述问题,该文提出了一种基于YOLOv8-RFL(You only look once version 8-RFL)模型的输电线路绝缘子缺陷检测方法。首先,通过对原有主干网络C2f(CSPDarknet53 to 2-Stage FPN)模块进行改进,增强模型对于绝缘子缺陷的特征提取能力;其次,构建基于特征聚焦的泛化特征金字塔网络(focusing generalized feature pyramid networks,FGFPN),采用“特征聚焦-扩散”的思想,精细化小缺陷目标的特征表达;然后,设计基于交叉注意机制的特征语义融合模块(feature semantic fusion module,FSFM),优化了对关键特征信息的捕获和利用;最后,提出轻量化权重共享检测头(Lightweight weight sharing detection head,LWSD),在保证检测精度的同时提高模型的计算效率和实时性。实验表明,改进后的YOLOv8-RFL模型均值平均精度(mean average precision,mAP)达到了93.2%,相较于基准模型提升了5.9%,在降低模型参数量和所需计算量的同时,实现了更好的绝缘子小目标缺陷检测效果,对于复杂背景下的输电线路绝缘子缺陷检测具有一定的现实意义。