雷达遥感图像的处理,由于受单机内存空间的限制,一般采用I/O函数随机访问磁盘图像文件的方式,因此完成整幅图像的处理需要耗费大量的时间,很难达到实际应用的需要。基于分布式共享内存网络系统JIAJIA软件将多台微机的物理内存连接...雷达遥感图像的处理,由于受单机内存空间的限制,一般采用I/O函数随机访问磁盘图像文件的方式,因此完成整幅图像的处理需要耗费大量的时间,很难达到实际应用的需要。基于分布式共享内存网络系统JIAJIA软件将多台微机的物理内存连接构成一个较大的共享内存空间,实现了多台微机对遥感图像同步、方便、快捷的处理。通过对SAR图像几何纠正、图像滤波、监督分类串行算法的分析,发展了相应的并行处理算法,并在8台运行Linux操作系统,主频400MHz,内存256兆的Pentium II PC机上进行了实验,都获得了超线性加速比的实验结果。展开更多
目的红外小目标检测旨在从复杂背景中准确识别和定位弱小红外目标,在海上侦查、军事救援等任务中具有重要的应用前景。然而,受限于红外图像中目标尺寸小、对比度低等因素,当前的检测方法仍难以实现检测精度与误报率之间的平衡。针对上...目的红外小目标检测旨在从复杂背景中准确识别和定位弱小红外目标,在海上侦查、军事救援等任务中具有重要的应用前景。然而,受限于红外图像中目标尺寸小、对比度低等因素,当前的检测方法仍难以实现检测精度与误报率之间的平衡。针对上述问题,提出一种基于选择性注意力的红外小目标检测网络(selective attentionbased network for infrared small target detection,SANet)。方法具体而言,设计了双路径语义感知模块,用于增强网络对弱小目标的感知能力。该模块融合了标准卷积与风车型卷积两种路径,兼顾局部空间一致性与全局上下文感知能力,并进一步引入空间/通道注意力机制以细化特征表达,从而有效提升了目标与背景的可区分性。此外,为克服U-Net中静态跳跃连接在特征融合中的局限性,进一步提出了选择性注意力融合模块。该模块基于空间动态权重机制实现跨尺度特征的自适应融合,能够根据空间显著性选择性增强关键区域,从而提升对真实目标与误报的辨识能力。结果在3个公开基准数据集上的实验结果验证了所提出的SANet在交并比(intersection over union,IoU)、nIoU、P_(d)和F_(a)4个指标上均优于现有先进方法,其中,本文方法的IoU指标在上述基准数据集上比次优方法分别提升1.93%、4.32%和2.21%。结论SANet有效增强了网络对小目标的感知能力、关键特征的表达能力以及背景干扰的抑制能力。源代码可以在https://gitcode.com/m0_61988291/SANet上获取。展开更多
This study investigates the impacts of mixing time,execution procedure,cement dosage(α),and total water-to-cement ratio(W_(Total)/C)on the mixing energy(E)of deep soil mixing(DSM)columns and how E influences the stre...This study investigates the impacts of mixing time,execution procedure,cement dosage(α),and total water-to-cement ratio(W_(Total)/C)on the mixing energy(E)of deep soil mixing(DSM)columns and how E influences the strength of treated sand.Columns with a diameter of 7.5 cm were constructed using three mixing times(130,190,and 250 s),two execution procedures(normal and zigzag),threeαvalues(300,400,and 500 kg/m^(3)),and three W_(Total)/C ratios(2.5,3.0,and 3.5).For comparison,equivalent laboratory samples were also examined.Results revealed that increasing the mixing time andα,adopting the zigzag execution procedure,and reducing the W_(Total)/C ratio increase E.Outcomes indicated that an increase in E from 0.49-0.70 kJ to 0.70-0.90 kJ,0.90-1.10 kJ,and 1.10-1.40 kJ improves the unconfined compressive strength(UCS)of columns on average by 66%,124%,and 179%,respectively,and the secant modulus by 61%,110%,and 152%.Average strain at maximum stress also rises from 0.68%to 0.75%,0.81%,and 0.84%,respectively.The study identified a threshold in the direct relationship between E and the strength ratio(λ),beyond whichλdid not increase significantly with further increases in E.Additionally,at low and high E levels,DSM samples mainly failed by crushing and cracking modes,respectively.In DSM columns withα=500 kg/m^(3)and W_(Total)/C=2.5,increasing average E from 0.77 kJ to 0.95 kJ,1.08 kJ,and 1.28 kJ resulted in a reduction of coefficients of variation of UCS from 30.4%to 27.8%,24.5%,and 21.1%,respectively.展开更多
A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method i...A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records the interleaving string of INV (invalidation) and GETP (getting a remote page) operations for each cached page and analyzes the periodicity of the string when a page is invalidated on a lock or barrier. A prefetching request is issued after the lock or barrier if the periodicity analysis indicates that GETP will be the next operation in the string. Multiple prefetching requests are merged into the same message if they are to the same host. Performance evaluation with eight well-accepted benchmarks in a cluster of sixteen PowerPC workstations shows that the prefetching scheme can significantly reduce the page fault overhead and as a result achieves a performance increase of 15%-20% in three benchmarks and around 8%-10% in another three. The average extra traffic caused by useless prefetches is only 7%-13% in the evaluation.展开更多
Page-based software DSM systems suffer from false sharing caused by the large sharing granularity, and only support one-dimension Block or Cyclicblock data distribution schemes. Thus applications running on them will...Page-based software DSM systems suffer from false sharing caused by the large sharing granularity, and only support one-dimension Block or Cyclicblock data distribution schemes. Thus applications running on them will suffer from poor data locality and will be able to exploit parallelism only when using a large number of processors. In this paper, a way towards supporting flexible data distribution (FDD) on software DSM system is presented. Small granularity-tunable blocks, the size of which can be set by compiler or programmer, are used to overlap the working data sets distributed among processors. The FDD was implemented on a software DSM system called JIAJIA. Compared with Block/Cyclic-block distribution schemes used by most DSM systems now, experiments show that the proposed way of flexible data distribution is more effective. The performance of the applications used in the experiments is significantly improved.展开更多
The performance gap between software DSM systems and message passing platforms prevents the prevalence of software DSM system greatly, though great efforts have been delivered in this area in the past decade. In this ...The performance gap between software DSM systems and message passing platforms prevents the prevalence of software DSM system greatly, though great efforts have been delivered in this area in the past decade. In this paper, we take the challenge to find where we should focus our efforts in the future design. The components of total system overhead of software DSM systems are analyzed in detail firstly. Based on a state-of-the-art software DSM system JIAJIA, we measure these components on Dawning parallel system and draw five important conclusions which are different from some traditional viewpoints. (1) The performance of the JIAJIA software DSM system is acceptable. For four of eight applications, the parallel ef ficiency achieved by JIAJIA is about 80%, while for two others, 70% efficiency can be obtained. (2) 40.94% interrupt service time is overlapped with waiting time. (3) Encoding and decoding diffs do not cost much time (<1%), so using hardware sup port to encode/decode diffs and send/receive messages is not worthwhile. (4) Great endeavours should be put to reduce data miss penalty and optimize synchronization operations, which occupy 11.75% and 13.65% of total execution time respectively.(5) Communication hardware overhead occupies 66.76% of the whole communication time in the experimental environment, and communication software overhead does not take much time as expected. Moreover, by studying the effect of CPU speed to system overhead, we find that the common speedup formula for distributed memory systems does not work under software DSM systems. Therefore, we design a new speedup formula special to software DSM systems, and point out that when the CPU speed increases the speedup can be increased too even if the network speed is fixed, which is impossible in message passing systems. Finally, we argue that JIAJIA system has desired scalability.展开更多
文摘雷达遥感图像的处理,由于受单机内存空间的限制,一般采用I/O函数随机访问磁盘图像文件的方式,因此完成整幅图像的处理需要耗费大量的时间,很难达到实际应用的需要。基于分布式共享内存网络系统JIAJIA软件将多台微机的物理内存连接构成一个较大的共享内存空间,实现了多台微机对遥感图像同步、方便、快捷的处理。通过对SAR图像几何纠正、图像滤波、监督分类串行算法的分析,发展了相应的并行处理算法,并在8台运行Linux操作系统,主频400MHz,内存256兆的Pentium II PC机上进行了实验,都获得了超线性加速比的实验结果。
文摘目的红外小目标检测旨在从复杂背景中准确识别和定位弱小红外目标,在海上侦查、军事救援等任务中具有重要的应用前景。然而,受限于红外图像中目标尺寸小、对比度低等因素,当前的检测方法仍难以实现检测精度与误报率之间的平衡。针对上述问题,提出一种基于选择性注意力的红外小目标检测网络(selective attentionbased network for infrared small target detection,SANet)。方法具体而言,设计了双路径语义感知模块,用于增强网络对弱小目标的感知能力。该模块融合了标准卷积与风车型卷积两种路径,兼顾局部空间一致性与全局上下文感知能力,并进一步引入空间/通道注意力机制以细化特征表达,从而有效提升了目标与背景的可区分性。此外,为克服U-Net中静态跳跃连接在特征融合中的局限性,进一步提出了选择性注意力融合模块。该模块基于空间动态权重机制实现跨尺度特征的自适应融合,能够根据空间显著性选择性增强关键区域,从而提升对真实目标与误报的辨识能力。结果在3个公开基准数据集上的实验结果验证了所提出的SANet在交并比(intersection over union,IoU)、nIoU、P_(d)和F_(a)4个指标上均优于现有先进方法,其中,本文方法的IoU指标在上述基准数据集上比次优方法分别提升1.93%、4.32%和2.21%。结论SANet有效增强了网络对小目标的感知能力、关键特征的表达能力以及背景干扰的抑制能力。源代码可以在https://gitcode.com/m0_61988291/SANet上获取。
文摘This study investigates the impacts of mixing time,execution procedure,cement dosage(α),and total water-to-cement ratio(W_(Total)/C)on the mixing energy(E)of deep soil mixing(DSM)columns and how E influences the strength of treated sand.Columns with a diameter of 7.5 cm were constructed using three mixing times(130,190,and 250 s),two execution procedures(normal and zigzag),threeαvalues(300,400,and 500 kg/m^(3)),and three W_(Total)/C ratios(2.5,3.0,and 3.5).For comparison,equivalent laboratory samples were also examined.Results revealed that increasing the mixing time andα,adopting the zigzag execution procedure,and reducing the W_(Total)/C ratio increase E.Outcomes indicated that an increase in E from 0.49-0.70 kJ to 0.70-0.90 kJ,0.90-1.10 kJ,and 1.10-1.40 kJ improves the unconfined compressive strength(UCS)of columns on average by 66%,124%,and 179%,respectively,and the secant modulus by 61%,110%,and 152%.Average strain at maximum stress also rises from 0.68%to 0.75%,0.81%,and 0.84%,respectively.The study identified a threshold in the direct relationship between E and the strength ratio(λ),beyond whichλdid not increase significantly with further increases in E.Additionally,at low and high E levels,DSM samples mainly failed by crushing and cracking modes,respectively.In DSM columns withα=500 kg/m^(3)and W_(Total)/C=2.5,increasing average E from 0.77 kJ to 0.95 kJ,1.08 kJ,and 1.28 kJ resulted in a reduction of coefficients of variation of UCS from 30.4%to 27.8%,24.5%,and 21.1%,respectively.
基金the National Natural Science Foundation of China (No.60073018).
文摘A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records the interleaving string of INV (invalidation) and GETP (getting a remote page) operations for each cached page and analyzes the periodicity of the string when a page is invalidated on a lock or barrier. A prefetching request is issued after the lock or barrier if the periodicity analysis indicates that GETP will be the next operation in the string. Multiple prefetching requests are merged into the same message if they are to the same host. Performance evaluation with eight well-accepted benchmarks in a cluster of sixteen PowerPC workstations shows that the prefetching scheme can significantly reduce the page fault overhead and as a result achieves a performance increase of 15%-20% in three benchmarks and around 8%-10% in another three. The average extra traffic caused by useless prefetches is only 7%-13% in the evaluation.
基金The work of this paper is supported by the National '863' High-Tech Programme of China under grant No. 863-306-ZD01-02- 5 and N
文摘Page-based software DSM systems suffer from false sharing caused by the large sharing granularity, and only support one-dimension Block or Cyclicblock data distribution schemes. Thus applications running on them will suffer from poor data locality and will be able to exploit parallelism only when using a large number of processors. In this paper, a way towards supporting flexible data distribution (FDD) on software DSM system is presented. Small granularity-tunable blocks, the size of which can be set by compiler or programmer, are used to overlap the working data sets distributed among processors. The FDD was implemented on a software DSM system called JIAJIA. Compared with Block/Cyclic-block distribution schemes used by most DSM systems now, experiments show that the proposed way of flexible data distribution is more effective. The performance of the applications used in the experiments is significantly improved.
文摘The performance gap between software DSM systems and message passing platforms prevents the prevalence of software DSM system greatly, though great efforts have been delivered in this area in the past decade. In this paper, we take the challenge to find where we should focus our efforts in the future design. The components of total system overhead of software DSM systems are analyzed in detail firstly. Based on a state-of-the-art software DSM system JIAJIA, we measure these components on Dawning parallel system and draw five important conclusions which are different from some traditional viewpoints. (1) The performance of the JIAJIA software DSM system is acceptable. For four of eight applications, the parallel ef ficiency achieved by JIAJIA is about 80%, while for two others, 70% efficiency can be obtained. (2) 40.94% interrupt service time is overlapped with waiting time. (3) Encoding and decoding diffs do not cost much time (<1%), so using hardware sup port to encode/decode diffs and send/receive messages is not worthwhile. (4) Great endeavours should be put to reduce data miss penalty and optimize synchronization operations, which occupy 11.75% and 13.65% of total execution time respectively.(5) Communication hardware overhead occupies 66.76% of the whole communication time in the experimental environment, and communication software overhead does not take much time as expected. Moreover, by studying the effect of CPU speed to system overhead, we find that the common speedup formula for distributed memory systems does not work under software DSM systems. Therefore, we design a new speedup formula special to software DSM systems, and point out that when the CPU speed increases the speedup can be increased too even if the network speed is fixed, which is impossible in message passing systems. Finally, we argue that JIAJIA system has desired scalability.