The IPCC has drawn attention to an apparent leveling-off of globally-averaged temperatures over the past 15 years or so. Measuring the duration of the hiatus has implications for determining if the underlying trend ha...The IPCC has drawn attention to an apparent leveling-off of globally-averaged temperatures over the past 15 years or so. Measuring the duration of the hiatus has implications for determining if the underlying trend has changed, and for evaluating climate models. Here, I propose a method for estimating the duration of the hiatus that is robust to unknown forms of heteroskedasticity and autocorrelation (HAC) in the temperature series and to cherry-picking of endpoints. For the specific case of global average temperatures I also add the requirement of spatial consistency between hemispheres. The method makes use of the Vogelsang-Franses (2005) HAC-robust trend variance estimator which is valid as long as the underlying series is trend stationary, which is the case for the data used herein. Application of the method shows that there is now a trendless interval of 19 years duration at the end of the HadCRUT4 surface temperature series, and of 16 - 26 years in the lower troposphere. Use of a simple AR1 trend model suggests a shorter hiatus of 14 - 20 years but is likely unreliable.展开更多
This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests p...This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests proposed is established. The asymptotic relative performance of the proposed class of test with respect to the optimal member of Xie and Priebe (2000) is studied in terms of Pitman efficiency for various underlying distributions.展开更多
When the observed price process is the true underlying price process plus microstructure noise, it is known that realized volatility (RV) estimates will be overwhelmed by the noise when the sampling frequency approach...When the observed price process is the true underlying price process plus microstructure noise, it is known that realized volatility (RV) estimates will be overwhelmed by the noise when the sampling frequency approaches infinity. Therefore, it may be optimal to sample less frequently, and averaging the less frequently sampled subsamples can improve estimation for quadratic variation. In this paper, we extend this idea to forecasting daily realized volatility. While subsample averaging has been proposed and used in estimating RV, this paper is the first that uses subsample averaging for forecasting RV. The subsample averaging method we examine incorporates the high frequency data in different levels of systematic sampling. It first pools the high frequency data into several subsamples, then generates forecasts from each subsample, and then combines these forecasts. We find that in daily S&P 500 return realized volatility forecasts, subsample averaging generates better forecasts than those using only one subsample.展开更多
Markov Chain Monte Carlo(MCMC) requires to evaluate the full data likelihood at different parameter values iteratively and is often computationally infeasible for large data sets. This paper proposes to approximate th...Markov Chain Monte Carlo(MCMC) requires to evaluate the full data likelihood at different parameter values iteratively and is often computationally infeasible for large data sets. This paper proposes to approximate the log-likelihood with subsamples taken according to nonuniform subsampling probabilities, and derives the most likely optimal(MLO) subsampling probabilities for better approximation. Compared with existing subsampled MCMC algorithm with equal subsampling probabilities,the MLO subsampled MCMC has a higher estimation efficiency with the same subsampling ratio. The authors also derive a formula using the asymptotic distribution of the subsampled log-likelihood to determine the required subsample size in each MCMC iteration for a given level of precision. This formula is used to develop an adaptive version of the MLO subsampled MCMC algorithm. Numerical experiments demonstrate that the proposed method outperforms the uniform subsampled MCMC.展开更多
This paper investigates methods of value-at-risk (VaR) estimation using extreme value theory (EVT). It compares two different estimation methods, 'two-step subsample bootstrap' based on moment estimation and m...This paper investigates methods of value-at-risk (VaR) estimation using extreme value theory (EVT). It compares two different estimation methods, 'two-step subsample bootstrap' based on moment estimation and maximum likelihood estimation (MLE), according to their theoretical bases and computation procedures. Then, the estimation results are analyzed together with those of normal method and empirical method. The empirical research of foreign exchange data shows that the EVT methods have good characters in estimating VaR under extreme conditions and 'two-step subsample bootstrap' method is preferable to MLE.展开更多
CMOS analog and mixed-signal phase-locked loops(PLL)are widely used in varies of the system-on-chips(SoC)as the clock generator or frequency synthesizer.This paper presents an overview of the AMS-PLL,including:1)a bri...CMOS analog and mixed-signal phase-locked loops(PLL)are widely used in varies of the system-on-chips(SoC)as the clock generator or frequency synthesizer.This paper presents an overview of the AMS-PLL,including:1)a brief introduction of the basics of the charge-pump based PLL,which is the most widely used AMS-PLL architecture due to its simplicity and robustness;2)a summary of the design issues of the basic CPPLL architecture;3)a systematic introduction of the techniques for the performance enhancement of the CPPLL;4)a brief overview of ultra-low-jitter AMS-PLL architectures which can achieve lower jitter(<100 fs)with lower power consumption compared with the CPPLL,including the injection-locked PLL(ILPLL),subsampling(SSPLL)and sampling PLL(SPLL);5)a discussion about the consideration of the AMS-PLL architecture selection,which could help designers meet their performance requirements.展开更多
Higher requirements for the accuracy of relevant models are put throughout the transformation and upgrade of the iron and steel sector to intelligent production.It has been difficult to meet the needs of the field wit...Higher requirements for the accuracy of relevant models are put throughout the transformation and upgrade of the iron and steel sector to intelligent production.It has been difficult to meet the needs of the field with the usual prediction model of mechanical properties of hotrolled strip.Insufficient data and difficult parameter adjustment limit deep learning models based on multi-layer networks in practical applications;besides,the limited discrete process parameters used make it impossible to effectively depict the actual strip processing process.In order to solve these problems,this research proposed a new sampling approach for mechanical characteristics input data of hot-rolled strip based on the multi-grained cascade forest(gcForest)framework.According to the characteristics of complex process flow and abnormal sensitivity of process path and parameters to product quality in the hot-rolled strip production,a three-dimensional continuous time series process data sampling method based on time-temperature-deformation was designed.The basic information of strip steel(chemical composition and typical process parameters)is fused with the local process information collected by multi-grained scanning,so that the next link’s input has both local and global features.Furthermore,in the multi-grained scanning structure,a sub sampling scheme with a variable window was designed,so that input data with different dimensions can get output characteristics of the same dimension after passing through the multi-grained scanning structure,allowing the cascade forest structure to be trained normally.Finally,actual production data of three steel grades was used to conduct the experimental evaluation.The results revealed that the gcForest-based mechanical property prediction model outperforms the competition in terms of comprehensive performance,ease of parameter adjustment,and ability to sustain high prediction accuracy with fewer samples.展开更多
As a less time-consuming procedure, subsampling technology has been widely used in biological monitoring and assessment programs. It is clear that subsampling counts af fect the value of traditional biodiversity indic...As a less time-consuming procedure, subsampling technology has been widely used in biological monitoring and assessment programs. It is clear that subsampling counts af fect the value of traditional biodiversity indices, but its ef fect on taxonomic distinctness(TD) indices is less well studied. Here, we examined the responses of traditional(species richness, Shannon-Wiener diversity) and TD(average taxonomic distinctness: Δ +, and variation in taxonomic distinctness: Λ +) indices to subsample counts using a random subsampling procedure from 50 to 400 individuals, based on macroinvertebrate datasets from three dif ferent river systems in China. At regional scale, taxa richness asymptotically increased with ?xed-count size; ≥250–300 individuals to express 95% information of the raw data. In contrast, TD indices were less sensitive to the subsampling procedure. At local scale, TD indices were more stable and had less deviation than species richness and Shannon-Wiener index, even at low subsample counts, with ≥100 individuals needed to estimate 95% of the information of the actual Δ + and Λ + in the three river basins. We also found that abundance had a certain ef fect on diversity indices during the subsampling procedure, with dif ferent subsampling counts for species richness and TD indices varying by regions. Therefore, we suggest that TD indices are suitable for biodiversity assessment and environment monitoring. Meanwhile, pilot analyses are necessary when to determine the appropriate subsample counts for bioassessment in a new region or habitat type.展开更多
Conventional full-waveform inversion is computationally intensive because it considers all shots in each iteration. To tackle this, we establish the number of shots needed and propose multiscale inversion in the frequ...Conventional full-waveform inversion is computationally intensive because it considers all shots in each iteration. To tackle this, we establish the number of shots needed and propose multiscale inversion in the frequency domain while using only the shots that are positively correlated with frequency. When using low-frequency data, the method considers only a small number of shots and raw data. More shots are used with increasing frequency. The random-in-group subsampling method is used to rotate the shots between iterations and avoid the loss of shot information. By reducing the number of shots in the inversion, we decrease the computational cost. There is no crosstalk between shots, no noise addition, and no observational limits. Numerical modeling suggests that the proposed method reduces the computing time, is more robust to noise, and produces better velocity models when using data with noise.展开更多
We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method...We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation;it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.展开更多
Color transfer between images uses the statistics information of image effectively. We present a novel approach of local color transfer between images based on the simple statistics and locally linear embedding. A ske...Color transfer between images uses the statistics information of image effectively. We present a novel approach of local color transfer between images based on the simple statistics and locally linear embedding. A sketching interface is proposed for quickly and easily specifying the color correspondences between target and source image. The user can specify the corre- spondences of local region using scribes, which more accurately transfers the target color to the source image while smoothly preserving the boundaries, and exhibits more natural output results. Our algorithm is not restricted to one-to-one image color transfer and can make use of more than one target images to transfer the color in different regions in the source image. Moreover, our algorithm does not require to choose the same color style and image size between source and target images. We propose the sub-sampling to reduce the computational load. Comparing with other approaches, our algorithm is much better in color blending in the input data. Our approach preserves the other color details in the source image. Various experimental results show that our approach specifies the correspondences of local color region in source and target images. And it expresses the intention of users and generates more actual and natural results of visual effect.展开更多
A new faster block-matching algorithm (BMA) by using both search candidate and pixd sulzsamplings is proposed. Firstly a pixd-subsampling approach used in adjustable partial distortion search (APDS) is adjusted to...A new faster block-matching algorithm (BMA) by using both search candidate and pixd sulzsamplings is proposed. Firstly a pixd-subsampling approach used in adjustable partial distortion search (APDS) is adjusted to visit about half points of all search candidates by subsampling them, using a spiral-scanning path with one skip. Two sdected candidates that have minimal and second minimal block distortion measures are obtained. Then a fine-tune step is taken around them to find the best one. Some analyses are given to approve the rationality of the approach of this paper. Experimental results show that, as compared to APDS, the proposed algorithm can enhance the block-matching speed by about 30% while maintaining its MSE performance very close to that of it. And it performs much better than many other BMAs such as TSS, NTSS, UCDBS and NPDS.展开更多
An imbalanced dataset is commonly found in at least one class,which are typically exceeded by the other ones.A machine learning algorithm(classifier)trained with an imbalanced dataset predicts the majority class(frequ...An imbalanced dataset is commonly found in at least one class,which are typically exceeded by the other ones.A machine learning algorithm(classifier)trained with an imbalanced dataset predicts the majority class(frequently occurring)more than the other minority classes(rarely occurring).Training with an imbalanced dataset poses challenges for classifiers;however,applying suitable techniques for reducing class imbalance issues can enhance classifiers’performance.In this study,we consider an imbalanced dataset from an educational context.Initially,we examine all shortcomings regarding the classification of an imbalanced dataset.Then,we apply data-level algorithms for class balancing and compare the performance of classifiers.The performance of the classifiers is measured using the underlying information in their confusion matrices,such as accuracy,precision,recall,and F measure.The results show that classification with an imbalanced dataset may produce high accuracy but low precision and recall for the minority class.The analysis confirms that undersampling and oversampling are effective for balancing datasets,but the latter dominates.展开更多
Subsampling plays a crucial role in enhancing the efficiency of Markov chain Monte Carlo(MCMC)algorithms.This paper presents a subsampling-based MCMC algorithm aimed at addressing the computational complexity challeng...Subsampling plays a crucial role in enhancing the efficiency of Markov chain Monte Carlo(MCMC)algorithms.This paper presents a subsampling-based MCMC algorithm aimed at addressing the computational complexity challenges of traditional MCMC methods on large-scale datasets.The proposed approach significantly reduces computational costs by approximating the full data likelihood function using only a subset of the full data in each iteration.The subsampling process is guided by the fidelity to the full data,which is measured by the energy distance.The resulting algorithm,termed the energy distancebased subsampling MCMC(EDSS-MCMC),offers a flexible approach while maintaining the simplicity of the standard MCMC algorithm.Additionally,we provide an analysis of the invariant distribution generated by the EDSS-MCMC algorithm and quantify the total variation norm between this distribution and the target distribution.Numerical experiments demonstrate the outstanding performance of the proposed algorithm on large-scale datasets.Compared with the standard MCMC algorithm and other subsampling MCMC algorithms,the EDSS-MCMC algorithm exhibits advantages in terms of accuracy and computational speed.Therefore,the proposed algorithm holds practical significance in tasks involving large-scale dataset analysis and machine learning.展开更多
A microwave photonic subsampling digital receiver(MPSDR)is proposed and experimentally demonstrated for target detection with a sampling rate of 10 MSa/s.Stepped and pseudo-random frequency-hopping signals with freque...A microwave photonic subsampling digital receiver(MPSDR)is proposed and experimentally demonstrated for target detection with a sampling rate of 10 MSa/s.Stepped and pseudo-random frequency-hopping signals with frequencies across the K band are both used for target detection and can be captured by the MPSDR.The range profiles of the targets are then derived using a compressed sensing algorithm,and precise target position estimation is achieved by changing the measurement position of the antenna pair.The results demonstrate that the estimation accuracy remains comparable even when the pseudo-random frequency-hopping signal utilizes only 12.5%of the frequency points required by the stepped frequency-hopping signal.This highlights the efficiency and potential of the proposed MPSDR in processing complex signals while maintaining high accuracy.展开更多
文摘The IPCC has drawn attention to an apparent leveling-off of globally-averaged temperatures over the past 15 years or so. Measuring the duration of the hiatus has implications for determining if the underlying trend has changed, and for evaluating climate models. Here, I propose a method for estimating the duration of the hiatus that is robust to unknown forms of heteroskedasticity and autocorrelation (HAC) in the temperature series and to cherry-picking of endpoints. For the specific case of global average temperatures I also add the requirement of spatial consistency between hemispheres. The method makes use of the Vogelsang-Franses (2005) HAC-robust trend variance estimator which is valid as long as the underlying series is trend stationary, which is the case for the data used herein. Application of the method shows that there is now a trendless interval of 19 years duration at the end of the HadCRUT4 surface temperature series, and of 16 - 26 years in the lower troposphere. Use of a simple AR1 trend model suggests a shorter hiatus of 14 - 20 years but is likely unreliable.
文摘This paper presents a new class of test procedures for two-sample location problem based on subsample quantiles. The class includes Mann-Whitney test as a special case. The asymptotic normality of the class of tests proposed is established. The asymptotic relative performance of the proposed class of test with respect to the optimal member of Xie and Priebe (2000) is studied in terms of Pitman efficiency for various underlying distributions.
文摘When the observed price process is the true underlying price process plus microstructure noise, it is known that realized volatility (RV) estimates will be overwhelmed by the noise when the sampling frequency approaches infinity. Therefore, it may be optimal to sample less frequently, and averaging the less frequently sampled subsamples can improve estimation for quadratic variation. In this paper, we extend this idea to forecasting daily realized volatility. While subsample averaging has been proposed and used in estimating RV, this paper is the first that uses subsample averaging for forecasting RV. The subsample averaging method we examine incorporates the high frequency data in different levels of systematic sampling. It first pools the high frequency data into several subsamples, then generates forecasts from each subsample, and then combines these forecasts. We find that in daily S&P 500 return realized volatility forecasts, subsample averaging generates better forecasts than those using only one subsample.
基金supported by US National Science Fundation under Grant No. 1812013。
文摘Markov Chain Monte Carlo(MCMC) requires to evaluate the full data likelihood at different parameter values iteratively and is often computationally infeasible for large data sets. This paper proposes to approximate the log-likelihood with subsamples taken according to nonuniform subsampling probabilities, and derives the most likely optimal(MLO) subsampling probabilities for better approximation. Compared with existing subsampled MCMC algorithm with equal subsampling probabilities,the MLO subsampled MCMC has a higher estimation efficiency with the same subsampling ratio. The authors also derive a formula using the asymptotic distribution of the subsampled log-likelihood to determine the required subsample size in each MCMC iteration for a given level of precision. This formula is used to develop an adaptive version of the MLO subsampled MCMC algorithm. Numerical experiments demonstrate that the proposed method outperforms the uniform subsampled MCMC.
基金the National Natural Science Foundation of China (No. 79970041).
文摘This paper investigates methods of value-at-risk (VaR) estimation using extreme value theory (EVT). It compares two different estimation methods, 'two-step subsample bootstrap' based on moment estimation and maximum likelihood estimation (MLE), according to their theoretical bases and computation procedures. Then, the estimation results are analyzed together with those of normal method and empirical method. The empirical research of foreign exchange data shows that the EVT methods have good characters in estimating VaR under extreme conditions and 'two-step subsample bootstrap' method is preferable to MLE.
基金supported by the Pioneer Hundred Talents Program,Chinese Academy of Sciences.
文摘CMOS analog and mixed-signal phase-locked loops(PLL)are widely used in varies of the system-on-chips(SoC)as the clock generator or frequency synthesizer.This paper presents an overview of the AMS-PLL,including:1)a brief introduction of the basics of the charge-pump based PLL,which is the most widely used AMS-PLL architecture due to its simplicity and robustness;2)a summary of the design issues of the basic CPPLL architecture;3)a systematic introduction of the techniques for the performance enhancement of the CPPLL;4)a brief overview of ultra-low-jitter AMS-PLL architectures which can achieve lower jitter(<100 fs)with lower power consumption compared with the CPPLL,including the injection-locked PLL(ILPLL),subsampling(SSPLL)and sampling PLL(SPLL);5)a discussion about the consideration of the AMS-PLL architecture selection,which could help designers meet their performance requirements.
基金financially supported by the National Natural Science Foundation of China(No.52004029)the Fundamental Research Funds for the Central Universities,China(No.FRF-TT-20-06).
文摘Higher requirements for the accuracy of relevant models are put throughout the transformation and upgrade of the iron and steel sector to intelligent production.It has been difficult to meet the needs of the field with the usual prediction model of mechanical properties of hotrolled strip.Insufficient data and difficult parameter adjustment limit deep learning models based on multi-layer networks in practical applications;besides,the limited discrete process parameters used make it impossible to effectively depict the actual strip processing process.In order to solve these problems,this research proposed a new sampling approach for mechanical characteristics input data of hot-rolled strip based on the multi-grained cascade forest(gcForest)framework.According to the characteristics of complex process flow and abnormal sensitivity of process path and parameters to product quality in the hot-rolled strip production,a three-dimensional continuous time series process data sampling method based on time-temperature-deformation was designed.The basic information of strip steel(chemical composition and typical process parameters)is fused with the local process information collected by multi-grained scanning,so that the next link’s input has both local and global features.Furthermore,in the multi-grained scanning structure,a sub sampling scheme with a variable window was designed,so that input data with different dimensions can get output characteristics of the same dimension after passing through the multi-grained scanning structure,allowing the cascade forest structure to be trained normally.Finally,actual production data of three steel grades was used to conduct the experimental evaluation.The results revealed that the gcForest-based mechanical property prediction model outperforms the competition in terms of comprehensive performance,ease of parameter adjustment,and ability to sustain high prediction accuracy with fewer samples.
基金Supported by the National Natural Science Foundation of China(Nos.31400469,41571495,31770460)the National Science and Technology Basic Research Program(No.2015FY110400-4)+2 种基金the China Three Gorges Corporation Research Project(No.JGJ/0272015)the Key Program of the Chinese Academy of Sciences(Comprehensive Assessment Technology of River Ecology and Environment for the Water Source Region of "South-toNorth Water Diversion Central Route")the Program for Biodiversity Protection(No.2017HB2096001006)
文摘As a less time-consuming procedure, subsampling technology has been widely used in biological monitoring and assessment programs. It is clear that subsampling counts af fect the value of traditional biodiversity indices, but its ef fect on taxonomic distinctness(TD) indices is less well studied. Here, we examined the responses of traditional(species richness, Shannon-Wiener diversity) and TD(average taxonomic distinctness: Δ +, and variation in taxonomic distinctness: Λ +) indices to subsample counts using a random subsampling procedure from 50 to 400 individuals, based on macroinvertebrate datasets from three dif ferent river systems in China. At regional scale, taxa richness asymptotically increased with ?xed-count size; ≥250–300 individuals to express 95% information of the raw data. In contrast, TD indices were less sensitive to the subsampling procedure. At local scale, TD indices were more stable and had less deviation than species richness and Shannon-Wiener index, even at low subsample counts, with ≥100 individuals needed to estimate 95% of the information of the actual Δ + and Λ + in the three river basins. We also found that abundance had a certain ef fect on diversity indices during the subsampling procedure, with dif ferent subsampling counts for species richness and TD indices varying by regions. Therefore, we suggest that TD indices are suitable for biodiversity assessment and environment monitoring. Meanwhile, pilot analyses are necessary when to determine the appropriate subsample counts for bioassessment in a new region or habitat type.
基金financially supported by the Fundamental Research Funds for the Central Universities(No.201822011)the National Natural Science Foundation of China(No.41674118)the National Science and Technology Major Project(No.2016ZX05027002)
文摘Conventional full-waveform inversion is computationally intensive because it considers all shots in each iteration. To tackle this, we establish the number of shots needed and propose multiscale inversion in the frequency domain while using only the shots that are positively correlated with frequency. When using low-frequency data, the method considers only a small number of shots and raw data. More shots are used with increasing frequency. The random-in-group subsampling method is used to rotate the shots between iterations and avoid the loss of shot information. By reducing the number of shots in the inversion, we decrease the computational cost. There is no crosstalk between shots, no noise addition, and no observational limits. Numerical modeling suggests that the proposed method reduces the computing time, is more robust to noise, and produces better velocity models when using data with noise.
文摘We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation;it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.
基金supported by the National Natural Science Foundation of China(61672482,11626253)the One Hundred Talent Project of the Chinese Academy of Sciences
文摘Color transfer between images uses the statistics information of image effectively. We present a novel approach of local color transfer between images based on the simple statistics and locally linear embedding. A sketching interface is proposed for quickly and easily specifying the color correspondences between target and source image. The user can specify the corre- spondences of local region using scribes, which more accurately transfers the target color to the source image while smoothly preserving the boundaries, and exhibits more natural output results. Our algorithm is not restricted to one-to-one image color transfer and can make use of more than one target images to transfer the color in different regions in the source image. Moreover, our algorithm does not require to choose the same color style and image size between source and target images. We propose the sub-sampling to reduce the computational load. Comparing with other approaches, our algorithm is much better in color blending in the input data. Our approach preserves the other color details in the source image. Various experimental results show that our approach specifies the correspondences of local color region in source and target images. And it expresses the intention of users and generates more actual and natural results of visual effect.
基金This project was supported by the National Natural Science Foundation of China (60272099) .
文摘A new faster block-matching algorithm (BMA) by using both search candidate and pixd sulzsamplings is proposed. Firstly a pixd-subsampling approach used in adjustable partial distortion search (APDS) is adjusted to visit about half points of all search candidates by subsampling them, using a spiral-scanning path with one skip. Two sdected candidates that have minimal and second minimal block distortion measures are obtained. Then a fine-tune step is taken around them to find the best one. Some analyses are given to approve the rationality of the approach of this paper. Experimental results show that, as compared to APDS, the proposed algorithm can enhance the block-matching speed by about 30% while maintaining its MSE performance very close to that of it. And it performs much better than many other BMAs such as TSS, NTSS, UCDBS and NPDS.
文摘An imbalanced dataset is commonly found in at least one class,which are typically exceeded by the other ones.A machine learning algorithm(classifier)trained with an imbalanced dataset predicts the majority class(frequently occurring)more than the other minority classes(rarely occurring).Training with an imbalanced dataset poses challenges for classifiers;however,applying suitable techniques for reducing class imbalance issues can enhance classifiers’performance.In this study,we consider an imbalanced dataset from an educational context.Initially,we examine all shortcomings regarding the classification of an imbalanced dataset.Then,we apply data-level algorithms for class balancing and compare the performance of classifiers.The performance of the classifiers is measured using the underlying information in their confusion matrices,such as accuracy,precision,recall,and F measure.The results show that classification with an imbalanced dataset may produce high accuracy but low precision and recall for the minority class.The analysis confirms that undersampling and oversampling are effective for balancing datasets,but the latter dominates.
基金supported by National Natural Science Foundation of China(Grant Nos.12401324,12131001,12371259 and 12371260)National Key Research and Development Program of China(Grant No.2020YFA0714102).
文摘Subsampling plays a crucial role in enhancing the efficiency of Markov chain Monte Carlo(MCMC)algorithms.This paper presents a subsampling-based MCMC algorithm aimed at addressing the computational complexity challenges of traditional MCMC methods on large-scale datasets.The proposed approach significantly reduces computational costs by approximating the full data likelihood function using only a subset of the full data in each iteration.The subsampling process is guided by the fidelity to the full data,which is measured by the energy distance.The resulting algorithm,termed the energy distancebased subsampling MCMC(EDSS-MCMC),offers a flexible approach while maintaining the simplicity of the standard MCMC algorithm.Additionally,we provide an analysis of the invariant distribution generated by the EDSS-MCMC algorithm and quantify the total variation norm between this distribution and the target distribution.Numerical experiments demonstrate the outstanding performance of the proposed algorithm on large-scale datasets.Compared with the standard MCMC algorithm and other subsampling MCMC algorithms,the EDSS-MCMC algorithm exhibits advantages in terms of accuracy and computational speed.Therefore,the proposed algorithm holds practical significance in tasks involving large-scale dataset analysis and machine learning.
基金supported by the Innovation Capacity Building Plan(Science and Technology Facilities)of Jiangsu(No.BM2022017)the Fundamental Research Funds for the Central Universities(No.NI023003)。
文摘A microwave photonic subsampling digital receiver(MPSDR)is proposed and experimentally demonstrated for target detection with a sampling rate of 10 MSa/s.Stepped and pseudo-random frequency-hopping signals with frequencies across the K band are both used for target detection and can be captured by the MPSDR.The range profiles of the targets are then derived using a compressed sensing algorithm,and precise target position estimation is achieved by changing the measurement position of the antenna pair.The results demonstrate that the estimation accuracy remains comparable even when the pseudo-random frequency-hopping signal utilizes only 12.5%of the frequency points required by the stepped frequency-hopping signal.This highlights the efficiency and potential of the proposed MPSDR in processing complex signals while maintaining high accuracy.