This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a c...This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.展开更多
In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different ...In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.展开更多
Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input da...Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.展开更多
Robust principal component analysis(PCA) is widely used in many applications, such as image processing, data mining and bioinformatics. The existing methods for solving the robust PCA are mostly based on nuclear norm ...Robust principal component analysis(PCA) is widely used in many applications, such as image processing, data mining and bioinformatics. The existing methods for solving the robust PCA are mostly based on nuclear norm minimization. Those methods simultaneously minimize all the singular values, and thus the rank cannot be well approximated in practice. We extend the idea of truncated nuclear norm regularization(TNNR) to the robust PCA and consider truncated nuclear norm minimization(TNNM) instead of nuclear norm minimization(NNM). This method only minimizes the smallest N-r singular values to preserve the low-rank components, where N is the number of singular values and r is the matrix rank. Moreover, we propose an effective way to determine r via the shrinkage operator. Then we develop an effective iterative algorithm based on the alternating direction method to solve this optimization problem. Experimental results demonstrate the efficiency and accuracy of the TNNM method. Moreover, this method is much more robust in terms of the rank of the reconstructed matrix and the sparsity of the error.展开更多
This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverag...This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.展开更多
Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake predi...Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.展开更多
This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers w...This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.展开更多
With the rapid growth of the international banking industry,bank failures can lead to severe economic losses and social impacts.Although existing measures to address such failures are well-developed,timely prediction ...With the rapid growth of the international banking industry,bank failures can lead to severe economic losses and social impacts.Although existing measures to address such failures are well-developed,timely prediction can significantly mitigate these effects.This study analyzes key indicators influencing bank fail-ure through data analysis and correlation analysis,then develops a neural net-work-based risk prediction model to estimate failure probabilities.First,we ex-tracted 64 indicators from the dataset,identified the most relevant indicators using the entropy weight method,and established a bank efficiency evaluation formula to determine the failure threshold.Next,we applied principal compo-nent analysis(PCA)to reduce dimensionality and derive a comprehensive scoring formula.Based on these findings,we constructed a machine learning model in MATLAB to predict bank failures.Finally,the model was used to predict the failure probabilities of all banks and identify 20 representative existing and failed banks.The developed models effectively predict bank fail-ure risks and demonstrate strong applicability across different scenarios.展开更多
Recently,the tensor robust principal component analysis(TRPCA),aiming to recover the true low-rank tensor from noisy data,has attracted considerable attention.In this paper,we solve the TRPCA problem under the framewo...Recently,the tensor robust principal component analysis(TRPCA),aiming to recover the true low-rank tensor from noisy data,has attracted considerable attention.In this paper,we solve the TRPCA problem under the framework of the tensor singular value decomposition(t-SVD).Since the convex relaxation approaches have some limitations,we establish a new non-convex TRPCA model by introducing the non-convex tensor rank approximation based on the Laplace function via the weighted l_(p)-norm regularization.An efficient algorithm based on the alternating direction method of multipliers(ADMM)is developed to solve the proposed model.We further prove that the constructed sequence converges to the desirable Karush-Kuhn-Tucker point.Experimental results show that the proposed approach outperforms various latest approaches in the literature.展开更多
In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight wate...In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.展开更多
This paper attempts to evaluate the coordinated development state of the subsystems within the internet financial ecosystem in China from 2011 to 2016.Focusing on the main business modes,technological innovation,and t...This paper attempts to evaluate the coordinated development state of the subsystems within the internet financial ecosystem in China from 2011 to 2016.Focusing on the main business modes,technological innovation,and the external environment,we select 29 indicators to construct an index system and adopt a coupling coordination degree model for evaluation.Furthermore,we use two weight calculation methods,entropy weight and principal component analysis,to ensure the robustness of the results.The empirical results show that China’s internet financial ecosystem experienced five development stages from 2011 to 2016,which are moderate disorder,near disorder,weak coordination,intermediate coordination,and good coordination.Different methods of obtaining weights have little effect on the empirical results.These findings suggest that at the beginning,the coordinated development of China’s internet financial ecosystem was hindered by factors including the scarcity of main business modes and the defect of technological innovation;then,with the rapid development of China’s internet industry,the external environment became another drawback in coordinated development.Finally,based on the findings,we give some policy recommendations from a global perspective to achieve a sustainable internet financial ecosystem.展开更多
Discriminating internal layers by radio echo sounding is important in analyzing the thickness and ice deposits in the Antarctic ice sheet.The signal processing method of synthesis aperture radar(SAR)has been widely us...Discriminating internal layers by radio echo sounding is important in analyzing the thickness and ice deposits in the Antarctic ice sheet.The signal processing method of synthesis aperture radar(SAR)has been widely used for improving the signal to noise ratio(SNR)and discriminating internal layers by radio echo sounding data of ice sheets.This method is not efficient when we use edge detection operators to obtain accurate information of the layers,especially the ice-bed interface.This paper presents a new image processing method via a combined robust principal component analysis-total variation(RPCA-TV)approach for discriminating internal layers of ice sheets by radio echo sounding data.The RPCA-based method is adopted to project the high-dimensional observations to low-dimensional subspace structure to accelerate the operation of the TV-based method,which is used to discriminate the internal layers.The efficiency of the presented method has been tested on simulation data and the dataset of the Institute of Electronics,Chinese Academy of Sciences,collected during CHINARE 28.The results show that the new method is more efficient than the previous method in discriminating internal layers of ice sheets by radio echo sounding data.展开更多
We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative t...We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative to their normal counterparts,and whether a commonly used transformation to normality plays any constructive roles in a predictive model based on the FPCA.Our work supplements the conditional growth charts developed by Wei and He(2006) by constructing a predictive growth model based on a small number of principal components scores on individual's past.展开更多
In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological...In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.展开更多
This paper presents a study on the biotic/abiotic conditions of the S?o Giácomo sanitary landfill, located near the city of Caxias do Sul, Brazil, through statistical analysis of fourteen physic-chemical data set...This paper presents a study on the biotic/abiotic conditions of the S?o Giácomo sanitary landfill, located near the city of Caxias do Sul, Brazil, through statistical analysis of fourteen physic-chemical data sets for the leachate, produced in the garbage dump site over a long period of years. Different chemometric methods are used in the statistical analysis. For example, the correlations between the variables, related to the degraded organic matter and biological activity, are determined by means of multivariate methods. The results highlight that BOD, COD, VTS, FTS and TS give information on the anaerobic degradation of the organic matter contained in the cells, and suggest that the greater the contribution of the variables with positive weights in PC1 the greater the level of organic matter degradation. The variables TN, Amon Nit. and alkalinity are related to the biological activity and determine the potency of the variables in relation to time. The greater the contribution of the variables related to organic degradation the greater the values in PC2 and the lesser the potency of these variables, whose influence is greater in the second stage of anaerobic degradation. The variables of PC2 is important plans of the contamination of the leached in the bodies hídrics.展开更多
Joint analysis of multiple phenotypes can have better interpretation of complex diseases and increase statistical power to detect more significant single nucleotide polymorphisms(SNPs)compare to traditional single phe...Joint analysis of multiple phenotypes can have better interpretation of complex diseases and increase statistical power to detect more significant single nucleotide polymorphisms(SNPs)compare to traditional single phenotype analysis in genome-wide association analysis.Principle component analysis(PCA),as a popular dimension reduction method,has been broadly used in the analysis of multiple phenotypes.Since PCA transforms the original phenotypes into principal components(PCs),it is natural to think that by analyzing these PCs,we can combine information across phenotypes.Existing PCA-based methods can be divided into two categories,either selecting one particular PC manually or combining information from all PCs.In this paper,we propose an adaptive principle component test(APCT)which selects and combines the PCs adaptively by using Cauchy combination method.Our proposed method can be seen as a generalization of traditional PCA based method since it contains two existing methods as special situation.Extensive simulation shows that our method is robust and can generate powerful result in various situations.The real data analysis of stock mice data also demonstrate that our proposed APCT can identify significant SNPs that are missed by traditional methods.展开更多
The restaurant is traditional industry of the third industry in our country. Since May 1 in this year, China's Restaurant Industry to implement "replace the business tax with value-added tax" policy and change to p...The restaurant is traditional industry of the third industry in our country. Since May 1 in this year, China's Restaurant Industry to implement "replace the business tax with value-added tax" policy and change to pay VAT. This paper analyzed the possible impact on restaurant industry after the" replace the business tax with value-added tax" based on the understanding of the tax theory of" replace the business tax with value-added tax" and the tax compliance in our country. At the same time, this paper used the statistical analysis of data on the investigation of 100 samples of the VAT cognitive degree on "replace the business tax with value-added tax", using principal component analysis method to analyze and evaluate factors on the awareness of the restaurant owner to "replace the business tax with value-added tax" tax policy. After multiple comparison made on the sample data, this paper summarized and analyzed the countermeasures of improving the pushing effect in restaurant industry "replace the business tax with VAT ".展开更多
Increasing contamination of water resources in the world and our country and decreasing water quality over time, not having met the objectives of utilization of water resources;it has increased the importance of water...Increasing contamination of water resources in the world and our country and decreasing water quality over time, not having met the objectives of utilization of water resources;it has increased the importance of water management. The monitoring of the water resources and evaluation of these monitoring results have given direction to the studies’ outcome in order to control factors that pollute water resources and reduce water quality. Nilüfer Creek is very important for both being a source of drinking and potable water and a discharge area for wastewaters for the city of Bursa. In this study, the results of the analysis belonging to the period between 2002-2010 which are taken from 15 points by General Directorate of Bursa Water and Sewerage Administration (BUWSA) were evaluated in relation to water quality of the Nilüfer Creek. Non-parametric methods were used in the evaluation of the water quality data due to the lack of normally distributed data. The identification of the best represented parameters of the water quality was provided by applying Principal Component Analysis. According to results of the analysis, the best representative 9 parameters from the 19 water quality parameters were defined as parameters of BOD5, COD, TSS, T.Fe, Zn, conductivity, NO2-N, Ni and NO3-N that taking part of the first two components.展开更多
文摘This paper studies the problem of tensor principal component analysis (PCA). Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique, and nuclear norm is used as a convex approximation of the rank operator under mild condition. However, most nuclear norm minimization approaches are based on SVD operations. Given a matrix , the time complexity of SVD operation is O(mn2), which brings prohibitive computational complexity in large-scale problems. In this paper, an efficient and scalable algorithm for tensor principal component analysis is proposed which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis (LADMVTPCA). Different from traditional matrix factorization methods, LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors, which greatly improves the computational efficacy compared to matrix factorization method. In the experiment part, synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA. Results have shown that LADMVTPCA outperforms matrix factorization based method.
基金supported by Zhejiang Provincial Natural Science Foundation of China(LY19F030003)Key Research and Development Project of Zhejiang Province(2021C04030)+1 种基金the National Natural Science Foundation of China(62003306)Educational Commission Research Program of Zhejiang Province(Y202044842)。
文摘In practical process industries,a variety of online and offline sensors and measuring instruments have been used for process control and monitoring purposes,which indicates that the measurements coming from different sources are collected at different sampling rates.To build a complete process monitoring strategy,all these multi-rate measurements should be considered for data-based modeling and monitoring.In this paper,a novel kernel multi-rate probabilistic principal component analysis(K-MPPCA)model is proposed to extract the nonlinear correlations among different sampling rates.In the proposed model,the model parameters are calibrated using the kernel trick and the expectation-maximum(EM)algorithm.Also,the corresponding fault detection methods based on the nonlinear features are developed.Finally,a simulated nonlinear case and an actual pre-decarburization unit in the ammonia synthesis process are tested to demonstrate the efficiency of the proposed method.
文摘Principal Component Analysis(PCA)is one of the most important feature extraction methods,and Kernel Principal Component Analysis(KPCA)is a nonlinear extension of PCA based on kernel methods.In real world,each input data may not be fully assigned to one class and it may partially belong to other classes.Based on the theory of fuzzy sets,this paper presents Fuzzy Principal Component Analysis(FPCA)and its nonlinear extension model,i.e.,Kernel-based Fuzzy Principal Component Analysis(KFPCA).The experimental results indicate that the proposed algorithms have good performances.
基金the Doctoral Program of Higher Education of China(No.20120032110034)
文摘Robust principal component analysis(PCA) is widely used in many applications, such as image processing, data mining and bioinformatics. The existing methods for solving the robust PCA are mostly based on nuclear norm minimization. Those methods simultaneously minimize all the singular values, and thus the rank cannot be well approximated in practice. We extend the idea of truncated nuclear norm regularization(TNNR) to the robust PCA and consider truncated nuclear norm minimization(TNNM) instead of nuclear norm minimization(NNM). This method only minimizes the smallest N-r singular values to preserve the low-rank components, where N is the number of singular values and r is the matrix rank. Moreover, we propose an effective way to determine r via the shrinkage operator. Then we develop an effective iterative algorithm based on the alternating direction method to solve this optimization problem. Experimental results demonstrate the efficiency and accuracy of the TNNM method. Moreover, this method is much more robust in terms of the rank of the reconstructed matrix and the sparsity of the error.
基金Funded by 973 Program of Ministry of National Defense of China(Grant No.613237)
文摘This paper proposes a design optimization method for the multi-objective orbit design of earth observation satellites, for which the optimality of orbit performance indices with different units, such as: total coverage time, the frequency of coverage, average time per coverage and maximum coverage gap, etc. is required simultaneously. By introducing index normalization method to convert performance indices into dimensionless variables within the range of [0, 1], a design optimization method based on the principal component analysis and cluster analysis is proposed, which consists of index normalization method, principal component analysis, multiple-level cluster analysis and weighted evaluation method. The results of orbit optimization for earth observation satellites show that the optimal orbit can be obtained by using the proposed method. The principal component analysis can reduce the total number of indices with a non-independent relationship to save computing time. Similarly, the multiple-level cluster analysis with parallel computing could save computing time.
文摘Having researched for many years, seismologists in China presented about 80 earthquake prediction factors which reflected omen information of earthquake. How to concentrate the information that the 80 earthquake prediction factors have and how to choose the main factors to predict earthquakes precisely have become one of the topics in seismology. The model of principal component-discrimination consists of principal component analysis, correlation analysis, weighted method of principal factor coefficients and Mahalanobis distance discrimination analysis. This model combines the method of maximization earthquake prediction factor information with the weighted method of principal factor coefficients and correlation analysis to choose earthquake prediction variables, applying Mahalanobis distance discrimination to establishing earthquake prediction discrimination model. This model was applied to analyzing the earthquake data of Northern China area and obtained good prediction results.
基金This work was supported by the Natural Science Foundation of Guangdong Province,China(2018 A0303131000)the project of Academician workstation of Guangdong Province,China(2014B090905001)the Fundamental Research Funds for the Central Universities,China(21617406)and the key project of Scientific and Technological projects of Guang Zhou,China(201604040007,201604020168).
文摘This study aimed to explore the application of surface-enhanced Raman scattering(SERS)in the rapid diagnosis of gastric cancer.The SERS spectra of 68 serum samples from gastric cancer patients and healthy volunteers were acquired.The characteristic ratio method(CRM)and principal component analysis(PCA)were used to differentiate gastric cancer serum from normal serum.Compared with healthy volunteers,the serum SERS intensity of gastric cancer patients was relatively high at 722 cm^(-1),while it was relatively low at 588,644,861,1008,1235,1397,1445 and 1586 cm^(-1).These results indicated that the relative content of nucleic acids in the serum of gastric cancer patients rises while the relative content of amino acids and carbohydrates decreases.In PCA,the sensitivity and specificity of discriminating gastric cancer were 94.1%and 94.1%,respectively,with the accuracy of 94.1%.Based on the intensity ratios of four characteristic peaks at 722,861,1008 and 1397 cm^(-1),CRM presented the diagnostic sensitivity and specificity of 100%and 97.4%,respectively,and the accuracy of 98.5%.Therefore,the three peak intensity ratios of I_(722)/I_(861),I_(722)/I_(1008)and I_(722)/I_(1397)can be considered as biologicalfingerprint information for gastric cancer diagnosis and can rapidly and directly reflect the physiological and pathological changes associated with gastric cancer development.This study provides an important basis and standards for the early diagnosis of gastric cancer.
文摘With the rapid growth of the international banking industry,bank failures can lead to severe economic losses and social impacts.Although existing measures to address such failures are well-developed,timely prediction can significantly mitigate these effects.This study analyzes key indicators influencing bank fail-ure through data analysis and correlation analysis,then develops a neural net-work-based risk prediction model to estimate failure probabilities.First,we ex-tracted 64 indicators from the dataset,identified the most relevant indicators using the entropy weight method,and established a bank efficiency evaluation formula to determine the failure threshold.Next,we applied principal compo-nent analysis(PCA)to reduce dimensionality and derive a comprehensive scoring formula.Based on these findings,we constructed a machine learning model in MATLAB to predict bank failures.Finally,the model was used to predict the failure probabilities of all banks and identify 20 representative existing and failed banks.The developed models effectively predict bank fail-ure risks and demonstrate strong applicability across different scenarios.
基金the editor and the anonymous referees for their constructive comments and suggestions,which greatly improved the paper.
文摘Recently,the tensor robust principal component analysis(TRPCA),aiming to recover the true low-rank tensor from noisy data,has attracted considerable attention.In this paper,we solve the TRPCA problem under the framework of the tensor singular value decomposition(t-SVD).Since the convex relaxation approaches have some limitations,we establish a new non-convex TRPCA model by introducing the non-convex tensor rank approximation based on the Laplace function via the weighted l_(p)-norm regularization.An efficient algorithm based on the alternating direction method of multipliers(ADMM)is developed to solve the proposed model.We further prove that the constructed sequence converges to the desirable Karush-Kuhn-Tucker point.Experimental results show that the proposed approach outperforms various latest approaches in the literature.
基金Supported by the National Natural Science Foundation of China(41901012)Project of Shaanxi Provincial Education Department(21JP040)+1 种基金Talent Fund Project of Weinan Normal University(2021RC04)National Innovation and Entrepreneurship Training Program for College Students(22XK019)。
文摘In order to study the water quality of the Shichuan River basin in Fuping,Shaanxi Province,based on improved Nemerow index method,comprehensive pollution index method and principal component analysis method,eight water quality indexes such as pH,dissolved oxygen(DO),total dissolved solids(TDS),COD,total hardness,total phosphorus,total nitrogen and Zn in three monitoring sections of Fuping section of the Shichuan River in Shaanxi Province were detected and analyzed.The results show that the water quality of the surface water in the Shichuan River basin is gradeⅢorⅣwater,that is,the water is slightly polluted and moderately polluted.It is necessary to monitor the water quality after regulation and clarify the main factors causing the water pollution.
基金Supported by the National Natural Science Foundation of China(71631005,71871062)the Humanities and Social Science Foundation of the Ministry of Education of China(16YJA630078).
文摘This paper attempts to evaluate the coordinated development state of the subsystems within the internet financial ecosystem in China from 2011 to 2016.Focusing on the main business modes,technological innovation,and the external environment,we select 29 indicators to construct an index system and adopt a coupling coordination degree model for evaluation.Furthermore,we use two weight calculation methods,entropy weight and principal component analysis,to ensure the robustness of the results.The empirical results show that China’s internet financial ecosystem experienced five development stages from 2011 to 2016,which are moderate disorder,near disorder,weak coordination,intermediate coordination,and good coordination.Different methods of obtaining weights have little effect on the empirical results.These findings suggest that at the beginning,the coordinated development of China’s internet financial ecosystem was hindered by factors including the scarcity of main business modes and the defect of technological innovation;then,with the rapid development of China’s internet industry,the external environment became another drawback in coordinated development.Finally,based on the findings,we give some policy recommendations from a global perspective to achieve a sustainable internet financial ecosystem.
基金supported by the National Hi-Tech Research and Development Program of China("863"Project)(Grant No.2011AA040202)the National Natural Science Foundation of China(Grant No.40976114)
文摘Discriminating internal layers by radio echo sounding is important in analyzing the thickness and ice deposits in the Antarctic ice sheet.The signal processing method of synthesis aperture radar(SAR)has been widely used for improving the signal to noise ratio(SNR)and discriminating internal layers by radio echo sounding data of ice sheets.This method is not efficient when we use edge detection operators to obtain accurate information of the layers,especially the ice-bed interface.This paper presents a new image processing method via a combined robust principal component analysis-total variation(RPCA-TV)approach for discriminating internal layers of ice sheets by radio echo sounding data.The RPCA-based method is adopted to project the high-dimensional observations to low-dimensional subspace structure to accelerate the operation of the TV-based method,which is used to discriminate the internal layers.The efficiency of the presented method has been tested on simulation data and the dataset of the Institute of Electronics,Chinese Academy of Sciences,collected during CHINARE 28.The results show that the new method is more efficient than the previous method in discriminating internal layers of ice sheets by radio echo sounding data.
基金supported by National Natural Science Foundation of China (Grant No. 10828102)a Changjiang Visiting Professorship, the Training Fund of Northeast Normal University’s Scientific Innovation Project (Grant No. NENU-STC07002)the National Institutes of Health Grant of USA (Grant No. R01GM080503-01A1)
文摘We use the functional principal component analysis(FPCA) to model and predict the weight growth in children.In particular,we examine how the approach can help discern growth patterns of underweight children relative to their normal counterparts,and whether a commonly used transformation to normality plays any constructive roles in a predictive model based on the FPCA.Our work supplements the conditional growth charts developed by Wei and He(2006) by constructing a predictive growth model based on a small number of principal components scores on individual's past.
基金jointly supported by the Gansu Provincial Natural Resources Science and Technology Project of the Key Laboratory of Strategic Mineral Resources of the Upper Yellow River,Ministry of Natural Resources(YSJD2022-16)the survey project initiated by the China Geological Survey(DD20211347).
文摘In this paper,25 sampling points of overlying deposits in Tonglushan mining area,Daye City,Hubei Province,China were tested for heavy metal content to explore pollution characteristics,pollution sources and ecological risks of heavy metals in sediments.A geo-accumulation index method was used to evaluate the degree of heavy metal pollution in the sediment.The mean sediment quality guideline quotient was used for evaluating the ecological risk level of heavy metal in the sediment.And a method of correlation analysis,clustering analysis,and principal component analysis was used for preliminary analysis on the source of heavy metal in the sediment.It was indicated that there was extremely heavy metal pollution in the sediment,among which Cd was extremely polluted,Cu strongly contaminated,Zn,As,and Hg moderately contaminated,and Pb,Cr,and Ni were slightly contaminated.It was also indicated by the mean sediment quality guideline-quotient result that there was a high ecological risk of heavy metals in the sediment,and 64%of the sample sites had extremely high hidden biotoxic effects.For distribution,the contamination of branches was worse than that of the main channel of Daye Dagang,and the deposition of each heavy metal was mainly influenced by the distance from this sample site to the sewage draining exit of a tailings pond.The source analysis showed that the heavy metals in the sediment come from pollution discharging of mining and beneficiation companies,tailings ponds,smelting companies,and transport vehicles.In the study area,due to the influence of heavy metal discharging from these sources,the ecotoxicity of heavy metals in the sediment was extremely high,and Cd was the most toxic pollutant.The research figured out the key restoration area and elements for ecological restoration in the sediment of the Tonglüshan mining area,which could be referenced by monitoring and governance of heavy metal pollution in the sediment of the polymetallic mining area.
文摘This paper presents a study on the biotic/abiotic conditions of the S?o Giácomo sanitary landfill, located near the city of Caxias do Sul, Brazil, through statistical analysis of fourteen physic-chemical data sets for the leachate, produced in the garbage dump site over a long period of years. Different chemometric methods are used in the statistical analysis. For example, the correlations between the variables, related to the degraded organic matter and biological activity, are determined by means of multivariate methods. The results highlight that BOD, COD, VTS, FTS and TS give information on the anaerobic degradation of the organic matter contained in the cells, and suggest that the greater the contribution of the variables with positive weights in PC1 the greater the level of organic matter degradation. The variables TN, Amon Nit. and alkalinity are related to the biological activity and determine the potency of the variables in relation to time. The greater the contribution of the variables related to organic degradation the greater the values in PC2 and the lesser the potency of these variables, whose influence is greater in the second stage of anaerobic degradation. The variables of PC2 is important plans of the contamination of the leached in the bodies hídrics.
基金supported by the Key Program of Joint Funds of the National Natural Science Foundation of China(Grant No.U19B2040)Fundamental Research Funds for Central Universities and the University of Chinese Academy of Sciences(Grant No.Y95401TXX2)Beijing Natural Science Foundation(Grant No.Z190004)。
文摘Joint analysis of multiple phenotypes can have better interpretation of complex diseases and increase statistical power to detect more significant single nucleotide polymorphisms(SNPs)compare to traditional single phenotype analysis in genome-wide association analysis.Principle component analysis(PCA),as a popular dimension reduction method,has been broadly used in the analysis of multiple phenotypes.Since PCA transforms the original phenotypes into principal components(PCs),it is natural to think that by analyzing these PCs,we can combine information across phenotypes.Existing PCA-based methods can be divided into two categories,either selecting one particular PC manually or combining information from all PCs.In this paper,we propose an adaptive principle component test(APCT)which selects and combines the PCs adaptively by using Cauchy combination method.Our proposed method can be seen as a generalization of traditional PCA based method since it contains two existing methods as special situation.Extensive simulation shows that our method is robust and can generate powerful result in various situations.The real data analysis of stock mice data also demonstrate that our proposed APCT can identify significant SNPs that are missed by traditional methods.
文摘The restaurant is traditional industry of the third industry in our country. Since May 1 in this year, China's Restaurant Industry to implement "replace the business tax with value-added tax" policy and change to pay VAT. This paper analyzed the possible impact on restaurant industry after the" replace the business tax with value-added tax" based on the understanding of the tax theory of" replace the business tax with value-added tax" and the tax compliance in our country. At the same time, this paper used the statistical analysis of data on the investigation of 100 samples of the VAT cognitive degree on "replace the business tax with value-added tax", using principal component analysis method to analyze and evaluate factors on the awareness of the restaurant owner to "replace the business tax with value-added tax" tax policy. After multiple comparison made on the sample data, this paper summarized and analyzed the countermeasures of improving the pushing effect in restaurant industry "replace the business tax with VAT ".
文摘Increasing contamination of water resources in the world and our country and decreasing water quality over time, not having met the objectives of utilization of water resources;it has increased the importance of water management. The monitoring of the water resources and evaluation of these monitoring results have given direction to the studies’ outcome in order to control factors that pollute water resources and reduce water quality. Nilüfer Creek is very important for both being a source of drinking and potable water and a discharge area for wastewaters for the city of Bursa. In this study, the results of the analysis belonging to the period between 2002-2010 which are taken from 15 points by General Directorate of Bursa Water and Sewerage Administration (BUWSA) were evaluated in relation to water quality of the Nilüfer Creek. Non-parametric methods were used in the evaluation of the water quality data due to the lack of normally distributed data. The identification of the best represented parameters of the water quality was provided by applying Principal Component Analysis. According to results of the analysis, the best representative 9 parameters from the 19 water quality parameters were defined as parameters of BOD5, COD, TSS, T.Fe, Zn, conductivity, NO2-N, Ni and NO3-N that taking part of the first two components.