Population growth and expanding urbanization have caused persistent shortages and contamination of groundwater resources in Mali,Africa.The increase in groundwater salinity makes it more difficult for residents to obt...Population growth and expanding urbanization have caused persistent shortages and contamination of groundwater resources in Mali,Africa.The increase in groundwater salinity makes it more difficult for residents to obtain drinking water,it is necessary to clarify the causes and control factors of groundwater mineralization in Gao region,northern Mali.Based on the analysis of the hydrochemical composition of groundwater in 24 boreholes,Piper and Sch?eller diagrams,principal component analysis(PCA)and hierarchical cluster analysis(HCA)are used to carry out multivariate statistical analysis on the main ions.The results show that the groundwater samples are weakly alkaline,with pH values ranging from 5.83 to 8.40,and the average values of boreholes are 7.50,respectively.The average electrical conductivity(EC)value is 354.4(μS/cm),and the extreme value is between 124.0 and 1247(μS/cm).Water is usually mineralized and presents nine types of water phase.The three principal components explain 84.42%of the total variance for 13 parameters.The factor F1(58.85%),the factor F2(16.88%)and the factor F3(8.69%)present for the majority of the total data set.In addition,multivariate statistical analysis confirmed the genetic relationship among aquifers and identified three main clusters.Clustering related to groundwater mineralization(F1),clustering related to oxide reduction and iron enrichment(F2),and clustering of groundwater pollution caused by nitrate and magnesium(F3).We found that agriculture,weathering activities and dissolution of geological materials promote the mineralization of groundwater.Groundwater quality in the Gao region is becoming less and less potable because of increasing salinity.展开更多
Data-driven process monitoring is an effective approach to assure safe operation of modern manufacturing and energy systems,such as thermal power plants being studied in this work.Industrial processes are inherently d...Data-driven process monitoring is an effective approach to assure safe operation of modern manufacturing and energy systems,such as thermal power plants being studied in this work.Industrial processes are inherently dynamic and need to be monitored using dynamic algorithms.Mainstream dynamic algorithms rely on concatenating current measurement with past data.This work proposes a new,alternative dynamic process monitoring algorithm,using dot product feature analysis(DPFA).DPFA computes the dot product of consecutive samples,thus naturally capturing the process dynamics through temporal correlation.At the same time,DPFA's online computational complexity is lower than not just existing dynamic algorithms,but also classical static algorithms(e.g.,principal component analysis and slow feature analysis).The detectability of the new algorithm is analyzed for three types of faults typically seen in process systems:sensor bias,process fault and gain change fault.Through experiments with a numerical example and real data from a thermal power plant,the DPFA algorithm is shown to be superior to the state-of-the-art methods,in terms of better monitoring performance(fault detection rate and false alarm rate)and lower computational complexity.展开更多
Dongguan (东莞) City, located in the Pearl River Delta, South China, is famous for its rapid industrialization in the past 30 years. A total of 90 topsoil samples have been collected from agricultural fields, includ...Dongguan (东莞) City, located in the Pearl River Delta, South China, is famous for its rapid industrialization in the past 30 years. A total of 90 topsoil samples have been collected from agricultural fields, including vegetable and orchard soils in the city, and eight heavy metals (As, Cu, Cd, Cr, Hg, Ni, Pb, and Zn) and other items (pH values and organic matter) have been analyzed, to evaluate the influence of anthropic activities on the environmental quality of agricultural soils and to identify the spatial distribution of trace elements and possible sources of trace elements. The elements Hg, Pb, and Cd have accumulated remarkably here, incomparison with the soil background content of elements in Guangdong (广东) Province. Pollution is more serious in the western plain and the central region, which are heavily distributed with industries and rivers. Multivariate and geostatistical methods have been applied to differentiate the influences of natural processes and human activities on the pollution of heavy metals in topsoils in the study area. The results of cluster analysis (CA) and factor analysis (FA) show that Ni, Cr, Cu, Zn, and As are grouped in factor F1, Pb in F2, and Cd and Hg in F3, respectively. The spatial pattern of the three factors may be well demonstrated by geostatistical analysis. It is shown that the first factor could be considered as a natural source controlled by parent rocks. The second factor could be referred to as "industrial and traffic pollution sources". The source of the third factor is mainly controlled by long-term anthropic activities, as a consequence of agricultural activities, fossil fuel consumption, and atmospheric deposition.展开更多
This study aims to investigate the hydrochemical characteristics of shallow aquifer in a semi-arid region situated in northwest Algeria,and to understand the major factors governing groundwater quality.The study area ...This study aims to investigate the hydrochemical characteristics of shallow aquifer in a semi-arid region situated in northwest Algeria,and to understand the major factors governing groundwater quality.The study area is suffering from recurring droughts,groundwater resource over-exploitation and groundwater quality degradation.The approach used is a combination of traditional hydrochemical analysis methods of multivariate statistical techniques,principal component analysis(PCA),and ratios of major ions,based on the data derived from 33 groundwater samples collected in February 2014.Results show that groundwater in the study area are highly mineralized and collectively has a high concentration of chloride(as Cl^(−)).The dominant water types are Na-Cl(27%),Mg-HCO_(3)(24%)and Mg-Cl(24%).According to the(PCA)approach,salinization is the main process that controls the hydrochemical variability.The PCA analysis reveal the impact of anthropogenic factor especially the agricultural activities on the groundwater quality.The PCA highlighted two types of recharge:Superficial recharge from effective rainfall and excess irrigation water distinguished by the presence of nitrate and lateral recharge or vertical leakage from carbonate formations marked by the omnipresence of HCO_(3)^(−).Additionally,three categories of samples were identified:(1)samples characterized by good water quality and receiving notable recharge from carbonate formations;(2)samples impacted by the natural salinization process;and(3)samples contaminated by anthropogenic activities.The major natural processes influencing water chemistry are the weathering of carbonate and silicate rocks,dissolution of evaporite as halite,evaporation and cation exchange.The study results can provide the basis for local decision makers to ensure the sustainable management of groundwater and the safety of drinking water.展开更多
Geochemical surveys are essential for understanding the spatial distribution of ore-forming elements.However,these surveys often involve compositional data,the weight concentrations,which do not meet the requirements ...Geochemical surveys are essential for understanding the spatial distribution of ore-forming elements.However,these surveys often involve compositional data,the weight concentrations,which do not meet the requirements of statistical methods due to the closure effect.In this study,we applied an integrated approach combining compositional data,multifractal,and multivariate statistical analyses to identify the nonlinear complexity of the spatial distributions of elemental concentrations in the Er’renshan ore field.Initially,the raw concentrations were transformed into log-ratios following the principles of composition data theory to alleviate the impact of the closure effect.Multifractal analysis was then conducted to characterise the nonlinear complexity of the concentration distributions.Furthermore,principal component analysis(PCA)and factor analysis(FA)were applied to identify spurious correlations and the potential factors controlling the distribution patterns.The results demonstrate that:a)the raw data are biased,while the log-ratio data are unbiased and more reliable;b)the spatial distributions of elemental concentrations exhibit nonlinear complexity;and c)the elemental distribution in the study area is largely controlled by structural factors.展开更多
The Kumaun Himalaya is well-known as a geologically and tectonically complex region that amplifies mass wasting processes,particularly landslides.This study attempts to investigate the interplay between landslide dist...The Kumaun Himalaya is well-known as a geologically and tectonically complex region that amplifies mass wasting processes,particularly landslides.This study attempts to investigate the interplay between landslide distribution and the lithotectonic regime of Darma Valley,Kumaun Himalaya.A landslide inventory comprising 295 landslides in the area has been prepared and several morphotectonic proxies such as valley floor width to height ratio(Vf),stream length gradient index(SL),and hypsometric integral(HI)have been used to infer tectonic regime.Morphometric analysis,including basic,linear,aerial,and relief aspects,of 59 fourth-order sub-basins,has been carried out to estimate erosion potential in the study area.The result demonstrates that 46.77%of the landslides lie in very high,20.32%in high,21.29%in medium,and 11.61%in low erosion potential zones respectively.In order to determine the key parameters controlling erosion potential,two multivariate statistical methods namely Principal Component Analysis(PCA)and Agglomerative Hierarchical Clustering(AHC)were utilized.PCA reveals that the Higher Himalayan Zone(HHZ)has the highest erosion potential due to the presence of elongated sub-basins characterized by steep slopes and high relief.The clusters created through AHC exhibit positive PCA values,indicating a robust correlation between PCA and AHC.Furthermore,the landslide density map shows two major landslide hotspots.One of these hotspots lies in the vicinity of highly active Munsiyari Thrust(MT),while the other is in the Pandukeshwar formation within the MT's hanging wall,characterized by a high exhumation rate.High SL and low Vf values along these hotspots further corroborate that the occurrence of landslides in the study area is influenced by tectonic activity.This study,by identifying erosionprone areas and elucidating the implications of tectonic activity on landslide distribution,empowers policymakers and government agencies to develop strategies for hazard assessment and effective landslide risk mitigation,consequently safeguarding lives and communities.展开更多
Water quality is a pressing issue affecting the sustainable development of lakes.To elucidate the spatial and temporal characteristics of water quality in Bos ten Lake,China,this study constructed a comprehensive wate...Water quality is a pressing issue affecting the sustainable development of lakes.To elucidate the spatial and temporal characteristics of water quality in Bos ten Lake,China,this study constructed a comprehensive water quality index(CWQI) based on key water quality indicators,utilizing water quality data collected from 17 sampling sites spaning from 2011 to 2019.Key water quality indicators were determined using factor analysis,and the spatial and temporal characteristics of key water quality indicators and the CWQI were examined using multivariate statistical analysis.The key water quality indicators included pH,chemical oxygen demand(COD),water transparency(SD),NO3-,total dissolved solids(TDS),Cl-,SO42-,and electrical conductivity(EC).Furthermore,the contribution rates of all water quality indicators to the water quality were quantitatively elucidated using the SHapley Additive explanations(SHAP) values,thereby validating the factor analysis outcomes.Among the eight key water quality indicators,the COD had the most significant influence on the water quality of Bos ten Lake.The water quality condition of Bosten Lake has remained at Class Ⅲ from 2011 to 2019(CWQI ranging from3.19 to 3.90).The water quality of Bos ten Lake was characterized by distinct regional differences that arose from hydrodynamic processes within the lake and upstream water quality.The southwestern region exhibited the best water quality(mean CWQI of 3.47),whereas the northwestern region exhibited the worst(mean CWQI of 3.58).It is crucial to acknowledge that alongside the increase in industrial and agricultural effluent discharge monitoring,a series of ecological restoration projects for the lake basin have been initiated.Over time,the water quality of Bosten Lake showed gradual improvement(improvement rate of CWQI at 0.05/a).This study provides a critical scientific basis for enhancing the understanding and effective management of water quality in the Bosten Lake Basin through a comprehensive analysis of its spatial and temporal evolution and driving mechanisms.展开更多
Multivariate statistical techniques,cluster analysis,non-parametric tests,and factor analysis were applied to analyze a water quality dataset including 13 parameters at 37 sites of the Three Gorges area,China,from 200...Multivariate statistical techniques,cluster analysis,non-parametric tests,and factor analysis were applied to analyze a water quality dataset including 13 parameters at 37 sites of the Three Gorges area,China,from 2003–2008 to investigate spatio-temporal variations and identify potential pollution sources.Using cluster analysis,the twelve months of the year were classified into three periods of lowflow (LF),normal-flow (NF),and high-flow (HF);and the 37 monitoring sites were divided into low pollution (LP),moderate pollution (MP),and high pollution (HP).Dissolved oxygen (DO),potassium permanganate index (COD Mn ),and ammonia-nitrogen (NH 4 +-N) were identified as significant variables affecting temporal and spatial variations by non-parametric tests.Factor analysis identified that the major pollutants in the HP region were organic matters and nutrients during NF,heavy metals during LF,and petroleum during HF.In the MP region,the identified pollutants primarily included organic matter and heavy metals year-around,while in the LP region,organic pollution was significant during both NF and HF,and nutrient and heavy metal levels were high during both LF and HF.The main sources of pollution came from domestic wastewater and agricultural activities and runoff;however,they contributed differently to each region in regards to pollution levels.For the HP region,inputs from wastewater treatment plants were significant;but for MP and LP regions,water pollution was more likely from the combined effects of agriculture,domestic wastewater,and chemical industry.These results provide fundamental information for developing better water pollution control strategies for the Three Gorges area.展开更多
Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares ...Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares (PLS) are surveyed in this paper. The four-step procedure of performing MSPM&C for chemical process, modeling of processes, detecting abnormal events or faults, identifying the variable(s) responsible for the faults and diagnosing the source cause for the abnormal behavior, is analyzed. Several main research directions of MSPM&C reported in the literature are discussed, such as multi-way principal component analysis (MPCA) for batch process, statistical monitoring and control for nonlinear process, dynamic PCA and dynamic PLS, and on-line quality control by inferential models. Industrial applications of MSPM&C to several typical chemical processes, such as chemical reactor, distillation column, polymerization process, petroleum refinery units, are summarized. Finally, some concluding remarks and future considerations are made.展开更多
Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the t...Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.展开更多
Understanding the controlling factor of groundwater quality can enhance promoting sustainable development of groundwater resources. To this end, multivariate statistical analysis(MA) and hydrochemical analysis were ...Understanding the controlling factor of groundwater quality can enhance promoting sustainable development of groundwater resources. To this end, multivariate statistical analysis(MA) and hydrochemical analysis were introduced in this work. The results indicate that the canonical discriminant function with 7 parameters was established using the discriminant analysis(DA) method, which can afford 100% correct assignation according to the 3 different clusters(good water(GW), poor water(PW), and very poor water(VPW)) obtained from cluster analysis(CA). According to factor analysis(FA), 8 factors were extracted from 25 hydrochemical elements and account for 80.897% of the total data variance, suggesting that groundwater with higher concentrations of sodium, calcium, magnesium, chloride, and sulfate in southeastern study area are mainly affected by the natural process; the higher level of arsenic and chromium in groundwater extracted from northwestern part of study area are derived by industrial activities; domestic and agriculture sewage have important contribution to copper, iron, iodine, and phosphate in the northern study area. Therefore, this work can help identify the main controlling factor of groundwater quality in North China plain so as to make better and more informed decisions about how to achieve groundwater resources sustainable development.展开更多
Meretricis concha is a kind of marine traditional Chinese medicine(TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental...Meretricis concha is a kind of marine traditional Chinese medicine(TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental fingerprint and the geographical origin identification of Meretricis concha, the elemental contents of M. concha from five sampling points in Rushan Bay have been determined by means of inductively coupled plasma optical emission spectrometry(ICP-OES). Based on the contents of 14 inorganic elements(Al, As, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Se, and Zn), the inorganic elemental fingerprint which well reflects the elemental characteristics was constructed. All the data from the five sampling points were discriminated with accuracy through hierarchical cluster analysis(HCA) and principle component analysis(PCA), indicating that a four-factor model which could explain approximately 80% of the detection data was established, and the elements Al, As, Cd, Cu, Ni and Pb could be viewed as the characteristic elements. This investigation suggests that the inorganic elemental fingerprint combined with multivariate statistical analysis is a promising method for verifying the geographical origin of M. concha, and this strategy should be valuable for the authenticity discrimination of some marine TCM.展开更多
Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality dat...Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River(SSHR) basin in China,obtained during two years(2012-2013) of monitoring of 10 physicochemical parameters at 15 different sites.The results showed that most of physicochemical parameters varied significantly among the sampling sites.Three significant groups,highly polluted(HP),moderately polluted(MP) and less polluted(LP),of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics.DA identified p H,F,DO,NH3-N,COD and VPhs were the most important parameters contributing to spatial variations of surface water quality.However,DA did not give a considerable data reduction(40% reduction).PCA/FA resulted in three,three and four latent factors explaining 70%,62% and 71% of the total variance in water quality data sets of HP,MP and LP regions,respectively.FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities(point sources:industrial effluents and wastewater treatment plants;non-point sources:domestic sewage,livestock operations and agricultural activities) and natural processes(seasonal effect,and natural inputs).PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters(about 80% reduction) as the most important parameters to explain 72% of the data variation.Thus,this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and,in water quality assessment,identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.展开更多
Chemical process variables are always driven by random noise and disturbances. The closed-loop con-trol yields process measurements that are auto and cross correlated. The influence of auto and cross correlations on s...Chemical process variables are always driven by random noise and disturbances. The closed-loop con-trol yields process measurements that are auto and cross correlated. The influence of auto and cross correlations on statistical process control (SPC) is investigated in detail by Monte Carlo experiments. It is revealed that in the sense of average performance, the false alarms rates (FAR) of principal component analysis (PCA), dynamic PCA are not affected by the time-series structures of process variables. Nevertheless, non-independent identical distribution will cause the actual FAR to deviate from its theoretic value apparently and result in unexpected consecutive false alarms for normal operating process. Dynamic PCA and ARMA-PCA are demonstrated to be inefficient to remove the influences of auto and cross correlations. Subspace identification-based PCA (SI-PCA) is proposed to improve the monitoring of dynamic processes. Through state space modeling, SI-PCA can remove the auto and cross corre-lations efficiently and avoid consecutive false alarms. Synthetic Monte Carlo experiments and the application in Tennessee Eastman challenge process illustrate the advantages of the proposed approach.展开更多
A technique for estimating tropical cyclone(TC) intensity over the Western North Pacific utilizing FY-3Microwave Imager(MWRI) data is developed. As a first step, we investigated the relationship between the FY-3 MWRI ...A technique for estimating tropical cyclone(TC) intensity over the Western North Pacific utilizing FY-3Microwave Imager(MWRI) data is developed. As a first step, we investigated the relationship between the FY-3 MWRI brightness temperature(TB) parameters, which are computed in concentric circles or annuli of different radius in different MWRI frequencies, and the TC maximum wind speed(Vmax) from the TC best track data. We found that the parameters of lower frequency channels' minimum TB, mean TB and ratio of pixels over the threshold TB with a radius of 1.0 or 1.5 degrees from the center give higher correlation. Then by applying principal components analysis(PCA)and multiple regression method, we established an estimation model and evaluated it using independent verification data, with the RMSE being 13 kt. The estimated Vmax is always stronger in the early stages of development, but slightly weaker toward the mature stage, and a reversal of positive and negative bias takes place with a boundary of around 70 kt. For the TC that has a larger error, we found that they are often with less organized and asymmetric cloud pattern, so the classification of TC cloud pattern will help improve the acuracy of the estimated TC intensity, and with the increase of statistical samples the accuracy of the estimated TC intensity will also be improved.展开更多
Groundwater is considered as one of the most important sources for water supply in Iran.The Fasa Plain in Fars Province,Southern Iran is one of the major areas of wheat production using groundwater for irrigation.A la...Groundwater is considered as one of the most important sources for water supply in Iran.The Fasa Plain in Fars Province,Southern Iran is one of the major areas of wheat production using groundwater for irrigation.A large population also uses local groundwater for drinking purposes.Therefore,in this study,this plain was selected to assess the spatial variability of groundwater quality and also to identify main parameters affecting the water quality using multivariate statistical techniques such as Cluster Analysis(CA),Discriminant Analysis(DA),and Principal Component Analysis(PCA).Water quality data was monitored at 22 different wells,for five years(2009-2014)with 10 water quality parameters.By using cluster analysis,the sampling wells were grouped into two clusters with distinct water qualities at different locations.The Lasso Discriminant Analysis(LDA)technique was used to assess the spatial variability of water quality.Based on the results,all of the variables except sodium absorption ratio(SAR)are effective in the LDA model with all variables affording 92.80%correct assignation to discriminate between the clusters from the primary 10 variables.Principal component(PC)analysis and factor analysis reduced the complex data matrix into two main components,accounting for more than 95.93%of the total variance.The first PC contained the parameters of TH,Ca2+,and Mg2+.Therefore,the first dominant factor was hardness.In the second PC,Cl-,SAR,and Na+were the dominant parameters,which may indicate salinity.The originally acquired factors illustrate natural(existence of geological formations)and anthropogenic(improper disposal of domestic and agricultural wastes)factors which affect the groundwater quality.展开更多
For aircraft manufacturing industries, the analyses and prediction of part machining error during machining process are very important to control and improve part machining quality. In order to effectively control mac...For aircraft manufacturing industries, the analyses and prediction of part machining error during machining process are very important to control and improve part machining quality. In order to effectively control machining error, the method of integrating multivariate statistical process control (MSPC) and stream of variations (SoV) is proposed. Firstly, machining error is modeled by multi-operation approaches for part machining process. SoV is adopted to establish the mathematic model of the relationship between the error of upstream operations and the error of downstream operations. Here error sources not only include the influence of upstream operations but also include many of other error sources. The standard model and the predicted model about SoV are built respectively by whether the operation is done or not to satisfy different requests during part machining process. Secondly, the method of one-step ahead forecast error (OSFE) is used to eliminate autocorrelativity of the sample data from the SoV model, and the T2 control chart in MSPC is built to realize machining error detection according to the data characteristics of the above error model, which can judge whether the operation is out of control or not. If it is, then feedback is sent to the operations. The error model is modified by adjusting the operation out of control, and continually it is used to monitor operations. Finally, a machining instance containing two operations demonstrates the effectiveness of the machining error control method presented in this paper.展开更多
A new method using discriminant analysis and control charts is proposed for monitoring multivariate process operations more reliably.Fisher discriminant analysis (FDA) is used to derive a feature discriminant direct...A new method using discriminant analysis and control charts is proposed for monitoring multivariate process operations more reliably.Fisher discriminant analysis (FDA) is used to derive a feature discriminant direction (FDD) between each normal and fault operations,and each FDD thus decided constructs the feature space of each fault operation.Individuals control charts (XmR charts) are used to monitor multivariate processes using the process data projected onto feature spaces.Upper control limit (UCL) and lower control limit (LCL) on each feature space from normal process operation are calculated for XmR charts,and are used to distinguish fault from normal.A variation trend on an XmR chart reveals the type of relevant fault operation.Applications to Tennessee Eastman simulation processes show that this proposed method can result in better monitoring performance than principal component analysis (PCA)-based methods and can better identify step type faults on XmR charts.展开更多
A new multivariate statistical strategy for analyzing large datasets that are produced by imaging mass spectrometry(IMS) techniques is reported.The strategy divides the whole datacube of the sample into several subs...A new multivariate statistical strategy for analyzing large datasets that are produced by imaging mass spectrometry(IMS) techniques is reported.The strategy divides the whole datacube of the sample into several subsets and analyses them one by one to obtain the results.Instead of analyzing the whole datacube at one time,the strategy makes the analysis easier and decreases the computation time greatly.In this report,the IMS data are produced by the air flow-assisted ionization IMS(AFAI-IMS).The strategy can be used in combination with most multivariate statistical analysis methods.In this paper,the strategy was combined with the principal component analysis(PCA) and partial least square analysis(PLS).It was proven to be effective by analyzing the handwriting sample.By using the strategy,the m/z corresponding to the specific lipids in rat brain tissue were distinguished successfully.Moreover the analysis time grew linearly instead of exponentially as the size of sample increased.The strategy developed in this study has enormous potential for searching for the mjz of potential biomarkers quickly and effectively.展开更多
基金funded by the China's National Natural Science Foundation(No.41440027)。
文摘Population growth and expanding urbanization have caused persistent shortages and contamination of groundwater resources in Mali,Africa.The increase in groundwater salinity makes it more difficult for residents to obtain drinking water,it is necessary to clarify the causes and control factors of groundwater mineralization in Gao region,northern Mali.Based on the analysis of the hydrochemical composition of groundwater in 24 boreholes,Piper and Sch?eller diagrams,principal component analysis(PCA)and hierarchical cluster analysis(HCA)are used to carry out multivariate statistical analysis on the main ions.The results show that the groundwater samples are weakly alkaline,with pH values ranging from 5.83 to 8.40,and the average values of boreholes are 7.50,respectively.The average electrical conductivity(EC)value is 354.4(μS/cm),and the extreme value is between 124.0 and 1247(μS/cm).Water is usually mineralized and presents nine types of water phase.The three principal components explain 84.42%of the total variance for 13 parameters.The factor F1(58.85%),the factor F2(16.88%)and the factor F3(8.69%)present for the majority of the total data set.In addition,multivariate statistical analysis confirmed the genetic relationship among aquifers and identified three main clusters.Clustering related to groundwater mineralization(F1),clustering related to oxide reduction and iron enrichment(F2),and clustering of groundwater pollution caused by nitrate and magnesium(F3).We found that agriculture,weathering activities and dissolution of geological materials promote the mineralization of groundwater.Groundwater quality in the Gao region is becoming less and less potable because of increasing salinity.
基金supported in part by the National Science Fund for Distinguished Young Scholars of China(62225303)the National Natural Science Fundation of China(62303039,62433004)+2 种基金the China Postdoctoral Science Foundation(BX20230034,2023M730190)the Fundamental Research Funds for the Central Universities(buctrc202201,QNTD2023-01)the High Performance Computing Platform,College of Information Science and Technology,Beijing University of Chemical Technology
文摘Data-driven process monitoring is an effective approach to assure safe operation of modern manufacturing and energy systems,such as thermal power plants being studied in this work.Industrial processes are inherently dynamic and need to be monitored using dynamic algorithms.Mainstream dynamic algorithms rely on concatenating current measurement with past data.This work proposes a new,alternative dynamic process monitoring algorithm,using dot product feature analysis(DPFA).DPFA computes the dot product of consecutive samples,thus naturally capturing the process dynamics through temporal correlation.At the same time,DPFA's online computational complexity is lower than not just existing dynamic algorithms,but also classical static algorithms(e.g.,principal component analysis and slow feature analysis).The detectability of the new algorithm is analyzed for three types of faults typically seen in process systems:sensor bias,process fault and gain change fault.Through experiments with a numerical example and real data from a thermal power plant,the DPFA algorithm is shown to be superior to the state-of-the-art methods,in terms of better monitoring performance(fault detection rate and false alarm rate)and lower computational complexity.
基金supported by the Ministry of Land and Resources of China (No. [2005]011-16)State Environment Protection Administration of China (No. 2001-1-2)+2 种基金State Key Laboratory of Geological Processes and Mineral Resources, China University of Geosciencesthe Guangdong Provincial Office of SciencesTechnology via NSF Team Project and Key Project (Nos. 06202438, 2004A3030800)
文摘Dongguan (东莞) City, located in the Pearl River Delta, South China, is famous for its rapid industrialization in the past 30 years. A total of 90 topsoil samples have been collected from agricultural fields, including vegetable and orchard soils in the city, and eight heavy metals (As, Cu, Cd, Cr, Hg, Ni, Pb, and Zn) and other items (pH values and organic matter) have been analyzed, to evaluate the influence of anthropic activities on the environmental quality of agricultural soils and to identify the spatial distribution of trace elements and possible sources of trace elements. The elements Hg, Pb, and Cd have accumulated remarkably here, incomparison with the soil background content of elements in Guangdong (广东) Province. Pollution is more serious in the western plain and the central region, which are heavily distributed with industries and rivers. Multivariate and geostatistical methods have been applied to differentiate the influences of natural processes and human activities on the pollution of heavy metals in topsoils in the study area. The results of cluster analysis (CA) and factor analysis (FA) show that Ni, Cr, Cu, Zn, and As are grouped in factor F1, Pb in F2, and Cd and Hg in F3, respectively. The spatial pattern of the three factors may be well demonstrated by geostatistical analysis. It is shown that the first factor could be considered as a natural source controlled by parent rocks. The second factor could be referred to as "industrial and traffic pollution sources". The source of the third factor is mainly controlled by long-term anthropic activities, as a consequence of agricultural activities, fossil fuel consumption, and atmospheric deposition.
文摘This study aims to investigate the hydrochemical characteristics of shallow aquifer in a semi-arid region situated in northwest Algeria,and to understand the major factors governing groundwater quality.The study area is suffering from recurring droughts,groundwater resource over-exploitation and groundwater quality degradation.The approach used is a combination of traditional hydrochemical analysis methods of multivariate statistical techniques,principal component analysis(PCA),and ratios of major ions,based on the data derived from 33 groundwater samples collected in February 2014.Results show that groundwater in the study area are highly mineralized and collectively has a high concentration of chloride(as Cl^(−)).The dominant water types are Na-Cl(27%),Mg-HCO_(3)(24%)and Mg-Cl(24%).According to the(PCA)approach,salinization is the main process that controls the hydrochemical variability.The PCA analysis reveal the impact of anthropogenic factor especially the agricultural activities on the groundwater quality.The PCA highlighted two types of recharge:Superficial recharge from effective rainfall and excess irrigation water distinguished by the presence of nitrate and lateral recharge or vertical leakage from carbonate formations marked by the omnipresence of HCO_(3)^(−).Additionally,three categories of samples were identified:(1)samples characterized by good water quality and receiving notable recharge from carbonate formations;(2)samples impacted by the natural salinization process;and(3)samples contaminated by anthropogenic activities.The major natural processes influencing water chemistry are the weathering of carbonate and silicate rocks,dissolution of evaporite as halite,evaporation and cation exchange.The study results can provide the basis for local decision makers to ensure the sustainable management of groundwater and the safety of drinking water.
基金supported by the Doctoral Research Start-up Fund,East China University of Technology(DHBK2019313)the Open Research Fund Program of Key Laboratory of Metallogenic Prediction of Nonferrous Metals and Geological Environment Monitoring(Central South University),the Ministry of Education(2020YSJS10)+1 种基金the Open Research Fund Program of Shandong Provincial Engineering Laboratory of Application and Development of Big Data for Deep Gold Exploration(SDK202224)the Basic Scientific Research Fund of the Institute of Geophysical and Geochemical Exploration,Chinese Academy of Geological Sciences(AS2022P03).
文摘Geochemical surveys are essential for understanding the spatial distribution of ore-forming elements.However,these surveys often involve compositional data,the weight concentrations,which do not meet the requirements of statistical methods due to the closure effect.In this study,we applied an integrated approach combining compositional data,multifractal,and multivariate statistical analyses to identify the nonlinear complexity of the spatial distributions of elemental concentrations in the Er’renshan ore field.Initially,the raw concentrations were transformed into log-ratios following the principles of composition data theory to alleviate the impact of the closure effect.Multifractal analysis was then conducted to characterise the nonlinear complexity of the concentration distributions.Furthermore,principal component analysis(PCA)and factor analysis(FA)were applied to identify spurious correlations and the potential factors controlling the distribution patterns.The results demonstrate that:a)the raw data are biased,while the log-ratio data are unbiased and more reliable;b)the spatial distributions of elemental concentrations exhibit nonlinear complexity;and c)the elemental distribution in the study area is largely controlled by structural factors.
基金CSIR for providing financial assistance(09/0420(11800)/2021EMR-I)。
文摘The Kumaun Himalaya is well-known as a geologically and tectonically complex region that amplifies mass wasting processes,particularly landslides.This study attempts to investigate the interplay between landslide distribution and the lithotectonic regime of Darma Valley,Kumaun Himalaya.A landslide inventory comprising 295 landslides in the area has been prepared and several morphotectonic proxies such as valley floor width to height ratio(Vf),stream length gradient index(SL),and hypsometric integral(HI)have been used to infer tectonic regime.Morphometric analysis,including basic,linear,aerial,and relief aspects,of 59 fourth-order sub-basins,has been carried out to estimate erosion potential in the study area.The result demonstrates that 46.77%of the landslides lie in very high,20.32%in high,21.29%in medium,and 11.61%in low erosion potential zones respectively.In order to determine the key parameters controlling erosion potential,two multivariate statistical methods namely Principal Component Analysis(PCA)and Agglomerative Hierarchical Clustering(AHC)were utilized.PCA reveals that the Higher Himalayan Zone(HHZ)has the highest erosion potential due to the presence of elongated sub-basins characterized by steep slopes and high relief.The clusters created through AHC exhibit positive PCA values,indicating a robust correlation between PCA and AHC.Furthermore,the landslide density map shows two major landslide hotspots.One of these hotspots lies in the vicinity of highly active Munsiyari Thrust(MT),while the other is in the Pandukeshwar formation within the MT's hanging wall,characterized by a high exhumation rate.High SL and low Vf values along these hotspots further corroborate that the occurrence of landslides in the study area is influenced by tectonic activity.This study,by identifying erosionprone areas and elucidating the implications of tectonic activity on landslide distribution,empowers policymakers and government agencies to develop strategies for hazard assessment and effective landslide risk mitigation,consequently safeguarding lives and communities.
基金supported by the National Natural Science Foundation of China(42377072,52409105).
文摘Water quality is a pressing issue affecting the sustainable development of lakes.To elucidate the spatial and temporal characteristics of water quality in Bos ten Lake,China,this study constructed a comprehensive water quality index(CWQI) based on key water quality indicators,utilizing water quality data collected from 17 sampling sites spaning from 2011 to 2019.Key water quality indicators were determined using factor analysis,and the spatial and temporal characteristics of key water quality indicators and the CWQI were examined using multivariate statistical analysis.The key water quality indicators included pH,chemical oxygen demand(COD),water transparency(SD),NO3-,total dissolved solids(TDS),Cl-,SO42-,and electrical conductivity(EC).Furthermore,the contribution rates of all water quality indicators to the water quality were quantitatively elucidated using the SHapley Additive explanations(SHAP) values,thereby validating the factor analysis outcomes.Among the eight key water quality indicators,the COD had the most significant influence on the water quality of Bos ten Lake.The water quality condition of Bosten Lake has remained at Class Ⅲ from 2011 to 2019(CWQI ranging from3.19 to 3.90).The water quality of Bos ten Lake was characterized by distinct regional differences that arose from hydrodynamic processes within the lake and upstream water quality.The southwestern region exhibited the best water quality(mean CWQI of 3.47),whereas the northwestern region exhibited the worst(mean CWQI of 3.58).It is crucial to acknowledge that alongside the increase in industrial and agricultural effluent discharge monitoring,a series of ecological restoration projects for the lake basin have been initiated.Over time,the water quality of Bosten Lake showed gradual improvement(improvement rate of CWQI at 0.05/a).This study provides a critical scientific basis for enhancing the understanding and effective management of water quality in the Bosten Lake Basin through a comprehensive analysis of its spatial and temporal evolution and driving mechanisms.
基金supported by the National Water Special Project (No.2009ZX07526-005)the Strategic Environmental Assessment Project (No.HP1080901)
文摘Multivariate statistical techniques,cluster analysis,non-parametric tests,and factor analysis were applied to analyze a water quality dataset including 13 parameters at 37 sites of the Three Gorges area,China,from 2003–2008 to investigate spatio-temporal variations and identify potential pollution sources.Using cluster analysis,the twelve months of the year were classified into three periods of lowflow (LF),normal-flow (NF),and high-flow (HF);and the 37 monitoring sites were divided into low pollution (LP),moderate pollution (MP),and high pollution (HP).Dissolved oxygen (DO),potassium permanganate index (COD Mn ),and ammonia-nitrogen (NH 4 +-N) were identified as significant variables affecting temporal and spatial variations by non-parametric tests.Factor analysis identified that the major pollutants in the HP region were organic matters and nutrients during NF,heavy metals during LF,and petroleum during HF.In the MP region,the identified pollutants primarily included organic matter and heavy metals year-around,while in the LP region,organic pollution was significant during both NF and HF,and nutrient and heavy metal levels were high during both LF and HF.The main sources of pollution came from domestic wastewater and agricultural activities and runoff;however,they contributed differently to each region in regards to pollution levels.For the HP region,inputs from wastewater treatment plants were significant;but for MP and LP regions,water pollution was more likely from the combined effects of agriculture,domestic wastewater,and chemical industry.These results provide fundamental information for developing better water pollution control strategies for the Three Gorges area.
基金Supported by the National High-Tech Development Program of China(No.863-511-920-011,2001AA411230).
文摘Multivariate statistical process monitoring and control (MSPM&C) methods for chemical process monitoring with statistical projection techniques such as principal component analysis (PCA) and partial least squares (PLS) are surveyed in this paper. The four-step procedure of performing MSPM&C for chemical process, modeling of processes, detecting abnormal events or faults, identifying the variable(s) responsible for the faults and diagnosing the source cause for the abnormal behavior, is analyzed. Several main research directions of MSPM&C reported in the literature are discussed, such as multi-way principal component analysis (MPCA) for batch process, statistical monitoring and control for nonlinear process, dynamic PCA and dynamic PLS, and on-line quality control by inferential models. Industrial applications of MSPM&C to several typical chemical processes, such as chemical reactor, distillation column, polymerization process, petroleum refinery units, are summarized. Finally, some concluding remarks and future considerations are made.
基金Supported by the National Natural Science Foundation of China (No.60574047) and the Doctorate Foundation of the State Education Ministry of China (No.20050335018).
文摘Abstract Data-driven tools, such as principal component analysis (PCA) and independent component analysis (ICA) have been applied to different benchmarks as process monitoring methods. The difference between the two methods is that the components of PCA are still dependent while ICA has no orthogonality constraint and its latentvariables are independent. Process monitoring with PCA often supposes that process data or principal components is Gaussian distribution. However, this kind of constraint cannot be satisfied by several practical processes. To ex-tend the use of PCA, a nonparametric method is added to PCA to overcome the difficulty, and kernel density estimation (KDE) is rather a good choice. Though ICA is based on non-Gaussian distribution intormation, .KDE can help in the close monitoring of the data. Methods, such as PCA, ICA, PCA.with .KDE(KPCA), and ICA with KDE,(KICA), are demonstrated and. compared by applying them to a practical industnal Spheripol craft polypropylene catalyzer reactor instead of a laboratory emulator.
基金supported by the Major State Basic Research Development Program (No. 2010CB428800)the Geological Survey Projects Foundation of Institute of Hydrogeology and Environmental Geology (No. SK201308)
文摘Understanding the controlling factor of groundwater quality can enhance promoting sustainable development of groundwater resources. To this end, multivariate statistical analysis(MA) and hydrochemical analysis were introduced in this work. The results indicate that the canonical discriminant function with 7 parameters was established using the discriminant analysis(DA) method, which can afford 100% correct assignation according to the 3 different clusters(good water(GW), poor water(PW), and very poor water(VPW)) obtained from cluster analysis(CA). According to factor analysis(FA), 8 factors were extracted from 25 hydrochemical elements and account for 80.897% of the total data variance, suggesting that groundwater with higher concentrations of sodium, calcium, magnesium, chloride, and sulfate in southeastern study area are mainly affected by the natural process; the higher level of arsenic and chromium in groundwater extracted from northwestern part of study area are derived by industrial activities; domestic and agriculture sewage have important contribution to copper, iron, iodine, and phosphate in the northern study area. Therefore, this work can help identify the main controlling factor of groundwater quality in North China plain so as to make better and more informed decisions about how to achieve groundwater resources sustainable development.
基金supposed by the Program for Science and Technology of Shandong Province (2011GHY11521)the Department of Education of Shandong Province (No. J11LB07)the Natural Science Foundation of Qingdao City (Nos. 12-1-3-52-(1)-nsh and 12-1-4-16-(7)-jch)
文摘Meretricis concha is a kind of marine traditional Chinese medicine(TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental fingerprint and the geographical origin identification of Meretricis concha, the elemental contents of M. concha from five sampling points in Rushan Bay have been determined by means of inductively coupled plasma optical emission spectrometry(ICP-OES). Based on the contents of 14 inorganic elements(Al, As, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Se, and Zn), the inorganic elemental fingerprint which well reflects the elemental characteristics was constructed. All the data from the five sampling points were discriminated with accuracy through hierarchical cluster analysis(HCA) and principle component analysis(PCA), indicating that a four-factor model which could explain approximately 80% of the detection data was established, and the elements Al, As, Cd, Cu, Ni and Pb could be viewed as the characteristic elements. This investigation suggests that the inorganic elemental fingerprint combined with multivariate statistical analysis is a promising method for verifying the geographical origin of M. concha, and this strategy should be valuable for the authenticity discrimination of some marine TCM.
基金Project (2012ZX07501002-001) supported by the Ministry of Science and Technology of China
文摘Multivariate statistical techniques,such as cluster analysis(CA),discriminant analysis(DA),principal component analysis(PCA) and factor analysis(FA),were applied to evaluate and interpret the surface water quality data sets of the Second Songhua River(SSHR) basin in China,obtained during two years(2012-2013) of monitoring of 10 physicochemical parameters at 15 different sites.The results showed that most of physicochemical parameters varied significantly among the sampling sites.Three significant groups,highly polluted(HP),moderately polluted(MP) and less polluted(LP),of sampling sites were obtained through Hierarchical agglomerative CA on the basis of similarity of water quality characteristics.DA identified p H,F,DO,NH3-N,COD and VPhs were the most important parameters contributing to spatial variations of surface water quality.However,DA did not give a considerable data reduction(40% reduction).PCA/FA resulted in three,three and four latent factors explaining 70%,62% and 71% of the total variance in water quality data sets of HP,MP and LP regions,respectively.FA revealed that the SSHR water chemistry was strongly affected by anthropogenic activities(point sources:industrial effluents and wastewater treatment plants;non-point sources:domestic sewage,livestock operations and agricultural activities) and natural processes(seasonal effect,and natural inputs).PCA/FA in the whole basin showed the best results for data reduction because it used only two parameters(about 80% reduction) as the most important parameters to explain 72% of the data variation.Thus,this work illustrated the utility of multivariate statistical techniques for analysis and interpretation of datasets and,in water quality assessment,identification of pollution sources/factors and understanding spatial variations in water quality for effective stream water quality management.
基金National Natural Foundation of China (No.60421002, No.70471052)
文摘Chemical process variables are always driven by random noise and disturbances. The closed-loop con-trol yields process measurements that are auto and cross correlated. The influence of auto and cross correlations on statistical process control (SPC) is investigated in detail by Monte Carlo experiments. It is revealed that in the sense of average performance, the false alarms rates (FAR) of principal component analysis (PCA), dynamic PCA are not affected by the time-series structures of process variables. Nevertheless, non-independent identical distribution will cause the actual FAR to deviate from its theoretic value apparently and result in unexpected consecutive false alarms for normal operating process. Dynamic PCA and ARMA-PCA are demonstrated to be inefficient to remove the influences of auto and cross correlations. Subspace identification-based PCA (SI-PCA) is proposed to improve the monitoring of dynamic processes. Through state space modeling, SI-PCA can remove the auto and cross corre-lations efficiently and avoid consecutive false alarms. Synthetic Monte Carlo experiments and the application in Tennessee Eastman challenge process illustrate the advantages of the proposed approach.
基金National Key Research and Development Program of China(2016YFA0600101)National Basic Research Program of China(973 Program,2010CB950802)National Natural Science Fund(41605028)
文摘A technique for estimating tropical cyclone(TC) intensity over the Western North Pacific utilizing FY-3Microwave Imager(MWRI) data is developed. As a first step, we investigated the relationship between the FY-3 MWRI brightness temperature(TB) parameters, which are computed in concentric circles or annuli of different radius in different MWRI frequencies, and the TC maximum wind speed(Vmax) from the TC best track data. We found that the parameters of lower frequency channels' minimum TB, mean TB and ratio of pixels over the threshold TB with a radius of 1.0 or 1.5 degrees from the center give higher correlation. Then by applying principal components analysis(PCA)and multiple regression method, we established an estimation model and evaluated it using independent verification data, with the RMSE being 13 kt. The estimated Vmax is always stronger in the early stages of development, but slightly weaker toward the mature stage, and a reversal of positive and negative bias takes place with a boundary of around 70 kt. For the TC that has a larger error, we found that they are often with less organized and asymmetric cloud pattern, so the classification of TC cloud pattern will help improve the acuracy of the estimated TC intensity, and with the increase of statistical samples the accuracy of the estimated TC intensity will also be improved.
基金The authors would like to thank the Laboratory of Water Engineering,Fasa University for providing the facilities to perform this research.
文摘Groundwater is considered as one of the most important sources for water supply in Iran.The Fasa Plain in Fars Province,Southern Iran is one of the major areas of wheat production using groundwater for irrigation.A large population also uses local groundwater for drinking purposes.Therefore,in this study,this plain was selected to assess the spatial variability of groundwater quality and also to identify main parameters affecting the water quality using multivariate statistical techniques such as Cluster Analysis(CA),Discriminant Analysis(DA),and Principal Component Analysis(PCA).Water quality data was monitored at 22 different wells,for five years(2009-2014)with 10 water quality parameters.By using cluster analysis,the sampling wells were grouped into two clusters with distinct water qualities at different locations.The Lasso Discriminant Analysis(LDA)technique was used to assess the spatial variability of water quality.Based on the results,all of the variables except sodium absorption ratio(SAR)are effective in the LDA model with all variables affording 92.80%correct assignation to discriminate between the clusters from the primary 10 variables.Principal component(PC)analysis and factor analysis reduced the complex data matrix into two main components,accounting for more than 95.93%of the total variance.The first PC contained the parameters of TH,Ca2+,and Mg2+.Therefore,the first dominant factor was hardness.In the second PC,Cl-,SAR,and Na+were the dominant parameters,which may indicate salinity.The originally acquired factors illustrate natural(existence of geological formations)and anthropogenic(improper disposal of domestic and agricultural wastes)factors which affect the groundwater quality.
基金National Natural Science Foundation of China (70931004)
文摘For aircraft manufacturing industries, the analyses and prediction of part machining error during machining process are very important to control and improve part machining quality. In order to effectively control machining error, the method of integrating multivariate statistical process control (MSPC) and stream of variations (SoV) is proposed. Firstly, machining error is modeled by multi-operation approaches for part machining process. SoV is adopted to establish the mathematic model of the relationship between the error of upstream operations and the error of downstream operations. Here error sources not only include the influence of upstream operations but also include many of other error sources. The standard model and the predicted model about SoV are built respectively by whether the operation is done or not to satisfy different requests during part machining process. Secondly, the method of one-step ahead forecast error (OSFE) is used to eliminate autocorrelativity of the sample data from the SoV model, and the T2 control chart in MSPC is built to realize machining error detection according to the data characteristics of the above error model, which can judge whether the operation is out of control or not. If it is, then feedback is sent to the operations. The error model is modified by adjusting the operation out of control, and continually it is used to monitor operations. Finally, a machining instance containing two operations demonstrates the effectiveness of the machining error control method presented in this paper.
基金Sponsored by the Scientific Research Foundation for Returned Overseas Chinese Scholars of the Ministry of Education of China
文摘A new method using discriminant analysis and control charts is proposed for monitoring multivariate process operations more reliably.Fisher discriminant analysis (FDA) is used to derive a feature discriminant direction (FDD) between each normal and fault operations,and each FDD thus decided constructs the feature space of each fault operation.Individuals control charts (XmR charts) are used to monitor multivariate processes using the process data projected onto feature spaces.Upper control limit (UCL) and lower control limit (LCL) on each feature space from normal process operation are calculated for XmR charts,and are used to distinguish fault from normal.A variation trend on an XmR chart reveals the type of relevant fault operation.Applications to Tennessee Eastman simulation processes show that this proposed method can result in better monitoring performance than principal component analysis (PCA)-based methods and can better identify step type faults on XmR charts.
基金supported by the National Instrumentation Programmme(Nos.2011YQ17006702 and 2011YQ14015010)the National Natural Science Foundation of China(Nos.81102413 and 21175121)Fundamental Research Program of Shenzhen (No.JC201005280634A).
文摘A new multivariate statistical strategy for analyzing large datasets that are produced by imaging mass spectrometry(IMS) techniques is reported.The strategy divides the whole datacube of the sample into several subsets and analyses them one by one to obtain the results.Instead of analyzing the whole datacube at one time,the strategy makes the analysis easier and decreases the computation time greatly.In this report,the IMS data are produced by the air flow-assisted ionization IMS(AFAI-IMS).The strategy can be used in combination with most multivariate statistical analysis methods.In this paper,the strategy was combined with the principal component analysis(PCA) and partial least square analysis(PLS).It was proven to be effective by analyzing the handwriting sample.By using the strategy,the m/z corresponding to the specific lipids in rat brain tissue were distinguished successfully.Moreover the analysis time grew linearly instead of exponentially as the size of sample increased.The strategy developed in this study has enormous potential for searching for the mjz of potential biomarkers quickly and effectively.