Geomechanical data are never sufficient in quantity or adequately precise and accurate for design purposes in mining and civil engineering.The objective of this paper is to show the variability of rock properties at t...Geomechanical data are never sufficient in quantity or adequately precise and accurate for design purposes in mining and civil engineering.The objective of this paper is to show the variability of rock properties at the sampled point in the roadway's roof,and then,how the statistical processing of the available geomechanical data can affect the results of numerical modelling of the roadway's stability.Four cases were applied in the numerical analysis,using average values(the most common in geomechanical data analysis),average minus standard deviation,median,and average value minus statistical error.The study show that different approach to the same geomechanical data set can change the modelling results considerably.The case shows that average minus standard deviation is the most conservative and least risky.It gives the displacements and yielded elements zone in four times broader range comparing to the average values scenario,which is the least conservative option.The two other cases need to be studied further.The results obtained from them are placed between most favorable and most adverse values.Taking the average values corrected by statistical error for the numerical analysis seems to be the best solution.Moreover,the confidence level can be adjusted depending on the object importance and the assumed risk level.展开更多
Due to the advances of intelligent transportation system(ITSs),traffic forecasting has gained significant interest as robust traffic prediction acts as an important part in different ITSs namely traffic signal control...Due to the advances of intelligent transportation system(ITSs),traffic forecasting has gained significant interest as robust traffic prediction acts as an important part in different ITSs namely traffic signal control,navigation,route mapping,etc.The traffic prediction model aims to predict the traffic conditions based on the past traffic data.For more accurate traffic prediction,this study proposes an optimal deep learning-enabled statistical analysis model.This study offers the design of optimal convolutional neural network with attention long short term memory(OCNN-ALSTM)model for traffic prediction.The proposed OCNN-ALSTM technique primarily preprocesses the traffic data by the use of min-max normalization technique.Besides,OCNN-ALSTM technique was executed for classifying and predicting the traffic data in real time cases.For enhancing the predictive outcomes of the OCNN-ALSTM technique,the bird swarm algorithm(BSA)is employed to it and thereby overall efficacy of the network gets improved.The design of BSA for optimal hyperparameter tuning of the CNN-ALSTM model shows the novelty of the work.The experimental validation of the OCNNALSTM technique is performed using benchmark datasets and the results are examined under several aspects.The simulation results reported the enhanced outcomes of the OCNN-ALSTM model over the recent methods under several dimensions.展开更多
The establishment of effective null models can provide reference networks to accurately describe statistical properties of real-life signed networks.At present,two classical null models of signed networks(i.e.,sign an...The establishment of effective null models can provide reference networks to accurately describe statistical properties of real-life signed networks.At present,two classical null models of signed networks(i.e.,sign and full-edge randomized models)shuffle both positive and negative topologies at the same time,so it is difficult to distinguish the effect on network topology of positive edges,negative edges,and the correlation between them.In this study,we construct three re-fined edge-randomized null models by only randomizing link relationships without changing positive and negative degree distributions.The results of nontrivial statistical indicators of signed networks,such as average degree connectivity and clustering coefficient,show that the position of positive edges has a stronger effect on positive-edge topology,while the signs of negative edges have a greater influence on negative-edge topology.For some specific statistics(e.g.,embeddedness),the results indicate that the proposed null models can more accurately describe real-life networks compared with the two existing ones,which can be selected to facilitate a better understanding of complex structures,functions,and dynamical behaviors on signed networks.展开更多
Since its first flight in 2007,the UAVSAR instrument of NASA has acquired a large number of fully Polarimetric SAR(PolSAR)data in very high spatial resolution.It is possible to observe small spatial features in this t...Since its first flight in 2007,the UAVSAR instrument of NASA has acquired a large number of fully Polarimetric SAR(PolSAR)data in very high spatial resolution.It is possible to observe small spatial features in this type of data,offering the opportunity to explore structures in the images.In general,the structured scenes would present multimodal or spiky histograms.The finite mixture model has great advantages in modeling data with irregular histograms.In this paper,a type of important statistics called log-cumulants,which could be used to design parameter estimator or goodness-of-fit tests,are derived for the finite mixture model.They are compared with logcumulants of the texture models.The results are adopted to UAVSAR data analysis to determine which model is better for different land types.展开更多
Qasab basin is one of the most promising areas for the sustainable development in the Eastern Desert fringes of the Nile Valley, Egypt. The integration between statistical analysis, stable isotopes as well as geochemi...Qasab basin is one of the most promising areas for the sustainable development in the Eastern Desert fringes of the Nile Valley, Egypt. The integration between statistical analysis, stable isotopes as well as geochemical modeling tools delineated the geochemical possesses affecting groundwater quality and detected the main recharge source in Qasab basin. The most of groundwater samples are brackish (88%), while the minority (12%) of the samples are fresh. The electrical conductivity of groundwater ranged from 1135 to 10,030 μS/cm. The statistical analysis and hydrochemical diagrams suggest that the groundwater quality is mainly controlled by several intermixed processes (rock weathering and agricultural activities). The mineralization of the Pleistocene groundwater is regulated by the rock weathering source, evaporation processes and reverse cation exchange. The isotopic signatures (δ<sup>2</sup>H and δ<sup>18</sup>O) represent two groundwater groups. The first group, is enriched with the isotopic signature of δ<sup>18</sup>O, which ranges from 0.9‰ to 5.5‰. This group is mostly affected by the recent meteoric recharge from the surface water leakage. The second group, is relatively depleted with the isotopic signature of δ<sup>18</sup>O, reflecting a palaeo recharge source of colder climate. The δ<sup>18</sup>O‰ varies from <span style="color:#4F4F4F;font-family:"font-size:14px;white-space:normal;background-color:#FFFFFF;">-</span>10.1‰ to <span style="color:#4F4F4F;font-family:"font-size:14px;white-space:normal;background-color:#FFFFFF;">-</span>6.4‰, indicating upward leakage of the Nubian sandstone aquifer through deep seated faults. The inverse geochemical model reflects that the salinity source of the groundwater samples is due to the leaching and dissolution processes of carbonate, sulphate and chloride minerals from the aquifer matrix. This study can demonstrate the hydrochemistry assessment guide to support sustainable development in Qasab basin to ensure that adequate groundwater management can play to reduce poverty and support socioeconomic development.展开更多
<strong>Background:</strong> The Cox Proportional Hazard (Cox-PH) model has been a popularly used method for survival analysis of cancer data given the survival times as a function of covariates or risk fa...<strong>Background:</strong> The Cox Proportional Hazard (Cox-PH) model has been a popularly used method for survival analysis of cancer data given the survival times as a function of covariates or risk factors. However, it is very seldom to see the assumptions for the application of the Cox-PH model satisfied in most of the research studies, raising questions about the effectiveness, robustness, and accuracy of the model predicting the proportion of survival times. This is because the necessary assumptions in most cases are difficult to satisfy, as well as the assessment of interaction among covariates. <strong>Methods:</strong> To further improve the therapeutic/treatment strategy for cancer diseases, we proposed a new approach to survival analysis using multiple myeloma (MM) cancer data. We first developed a data-driven nonlinear statistical model that predicts the survival times with 93% accuracy. We then performed a parametric analysis on the predicted survival times to obtain the survival function which is used in estimating the proportion of survival times. <strong>Results:</strong> The new proposed approach for survival analysis has proved to be more robust and gives better estimates of the proportion of survival than the Cox-PH model. Also, satisfying the proposed model assumptions and finding interactions among risk factors is less difficult compared to the Cox-PH model. The proposed model can predict the real values of the survival times and the identified risk factors are ranked according to the percent of contribution to the survival time. <strong>Conclusion:</strong> The new proposed nonlinear statistical model approach for survival analysis of cancer diseases is very efficient and provides an improved and innovative strategy for cancer therapeutic/treatment.展开更多
This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a...This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a major problem of translating into or from Korean. To solve this problem, we present a method to conduct a Korean morphological analysis by using a pre-analyzed partial word-phrase dictionary(PWD).Besides, we build a Korean-Vietnamese parallel corpus for training SMT models by collecting text from multilingual magazines. Then, we apply such a morphology analysis to Korean sentences that are included in the collected parallel corpus as a preprocessing step. The experiment results demonstrate a remarkable improvement of Korean-to-Vietnamese translation quality in term of bi-lingual evaluation understudy(BLEU).展开更多
In the past few years, genome-wide association study (GWAS) has made great successes in identifying genetic susceptibility loci underlying many complex diseases and traits. The findings provide important genetic ins...In the past few years, genome-wide association study (GWAS) has made great successes in identifying genetic susceptibility loci underlying many complex diseases and traits. The findings provide important genetic insights into understanding pathogenesis of diseases. In this paper, we present an overview of widely used approaches and strategies for analysis of GWAS, offered a general consideration to deal with GWAS data. The issues regarding data quality control, population structure, association analysis, multiple comparison and visual presentation of GWAS results are discussed; other advanced topics including the issue of missing heritability, meta-analysis, setbased association analysis, copy number variation analysis and GWAS cohort analysis are also briefly introduced.展开更多
In order to improve the prediction accuracy and test the generalization ability of the dam deformation analysis model, the back-propagation(BP) neural network model for dam deformation analysis is studied, and the m...In order to improve the prediction accuracy and test the generalization ability of the dam deformation analysis model, the back-propagation(BP) neural network model for dam deformation analysis is studied, and the merging model is built based on the neural network BP algorithm and the traditional statistical model. The three models mentioned above are calculated and analyzed according to the long-term deformation observation data in Chencun Dam. The analytical results show that the average prediction accuracies of the statistical model and the BP neural network model are ~ 0.477 and +- 0.390 mm, respectively, while the prediction accuracy of the merging model is ~0. 318 mm, which is improved by 33% and 18% compared to the other two models, respectively. And the merging model has a better generalization ability and broad applicability.展开更多
China has huge differences among its regions in terms of socio-economic development, industrial structure, natural resource endowments, and technological advancement. These differences have created complicated linkage...China has huge differences among its regions in terms of socio-economic development, industrial structure, natural resource endowments, and technological advancement. These differences have created complicated linkages between regions in China. In this study, building upon gravity model and location quotient techniques, we develop a sector-specific model to estimate inter-provincial trade flows, which is the base for making a multi-regional input-output table. In the model, we distinguish sectors with less intra-sector input from those with larger intra-sector input, and assume that the former sectors tend to compete among regions while the latter tend to cooperate among regions. Then we apply this new method of inter-regional trade estimation to three sectors: food and tobacco, metal smelting and proc- essing, and electrical equipment. The results show that selection of bandwidth has a significant impact on the assessment of inter-regional trade. Trade flows are more scattered with the increase of bandwidths. As a result, bandwidth reflects the spatial concentration of geo- graphical activities, which should be distinguishable for different industries. We conclude that the sector-specific spatial model can increase the credibility of estimates of inter-regional trade flows.展开更多
The goal of this study is to analyze the statistics of the backscatter signal from bovine cancellous bone using a Nakagami model and to evaluate the feasibility of Nakagami-model parameters for cancellous bone charact...The goal of this study is to analyze the statistics of the backscatter signal from bovine cancellous bone using a Nakagami model and to evaluate the feasibility of Nakagami-model parameters for cancellous bone characterization. Ultrasonic backscatter measurements were performed on 24 bovine cancellous bone specimens in vitro and the backscatter signals were compensated for the frequency-dependent attenuation prior to the envelope detection. The statistics of the backscatter envelope were modeled using the Nakagami distribution. Our results reveal that the backscatter envelope mainly followed pre-Rayleigh distributions, and the deviations of the backscatter envelope from Rayleigh distribution decreased with increasing bone density. The Nakagami shape parameter(i.e., m) was significantly correlated with bone densities(R = 0.78–0.81, p < 0.001) and trabecular microstructures(|R| = 0.46–0.78, p < 0.05). The scale parameter(i.e.,?) and signal-to-noise ratio(SNR) also yielded significant correlations with bone density and structural features. Multiple linear regressions showed that bone volume fraction(BV/TV) was the main predictor of the Nakagami parameters,and microstructure produced significantly independent contribution to the prediction of Nakagami distribution parameters,explaining an additional 10.2% of the variance at most. The in vitro study showed that statistical parameters derived with Nakagami model might be useful for cancellous bone characterization, and statistical analysis has potential for ultrasonic backscatter bone evaluation.展开更多
This paper systematically studies the statistical diagnosis and hypothesis testing for the semiparametric linear regression model according to the theories and methods of the statistical diagnosis and hypothesis testi...This paper systematically studies the statistical diagnosis and hypothesis testing for the semiparametric linear regression model according to the theories and methods of the statistical diagnosis and hypothesis testing for parametric regression model.Several diagnostic measures and the methods for gross error testing are derived.Especially,the global and local influence analysis of the gross error on the parameter X and the nonparameter s are discussed in detail;at the same time,the paper proves that the data point deletion model is equivalent to the mean shift model for the semiparametric regression model.Finally,with one simulative computing example,some helpful conclusions are drawn.展开更多
BACKGROUND Meta-analysis is a critical tool in evidence-based medicine,particularly in cardiology,where it synthesizes data from multiple studies to inform clinical decisions.This study explored the potential of using...BACKGROUND Meta-analysis is a critical tool in evidence-based medicine,particularly in cardiology,where it synthesizes data from multiple studies to inform clinical decisions.This study explored the potential of using ChatGPT to streamline and enhance the meta-analysis process.AIM To investigate the potential of ChatGPT to conduct meta-analyses in interventional cardiology by comparing the results of ChatGPT-generated analyses with those of randomly selected,human-conducted meta-analyses on the same topic.METHODS We systematically searched PubMed for meta-analyses on interventional cardiology published in 2024.Five metaanalyses were randomly chosen.ChatGPT 4.0 was used to perform meta-analyses on the extracted data.We compared the results from ChatGPT with the original meta-analyses,focusing on key effect sizes,such as risk ratios(RR),hazard ratios,and odds ratios,along with their confidence intervals(CI)and P values.RESULTS The ChatGPT results showed high concordance with those of the original meta-analyses.For most outcomes,the effect measures and P values generated by ChatGPT closely matched those of the original studies,except for the RR of stent thrombosis in the Sreenivasan et al study,where ChatGPT reported a non-significant effect size,while the original study found it to be statistically significant.While minor discrepancies were observed in specific CI and P values,these differences did not alter the overall conclusions drawn from the analyses.CONCLUSION Our findings suggest the potential of ChatGPT in conducting meta-analyses in interventional cardiology.However,further research is needed to address the limitations of transparency and potential data quality issues,ensuring that AI-generated analyses are robust and trustworthy for clinical decision-making.展开更多
Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component...Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component analysis with discriminatory analysis. Principal component analysis and discriminatory analysis are very important theories in multivariate statistical analysis that has developed quickly in the late thirty years. By means of maximization information method, we choose several earthquake prediction factors whose cumulative proportions of total sam-ple variances are beyond 90% from numerous earthquake prediction factors. The paper applies regression analysis and Mahalanobis discrimination to extrapolating synthetic prediction. Furthermore, we use this model to charac-terize and predict earthquakes in North China (30~42N, 108~125E) and better prediction results are obtained.展开更多
Symbolic circuit simulator is traditionally applied to the small-signal analysis of analog circuits. This paper establishes a symbolic behavioral macromodeling method applicable to both small-signal and large-signal a...Symbolic circuit simulator is traditionally applied to the small-signal analysis of analog circuits. This paper establishes a symbolic behavioral macromodeling method applicable to both small-signal and large-signal analysis of general two-stage operational amplifiers (op-amps). The proposed method creates a two-pole parametric macromodel whose parameters are analytical functions of the circuit element parameters generated by a symbolic circuit simulator. A moment matching technique is used in deriving the analytical model parameter. The created parametric behavioral model can be used for op-amps performance simulation in both frequency and time domains. In particular, the parametric models are highly suited for fast statistical simulation of op-amps in the time-domain. Experiment results show that the statistical distributions of the op-amp slew and settling time characterized by the proposed model agree well with the transistor-level results in addition to achieving significant speedup.展开更多
Ionospheric variability is influenced by many factors, such as solar radiation, neutral atmosphere composition, and geomagnetic disturbances. Mainly characterized by the total electron content(TEC) and electron densit...Ionospheric variability is influenced by many factors, such as solar radiation, neutral atmosphere composition, and geomagnetic disturbances. Mainly characterized by the total electron content(TEC) and electron density, the climatology of the ionosphere features temporal and spatial changes. Establishing a multivariant regression model helps substantially in better understanding the ionosphere characteristics and their long-term variability. In this paper, an improvement of the existing ionosphere multivariate linear fitting regression model is proposed and investigated using data from both the ionosonde and the global ionosphere map(GIM) derived from groundbased Global Navigation Satellite System(GNSS) observations. The proposed method gives more consideration to the impact of the solar activity and adds modeling of the annual periodic fluctuations and half-year periodic fluctuations for the F10.7 index. The improved model is verified to have a better correlation with the real observations and can help reduce the calculation uncertainty.Moreover, the proposed model is used to evaluate the fitting accuracy of the GIMs produced by five authorized data analysis centers from the International GNSS Service(IGS). The results show that there is a fixing hole in the North America region for the GIM model where the correlation between the GIM and the proposed model always returns lower values compared to other places.展开更多
Statistical two-group comparisons are widely used to identify the significant differentially expressed (DE) signatures against a therapy response for microarray data analysis. We applied a rank order statistics based ...Statistical two-group comparisons are widely used to identify the significant differentially expressed (DE) signatures against a therapy response for microarray data analysis. We applied a rank order statistics based on an Autoregressive Conditional Heteroskedasticity (ARCH) residual empirical process to DE analysis. This approach was considered for simulation data and publicly available datasets, and was compared with two-group comparison by original data and Auto-regressive (AR) residual. The significant DE genes by the ARCH and AR residuals were reduced by about 20% - 30% to these genes by the original data. Almost 100% of the genes by ARCH are covered by the genes by the original data unlike the genes by AR residuals. GO enrichment and Pathway analyses indicate the consistent biological characteristics between genes by ARCH residuals and original data. ARCH residuals array data might contribute to refining the number of significant DE genes to detect the biological feature as well as ordinal microarray data.展开更多
This paper represents a template matching using statistical model and parametric template for multi-template. This algorithm consists of two phases: training and matching phases. In the training phase, the statistical...This paper represents a template matching using statistical model and parametric template for multi-template. This algorithm consists of two phases: training and matching phases. In the training phase, the statistical model created by principal component analysis method (PCA) can be used to synthesize multi-template. The advantage of PCA is to reduce the variances of multi-template. In the matching phase, the normalized cross correlation (NCC) is employed to find the candidates in inspection images. The relationship between image block and multi-template is built to use parametric template method. Results show that the proposed method is more efficient than the conventional template matching and parametric template. Furthermore, the proposed method is more robust than conventional template method.展开更多
文摘Geomechanical data are never sufficient in quantity or adequately precise and accurate for design purposes in mining and civil engineering.The objective of this paper is to show the variability of rock properties at the sampled point in the roadway's roof,and then,how the statistical processing of the available geomechanical data can affect the results of numerical modelling of the roadway's stability.Four cases were applied in the numerical analysis,using average values(the most common in geomechanical data analysis),average minus standard deviation,median,and average value minus statistical error.The study show that different approach to the same geomechanical data set can change the modelling results considerably.The case shows that average minus standard deviation is the most conservative and least risky.It gives the displacements and yielded elements zone in four times broader range comparing to the average values scenario,which is the least conservative option.The two other cases need to be studied further.The results obtained from them are placed between most favorable and most adverse values.Taking the average values corrected by statistical error for the numerical analysis seems to be the best solution.Moreover,the confidence level can be adjusted depending on the object importance and the assumed risk level.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2021R1A6A1A03039493).
文摘Due to the advances of intelligent transportation system(ITSs),traffic forecasting has gained significant interest as robust traffic prediction acts as an important part in different ITSs namely traffic signal control,navigation,route mapping,etc.The traffic prediction model aims to predict the traffic conditions based on the past traffic data.For more accurate traffic prediction,this study proposes an optimal deep learning-enabled statistical analysis model.This study offers the design of optimal convolutional neural network with attention long short term memory(OCNN-ALSTM)model for traffic prediction.The proposed OCNN-ALSTM technique primarily preprocesses the traffic data by the use of min-max normalization technique.Besides,OCNN-ALSTM technique was executed for classifying and predicting the traffic data in real time cases.For enhancing the predictive outcomes of the OCNN-ALSTM technique,the bird swarm algorithm(BSA)is employed to it and thereby overall efficacy of the network gets improved.The design of BSA for optimal hyperparameter tuning of the CNN-ALSTM model shows the novelty of the work.The experimental validation of the OCNNALSTM technique is performed using benchmark datasets and the results are examined under several aspects.The simulation results reported the enhanced outcomes of the OCNN-ALSTM model over the recent methods under several dimensions.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61773091 and 61603073)the LiaoNing Revitalization Talents Program(Grant No.XLYC1807106)the Natural Science Foundation of Liaoning Province,China(Grant No.2020-MZLH-22).
文摘The establishment of effective null models can provide reference networks to accurately describe statistical properties of real-life signed networks.At present,two classical null models of signed networks(i.e.,sign and full-edge randomized models)shuffle both positive and negative topologies at the same time,so it is difficult to distinguish the effect on network topology of positive edges,negative edges,and the correlation between them.In this study,we construct three re-fined edge-randomized null models by only randomizing link relationships without changing positive and negative degree distributions.The results of nontrivial statistical indicators of signed networks,such as average degree connectivity and clustering coefficient,show that the position of positive edges has a stronger effect on positive-edge topology,while the signs of negative edges have a greater influence on negative-edge topology.For some specific statistics(e.g.,embeddedness),the results indicate that the proposed null models can more accurately describe real-life networks compared with the two existing ones,which can be selected to facilitate a better understanding of complex structures,functions,and dynamical behaviors on signed networks.
基金This work has been supported in part by the Shenzhen Science&Technology Program[grant number JSGG20150512145714247]the State Key Program of National Natural Science of China[grant number 61331016]National Key Research Plan of China[grant number 2016YFC0500201-07].
文摘Since its first flight in 2007,the UAVSAR instrument of NASA has acquired a large number of fully Polarimetric SAR(PolSAR)data in very high spatial resolution.It is possible to observe small spatial features in this type of data,offering the opportunity to explore structures in the images.In general,the structured scenes would present multimodal or spiky histograms.The finite mixture model has great advantages in modeling data with irregular histograms.In this paper,a type of important statistics called log-cumulants,which could be used to design parameter estimator or goodness-of-fit tests,are derived for the finite mixture model.They are compared with logcumulants of the texture models.The results are adopted to UAVSAR data analysis to determine which model is better for different land types.
文摘Qasab basin is one of the most promising areas for the sustainable development in the Eastern Desert fringes of the Nile Valley, Egypt. The integration between statistical analysis, stable isotopes as well as geochemical modeling tools delineated the geochemical possesses affecting groundwater quality and detected the main recharge source in Qasab basin. The most of groundwater samples are brackish (88%), while the minority (12%) of the samples are fresh. The electrical conductivity of groundwater ranged from 1135 to 10,030 μS/cm. The statistical analysis and hydrochemical diagrams suggest that the groundwater quality is mainly controlled by several intermixed processes (rock weathering and agricultural activities). The mineralization of the Pleistocene groundwater is regulated by the rock weathering source, evaporation processes and reverse cation exchange. The isotopic signatures (δ<sup>2</sup>H and δ<sup>18</sup>O) represent two groundwater groups. The first group, is enriched with the isotopic signature of δ<sup>18</sup>O, which ranges from 0.9‰ to 5.5‰. This group is mostly affected by the recent meteoric recharge from the surface water leakage. The second group, is relatively depleted with the isotopic signature of δ<sup>18</sup>O, reflecting a palaeo recharge source of colder climate. The δ<sup>18</sup>O‰ varies from <span style="color:#4F4F4F;font-family:"font-size:14px;white-space:normal;background-color:#FFFFFF;">-</span>10.1‰ to <span style="color:#4F4F4F;font-family:"font-size:14px;white-space:normal;background-color:#FFFFFF;">-</span>6.4‰, indicating upward leakage of the Nubian sandstone aquifer through deep seated faults. The inverse geochemical model reflects that the salinity source of the groundwater samples is due to the leaching and dissolution processes of carbonate, sulphate and chloride minerals from the aquifer matrix. This study can demonstrate the hydrochemistry assessment guide to support sustainable development in Qasab basin to ensure that adequate groundwater management can play to reduce poverty and support socioeconomic development.
文摘<strong>Background:</strong> The Cox Proportional Hazard (Cox-PH) model has been a popularly used method for survival analysis of cancer data given the survival times as a function of covariates or risk factors. However, it is very seldom to see the assumptions for the application of the Cox-PH model satisfied in most of the research studies, raising questions about the effectiveness, robustness, and accuracy of the model predicting the proportion of survival times. This is because the necessary assumptions in most cases are difficult to satisfy, as well as the assessment of interaction among covariates. <strong>Methods:</strong> To further improve the therapeutic/treatment strategy for cancer diseases, we proposed a new approach to survival analysis using multiple myeloma (MM) cancer data. We first developed a data-driven nonlinear statistical model that predicts the survival times with 93% accuracy. We then performed a parametric analysis on the predicted survival times to obtain the survival function which is used in estimating the proportion of survival times. <strong>Results:</strong> The new proposed approach for survival analysis has proved to be more robust and gives better estimates of the proportion of survival than the Cox-PH model. Also, satisfying the proposed model assumptions and finding interactions among risk factors is less difficult compared to the Cox-PH model. The proposed model can predict the real values of the survival times and the identified risk factors are ranked according to the percent of contribution to the survival time. <strong>Conclusion:</strong> The new proposed nonlinear statistical model approach for survival analysis of cancer diseases is very efficient and provides an improved and innovative strategy for cancer therapeutic/treatment.
基金supported by the Institute for Information&communications Technology Promotion under Grant No.R0101-16-0176the Project of Core Technology Development for Human-Like Self-Taught Learning Based on Symbolic Approach
文摘This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a major problem of translating into or from Korean. To solve this problem, we present a method to conduct a Korean morphological analysis by using a pre-analyzed partial word-phrase dictionary(PWD).Besides, we build a Korean-Vietnamese parallel corpus for training SMT models by collecting text from multilingual magazines. Then, we apply such a morphology analysis to Korean sentences that are included in the collected parallel corpus as a preprocessing step. The experiment results demonstrate a remarkable improvement of Korean-to-Vietnamese translation quality in term of bi-lingual evaluation understudy(BLEU).
基金supported by National Natural Science Foundation of China(No.81072389,81373102,81473070 and 81402765)Research Found for the Doctoral Program of Higher Education of China(No.20113234110002)+4 种基金Key Grant of Natural Science Foundation of the Jiangsu Higher Education Institutions of China(No.10KJA330034)College Philosophy and Social Science Foundation from Education Department of Jiangsu Province of China(No.2013SJB790059,2013SJD790032)Research Foundation from Xuzhou Medical College(No.2012KJ02)Research and Innovation Project for College Graduates of Jiangsu Province of China(No.CXLX13_574)the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)
文摘In the past few years, genome-wide association study (GWAS) has made great successes in identifying genetic susceptibility loci underlying many complex diseases and traits. The findings provide important genetic insights into understanding pathogenesis of diseases. In this paper, we present an overview of widely used approaches and strategies for analysis of GWAS, offered a general consideration to deal with GWAS data. The issues regarding data quality control, population structure, association analysis, multiple comparison and visual presentation of GWAS results are discussed; other advanced topics including the issue of missing heritability, meta-analysis, setbased association analysis, copy number variation analysis and GWAS cohort analysis are also briefly introduced.
基金The Scientific Innovation Research of College Graduates in Jiangsu Province(No.CXLX11_0143)
文摘In order to improve the prediction accuracy and test the generalization ability of the dam deformation analysis model, the back-propagation(BP) neural network model for dam deformation analysis is studied, and the merging model is built based on the neural network BP algorithm and the traditional statistical model. The three models mentioned above are calculated and analyzed according to the long-term deformation observation data in Chencun Dam. The analytical results show that the average prediction accuracies of the statistical model and the BP neural network model are ~ 0.477 and +- 0.390 mm, respectively, while the prediction accuracy of the merging model is ~0. 318 mm, which is improved by 33% and 18% compared to the other two models, respectively. And the merging model has a better generalization ability and broad applicability.
基金National Science Foundation for Distinguished Young Scholars of China, No.41125005
文摘China has huge differences among its regions in terms of socio-economic development, industrial structure, natural resource endowments, and technological advancement. These differences have created complicated linkages between regions in China. In this study, building upon gravity model and location quotient techniques, we develop a sector-specific model to estimate inter-provincial trade flows, which is the base for making a multi-regional input-output table. In the model, we distinguish sectors with less intra-sector input from those with larger intra-sector input, and assume that the former sectors tend to compete among regions while the latter tend to cooperate among regions. Then we apply this new method of inter-regional trade estimation to three sectors: food and tobacco, metal smelting and proc- essing, and electrical equipment. The results show that selection of bandwidth has a significant impact on the assessment of inter-regional trade. Trade flows are more scattered with the increase of bandwidths. As a result, bandwidth reflects the spatial concentration of geo- graphical activities, which should be distinguishable for different industries. We conclude that the sector-specific spatial model can increase the credibility of estimates of inter-regional trade flows.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.11874289,11827808,11504057,11525416,and 81601504)the Fundamental Research Funds for the Central Universities
文摘The goal of this study is to analyze the statistics of the backscatter signal from bovine cancellous bone using a Nakagami model and to evaluate the feasibility of Nakagami-model parameters for cancellous bone characterization. Ultrasonic backscatter measurements were performed on 24 bovine cancellous bone specimens in vitro and the backscatter signals were compensated for the frequency-dependent attenuation prior to the envelope detection. The statistics of the backscatter envelope were modeled using the Nakagami distribution. Our results reveal that the backscatter envelope mainly followed pre-Rayleigh distributions, and the deviations of the backscatter envelope from Rayleigh distribution decreased with increasing bone density. The Nakagami shape parameter(i.e., m) was significantly correlated with bone densities(R = 0.78–0.81, p < 0.001) and trabecular microstructures(|R| = 0.46–0.78, p < 0.05). The scale parameter(i.e.,?) and signal-to-noise ratio(SNR) also yielded significant correlations with bone density and structural features. Multiple linear regressions showed that bone volume fraction(BV/TV) was the main predictor of the Nakagami parameters,and microstructure produced significantly independent contribution to the prediction of Nakagami distribution parameters,explaining an additional 10.2% of the variance at most. The in vitro study showed that statistical parameters derived with Nakagami model might be useful for cancellous bone characterization, and statistical analysis has potential for ultrasonic backscatter bone evaluation.
基金Supported by the National Natural Science Foundation of China (No. 40604001),the National High Technology Research and Development Program of China (No. 2007AA12Z312).Acknowledgement The authors thank Prof. Tao Benzao and Prof. Wang Xingzhou for several helpful suggestions during the preparation of this manuscript.
文摘This paper systematically studies the statistical diagnosis and hypothesis testing for the semiparametric linear regression model according to the theories and methods of the statistical diagnosis and hypothesis testing for parametric regression model.Several diagnostic measures and the methods for gross error testing are derived.Especially,the global and local influence analysis of the gross error on the parameter X and the nonparameter s are discussed in detail;at the same time,the paper proves that the data point deletion model is equivalent to the mean shift model for the semiparametric regression model.Finally,with one simulative computing example,some helpful conclusions are drawn.
文摘BACKGROUND Meta-analysis is a critical tool in evidence-based medicine,particularly in cardiology,where it synthesizes data from multiple studies to inform clinical decisions.This study explored the potential of using ChatGPT to streamline and enhance the meta-analysis process.AIM To investigate the potential of ChatGPT to conduct meta-analyses in interventional cardiology by comparing the results of ChatGPT-generated analyses with those of randomly selected,human-conducted meta-analyses on the same topic.METHODS We systematically searched PubMed for meta-analyses on interventional cardiology published in 2024.Five metaanalyses were randomly chosen.ChatGPT 4.0 was used to perform meta-analyses on the extracted data.We compared the results from ChatGPT with the original meta-analyses,focusing on key effect sizes,such as risk ratios(RR),hazard ratios,and odds ratios,along with their confidence intervals(CI)and P values.RESULTS The ChatGPT results showed high concordance with those of the original meta-analyses.For most outcomes,the effect measures and P values generated by ChatGPT closely matched those of the original studies,except for the RR of stent thrombosis in the Sreenivasan et al study,where ChatGPT reported a non-significant effect size,while the original study found it to be statistically significant.While minor discrepancies were observed in specific CI and P values,these differences did not alter the overall conclusions drawn from the analyses.CONCLUSION Our findings suggest the potential of ChatGPT in conducting meta-analyses in interventional cardiology.However,further research is needed to address the limitations of transparency and potential data quality issues,ensuring that AI-generated analyses are robust and trustworthy for clinical decision-making.
文摘Considering the problems that should be solved in the synthetic earthquake prediction at present, a new model is proposed in the paper. It is called joint multivariate statistical model combined by principal component analysis with discriminatory analysis. Principal component analysis and discriminatory analysis are very important theories in multivariate statistical analysis that has developed quickly in the late thirty years. By means of maximization information method, we choose several earthquake prediction factors whose cumulative proportions of total sam-ple variances are beyond 90% from numerous earthquake prediction factors. The paper applies regression analysis and Mahalanobis discrimination to extrapolating synthetic prediction. Furthermore, we use this model to charac-terize and predict earthquakes in North China (30~42N, 108~125E) and better prediction results are obtained.
文摘Symbolic circuit simulator is traditionally applied to the small-signal analysis of analog circuits. This paper establishes a symbolic behavioral macromodeling method applicable to both small-signal and large-signal analysis of general two-stage operational amplifiers (op-amps). The proposed method creates a two-pole parametric macromodel whose parameters are analytical functions of the circuit element parameters generated by a symbolic circuit simulator. A moment matching technique is used in deriving the analytical model parameter. The created parametric behavioral model can be used for op-amps performance simulation in both frequency and time domains. In particular, the parametric models are highly suited for fast statistical simulation of op-amps in the time-domain. Experiment results show that the statistical distributions of the op-amp slew and settling time characterized by the proposed model agree well with the transistor-level results in addition to achieving significant speedup.
基金supported by the National Natural Science Foundation of China (Nos. 61521091, 61771030, 61301087)supported by 2011 Collaborative Innovation Center of Geospatial Technology。
文摘Ionospheric variability is influenced by many factors, such as solar radiation, neutral atmosphere composition, and geomagnetic disturbances. Mainly characterized by the total electron content(TEC) and electron density, the climatology of the ionosphere features temporal and spatial changes. Establishing a multivariant regression model helps substantially in better understanding the ionosphere characteristics and their long-term variability. In this paper, an improvement of the existing ionosphere multivariate linear fitting regression model is proposed and investigated using data from both the ionosonde and the global ionosphere map(GIM) derived from groundbased Global Navigation Satellite System(GNSS) observations. The proposed method gives more consideration to the impact of the solar activity and adds modeling of the annual periodic fluctuations and half-year periodic fluctuations for the F10.7 index. The improved model is verified to have a better correlation with the real observations and can help reduce the calculation uncertainty.Moreover, the proposed model is used to evaluate the fitting accuracy of the GIMs produced by five authorized data analysis centers from the International GNSS Service(IGS). The results show that there is a fixing hole in the North America region for the GIM model where the correlation between the GIM and the proposed model always returns lower values compared to other places.
文摘Statistical two-group comparisons are widely used to identify the significant differentially expressed (DE) signatures against a therapy response for microarray data analysis. We applied a rank order statistics based on an Autoregressive Conditional Heteroskedasticity (ARCH) residual empirical process to DE analysis. This approach was considered for simulation data and publicly available datasets, and was compared with two-group comparison by original data and Auto-regressive (AR) residual. The significant DE genes by the ARCH and AR residuals were reduced by about 20% - 30% to these genes by the original data. Almost 100% of the genes by ARCH are covered by the genes by the original data unlike the genes by AR residuals. GO enrichment and Pathway analyses indicate the consistent biological characteristics between genes by ARCH residuals and original data. ARCH residuals array data might contribute to refining the number of significant DE genes to detect the biological feature as well as ordinal microarray data.
文摘This paper represents a template matching using statistical model and parametric template for multi-template. This algorithm consists of two phases: training and matching phases. In the training phase, the statistical model created by principal component analysis method (PCA) can be used to synthesize multi-template. The advantage of PCA is to reduce the variances of multi-template. In the matching phase, the normalized cross correlation (NCC) is employed to find the candidates in inspection images. The relationship between image block and multi-template is built to use parametric template method. Results show that the proposed method is more efficient than the conventional template matching and parametric template. Furthermore, the proposed method is more robust than conventional template method.