In the past decade,financial institutions have invested significant efforts in the development of accurate analytical credit scoring models.The evidence suggests that even small improvements in the accuracy of existin...In the past decade,financial institutions have invested significant efforts in the development of accurate analytical credit scoring models.The evidence suggests that even small improvements in the accuracy of existing credit-scoring models may optimize profits while effectively managing risk exposure.Despite continuing efforts,the majority of existing credit scoring models still include some judgment-based assumptions that are sometimes supported by the significant findings of previous studies but are not validated using the institution’s internal data.We argue that current studies related to the development of credit scoring models have largely ignored recent developments in statistical methods for sufficient dimension reduction.To contribute to the field of financial innovation,this study proposes a Dimension Reduction Assisted Credit Scoring(DRA-CS)method via distance covariance-based sufficient dimension reduction(DCOV-SDR)in Majorization-Minimization(MM)algorithm.First,in the presence of a large number of variables,the DRA-CS method results in greater dimension reduction and better prediction accuracy than the other methods used for dimension reduction.Second,when the DRA-CS method is employed with logistic regression,it outperforms existing methods based on different variable selection techniques.This study argues that the DRA-CS method should be used by financial institutions as a financial innovation tool to analyze high-dimensional customer datasets and improve the accuracy of existing credit scoring methods.展开更多
An improved Hybrid Collaborative Filtering algorithm(H-CF)is proposed,addressing the issues of data sparsity,low recommendation accuracy,and poor scalability present in traditional collaborative filtering algorithms.T...An improved Hybrid Collaborative Filtering algorithm(H-CF)is proposed,addressing the issues of data sparsity,low recommendation accuracy,and poor scalability present in traditional collaborative filtering algorithms.The core of H-CF is a linear weighted hybrid algorithm based on the Latent Factor Model(LFM)and the Improved Item Clustering and Similarity Calculation Collaborative Filtering Algorithm(ITCSCF).To begin with,the items are clustered based on their attribute dimension,which accelerates the computation of the nearest neighbor set.Subsequently,H-CF enhances the formula for scoring similarity by penalizing popular items and optimizing unpopular items.This improvement enhances the rationality of scoring similarity and reduces the impact of data sparseness.Furthermore,a weighting function is employed to combine the various improved algorithms.The balance factor of the weighting function is dynamically adjusted to attain the optimal recommendation list.To address the real-time and scalability concerns,the algorithm leverages the Spark big data distributed cluster computing framework.Experiments were conducted using the public dataset Movie Lens,where the improved algorithm’s performance was compared against the algorithm before enhancement and the algorithm running on a single machine.The experimental results demonstrate that the improved algorithm outperforms in terms of data sparsity,recommendation personalization,accuracy,recall,and efficiency.展开更多
The standalone Global Positioning System (GPS) does not meet the higher accuracy requirements needed for approach and landing phase of an aircraft. To meet the Category-I Precision Approach (CAT-I PA) requirements of ...The standalone Global Positioning System (GPS) does not meet the higher accuracy requirements needed for approach and landing phase of an aircraft. To meet the Category-I Precision Approach (CAT-I PA) requirements of civil aviation, satellite based augmentation system (SBAS) has been planned by various countries including USA, Europe, Japan and India. The Indian SBAS is named as GPS Aided Geo Augmented Navigation (GAGAN). The GAGAN network consists of several dual frequency GPS receivers located at various airports around the Indian subcontinent. The ionospheric delay, which is a function of the total electron content (TEC), is one of the main sources of error affecting GPS/SBAS accuracy. A dual frequency GPS receiver can be used to estimate the TEC. However, line-of-sight TEC derived from dual frequency GPS data is corrupted by the instrumental biases of the GPS receiver and satellites. The estimation of receiver instrumental bias is particularly important for obtaining accurate estimates of ionospheric delay. In this paper, two prominent techniques based on Kalman filter and Self-Calibration Of pseudo Range Error (SCORE) algorithm are used for estimation of instrumental biases. The estimated instrumental bias and TEC results for the GPS Aided Geo Augmented Navigation (GAGAN) station at Hyderabad (78.47°E, 17.45°N), India are presented.展开更多
Quality of experience ( QoE ) based scheduling algorithm of long term evalution ( LTE ) network with various traffics is studied. Utility functions are adopted to estimate mean opinion score (MOS) for different ...Quality of experience ( QoE ) based scheduling algorithm of long term evalution ( LTE ) network with various traffics is studied. Utility functions are adopted to estimate mean opinion score (MOS) for different traffics and a new MOS metric called normalized MOS is defined. A scheduling algorithm based on normalized MOS and greedy algorithm is proposed, aiming at maximizing the entirety MOS level of the whole users in the cell. We compare the performance of the proposed algorithm with other typical scheduling algorithms and the simulation results show that the algorithm pro- posed outperform other ones in term of QoE and fairness.展开更多
The Floyd-Warshall algorithm is frequently used to determine the shortest path between any pair of nodes.It works well for crisp weights,but the problem arises when weights are vague and uncertain.Let us take an examp...The Floyd-Warshall algorithm is frequently used to determine the shortest path between any pair of nodes.It works well for crisp weights,but the problem arises when weights are vague and uncertain.Let us take an example of computer networks,where the chosen path might no longer be appropriate due to rapid changes in network conditions.The optimal path from among all possible courses is chosen in computer networks based on a variety of parameters.In this paper,we design a new variant of the Floyd-Warshall algorithm that identifies an All-Pair Shortest Path(APSP)in an uncertain situation of a network.In the proposed methodology,multiple criteria and theirmutual associationmay involve the selection of any suitable path between any two node points,and the values of these criteria may change due to an uncertain environment.We use trapezoidal picture fuzzy addition,score,and accuracy functions to find APSP.We compute the time complexity of this algorithm and contrast it with the traditional Floyd-Warshall algorithm and fuzzy Floyd-Warshall algorithm.展开更多
With continuous advancements in artificial intelligence(AI), automatic piano-playing robots have become subjects of cross-disciplinary interest. However, in most studies, these robots served merely as objects of obser...With continuous advancements in artificial intelligence(AI), automatic piano-playing robots have become subjects of cross-disciplinary interest. However, in most studies, these robots served merely as objects of observation with limited user engagement or interaction. To address this issue, we propose a user-friendly and innovative interaction system based on the principles of greedy algorithms. This system features three modules: score management, performance control, and keyboard interactions. Upon importing a custom score or playing a note via an external device, the system performs on a virtual piano in line with user inputs. This system has been successfully integrated into our dexterous manipulator-based piano-playing device, which significantly enhances user interactions.展开更多
Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 ...Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.展开更多
The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary ...The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.展开更多
The increasing amount of sequences stored in genomic databases has become unfeasible to the sequential analysis. Then, the parallel computing brought its power to the Bioinformatics through parallel algorithms to alig...The increasing amount of sequences stored in genomic databases has become unfeasible to the sequential analysis. Then, the parallel computing brought its power to the Bioinformatics through parallel algorithms to align and analyze the sequences, providing improvements mainly in the running time of these algorithms. In many situations, the parallel strategy contributes to reducing the computational complexity of the big problems. This work shows some results obtained by an implementation of a parallel score estimating technique for the score matrix calculation stage, which is the first stage of a progressive multiple sequence alignment. The performance and quality of the parallel score estimating are compared with the results of a dynamic programming approach also implemented in parallel. This comparison shows a significant reduction of running time. Moreover, the quality of the final alignment, using the new strategy, is analyzed and compared with the quality of the approach with dynamic programming.展开更多
We consider the task of binary classification in the high-dimensional setting where the number of features of the given data is larger than the number of observations.To accomplish this task,we propose an adherently p...We consider the task of binary classification in the high-dimensional setting where the number of features of the given data is larger than the number of observations.To accomplish this task,we propose an adherently penalized optimal scoring(APOS)model for simultaneously performing discriminant analysis and feature selection.In this paper,an efficient algorithm based on the block coordinate descent(BCD)method and the SSNAL algorithm is developed to solve the APOS approximately.The convergence results of our method are also established.Numerical experiments conducted on simulated and real datasets demonstrate that the proposed model is more efficient than several sparse discriminant analysis methods.展开更多
文摘In the past decade,financial institutions have invested significant efforts in the development of accurate analytical credit scoring models.The evidence suggests that even small improvements in the accuracy of existing credit-scoring models may optimize profits while effectively managing risk exposure.Despite continuing efforts,the majority of existing credit scoring models still include some judgment-based assumptions that are sometimes supported by the significant findings of previous studies but are not validated using the institution’s internal data.We argue that current studies related to the development of credit scoring models have largely ignored recent developments in statistical methods for sufficient dimension reduction.To contribute to the field of financial innovation,this study proposes a Dimension Reduction Assisted Credit Scoring(DRA-CS)method via distance covariance-based sufficient dimension reduction(DCOV-SDR)in Majorization-Minimization(MM)algorithm.First,in the presence of a large number of variables,the DRA-CS method results in greater dimension reduction and better prediction accuracy than the other methods used for dimension reduction.Second,when the DRA-CS method is employed with logistic regression,it outperforms existing methods based on different variable selection techniques.This study argues that the DRA-CS method should be used by financial institutions as a financial innovation tool to analyze high-dimensional customer datasets and improve the accuracy of existing credit scoring methods.
基金Supported by the Natural Science Foundation of Jiangxi Province(20212BAB202018)Provincial Virtual Simulation Experiment Education Project of Jiangxi Education Department(2020-2-0048)the Science and Technology Research Project of Jiangxi Province Educational Department(GJJ210333)。
文摘An improved Hybrid Collaborative Filtering algorithm(H-CF)is proposed,addressing the issues of data sparsity,low recommendation accuracy,and poor scalability present in traditional collaborative filtering algorithms.The core of H-CF is a linear weighted hybrid algorithm based on the Latent Factor Model(LFM)and the Improved Item Clustering and Similarity Calculation Collaborative Filtering Algorithm(ITCSCF).To begin with,the items are clustered based on their attribute dimension,which accelerates the computation of the nearest neighbor set.Subsequently,H-CF enhances the formula for scoring similarity by penalizing popular items and optimizing unpopular items.This improvement enhances the rationality of scoring similarity and reduces the impact of data sparseness.Furthermore,a weighting function is employed to combine the various improved algorithms.The balance factor of the weighting function is dynamically adjusted to attain the optimal recommendation list.To address the real-time and scalability concerns,the algorithm leverages the Spark big data distributed cluster computing framework.Experiments were conducted using the public dataset Movie Lens,where the improved algorithm’s performance was compared against the algorithm before enhancement and the algorithm running on a single machine.The experimental results demonstrate that the improved algorithm outperforms in terms of data sparsity,recommendation personalization,accuracy,recall,and efficiency.
文摘The standalone Global Positioning System (GPS) does not meet the higher accuracy requirements needed for approach and landing phase of an aircraft. To meet the Category-I Precision Approach (CAT-I PA) requirements of civil aviation, satellite based augmentation system (SBAS) has been planned by various countries including USA, Europe, Japan and India. The Indian SBAS is named as GPS Aided Geo Augmented Navigation (GAGAN). The GAGAN network consists of several dual frequency GPS receivers located at various airports around the Indian subcontinent. The ionospheric delay, which is a function of the total electron content (TEC), is one of the main sources of error affecting GPS/SBAS accuracy. A dual frequency GPS receiver can be used to estimate the TEC. However, line-of-sight TEC derived from dual frequency GPS data is corrupted by the instrumental biases of the GPS receiver and satellites. The estimation of receiver instrumental bias is particularly important for obtaining accurate estimates of ionospheric delay. In this paper, two prominent techniques based on Kalman filter and Self-Calibration Of pseudo Range Error (SCORE) algorithm are used for estimation of instrumental biases. The estimated instrumental bias and TEC results for the GPS Aided Geo Augmented Navigation (GAGAN) station at Hyderabad (78.47°E, 17.45°N), India are presented.
基金Supported by China National S&T Major Project(2013ZX03003002-003)Beijing Natural Science Foundation(4152047)National High Technology Research and Development Program of China(863Program)(2014AA01A701)
文摘Quality of experience ( QoE ) based scheduling algorithm of long term evalution ( LTE ) network with various traffics is studied. Utility functions are adopted to estimate mean opinion score (MOS) for different traffics and a new MOS metric called normalized MOS is defined. A scheduling algorithm based on normalized MOS and greedy algorithm is proposed, aiming at maximizing the entirety MOS level of the whole users in the cell. We compare the performance of the proposed algorithm with other typical scheduling algorithms and the simulation results show that the algorithm pro- posed outperform other ones in term of QoE and fairness.
基金extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through General Research Project under Grant No.(R.G.P.2/48/43).
文摘The Floyd-Warshall algorithm is frequently used to determine the shortest path between any pair of nodes.It works well for crisp weights,but the problem arises when weights are vague and uncertain.Let us take an example of computer networks,where the chosen path might no longer be appropriate due to rapid changes in network conditions.The optimal path from among all possible courses is chosen in computer networks based on a variety of parameters.In this paper,we design a new variant of the Floyd-Warshall algorithm that identifies an All-Pair Shortest Path(APSP)in an uncertain situation of a network.In the proposed methodology,multiple criteria and theirmutual associationmay involve the selection of any suitable path between any two node points,and the values of these criteria may change due to an uncertain environment.We use trapezoidal picture fuzzy addition,score,and accuracy functions to find APSP.We compute the time complexity of this algorithm and contrast it with the traditional Floyd-Warshall algorithm and fuzzy Floyd-Warshall algorithm.
基金Supported by the Natural Science Foundation of China(62072388)Public Technology Service Platform Project of Xiamen City(3502Z20231043)Fujian Sunshine Charity Foundation.
文摘With continuous advancements in artificial intelligence(AI), automatic piano-playing robots have become subjects of cross-disciplinary interest. However, in most studies, these robots served merely as objects of observation with limited user engagement or interaction. To address this issue, we propose a user-friendly and innovative interaction system based on the principles of greedy algorithms. This system features three modules: score management, performance control, and keyboard interactions. Upon importing a custom score or playing a note via an external device, the system performs on a virtual piano in line with user inputs. This system has been successfully integrated into our dexterous manipulator-based piano-playing device, which significantly enhances user interactions.
文摘Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.
文摘The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.
文摘The increasing amount of sequences stored in genomic databases has become unfeasible to the sequential analysis. Then, the parallel computing brought its power to the Bioinformatics through parallel algorithms to align and analyze the sequences, providing improvements mainly in the running time of these algorithms. In many situations, the parallel strategy contributes to reducing the computational complexity of the big problems. This work shows some results obtained by an implementation of a parallel score estimating technique for the score matrix calculation stage, which is the first stage of a progressive multiple sequence alignment. The performance and quality of the parallel score estimating are compared with the results of a dynamic programming approach also implemented in parallel. This comparison shows a significant reduction of running time. Moreover, the quality of the final alignment, using the new strategy, is analyzed and compared with the quality of the approach with dynamic programming.
基金supported by the National Natural Science Foundation of China(No.12271097)the Key Program of National Science Foundation of Fujian Province of China(No.2023J02007)+1 种基金the Central Guidance on Local Science and Technology Development Fund of Fujian Province(No.2023L3003)the Fujian Alliance of Mathematics(No.2023SXLMMS01)。
文摘We consider the task of binary classification in the high-dimensional setting where the number of features of the given data is larger than the number of observations.To accomplish this task,we propose an adherently penalized optimal scoring(APOS)model for simultaneously performing discriminant analysis and feature selection.In this paper,an efficient algorithm based on the block coordinate descent(BCD)method and the SSNAL algorithm is developed to solve the APOS approximately.The convergence results of our method are also established.Numerical experiments conducted on simulated and real datasets demonstrate that the proposed model is more efficient than several sparse discriminant analysis methods.