Existing water hazard detection methods usually fail when the features of water surfaces are greatly changed by the surroundings, e.g., by a change in illumination. This paper proposes a novel algorithm to robustly de...Existing water hazard detection methods usually fail when the features of water surfaces are greatly changed by the surroundings, e.g., by a change in illumination. This paper proposes a novel algorithm to robustly detect different kinds of water hazards for autonomous navigation. Our algorithm combines traditional machine learning and image segmentation and uses only digital cameras, which are usually affordable, as the visual sensors. Active learning is used for automatically dealing with problems caused by the selection, labeling and classification of large numbers of training sets. Mean-shift based image segmentation is used to refine the final classification. Our experimental results show that our new algorithm can accurately detect not only ‘common’ water hazards, which usually have the features of both high brightness and low texture, but also ‘special’ water hazards that may have lots of ripples or low brightness.展开更多
Machine learning is becoming increasingly important in scientific and technological progress,due to its ability to create models that describe complex data and generalize well.The wealth of publicly-available seismic ...Machine learning is becoming increasingly important in scientific and technological progress,due to its ability to create models that describe complex data and generalize well.The wealth of publicly-available seismic data nowadays requires automated,fast,and reliable tools to carry out a multitude of tasks,such as the detection of small,local earthquakes in areas characterized by sparsity of receivers.A similar application of machine learning,however,should be built on a large amount of labeled seismograms,which is neither immediate to obtain nor to compile.In this study we present a large dataset of seismograms recorded along the vertical,north,and east components of 1487 broad-band or very broad-band receivers distributed worldwide;this includes 629,0953-component seismograms generated by 304,878 local earthquakes and labeled as EQ,and 615,847 ones labeled as noise(AN).Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings,even if applied in regions not represented in the training set.Achieving an accuracy of 96.7,95.3,and 93.2% on training,validation,and test set,respectively,we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm,and makes it applicable to real-time detection of local events.We make the database publicly available,intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.展开更多
Purpose: The authors aim at testing the performance of a set of machine learning algorithms that could improve the process of data cleaning when building datasets. Design/methodology/approach: The paper is centered ...Purpose: The authors aim at testing the performance of a set of machine learning algorithms that could improve the process of data cleaning when building datasets. Design/methodology/approach: The paper is centered on cleaning datasets gathered from publishers and online resources by the use of specific keywords. In this case, we analyzed data from the Web of Science. The accuracy of various forms of automatic classification was tested here in comparison with manual coding in order to determine their usefulness for data collection and cleaning. We assessed the performance of seven supervised classification algorithms (Support Vector Machine (SVM), Scaled Linear Discriminant Analysis, Lasso and elastic-net regularized generalized linear models, Maximum Entropy, Regression Tree, Boosting, and Random Forest) and analyzed two properties: accuracy and recall. We assessed not only each algorithm individually, but also their combinations through a voting scheme. We also tested the performance of these algorithms with different sizes of training data. When assessing the performance of different combinations, we used an indicator of coverage to account for the agreement and disagreement on classification between algorithms. Findings: We found that the performance of the algorithms used vary with the size of the sample for training. However, for the classification exercise in this paper the best performing algorithms were SVM and Boosting. The combination of these two algorithms achieved a high agreement on coverage and was highly accurate. This combination performs well with a small training dataset (10%), which may reduce the manual work needed for classification tasks. Research limitations: The dataset gathered has significantly more records related to the topic of interest compared to unrelated topics. This may affect the performance of some algorithms, especially in their identification of unrelated papers. Practical implications: Although the classification achieved by this means is not completely accurate, the amount of manual coding needed can be greatly reduced by using classification algorithms. This can be of great help when the dataset is big. With the help of accuracy, recall,and coverage measures, it is possible to have an estimation of the error involved in this classification, which could open the possibility of incorporating the use of these algorithms in software specifically designed for data cleaning and classification.展开更多
Inverse synthetic aperture radar(ISAR) imaging can be regarded as a narrow-band version of the computer aided tomography(CT). The traditional CT imaging algorithms for ISAR, including the polar format algorithm(PFA) a...Inverse synthetic aperture radar(ISAR) imaging can be regarded as a narrow-band version of the computer aided tomography(CT). The traditional CT imaging algorithms for ISAR, including the polar format algorithm(PFA) and the convolution back projection algorithm(CBP), usually suffer from the problem of the high sidelobe and the low resolution. The ISAR tomography image reconstruction within a sparse Bayesian framework is concerned. Firstly, the sparse ISAR tomography imaging model is established in light of the CT imaging theory. Then, by using the compressed sensing(CS) principle, a high resolution ISAR image can be achieved with limited number of pulses. Since the performance of existing CS-based ISAR imaging algorithms is sensitive to the user parameter, this makes the existing algorithms inconvenient to be used in practice. It is well known that the Bayesian formalism of recover algorithm named sparse Bayesian learning(SBL) acts as an effective tool in regression and classification,which uses an efficient expectation maximization procedure to estimate the necessary parameters, and retains a preferable property of the l0-norm diversity measure. Motivated by that, a fully automated ISAR tomography imaging algorithm based on SBL is proposed.Experimental results based on simulated and electromagnetic(EM) data illustrate the effectiveness and the superiority of the proposed algorithm over the existing algorithms.展开更多
This paper describes the research carried out in partial fulfilment of the degree of doctor of education. The study was qualitative in nature with a phenomenological interpretive paradigm dominating the philosophical ...This paper describes the research carried out in partial fulfilment of the degree of doctor of education. The study was qualitative in nature with a phenomenological interpretive paradigm dominating the philosophical approach. The research methods adopted combined life story and grounded theory. As far as the author has been able to determine there are very few, if any studies which have applied this approach specifically to this area of research which investigated the influence life history has on attitude to lifelong learning. Twenty five respondents were interviewed in face-to-face informal interviews. The main aim was to elicit the respondent's subjective interpretation of the interaction between school, family, work, and learning within their lives. The researcher was then able to identify when they occurred and what or who made them particularly meaningful. This paper describes how initial decisions were made regarding the substantive area for the research. Sampling technique and method for collecting the data is discussed and a worked example is given of how the data was analysed. It is intended that this paper will give an insight into the challenge of combining these two much debated methods of research. The empirical data lead to some interesting findings which educators and policy makers will find helpful in order to strengthen the school, college, and workplace interface.展开更多
The goal in reinforcement learning is to learn the value of state-action pair in order to maximize the total reward. For continuous states and actions in the real world, the representation of value functions is critic...The goal in reinforcement learning is to learn the value of state-action pair in order to maximize the total reward. For continuous states and actions in the real world, the representation of value functions is critical. Furthermore, the samples in value functions are sequentially obtained. Therefore, an online sup-port vector regression (OSVR) is set up, which is a function approximator to estimate value functions in reinforcement learning. OSVR updates the regression function by analyzing the possible variation of sup-port vector sets after new samples are inserted to the training set. To evaluate the OSVR learning ability, it is applied to the mountain-car task. The simulation results indicate that the OSVR has a preferable con- vergence speed and can solve continuous problems that are infeasible using lookup table.展开更多
Some students make negative comments on their learning outcomes after two years' college English learning. The article is to investigate factors influencing students' evaluation of their learning, including students...Some students make negative comments on their learning outcomes after two years' college English learning. The article is to investigate factors influencing students' evaluation of their learning, including students' self-efficacy, learning strategies and different categories of achievement goals.展开更多
基金Project supported by the National Natural Science Foundation of China (Nos. 60505017 and 60534070)the Natural Science Foundation of Zhejiang Province, China (No. 2005C14008)
文摘Existing water hazard detection methods usually fail when the features of water surfaces are greatly changed by the surroundings, e.g., by a change in illumination. This paper proposes a novel algorithm to robustly detect different kinds of water hazards for autonomous navigation. Our algorithm combines traditional machine learning and image segmentation and uses only digital cameras, which are usually affordable, as the visual sensors. Active learning is used for automatically dealing with problems caused by the selection, labeling and classification of large numbers of training sets. Mean-shift based image segmentation is used to refine the final classification. Our experimental results show that our new algorithm can accurately detect not only ‘common’ water hazards, which usually have the features of both high brightness and low texture, but also ‘special’ water hazards that may have lots of ripples or low brightness.
文摘Machine learning is becoming increasingly important in scientific and technological progress,due to its ability to create models that describe complex data and generalize well.The wealth of publicly-available seismic data nowadays requires automated,fast,and reliable tools to carry out a multitude of tasks,such as the detection of small,local earthquakes in areas characterized by sparsity of receivers.A similar application of machine learning,however,should be built on a large amount of labeled seismograms,which is neither immediate to obtain nor to compile.In this study we present a large dataset of seismograms recorded along the vertical,north,and east components of 1487 broad-band or very broad-band receivers distributed worldwide;this includes 629,0953-component seismograms generated by 304,878 local earthquakes and labeled as EQ,and 615,847 ones labeled as noise(AN).Application of machine learning to this dataset shows that a simple Convolutional Neural Network of 67,939 parameters allows discriminating between earthquakes and noise single-station recordings,even if applied in regions not represented in the training set.Achieving an accuracy of 96.7,95.3,and 93.2% on training,validation,and test set,respectively,we prove that the large variety of geological and tectonic settings covered by our data supports the generalization capabilities of the algorithm,and makes it applicable to real-time detection of local events.We make the database publicly available,intending to provide the seismological and broader scientific community with a benchmark for time-series to be used as a testing ground in signal processing.
基金supported by National Natural Science Foundation of China(NSFC)(Grant No.:71173154)The National Social Science Fund of China(NSSFC)(Grant No.:08BZX076)the Fundamental Research Funds for the Central Universities
文摘Purpose: The authors aim at testing the performance of a set of machine learning algorithms that could improve the process of data cleaning when building datasets. Design/methodology/approach: The paper is centered on cleaning datasets gathered from publishers and online resources by the use of specific keywords. In this case, we analyzed data from the Web of Science. The accuracy of various forms of automatic classification was tested here in comparison with manual coding in order to determine their usefulness for data collection and cleaning. We assessed the performance of seven supervised classification algorithms (Support Vector Machine (SVM), Scaled Linear Discriminant Analysis, Lasso and elastic-net regularized generalized linear models, Maximum Entropy, Regression Tree, Boosting, and Random Forest) and analyzed two properties: accuracy and recall. We assessed not only each algorithm individually, but also their combinations through a voting scheme. We also tested the performance of these algorithms with different sizes of training data. When assessing the performance of different combinations, we used an indicator of coverage to account for the agreement and disagreement on classification between algorithms. Findings: We found that the performance of the algorithms used vary with the size of the sample for training. However, for the classification exercise in this paper the best performing algorithms were SVM and Boosting. The combination of these two algorithms achieved a high agreement on coverage and was highly accurate. This combination performs well with a small training dataset (10%), which may reduce the manual work needed for classification tasks. Research limitations: The dataset gathered has significantly more records related to the topic of interest compared to unrelated topics. This may affect the performance of some algorithms, especially in their identification of unrelated papers. Practical implications: Although the classification achieved by this means is not completely accurate, the amount of manual coding needed can be greatly reduced by using classification algorithms. This can be of great help when the dataset is big. With the help of accuracy, recall,and coverage measures, it is possible to have an estimation of the error involved in this classification, which could open the possibility of incorporating the use of these algorithms in software specifically designed for data cleaning and classification.
基金Project(61171133)supported by the National Natural Science Foundation of ChinaProject(11JJ1010)supported by the Natural Science Fund for Distinguished Young Scholars of Hunan Province,ChinaProject(61101182)supported by the National Natural Science Foundation for Young Scientists of China
文摘Inverse synthetic aperture radar(ISAR) imaging can be regarded as a narrow-band version of the computer aided tomography(CT). The traditional CT imaging algorithms for ISAR, including the polar format algorithm(PFA) and the convolution back projection algorithm(CBP), usually suffer from the problem of the high sidelobe and the low resolution. The ISAR tomography image reconstruction within a sparse Bayesian framework is concerned. Firstly, the sparse ISAR tomography imaging model is established in light of the CT imaging theory. Then, by using the compressed sensing(CS) principle, a high resolution ISAR image can be achieved with limited number of pulses. Since the performance of existing CS-based ISAR imaging algorithms is sensitive to the user parameter, this makes the existing algorithms inconvenient to be used in practice. It is well known that the Bayesian formalism of recover algorithm named sparse Bayesian learning(SBL) acts as an effective tool in regression and classification,which uses an efficient expectation maximization procedure to estimate the necessary parameters, and retains a preferable property of the l0-norm diversity measure. Motivated by that, a fully automated ISAR tomography imaging algorithm based on SBL is proposed.Experimental results based on simulated and electromagnetic(EM) data illustrate the effectiveness and the superiority of the proposed algorithm over the existing algorithms.
文摘This paper describes the research carried out in partial fulfilment of the degree of doctor of education. The study was qualitative in nature with a phenomenological interpretive paradigm dominating the philosophical approach. The research methods adopted combined life story and grounded theory. As far as the author has been able to determine there are very few, if any studies which have applied this approach specifically to this area of research which investigated the influence life history has on attitude to lifelong learning. Twenty five respondents were interviewed in face-to-face informal interviews. The main aim was to elicit the respondent's subjective interpretation of the interaction between school, family, work, and learning within their lives. The researcher was then able to identify when they occurred and what or who made them particularly meaningful. This paper describes how initial decisions were made regarding the substantive area for the research. Sampling technique and method for collecting the data is discussed and a worked example is given of how the data was analysed. It is intended that this paper will give an insight into the challenge of combining these two much debated methods of research. The empirical data lead to some interesting findings which educators and policy makers will find helpful in order to strengthen the school, college, and workplace interface.
文摘The goal in reinforcement learning is to learn the value of state-action pair in order to maximize the total reward. For continuous states and actions in the real world, the representation of value functions is critical. Furthermore, the samples in value functions are sequentially obtained. Therefore, an online sup-port vector regression (OSVR) is set up, which is a function approximator to estimate value functions in reinforcement learning. OSVR updates the regression function by analyzing the possible variation of sup-port vector sets after new samples are inserted to the training set. To evaluate the OSVR learning ability, it is applied to the mountain-car task. The simulation results indicate that the OSVR has a preferable con- vergence speed and can solve continuous problems that are infeasible using lookup table.
文摘Some students make negative comments on their learning outcomes after two years' college English learning. The article is to investigate factors influencing students' evaluation of their learning, including students' self-efficacy, learning strategies and different categories of achievement goals.