Water quality is a critical global issue,especially in urban and semi-urban regions where natural and anthropogenic factors significantly influence surface water systems.This study evaluates the hydrochemical characte...Water quality is a critical global issue,especially in urban and semi-urban regions where natural and anthropogenic factors significantly influence surface water systems.This study evaluates the hydrochemical characteristics of surface water in the North of Tehran Rivers(NTRs),an essential water resource in a rapidly urbanizing region,using advanced clustering techniques,including Hierarchical Clustering Analysis(HCA),Fuzzy CMeans(FCM),Genetic Algorithm Fuzzy C-Means(GAFCM),and Self-Organizing Map(SOM).The research aims to address the scientific challenge of understanding spatial and temporal variability in water quality,focusing on physicochemical parameters,hydrochemical facies,and contamination sources.Water samples from six rivers collected over four seasons in 2020 were analyzed and classified into distinct clusters based on their chemical composition,revealing significant seasonal and spatial differences.Results showed that FCM and GAFCM consistently categorized the NTRs into two clusters during winter and spring and three in summer and autumn.These findings were supported by HCA and SOM,which identified clusters corresponding to specific river segments and contamination levels.The primary hydrochemical processes identified were mineral dissolution and weathering,with calcite,dolomite,and aragonite significantly influencing water chemistry.Additionally,human activities,such as wastewater discharge,were shown to contribute to elevated sulfate,nitrate,and phosphate concentrations,further corroborated by microbial analyses.By integrating HCA,FCM,and GAFCM with an artificial neural network(ANN)-based clustering method(SOM),this study provides a robust framework for evaluating surface water quality.The findings,supported by Gibbs diagrams,Hounslow ion ratio,and saturation indices,highlight the dominance of rock weathering and human impacts in shaping the hydrochemical dynamics of the NTRs.These insights contribute to the scientific understanding of water quality dynamics and offer practical guidance for sustainable water resource management and environmental protection in developing urban areas.展开更多
The title of the online version of the original article was revised.The title of the original article has been revised to:Hydrochemical characterization of surface waters in Northern Tehran:Integrating cluster-based t...The title of the online version of the original article was revised.The title of the original article has been revised to:Hydrochemical characterization of surface waters in Northern Tehran:Integrating cluster-based techniques with Self-Organizing Maps.展开更多
Water resources are scarce in arid or semiarid areas,which not only limits economic development,but also threatens the survival of mankind.The local communities around the Hangjinqi gasfield depend on groundwater sour...Water resources are scarce in arid or semiarid areas,which not only limits economic development,but also threatens the survival of mankind.The local communities around the Hangjinqi gasfield depend on groundwater sources for water supply.A clear understanding of the groundwater hydrogeochemical characteristics and the groundwater quality and its seasonal cycle is invaluable and indispensable for groundwater protection and management.In this study,self-organizing maps were used in combination with the quantization and topographic errors and K-means clustering method to investigate groundwater chemistry datasets.The Piper and Gibbs diagrams and saturation index were systematically applied to investigate the hydrogeochemical characteristics of groundwater from both rainy and dry seasons.Further,the entropy-weighted theory was used to characterize groundwater quality and assess its seasonal variability and suitability for drinking purposes.Our hydrochemical groundwater dataset,consisting of 10 parameters measured during both dry and rainy seasons,was classified into 6 clusters,and the Piper diagram revealed three hydrochemical facies:Cl-Na type(clusters 1,2 and 3),mixed type(clusters 4 and 5),and HCO3-Ca type(cluster 6).The Gibbs diagram and saturation index suggested thatweathering of rock-forming mineralswere the primary process controlling groundwater chemical composition and validated the credibility and practicality of the clustering results.Two-thirds of 45 groundwater samples were categorized as excellent-or good-quality and were suitable as drinking water.Cluster changes within the same and different clusters from the dry season to the rainy season were detected in approximately 78%of the collected samples.The main factors affecting the groundwater quality were hydrogeochemical characteristics,and dry season groundwater quality was better than rainy season groundwater quality.Based on this work,such results can be used to investigate the seasonal variation of hydrogeochemical characteristics and assess water quality accurately in the others similar area.展开更多
We investigated the intraseasonal variability of equatorial Pacific subsurface temperature and its relationship with El Nino-Southern Oscillation(ENSO) using Self-Organizing Maps(SOM) analysis.Variation in intraseason...We investigated the intraseasonal variability of equatorial Pacific subsurface temperature and its relationship with El Nino-Southern Oscillation(ENSO) using Self-Organizing Maps(SOM) analysis.Variation in intraseasonal subsurface temperature is mainly found along the thermocline.The SOM patterns concentrate in basin-wide seesaw or sandwich structures along an east-west axis.Both the seesaw and sandwich SOM patterns oscillate with periods of 55 to 90 days,with the sequence of them showing features of equatorial intraseasonal Kelvin wave,and have marked interannual variations in their occurrence frequencies.Further examination shows that the interannual variability of the SOM patterns is closely related to ENSO;and maxima in composite interannual variability of the SOM patterns are located in the central Pacific during CP El Nino and in the eastern Pacific during EP El Nino.The se results imply that some of the ENSO forcing is manife sted through changes in the occurrence frequency of intraseasonal patterns,in which the change of the intraseasonal Kelvin wave plays an important role.展开更多
Unsupervised neural networks such as the Kohonen Self-Organizing Maps (SOM) have been widely used for searching natural clusters in multidimensional and massive data. One example where the data available for analysi...Unsupervised neural networks such as the Kohonen Self-Organizing Maps (SOM) have been widely used for searching natural clusters in multidimensional and massive data. One example where the data available for analysis can be extremely large is seismic interpretation for hydrocarbon exploration. In order to assist the interpreter in identifying characteristics of interest confined in the seismic data, the authors present a set of data attributes that can be used to train a SOM in such a way that zones of interest can be automatically identified or segmented, reducing time in the interpretation process. The authors show how to associate SOM to 2D color maps to visually identify the clustering structure of the input seismic data, and apply the proposed technique to a 2D synthetic seismic dataset of salt structures.展开更多
To solve the fault diagnosis problem of liquid propellant rocket engine ground testing bed,a fault diagnosis approach based on self-organizing map(SOM)is proposed.The SOM projects the multidimensional ground testing b...To solve the fault diagnosis problem of liquid propellant rocket engine ground testing bed,a fault diagnosis approach based on self-organizing map(SOM)is proposed.The SOM projects the multidimensional ground testing bed data into a two-dimensional map.Visualization of the SOM is used to cluster the ground testing bed data.The out map of the SOM is divided to several regions.Each region is represented for one fault mode.The fault mode of testing data is determined according to the region of their labels belonged to.The method is evaluated using the testing data of a liquid-propellant rocket engine ground testing bed with sixteen fault states.The results show that it is a reliable and effective method for fault diagnosis with good visualization property.展开更多
Characterization of unknown groundwater contaminant sources in terms of location, magnitude and duration of source activity is a complex problem. In this study, to increase the efficiency and accuracy of source charac...Characterization of unknown groundwater contaminant sources in terms of location, magnitude and duration of source activity is a complex problem. In this study, to increase the efficiency and accuracy of source characterization an alternative methodology to the methodologies proposed earlier is developed. This methodology, Adaptive Surrogate Modeling Based Optimization (ASMBO) uses the capabilities of Self Organizing Map (SOM) algorithm to design the surrogate models and adaptive surrogate models for source characterization. The most important advantage of this methodology is its direct utilization for groundwater contaminant characterization without the necessity of utilizing a linked simulation optimization model. The validation of the SOM based surrogate models and SOM based adaptive surrogate models demonstrates that the quantity and quality of initial sample sizes have crucial role on the accuracy of solutions as the designed monitoring locations. The performance evaluation results of the proposed methodology are obtained using error free and erroneous concentration measurement data. These results demonstrate that the developed methodology could approximate groundwater flow and transport simulation models, and substitute the optimization model for characterization of unknown groundwater contaminant sources in terms of location, magnitude and duration of source activity.展开更多
The traditional K-means clustering algorithm is difficult to determine the cluster number,which is sensitive to the initialization of the clustering center and easy to fall into local optimum.This paper proposes a clu...The traditional K-means clustering algorithm is difficult to determine the cluster number,which is sensitive to the initialization of the clustering center and easy to fall into local optimum.This paper proposes a clustering algorithm based on self-organizing mapping network and weight particle swarm optimization SOM&WPSO(Self-Organization Map and Weight Particle Swarm Optimization).Firstly,the algorithm takes the competitive learning mechanism of a self-organizing mapping network to divide the data samples into coarse clusters and obtain the clustering center.Then,the obtained clustering center is used as the initialization parameter of the weight particle swarm optimization algorithm.The particle position of the WPSO algorithm is determined by the traditional clustering center is improved to the sample weight,and the cluster center is the“food”of the particle group.Each particle moves toward the nearest cluster center.Each iteration optimizes the particle position and velocity and uses K-means and K-medoids recalculates cluster centers and cluster partitions until the end of the algorithm convergence iteration.After a lot of experimental analysis on the commonly used UCI data set,this paper not only solves the shortcomings of K-means clustering algorithm,the problem of dependence of the initial clustering center,and improves the accuracy of clustering,but also avoids falling into the local optimum.The algorithm has good global convergence.展开更多
An extended self-organizing map for supervised classification is proposed in this paper. Unlike other traditional SOMs, the model has an input layer, a Kohonen layer, and an output layer. The number of neurons in the ...An extended self-organizing map for supervised classification is proposed in this paper. Unlike other traditional SOMs, the model has an input layer, a Kohonen layer, and an output layer. The number of neurons in the input layer depends on the dimensionality of input patterns. The number of neurons in the output layer equals the number of the desired classes. The number of neurons in the Kohonen layer may be a few to several thousands, which depends on the complexity of classification problems and the classification precision. Each training sample is expressed by a pair of vectors : an input vector and a class codebook vector. When a training sample is input into the model, Kohonen's competitive learning rule is applied to selecting the winning neuron from the Kohouen layer and the weight coefficients connecting all the neurons in the input layer with both the winning neuron and its neighbors in the Kohonen layer are modified to be closer to the input vector, and those connecting all the neurons around the winning neuron within a certain diameter in the Kohonen layer with all the neurons in the output layer are adjusted to be closer to the class codebook vector. If the number of training sam- ples is sufficiently large and the learning epochs iterate enough times, the model will be able to serve as a supervised classifier. The model has been tentatively applied to the supervised classification of multispectral remotely sensed data. The author compared the performances of the extended SOM and BPN in remotely sensed data classification. The investigation manifests that the extended SOM is feasible for supervised classification.展开更多
The detailed analysis of individual rain events characteristics is an essential step for improving our understanding of variation in precipitation over different topographies. In this study, the homogeneity among rain...The detailed analysis of individual rain events characteristics is an essential step for improving our understanding of variation in precipitation over different topographies. In this study, the homogeneity among rain gauges was investigated using the concept of “rain event properties,” linking them to the main atmospheric system that affects the rainfall in the region. For this, eight properties of more than 23,000 rain events recorded at 47 meteorological stations in Mumbai, India, were analyzed utilizing seasonal (June-September) rainfall records over 2006-2016. The high similarities among the properties indicated the similarities among the rain gauges. Furthermore, similar rain gauges were distinguished, investigated and characterized by cluster analysis using self-organizing maps (SOM). The cluster analysis results show six clusters of similarly behaving rain gauges, where each cluster addresses one isolated class of variables for the rain gauge. Additionally, the clusters confirm the spatial variation of rainfall caused by the complex topography of Mumbai, comprising the flatland near the Arabian Sea, high-rise buildings (urban area) and mountain and hills areas (Sanjay Gandhi National Park located in the northern part of Mumbai).展开更多
Several studies were devoted to investigate the effects of meteorological factors on the occurrence of stroke. Regression models had been mostly used to assess the correlation between weather and stroke incidence. How...Several studies were devoted to investigate the effects of meteorological factors on the occurrence of stroke. Regression models had been mostly used to assess the correlation between weather and stroke incidence. However, these methods could not describe the process proceeding in the back-ground of stroke incidence. The purpose of this study was to provide a new approach based on Hidden Markov Models (HMMs) and self-organizing maps (SOM), interpreting the background from the viewpoint of weather variability. Based on meteorological data, SOM was performed to classify weather patterns. Using these classes by SOM as randomly changing “states”, our Hidden Markov Models were constructed with “observation data” that were extracted from the daily data of emergency transport at Nagoya City in Japan. We showed that SOM was an effective method to get weather patterns that would serve as “states” of Hidden Markov Models. Our Hidden Markov Models provided effective models to clarify background process for stroke incidence. The effectiveness of these Hidden Markov Models was estimated by stochastic test for root mean square errors (RMSE). “HMMs with states by SOM” would serve as a description of the background process of stroke incidence and were useful to show the influence of weather on stroke onset. This finding will contribute to an improvement of our understanding for links between weather variability and stroke incidence.展开更多
Business cluster identification is an essential topic for helping understand regional and global supply chains and establishing economic policies and logistics.This work aims to leverage the benefits of self-organizin...Business cluster identification is an essential topic for helping understand regional and global supply chains and establishing economic policies and logistics.This work aims to leverage the benefits of self-organizing maps(SOM),combined with traditional clustering algorithms and image processing techniques,to identify business clusters that are described by high-dimensionality feature vectors.It is advantageous over previous work because the algorithm is unsupervised and makes no assumptions about the number of clusters for a given feature set.The proposed algorithm was evaluated using recent datasets for US metropolitan cities from the Indiana Business Research Center(Innovation 2.0)and the Occupational Employment Statistics Survey.Data involving innovation metrics,education levels,economic well-being,connectivity,local GDP,and STEM are aggregated to demonstrate the effectiveness of the proposed neural network.The clustering results are compared to traditional approaches,including K-means clustering,both quantitatively and qualitatively.The unsupervised nature of the proposed SOM approach,and the acceptable computational complexity of the overall algorithm,suggests that self-organizing maps offer several advantages over traditional methods.In this work,we present a novel architecture coupling a SOM model with processing techniques for automatically identifying business clusters derived from high-dimensionality feature vectors,the first use case of SOMs in business cases affecting supply chains and other economic decisions.Preliminary results confirm the viability of architecture as an unsupervised approach for identifying business clusters.展开更多
Intrusion attempts against Internet of Things(IoT)devices have significantly increased in the last few years.These devices are now easy targets for hackers because of their built-in security flaws.Combining a Self-Org...Intrusion attempts against Internet of Things(IoT)devices have significantly increased in the last few years.These devices are now easy targets for hackers because of their built-in security flaws.Combining a Self-Organizing Map(SOM)hybrid anomaly detection system for dimensionality reduction with the inherited nature of clustering and Extreme Gradient Boosting(XGBoost)for multi-class classification can improve network traffic intrusion detection.The proposed model is evaluated on the NSL-KDD dataset.The hybrid approach outperforms the baseline line models,Multilayer perceptron model,and SOM-KNN(k-nearest neighbors)model in precision,recall,and F1-score,highlighting the proposed approach’s scalability,potential,adaptability,and real-world applicability.Therefore,this paper proposes a highly efficient deployment strategy for resource-constrained network edges.The results reveal that Precision,Recall,and F1-scores rise 10%-30% for the benign,probing,and Denial of Service(DoS)classes.In particular,the DoS,probe,and benign classes improved their F1-scores by 7.91%,32.62%,and 12.45%,respectively.展开更多
Recently,machine learning(ML)has been considered a powerful technological element of different society areas.To transform the computer into a decision maker,several sophisticated methods and algorithms are constantly ...Recently,machine learning(ML)has been considered a powerful technological element of different society areas.To transform the computer into a decision maker,several sophisticated methods and algorithms are constantly created and analyzed.In geophysics,both supervised and unsupervised ML methods have dramatically contributed to the development of seismic and well-log data interpretation.In well-logging,ML algorithms are well-suited for lithologic reconstruction problems,once there is no analytical expressions for computing well-log data produced by a particular rock unit.Additionally,supervised ML methods are strongly dependent on a accurate-labeled training data-set,which is not a simple task to achieve,due to data absences or corruption.Once an adequate supervision is performed,the classification outputs tend to be more accurate than unsupervised methods.This work presents a supervised version of a Self-Organizing Map,named as SSOM,to solve a lithologic reconstruction problem from well-log data.Firstly,we go for a more controlled problem and simulate well-log data directly from an interpreted geologic cross-section.We then define two specific training data-sets composed by density(RHOB),sonic(DT),spontaneous potential(SP)and gamma-ray(GR)logs,all simulated through a Gaussian distribution function per lithology.Once the training data-set is created,we simulate a particular pseudo-well,referred to as classification well,for defining controlled tests.First one comprises a training data-set with no labeled log data of the simulated fault zone.In the second test,we intentionally improve the training data-set with the fault.To bespeak the obtained results for each test,we analyze confusion matrices,logplots,accuracy and precision.Apart from very thin layer misclassifications,the SSOM provides reasonable lithologic reconstructions,especially when the improved training data-set is considered for supervision.The set of numerical experiments shows that our SSOM is extremely well-suited for a supervised lithologic reconstruction,especially to recover lithotypes that are weakly-sampled in the training log-data.On the other hand,some misclassifications are also observed when the cortex could not group the slightly different lithologies.展开更多
The two important features of self-organizing maps (SOM), topological preservation and easy visualization, give it great potential for analyzing multi-dimensional time series, specifically traffic flow time series i...The two important features of self-organizing maps (SOM), topological preservation and easy visualization, give it great potential for analyzing multi-dimensional time series, specifically traffic flow time series in an urban traffic network. This paper investigates the application of SOM in the representation and prediction of multi-dimensional traffic time series. Ffrst, SOMs are applied to cluster the time series and to project each multi-dimensional vector onto a two-dimensional SOM plane while preserving the topological relationships of the original data. Then, the easy visualization of the SOMs is utilized and several exploratory methods are used to investigate the physical meaning of the clusters as well as how the traffic flow vectors evolve with time. Finally, the k-nearest neighbor (kNN) algorithm is applied to the clustering result to perform short-term predictions of the traffic flow vectors. Analysis of real world traffic data shows the effec- tiveness of these methods for traffic flow predictions, for they can capture the nonlinear information of traffic flows data and predict traffic flows on multiple links simultaneously.展开更多
Traveling salesman problem(TSP)is a classic non-deterministic polynomial-hard optimization prob-lem.Based on the characteristics of self-organizing mapping(SOM)network,this paper proposes an improved SOM network from ...Traveling salesman problem(TSP)is a classic non-deterministic polynomial-hard optimization prob-lem.Based on the characteristics of self-organizing mapping(SOM)network,this paper proposes an improved SOM network from the perspectives of network update strategy,initialization method,and parameter selection.This paper compares the performance of the proposed algorithms with the performance of existing SOM network algorithms on the TSP and compares them with several heuristic algorithms.Simulations show that compared with existing SOM networks,the improved SOM network proposed in this paper improves the convergence rate and algorithm accuracy.Compared with iterated local search and heuristic algorithms,the improved SOM net-work algorithms proposed in this paper have the advantage of fast calculation speed on medium-scale TSP.展开更多
We previously proposed a method for creating product maps with SOM (Self-Organizing Maps) to be used during purchase decision making. In that study, we first established two class boundaries, which divide the area b...We previously proposed a method for creating product maps with SOM (Self-Organizing Maps) to be used during purchase decision making. In that study, we first established two class boundaries, which divide the area between the minimum and maximum range of an input feature value into three equal parts. Then, we produced self-organizing product maps using classification data inputs. Finally, we applied our method to five product types and confirmed its effectiveness. In this paper, we propose a method for selecting alternatives from a product map, in which we have located a favorite several examples of selecting alternatives and making decisions using cluster, and/or from a favorite component map. We then show the AHP (Analytic Hierarchy Process).展开更多
Image-maps,a hybrid design with satellite images as background and map symbols uploaded,aim to combine the advantages of maps’high interpretation efficiency and satellite images’realism.The usability of image-maps i...Image-maps,a hybrid design with satellite images as background and map symbols uploaded,aim to combine the advantages of maps’high interpretation efficiency and satellite images’realism.The usability of image-maps is influenced by the representations of background images and map symbols.Many researchers explored the optimizations for background images and symbolization techniques for symbols to reduce the complexity of image-maps and improve the usability.However,little literature was found for the optimum amount of symbol loading.This study focuses on the effects of background image complexity and map symbol load on the usability(i.e.,effectiveness and efficiency)of image-maps.Experiments were conducted by user studies via eye-tracking equipment and an online questionnaire survey.Experimental data sets included image-maps with ten levels of map symbol load in ten areas.Forty volunteers took part in the target searching experiments.It has been found that the usability,i.e.,average time viewed(efficiency)and average revisits(effectiveness)of targets recorded,is influenced by the complexity of background images,a peak exists for optimum symbol load for an image-map.The optimum levels for symbol load for different image-maps also have a peak when the complexity of the background image/image map increases.The complexity of background images serves as a guideline for optimum map symbol load in image-map design.This study enhanced user experience by optimizing visual clarity and managing cognitive load.Understanding how these factors interact can help create adaptive maps that maintain clarity and usability,guiding AI algorithms to adjust symbol density based on user context.This research establishes the practices for map design,making cartographic tools more innovative and more user-centric.展开更多
Topographic maps,as essential tools and sources of information for geographic research,contain precise spatial locations and rich map features,and they illustrate spatio-temporal information on the distribution and di...Topographic maps,as essential tools and sources of information for geographic research,contain precise spatial locations and rich map features,and they illustrate spatio-temporal information on the distribution and differences of various surface features.Currently,topographic maps are mainly stored in raster and vector formats.Extraction of the spatio-temporal knowledge in the maps—such as spatial distribution patterns,feature relationships,and dynamic evolution—still primarily relies on manual interpretation.However,manual interpretation is time-consuming and laborious,especially for large-scale,long-term map knowledge extraction and application.With the development of artificial intelligence technology,it is possible to improve the automation level of map knowledge interpretation.Therefore,the present study proposes an automatic interpretation method for raster topographic map knowledge based on deep learning.To address the limitations of current data-driven intelligent technology in learning map spatial relations and cognitive logic,we establish a formal description of map knowledge by mapping the relationship between map knowledge and features,thereby ensuring interpretation accuracy.Subsequently,deep learning techniques are employed to extract map features automatically,and the spatio-temporal knowledge is constructed by combining formal descriptions of geographic feature knowledge.Validation experiments demonstrate that the proposed method effectively achieves automatic interpretation of spatio-temporal knowledge of geographic features in maps,with an accuracy exceeding 80%.The findings of the present study contribute to machine understanding of spatio-temporal differences in map knowledge and advances the intelligent interpretation and utilization of cartographic information.展开更多
文摘Water quality is a critical global issue,especially in urban and semi-urban regions where natural and anthropogenic factors significantly influence surface water systems.This study evaluates the hydrochemical characteristics of surface water in the North of Tehran Rivers(NTRs),an essential water resource in a rapidly urbanizing region,using advanced clustering techniques,including Hierarchical Clustering Analysis(HCA),Fuzzy CMeans(FCM),Genetic Algorithm Fuzzy C-Means(GAFCM),and Self-Organizing Map(SOM).The research aims to address the scientific challenge of understanding spatial and temporal variability in water quality,focusing on physicochemical parameters,hydrochemical facies,and contamination sources.Water samples from six rivers collected over four seasons in 2020 were analyzed and classified into distinct clusters based on their chemical composition,revealing significant seasonal and spatial differences.Results showed that FCM and GAFCM consistently categorized the NTRs into two clusters during winter and spring and three in summer and autumn.These findings were supported by HCA and SOM,which identified clusters corresponding to specific river segments and contamination levels.The primary hydrochemical processes identified were mineral dissolution and weathering,with calcite,dolomite,and aragonite significantly influencing water chemistry.Additionally,human activities,such as wastewater discharge,were shown to contribute to elevated sulfate,nitrate,and phosphate concentrations,further corroborated by microbial analyses.By integrating HCA,FCM,and GAFCM with an artificial neural network(ANN)-based clustering method(SOM),this study provides a robust framework for evaluating surface water quality.The findings,supported by Gibbs diagrams,Hounslow ion ratio,and saturation indices,highlight the dominance of rock weathering and human impacts in shaping the hydrochemical dynamics of the NTRs.These insights contribute to the scientific understanding of water quality dynamics and offer practical guidance for sustainable water resource management and environmental protection in developing urban areas.
文摘The title of the online version of the original article was revised.The title of the original article has been revised to:Hydrochemical characterization of surface waters in Northern Tehran:Integrating cluster-based techniques with Self-Organizing Maps.
基金the National Natural Science Foundation of China(Nos.41972259 and 41572227)the National Key Research and Development Program of China(No.2018YFC0406404).
文摘Water resources are scarce in arid or semiarid areas,which not only limits economic development,but also threatens the survival of mankind.The local communities around the Hangjinqi gasfield depend on groundwater sources for water supply.A clear understanding of the groundwater hydrogeochemical characteristics and the groundwater quality and its seasonal cycle is invaluable and indispensable for groundwater protection and management.In this study,self-organizing maps were used in combination with the quantization and topographic errors and K-means clustering method to investigate groundwater chemistry datasets.The Piper and Gibbs diagrams and saturation index were systematically applied to investigate the hydrogeochemical characteristics of groundwater from both rainy and dry seasons.Further,the entropy-weighted theory was used to characterize groundwater quality and assess its seasonal variability and suitability for drinking purposes.Our hydrochemical groundwater dataset,consisting of 10 parameters measured during both dry and rainy seasons,was classified into 6 clusters,and the Piper diagram revealed three hydrochemical facies:Cl-Na type(clusters 1,2 and 3),mixed type(clusters 4 and 5),and HCO3-Ca type(cluster 6).The Gibbs diagram and saturation index suggested thatweathering of rock-forming mineralswere the primary process controlling groundwater chemical composition and validated the credibility and practicality of the clustering results.Two-thirds of 45 groundwater samples were categorized as excellent-or good-quality and were suitable as drinking water.Cluster changes within the same and different clusters from the dry season to the rainy season were detected in approximately 78%of the collected samples.The main factors affecting the groundwater quality were hydrogeochemical characteristics,and dry season groundwater quality was better than rainy season groundwater quality.Based on this work,such results can be used to investigate the seasonal variation of hydrogeochemical characteristics and assess water quality accurately in the others similar area.
基金the National Natural Science Foundation of China (NSFC)(Nos.41976027,41976011,41730534,41476017,41576014)the Bureau of International Cooperation Chinese Academy of Sciences (No.132B61KYSB20170005)
文摘We investigated the intraseasonal variability of equatorial Pacific subsurface temperature and its relationship with El Nino-Southern Oscillation(ENSO) using Self-Organizing Maps(SOM) analysis.Variation in intraseasonal subsurface temperature is mainly found along the thermocline.The SOM patterns concentrate in basin-wide seesaw or sandwich structures along an east-west axis.Both the seesaw and sandwich SOM patterns oscillate with periods of 55 to 90 days,with the sequence of them showing features of equatorial intraseasonal Kelvin wave,and have marked interannual variations in their occurrence frequencies.Further examination shows that the interannual variability of the SOM patterns is closely related to ENSO;and maxima in composite interannual variability of the SOM patterns are located in the central Pacific during CP El Nino and in the eastern Pacific during EP El Nino.The se results imply that some of the ENSO forcing is manife sted through changes in the occurrence frequency of intraseasonal patterns,in which the change of the intraseasonal Kelvin wave plays an important role.
文摘Unsupervised neural networks such as the Kohonen Self-Organizing Maps (SOM) have been widely used for searching natural clusters in multidimensional and massive data. One example where the data available for analysis can be extremely large is seismic interpretation for hydrocarbon exploration. In order to assist the interpreter in identifying characteristics of interest confined in the seismic data, the authors present a set of data attributes that can be used to train a SOM in such a way that zones of interest can be automatically identified or segmented, reducing time in the interpretation process. The authors show how to associate SOM to 2D color maps to visually identify the clustering structure of the input seismic data, and apply the proposed technique to a 2D synthetic seismic dataset of salt structures.
基金Sponsored by the National Natural Science Foundation of China(Grant No. NSFC-60572010)
文摘To solve the fault diagnosis problem of liquid propellant rocket engine ground testing bed,a fault diagnosis approach based on self-organizing map(SOM)is proposed.The SOM projects the multidimensional ground testing bed data into a two-dimensional map.Visualization of the SOM is used to cluster the ground testing bed data.The out map of the SOM is divided to several regions.Each region is represented for one fault mode.The fault mode of testing data is determined according to the region of their labels belonged to.The method is evaluated using the testing data of a liquid-propellant rocket engine ground testing bed with sixteen fault states.The results show that it is a reliable and effective method for fault diagnosis with good visualization property.
文摘Characterization of unknown groundwater contaminant sources in terms of location, magnitude and duration of source activity is a complex problem. In this study, to increase the efficiency and accuracy of source characterization an alternative methodology to the methodologies proposed earlier is developed. This methodology, Adaptive Surrogate Modeling Based Optimization (ASMBO) uses the capabilities of Self Organizing Map (SOM) algorithm to design the surrogate models and adaptive surrogate models for source characterization. The most important advantage of this methodology is its direct utilization for groundwater contaminant characterization without the necessity of utilizing a linked simulation optimization model. The validation of the SOM based surrogate models and SOM based adaptive surrogate models demonstrates that the quantity and quality of initial sample sizes have crucial role on the accuracy of solutions as the designed monitoring locations. The performance evaluation results of the proposed methodology are obtained using error free and erroneous concentration measurement data. These results demonstrate that the developed methodology could approximate groundwater flow and transport simulation models, and substitute the optimization model for characterization of unknown groundwater contaminant sources in terms of location, magnitude and duration of source activity.
文摘The traditional K-means clustering algorithm is difficult to determine the cluster number,which is sensitive to the initialization of the clustering center and easy to fall into local optimum.This paper proposes a clustering algorithm based on self-organizing mapping network and weight particle swarm optimization SOM&WPSO(Self-Organization Map and Weight Particle Swarm Optimization).Firstly,the algorithm takes the competitive learning mechanism of a self-organizing mapping network to divide the data samples into coarse clusters and obtain the clustering center.Then,the obtained clustering center is used as the initialization parameter of the weight particle swarm optimization algorithm.The particle position of the WPSO algorithm is determined by the traditional clustering center is improved to the sample weight,and the cluster center is the“food”of the particle group.Each particle moves toward the nearest cluster center.Each iteration optimizes the particle position and velocity and uses K-means and K-medoids recalculates cluster centers and cluster partitions until the end of the algorithm convergence iteration.After a lot of experimental analysis on the commonly used UCI data set,this paper not only solves the shortcomings of K-means clustering algorithm,the problem of dependence of the initial clustering center,and improves the accuracy of clustering,but also avoids falling into the local optimum.The algorithm has good global convergence.
基金Supported by National Natural Science Foundation of China (No. 40872193)
文摘An extended self-organizing map for supervised classification is proposed in this paper. Unlike other traditional SOMs, the model has an input layer, a Kohonen layer, and an output layer. The number of neurons in the input layer depends on the dimensionality of input patterns. The number of neurons in the output layer equals the number of the desired classes. The number of neurons in the Kohonen layer may be a few to several thousands, which depends on the complexity of classification problems and the classification precision. Each training sample is expressed by a pair of vectors : an input vector and a class codebook vector. When a training sample is input into the model, Kohonen's competitive learning rule is applied to selecting the winning neuron from the Kohouen layer and the weight coefficients connecting all the neurons in the input layer with both the winning neuron and its neighbors in the Kohonen layer are modified to be closer to the input vector, and those connecting all the neurons around the winning neuron within a certain diameter in the Kohonen layer with all the neurons in the output layer are adjusted to be closer to the class codebook vector. If the number of training sam- ples is sufficiently large and the learning epochs iterate enough times, the model will be able to serve as a supervised classifier. The model has been tentatively applied to the supervised classification of multispectral remotely sensed data. The author compared the performances of the extended SOM and BPN in remotely sensed data classification. The investigation manifests that the extended SOM is feasible for supervised classification.
文摘The detailed analysis of individual rain events characteristics is an essential step for improving our understanding of variation in precipitation over different topographies. In this study, the homogeneity among rain gauges was investigated using the concept of “rain event properties,” linking them to the main atmospheric system that affects the rainfall in the region. For this, eight properties of more than 23,000 rain events recorded at 47 meteorological stations in Mumbai, India, were analyzed utilizing seasonal (June-September) rainfall records over 2006-2016. The high similarities among the properties indicated the similarities among the rain gauges. Furthermore, similar rain gauges were distinguished, investigated and characterized by cluster analysis using self-organizing maps (SOM). The cluster analysis results show six clusters of similarly behaving rain gauges, where each cluster addresses one isolated class of variables for the rain gauge. Additionally, the clusters confirm the spatial variation of rainfall caused by the complex topography of Mumbai, comprising the flatland near the Arabian Sea, high-rise buildings (urban area) and mountain and hills areas (Sanjay Gandhi National Park located in the northern part of Mumbai).
文摘Several studies were devoted to investigate the effects of meteorological factors on the occurrence of stroke. Regression models had been mostly used to assess the correlation between weather and stroke incidence. However, these methods could not describe the process proceeding in the back-ground of stroke incidence. The purpose of this study was to provide a new approach based on Hidden Markov Models (HMMs) and self-organizing maps (SOM), interpreting the background from the viewpoint of weather variability. Based on meteorological data, SOM was performed to classify weather patterns. Using these classes by SOM as randomly changing “states”, our Hidden Markov Models were constructed with “observation data” that were extracted from the daily data of emergency transport at Nagoya City in Japan. We showed that SOM was an effective method to get weather patterns that would serve as “states” of Hidden Markov Models. Our Hidden Markov Models provided effective models to clarify background process for stroke incidence. The effectiveness of these Hidden Markov Models was estimated by stochastic test for root mean square errors (RMSE). “HMMs with states by SOM” would serve as a description of the background process of stroke incidence and were useful to show the influence of weather on stroke onset. This finding will contribute to an improvement of our understanding for links between weather variability and stroke incidence.
文摘Business cluster identification is an essential topic for helping understand regional and global supply chains and establishing economic policies and logistics.This work aims to leverage the benefits of self-organizing maps(SOM),combined with traditional clustering algorithms and image processing techniques,to identify business clusters that are described by high-dimensionality feature vectors.It is advantageous over previous work because the algorithm is unsupervised and makes no assumptions about the number of clusters for a given feature set.The proposed algorithm was evaluated using recent datasets for US metropolitan cities from the Indiana Business Research Center(Innovation 2.0)and the Occupational Employment Statistics Survey.Data involving innovation metrics,education levels,economic well-being,connectivity,local GDP,and STEM are aggregated to demonstrate the effectiveness of the proposed neural network.The clustering results are compared to traditional approaches,including K-means clustering,both quantitatively and qualitatively.The unsupervised nature of the proposed SOM approach,and the acceptable computational complexity of the overall algorithm,suggests that self-organizing maps offer several advantages over traditional methods.In this work,we present a novel architecture coupling a SOM model with processing techniques for automatically identifying business clusters derived from high-dimensionality feature vectors,the first use case of SOMs in business cases affecting supply chains and other economic decisions.Preliminary results confirm the viability of architecture as an unsupervised approach for identifying business clusters.
基金Researcher Supporting Project number(RSPD2025R582),King Saud University,Riyadh,Saudi Arabia.
文摘Intrusion attempts against Internet of Things(IoT)devices have significantly increased in the last few years.These devices are now easy targets for hackers because of their built-in security flaws.Combining a Self-Organizing Map(SOM)hybrid anomaly detection system for dimensionality reduction with the inherited nature of clustering and Extreme Gradient Boosting(XGBoost)for multi-class classification can improve network traffic intrusion detection.The proposed model is evaluated on the NSL-KDD dataset.The hybrid approach outperforms the baseline line models,Multilayer perceptron model,and SOM-KNN(k-nearest neighbors)model in precision,recall,and F1-score,highlighting the proposed approach’s scalability,potential,adaptability,and real-world applicability.Therefore,this paper proposes a highly efficient deployment strategy for resource-constrained network edges.The results reveal that Precision,Recall,and F1-scores rise 10%-30% for the benign,probing,and Denial of Service(DoS)classes.In particular,the DoS,probe,and benign classes improved their F1-scores by 7.91%,32.62%,and 12.45%,respectively.
文摘Recently,machine learning(ML)has been considered a powerful technological element of different society areas.To transform the computer into a decision maker,several sophisticated methods and algorithms are constantly created and analyzed.In geophysics,both supervised and unsupervised ML methods have dramatically contributed to the development of seismic and well-log data interpretation.In well-logging,ML algorithms are well-suited for lithologic reconstruction problems,once there is no analytical expressions for computing well-log data produced by a particular rock unit.Additionally,supervised ML methods are strongly dependent on a accurate-labeled training data-set,which is not a simple task to achieve,due to data absences or corruption.Once an adequate supervision is performed,the classification outputs tend to be more accurate than unsupervised methods.This work presents a supervised version of a Self-Organizing Map,named as SSOM,to solve a lithologic reconstruction problem from well-log data.Firstly,we go for a more controlled problem and simulate well-log data directly from an interpreted geologic cross-section.We then define two specific training data-sets composed by density(RHOB),sonic(DT),spontaneous potential(SP)and gamma-ray(GR)logs,all simulated through a Gaussian distribution function per lithology.Once the training data-set is created,we simulate a particular pseudo-well,referred to as classification well,for defining controlled tests.First one comprises a training data-set with no labeled log data of the simulated fault zone.In the second test,we intentionally improve the training data-set with the fault.To bespeak the obtained results for each test,we analyze confusion matrices,logplots,accuracy and precision.Apart from very thin layer misclassifications,the SSOM provides reasonable lithologic reconstructions,especially when the improved training data-set is considered for supervision.The set of numerical experiments shows that our SSOM is extremely well-suited for a supervised lithologic reconstruction,especially to recover lithotypes that are weakly-sampled in the training log-data.On the other hand,some misclassifications are also observed when the cortex could not group the slightly different lithologies.
基金the National Key Basic Research and Development (973) Program of China (No. 2006CB705506)the National High-Tech Research and Development (863) Program of China (No. 2007AA11Z222)the National Natural Science Foundation of China (Nos. 60774034, 60721003, and 50708054).
文摘The two important features of self-organizing maps (SOM), topological preservation and easy visualization, give it great potential for analyzing multi-dimensional time series, specifically traffic flow time series in an urban traffic network. This paper investigates the application of SOM in the representation and prediction of multi-dimensional traffic time series. Ffrst, SOMs are applied to cluster the time series and to project each multi-dimensional vector onto a two-dimensional SOM plane while preserving the topological relationships of the original data. Then, the easy visualization of the SOMs is utilized and several exploratory methods are used to investigate the physical meaning of the clusters as well as how the traffic flow vectors evolve with time. Finally, the k-nearest neighbor (kNN) algorithm is applied to the clustering result to perform short-term predictions of the traffic flow vectors. Analysis of real world traffic data shows the effec- tiveness of these methods for traffic flow predictions, for they can capture the nonlinear information of traffic flows data and predict traffic flows on multiple links simultaneously.
基金the National Natural Science Foundation of China (No.61627810)the National Science and Technology Major Program of China (No.2018YFB1305003)the National Defense Science and Technology Outstanding Youth Science Foundation (No.2017-JCJQ-ZQ-031)。
文摘Traveling salesman problem(TSP)is a classic non-deterministic polynomial-hard optimization prob-lem.Based on the characteristics of self-organizing mapping(SOM)network,this paper proposes an improved SOM network from the perspectives of network update strategy,initialization method,and parameter selection.This paper compares the performance of the proposed algorithms with the performance of existing SOM network algorithms on the TSP and compares them with several heuristic algorithms.Simulations show that compared with existing SOM networks,the improved SOM network proposed in this paper improves the convergence rate and algorithm accuracy.Compared with iterated local search and heuristic algorithms,the improved SOM net-work algorithms proposed in this paper have the advantage of fast calculation speed on medium-scale TSP.
文摘We previously proposed a method for creating product maps with SOM (Self-Organizing Maps) to be used during purchase decision making. In that study, we first established two class boundaries, which divide the area between the minimum and maximum range of an input feature value into three equal parts. Then, we produced self-organizing product maps using classification data inputs. Finally, we applied our method to five product types and confirmed its effectiveness. In this paper, we propose a method for selecting alternatives from a product map, in which we have located a favorite several examples of selecting alternatives and making decisions using cluster, and/or from a favorite component map. We then show the AHP (Analytic Hierarchy Process).
基金National Natural Science Foundation of China(No.42301518)Hubei Key Laboratory of Regional Development and Environmental Response(No.2023(A)002)Key Laboratory of the Evaluation and Monitoring of Southwest Land Resources(Ministry of Education)(No.TDSYS202304).
文摘Image-maps,a hybrid design with satellite images as background and map symbols uploaded,aim to combine the advantages of maps’high interpretation efficiency and satellite images’realism.The usability of image-maps is influenced by the representations of background images and map symbols.Many researchers explored the optimizations for background images and symbolization techniques for symbols to reduce the complexity of image-maps and improve the usability.However,little literature was found for the optimum amount of symbol loading.This study focuses on the effects of background image complexity and map symbol load on the usability(i.e.,effectiveness and efficiency)of image-maps.Experiments were conducted by user studies via eye-tracking equipment and an online questionnaire survey.Experimental data sets included image-maps with ten levels of map symbol load in ten areas.Forty volunteers took part in the target searching experiments.It has been found that the usability,i.e.,average time viewed(efficiency)and average revisits(effectiveness)of targets recorded,is influenced by the complexity of background images,a peak exists for optimum symbol load for an image-map.The optimum levels for symbol load for different image-maps also have a peak when the complexity of the background image/image map increases.The complexity of background images serves as a guideline for optimum map symbol load in image-map design.This study enhanced user experience by optimizing visual clarity and managing cognitive load.Understanding how these factors interact can help create adaptive maps that maintain clarity and usability,guiding AI algorithms to adjust symbol density based on user context.This research establishes the practices for map design,making cartographic tools more innovative and more user-centric.
基金Deep-time Digital Earth(DDE)Big Science Program(No.GJ-C03-SGF-2025-004)National Natural Science Foundation of China(No.42394063)Sichuan Science and Technology Program(No.2025ZNSFSC0325).
文摘Topographic maps,as essential tools and sources of information for geographic research,contain precise spatial locations and rich map features,and they illustrate spatio-temporal information on the distribution and differences of various surface features.Currently,topographic maps are mainly stored in raster and vector formats.Extraction of the spatio-temporal knowledge in the maps—such as spatial distribution patterns,feature relationships,and dynamic evolution—still primarily relies on manual interpretation.However,manual interpretation is time-consuming and laborious,especially for large-scale,long-term map knowledge extraction and application.With the development of artificial intelligence technology,it is possible to improve the automation level of map knowledge interpretation.Therefore,the present study proposes an automatic interpretation method for raster topographic map knowledge based on deep learning.To address the limitations of current data-driven intelligent technology in learning map spatial relations and cognitive logic,we establish a formal description of map knowledge by mapping the relationship between map knowledge and features,thereby ensuring interpretation accuracy.Subsequently,deep learning techniques are employed to extract map features automatically,and the spatio-temporal knowledge is constructed by combining formal descriptions of geographic feature knowledge.Validation experiments demonstrate that the proposed method effectively achieves automatic interpretation of spatio-temporal knowledge of geographic features in maps,with an accuracy exceeding 80%.The findings of the present study contribute to machine understanding of spatio-temporal differences in map knowledge and advances the intelligent interpretation and utilization of cartographic information.