Extracting and parameterizing ionospheric waves globally and statistically is a longstanding problem. Based on the multichannel maximum entropy method(MMEM) used for studying ionospheric waves by previous work, we c...Extracting and parameterizing ionospheric waves globally and statistically is a longstanding problem. Based on the multichannel maximum entropy method(MMEM) used for studying ionospheric waves by previous work, we calculate the parameters of ionospheric waves by applying the MMEM to numerously temporally approximate and spatially close global-positioning-system radio occultation total electron content profile triples provided by the unique clustered satellites flight between years 2006 and 2007 right after the constellation observing system for meteorology, ionosphere, and climate(COSMIC) mission launch. The results show that the amplitude of ionospheric waves increases at the low and high latitudes(~0.15 TECU) and decreases in the mid-latitudes(~0.05 TECU). The vertical wavelength of the ionospheric waves increases in the mid-latitudes(e.g., ~50 km at altitudes of 200–250 km) and decreases at the low and high latitudes(e.g., ~35 km at altitudes of 200–250 km).The horizontal wavelength shows a similar result(e.g., ~1400 km in the mid-latitudes and ~800 km at the low and high latitudes).展开更多
The Binary star DataBase(BDB, http://bdb.inasan.ru) combines data from catalogs of binary and multiple stars of all observational types. There is a number of ways for variable stars to form or to be a part of binary o...The Binary star DataBase(BDB, http://bdb.inasan.ru) combines data from catalogs of binary and multiple stars of all observational types. There is a number of ways for variable stars to form or to be a part of binary or multiple systems. We describe how such stars are represented in the database.展开更多
We introduced a decision tree method called Random Forests for multiwavelength data classification. The data were adopted from different databases, including the Sloan Digital Sky Survey (SDSS) Data Release five, US...We introduced a decision tree method called Random Forests for multiwavelength data classification. The data were adopted from different databases, including the Sloan Digital Sky Survey (SDSS) Data Release five, USNO, FIRST and ROSAT. We then studied the discrimination of quasars from stars and the classification of quasars, stars and galaxies with the sample from optical and radio bands and with that from optical and X-ray bands. Moreover, feature selection and feature weighting based on Random Forests were investigated. The performances based on different input patterns were compared. The experimental results show that the random forest method is an effective method for astronomical object classification and can be applied to other classification problems faced in astronomy. In addition, Random Forests will show its superiorities due to its own merits, e.g. classification, feature selection, feature weighting as well as outlier detection.展开更多
Effective extraction of data association rules can provide a reliable basis for classification of stellar spectra. The concept of stellar spectrum weighted itemsets and stellar spectrum weighted association rules are ...Effective extraction of data association rules can provide a reliable basis for classification of stellar spectra. The concept of stellar spectrum weighted itemsets and stellar spectrum weighted association rules are introduced, and the weight of a single property in the stellar spectrum is determined by information entropy. On that basis, a method is presented to mine the association rules of a stellar spectrum based on the weighted frequent pattern tree. Important properties of the spectral line are highlighted using this method. At the same time, the waveform of the whole spectrum is taken into account. The experimental results show that the data association rules of a stellar spectrum mined with this method are consistent with the main features of stellar spectral types.展开更多
“慧眼”硬X射线调制望远镜(简称慧眼卫星,英文名为Insight Hard X-ray Modulation Telescope,简称Insight-HXMT或HXMT),于2017年6月15日发射升空,标志着我国自主研发的首个天文台级X射线望远镜的诞生.慧眼卫星凭借其大面积、宽波段、...“慧眼”硬X射线调制望远镜(简称慧眼卫星,英文名为Insight Hard X-ray Modulation Telescope,简称Insight-HXMT或HXMT),于2017年6月15日发射升空,标志着我国自主研发的首个天文台级X射线望远镜的诞生.慧眼卫星凭借其大面积、宽波段、高时间分辨率和高能量分辨率的综合优势,为黑洞与中子星系统的硬X射线快速变化和宽波段能谱研究领域开辟了新的研究窗口.超出设计寿命的慧眼卫星已稳定运行超过8yr,目前状态良好,且有望进一步延长其在轨服务时间.截至2024年10月,慧眼卫星已7次向全球科学界公开征集观测提案,共收到334份有效的观测提案,并据此制定了2368个观测计划.此外,慧眼卫星已向公众发布数据13批次,累计数据量达到40 TB,数据公开比率高达94%.慧眼卫星还向用户提供了不同版本的数据分析软件和标定数据库,在轨标定精度在2%左右,满足科学分析的要求.来自全球17家及国内36家研究机构的学者使用慧眼数据开展了科学研究,发表了约300多篇高质量学术论文,累计引用次数约7300次.展开更多
Pulsar candidate identification is an indispensable task in pulsar science.Based on the characteristics of imbalanced and diverse pulsar data sets,and the lack of a unified processing framework,we first used dimension...Pulsar candidate identification is an indispensable task in pulsar science.Based on the characteristics of imbalanced and diverse pulsar data sets,and the lack of a unified processing framework,we first used dimensionality reduction and visualization to analyze potential deficiencies caused by the incompleteness of current data set extraction methods.We found that the limited use of non-pulsar data may lead to bias in the result,which may limit the generalization ability.Based on the dimensionality reduction results,we propose a Grid Group Uniform Sampling(GGUS) method.This data preprocessing method improves the performance of Random Forest,Support Vector Machine,Convolutional Neural Network,and Res Net50 models on Lyon’s features,diagnostic plots,and perioddispersion measure (period-DM) plots in the HTRU1 data set.The average recall increased by approximately0.5%,precision by nearly 2%,and F_(1) score by around 1.2%for all models and in all data sets.In the period-DM plots testing,the high-performance Res Net50 algorithm achieved over 98%F_(1) using random sampling.GGUS demonstrated further improvements in this test,enhancing the average F_(1) score,precision,and recall by approximately 0.07%,0.1%,and 0.03%,respectively.展开更多
普查数据是地理学空间分析的重要数据源。由于受到数据与计算机处理能力的限制,以往的研究对普查数据空间分析的不确定性未给予足够重视,也未形成成熟的研究方法。在建筑物单元的人口普查数据支持下,本文基于多边形统计数据的可塑面积...普查数据是地理学空间分析的重要数据源。由于受到数据与计算机处理能力的限制,以往的研究对普查数据空间分析的不确定性未给予足够重视,也未形成成熟的研究方法。在建筑物单元的人口普查数据支持下,本文基于多边形统计数据的可塑面积单元问题(Modifiable areal unit problem,MAUP)特征,设计了一种该类数据空间分析不确定性的研究方法,采用不同的尺度(Scale)及分区(Zoning)系统对多边形的统计数据空间分析的准确性进行了分析。实验引入尺度与形态指数,利用可视化分析和数据拟合的研究方法,对尺度及分区对空间分析结果的影响模式进行了模拟。研究结果表明:(1)以统计小区的空间分析,其结果受统计小区空间形态的影响较大,不确定性强,不能充分反映统计数据本身的空间特征;(2)规则格网能较好地保持原始统计数据的空间分布特征,但仍然受尺度及分区影响;(3)规则格网的空间分析结果及其准确性与尺度有较好的拟合关系,不同尺度下的分析结果不确定性是原始数据不同尺度特征的体现;(4)分区效应受空间分析方法的计算尺度影响,两者共同对空间分析结果产生影响。对于固定尺度的规则格网,其邻接多边形数目是分析结果不确定的主要原因。本文研究结果表明,在多边形统计数据空间分析时,应该对其使用规则格网重新聚合,并根据实际应用的需求选择多尺度分析方法,以达到实际应用目的。展开更多
基金Supported by the National Natural Science Foundation of China under Grant Nos 41774158,41474129 and 41704148the Chinese Meridian Projectthe Youth Innovation Promotion Association of the Chinese Academy of Sciences under Grant No2011324
文摘Extracting and parameterizing ionospheric waves globally and statistically is a longstanding problem. Based on the multichannel maximum entropy method(MMEM) used for studying ionospheric waves by previous work, we calculate the parameters of ionospheric waves by applying the MMEM to numerously temporally approximate and spatially close global-positioning-system radio occultation total electron content profile triples provided by the unique clustered satellites flight between years 2006 and 2007 right after the constellation observing system for meteorology, ionosphere, and climate(COSMIC) mission launch. The results show that the amplitude of ionospheric waves increases at the low and high latitudes(~0.15 TECU) and decreases in the mid-latitudes(~0.05 TECU). The vertical wavelength of the ionospheric waves increases in the mid-latitudes(e.g., ~50 km at altitudes of 200–250 km) and decreases at the low and high latitudes(e.g., ~35 km at altitudes of 200–250 km).The horizontal wavelength shows a similar result(e.g., ~1400 km in the mid-latitudes and ~800 km at the low and high latitudes).
基金supportedby the Russian Foundation of Basic Researches,projects 16–07–1162 and 18–02–00890Funding for the DPAC has been provided by nationalinstitutions, in particular the institutions participating inthe Gaia Multilateral Agreement
文摘The Binary star DataBase(BDB, http://bdb.inasan.ru) combines data from catalogs of binary and multiple stars of all observational types. There is a number of ways for variable stars to form or to be a part of binary or multiple systems. We describe how such stars are represented in the database.
基金Supported by the National Natural Science Foundation of ChinaThis paper is funded by the National Natural Science Foundation of China under grant under GrantNos. 10473013, 90412016 and 10778724 by the 863 project under Grant No. 2006AA01A120
文摘We introduced a decision tree method called Random Forests for multiwavelength data classification. The data were adopted from different databases, including the Sloan Digital Sky Survey (SDSS) Data Release five, USNO, FIRST and ROSAT. We then studied the discrimination of quasars from stars and the classification of quasars, stars and galaxies with the sample from optical and radio bands and with that from optical and X-ray bands. Moreover, feature selection and feature weighting based on Random Forests were investigated. The performances based on different input patterns were compared. The experimental results show that the random forest method is an effective method for astronomical object classification and can be applied to other classification problems faced in astronomy. In addition, Random Forests will show its superiorities due to its own merits, e.g. classification, feature selection, feature weighting as well as outlier detection.
基金supported by the National Natural Science Foundation of China (Grant Nos. 61073145, 41140027 and 41210104028)the Shanxi Province Natural Science Foundation (No. 2012011011-4)+1 种基金Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi, China (No. 20121011)the Shanxi Province Science Foundation for Youths (No. 2012021015-4)
文摘Effective extraction of data association rules can provide a reliable basis for classification of stellar spectra. The concept of stellar spectrum weighted itemsets and stellar spectrum weighted association rules are introduced, and the weight of a single property in the stellar spectrum is determined by information entropy. On that basis, a method is presented to mine the association rules of a stellar spectrum based on the weighted frequent pattern tree. Important properties of the spectral line are highlighted using this method. At the same time, the waveform of the whole spectrum is taken into account. The experimental results show that the data association rules of a stellar spectrum mined with this method are consistent with the main features of stellar spectral types.
基金supported by the National Key Research and Development Program of China under grant No.2018YFA0404603supported by the Operation,Maintenance and Upgrading Fund for Astronomical Telescopes and Facility Instruments,budgeted from the Ministry of Finance of China (MOF) and administered by the Chinese Academy of Sciences (CAS)。
文摘Pulsar candidate identification is an indispensable task in pulsar science.Based on the characteristics of imbalanced and diverse pulsar data sets,and the lack of a unified processing framework,we first used dimensionality reduction and visualization to analyze potential deficiencies caused by the incompleteness of current data set extraction methods.We found that the limited use of non-pulsar data may lead to bias in the result,which may limit the generalization ability.Based on the dimensionality reduction results,we propose a Grid Group Uniform Sampling(GGUS) method.This data preprocessing method improves the performance of Random Forest,Support Vector Machine,Convolutional Neural Network,and Res Net50 models on Lyon’s features,diagnostic plots,and perioddispersion measure (period-DM) plots in the HTRU1 data set.The average recall increased by approximately0.5%,precision by nearly 2%,and F_(1) score by around 1.2%for all models and in all data sets.In the period-DM plots testing,the high-performance Res Net50 algorithm achieved over 98%F_(1) using random sampling.GGUS demonstrated further improvements in this test,enhancing the average F_(1) score,precision,and recall by approximately 0.07%,0.1%,and 0.03%,respectively.
文摘普查数据是地理学空间分析的重要数据源。由于受到数据与计算机处理能力的限制,以往的研究对普查数据空间分析的不确定性未给予足够重视,也未形成成熟的研究方法。在建筑物单元的人口普查数据支持下,本文基于多边形统计数据的可塑面积单元问题(Modifiable areal unit problem,MAUP)特征,设计了一种该类数据空间分析不确定性的研究方法,采用不同的尺度(Scale)及分区(Zoning)系统对多边形的统计数据空间分析的准确性进行了分析。实验引入尺度与形态指数,利用可视化分析和数据拟合的研究方法,对尺度及分区对空间分析结果的影响模式进行了模拟。研究结果表明:(1)以统计小区的空间分析,其结果受统计小区空间形态的影响较大,不确定性强,不能充分反映统计数据本身的空间特征;(2)规则格网能较好地保持原始统计数据的空间分布特征,但仍然受尺度及分区影响;(3)规则格网的空间分析结果及其准确性与尺度有较好的拟合关系,不同尺度下的分析结果不确定性是原始数据不同尺度特征的体现;(4)分区效应受空间分析方法的计算尺度影响,两者共同对空间分析结果产生影响。对于固定尺度的规则格网,其邻接多边形数目是分析结果不确定的主要原因。本文研究结果表明,在多边形统计数据空间分析时,应该对其使用规则格网重新聚合,并根据实际应用的需求选择多尺度分析方法,以达到实际应用目的。