RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpre...RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This...With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.展开更多
This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can e...This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic respo...The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.展开更多
The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pr...The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.展开更多
With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis o...With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to huma...Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).展开更多
The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utilit...The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utility of these ancient genomic datasets,a range of databases and advanced statistical models have been developed,including the Allen Ancient DNA Resource(AADR)(Mallick et al.,2024)and AdmixTools(Patterson et al.,2012).While upstream processes such as sequencing and raw data processing have been streamlined by resources like the AADR,the downstream analysis of these datasets-encompassing population genetics inference and spatiotemporal interpretation-remains a significant challenge.The AADR provides a unified collection of published ancient DNA(aDNA)data,yet its file-based format and reliance on command-line tools,such as those in Admix-Tools(Patterson et al.,2012),require advanced computational expertise for effective exploration and analysis.These requirements can present significant challenges forresearchers lackingadvanced computational expertise,limiting the accessibility and broader application of these valuable genomic resources.展开更多
Long noncoding RNAs (IncRNAs) have been increasingly implicated in a variety of human diseases, including autoimmune disease (Wu et al., 2015), neurodegenerative diseases (Wapinski and Chang, 2011) and cancer (...Long noncoding RNAs (IncRNAs) have been increasingly implicated in a variety of human diseases, including autoimmune disease (Wu et al., 2015), neurodegenerative diseases (Wapinski and Chang, 2011) and cancer (Huarte, 2015). Due to recent advances in next-generation sequencing technologies, tens of thousands of lncRNAs have been identified and annotated, a number of them have been proven to be functional in diverse biological processes through various mechanisms.展开更多
There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from ...There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.展开更多
Objective To evaluate the environmental and technical efficiencies of China's industrial sectors and provide appropriate advice for policy makers in the context of rapid economic growth and concurrent serious environ...Objective To evaluate the environmental and technical efficiencies of China's industrial sectors and provide appropriate advice for policy makers in the context of rapid economic growth and concurrent serious environmental damages caused by industrial pollutants. Methods A data of envelopment analysis (DEA) framework crediting both reduction of pollution outputs and expansion of good outputs was designed as a model to compute environmental efficiency of China's regional industrial systems. Results As shown by the geometric mean of environmental efficiency, if other inputs were made constant and good outputs were not to be improved, the air pollution outputs would have the potential to be decreased by about 60% in the whole China. Conclusion Both environmental and technical efficiencies have the potential to be greatly improved in China, which may provide some advice for policy-makers.展开更多
Assimilation configurations have significant impacts on analysis results and subsequent forecasts. A squall line system that occurred on 23 April 2007 over southern China was used to investigate the impacts of the dat...Assimilation configurations have significant impacts on analysis results and subsequent forecasts. A squall line system that occurred on 23 April 2007 over southern China was used to investigate the impacts of the data assimilation frequency of radar data on analyses and forecasts. A three-dimensional variational system was used to assimilate radial velocity data,and a cloud analysis system was used for reflectivity assimilation with a 2-h assimilation window covering the initial stage of the squall line. Two operators of radar reflectivity for cloud analyses corresponding to single-and double-moment schemes were used. In this study, we examined the sensitivity of assimilation frequency using 10-, 20-, 30-, and 60-min assimilation intervals. The results showed that analysis fields were not consistent with model dynamics and microphysics in general;thus, model states, including dynamic and microphysical variables, required approximately 20 min to reach a new balance after data assimilation in all experiments. Moreover, a 20-min data assimilation interval generally produced better forecasts for both single-and double-moment schemes in terms of equitable threat and bias scores. We conclude that a higher data assimilation frequency can produce a more intense cold pool and rear inflow jets but does not necessarily lead to a better forecast.展开更多
The buildings and structures of mines were monitored automatically using modern surveying technology. Through the analysis of the monitoring data, the deformation characteristics were found out from three aspects cont...The buildings and structures of mines were monitored automatically using modern surveying technology. Through the analysis of the monitoring data, the deformation characteristics were found out from three aspects containing points, lines and regions, which play an important role in understanding the stable state of buildings and structures. The stability and deformation of monitoring points were analysed, and time-series data of monitoring points were denoised with wavelet analysis and Kalman filtering, and exponent function and periodic function were used to get the ideal deformation trend model of monitoring points. Through calculating the monitoring data obtained, analyzing the deformation trend, and cognizing the deformation regularity, it can better service mine safety production and decision-making.展开更多
Simultaneous-source acquisition has been recog- nized as an economic and efficient acquisition method, but the direct imaging of the simultaneous-source data produces migration artifacts because of the interference of...Simultaneous-source acquisition has been recog- nized as an economic and efficient acquisition method, but the direct imaging of the simultaneous-source data produces migration artifacts because of the interference of adjacent sources. To overcome this problem, we propose the regularized least-squares reverse time migration method (RLSRTM) using the singular spectrum analysis technique that imposes sparseness constraints on the inverted model. Additionally, the difference spectrum theory of singular values is presented so that RLSRTM can be implemented adaptively to eliminate the migration artifacts. With numerical tests on a fiat layer model and a Marmousi model, we validate the superior imaging quality, efficiency and convergence of RLSRTM compared with LSRTM when dealing with simultaneoussource data, incomplete data and noisy data.展开更多
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonline...In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.展开更多
文摘RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
文摘With the advent of the big data era,real-time data analysis and decision-support systems have been recognized as essential tools for enhancing enterprise competitiveness and optimizing the decision-making process.This study aims to explore the development strategies of real-time data analysis and decision-support systems,and analyze their application status and future development trends in various industries.The article first reviews the basic concepts and importance of real-time data analysis and decision-support systems,and then discusses in detail the key technical aspects such as system architecture,data collection and processing,analysis methods,and visualization techniques.
文摘This paper analyzes the advantages of legal digital currencies and explores their impact on bank big data practices.By combining bank big data collection and processing,it clarifies that legal digital currencies can enhance the efficiency of bank data processing,enrich data types,and strengthen data analysis and application capabilities.In response to future development needs,it is necessary to strengthen data collection management,enhance data processing capabilities,innovate big data application models,and provide references for bank big data practices,promoting the transformation and upgrading of the banking industry in the context of legal digital currencies.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
基金supported by the National Key Research and Development Program Project of China(Grant No.2023YFF0718003)the key research and development plan project of Yunnan Province(Grant No.202303AA080006).
文摘The issue of strong noise has increasingly become a bottleneck restricting the precision and application space of electromagnetic exploration methods.Noise suppression and extraction of effective electromagnetic response information under a strong noise background is a crucial scientific task to be addressed.To solve the noise suppression problem of the controlled-source electromagnetic method in strong interference areas,we propose an approach based on complex-plane 2D k-means clustering for data processing.Based on the stability of the controlled-source signal response,clustering analysis is applied to classify the spectra of different sources and noises in multiple time segments.By identifying the power spectra with controlled-source characteristics,it helps to improve the quality of the controlled-source response extraction.This paper presents the principle and workflow of the proposed algorithm,and demonstrates feasibility and effectiveness of the new algorithm through synthetic and real data examples.The results show that,compared with the conventional Robust denoising method,the clustering algorithm has a stronger suppression effect on common noise,can identify high-quality signals,and improve the preprocessing data quality of the controlledsource electromagnetic method.
文摘The fracture volume is gradually changed with the depletion of fracture pressure during the production process.However,there are few flowback models available so far that can estimate the fracture volume loss using pressure transient and rate transient data.The initial flowback involves producing back the fracturing fuid after hydraulic fracturing,while the second flowback involves producing back the preloading fluid injected into the parent wells before fracturing of child wells.The main objective of this research is to compare the initial and second flowback data to capture the changes in fracture volume after production and preload processes.Such a comparison is useful for evaluating well performance and optimizing frac-turing operations.We construct rate-normalized pressure(RNP)versus material balance time(MBT)diagnostic plots using both initial and second flowback data(FB;and FBs,respectively)of six multi-fractured horizontal wells completed in Niobrara and Codell formations in DJ Basin.In general,the slope of RNP plot during the FB,period is higher than that during the FB;period,indicating a potential loss of fracture volume from the FB;to the FB,period.We estimate the changes in effective fracture volume(Ver)by analyzing the changes in the RNP slope and total compressibility between these two flowback periods.Ver during FB,is in general 3%-45%lower than that during FB:.We also compare the drive mechanisms for the two flowback periods by calculating the compaction-drive index(CDI),hydrocarbon-drive index(HDI),and water-drive index(WDI).The dominant drive mechanism during both flowback periods is CDI,but its contribution is reduced by 16%in the FB,period.This drop is generally compensated by a relatively higher HDI during this period.The loss of effective fracture volume might be attributed to the pressure depletion in fractures,which occurs during the production period and can extend 800 days.
文摘With the rapid development of the Internet and e-commerce,e-commerce platforms have accumulated huge amounts of user behavior data.The emergence of big data technology provides a powerful means for in-depth analysis of these data and insight into user behavior patterns and preferences.This paper elaborates on the application of big data technology in the analysis of user behavior on e-commerce platforms,including the technical methods of data collection,storage,processing and analysis,as well as the specific applications in the construction of user profiles,precision marketing,personalized recommendation,user retention and churn analysis,etc.,and discusses the challenges and countermeasures faced in the application.Through the study of actual cases,it demonstrates the remarkable effectiveness of big data technology in enhancing the competitiveness of e-commerce platforms and user experience.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
文摘In section‘Track decoding’of this article,one of the paragraphs was inadvertently missed out after the text'…shows the flow diagram of the Tr2-1121 track mode.'The missed paragraph is provided below.
基金supported by a project funded by the Hebei Provincial Central Guidance Local Science and Technology Development Fund(236Z7714G)。
文摘Cervical cancer,a leading malignancy globally,poses a significant threat to women's health,with an estimated 604,000 new cases and 342,000 deaths reported in 2020^([1]).As cervical cancer is closely linked to human papilloma virus(HPV)infection,early detection relies on HPV screening;however,late-stage prognosis remains poor,underscoring the need for novel diagnostic and therapeutic targets^([2]).
基金by the National Key Research and Development Program of China(2023YFC3303701-02 and 2024YFC3306701)the National Natural Science Foundation of China(T2425014 and 32270667)+3 种基金the Natural Science Foundation of Fujian Province of China(2023J06013)the Major Project of the National Social Science Foundation of China granted to Chuan-Chao Wang(21&ZD285)Open Research Fund of State Key Laboratory of Genetic Engineering at Fudan University(SKLGE-2310)Open Research Fund of Forensic Genetics Key Laboratory of the Ministry of Public Security(2023FGKFKT07).
文摘The analysis of ancient genomics provides opportunities to explore human population history across both temporal and geographic dimensions(Haak et al.,2015;Wang et al.,2021,2024)to enhance the accessibility and utility of these ancient genomic datasets,a range of databases and advanced statistical models have been developed,including the Allen Ancient DNA Resource(AADR)(Mallick et al.,2024)and AdmixTools(Patterson et al.,2012).While upstream processes such as sequencing and raw data processing have been streamlined by resources like the AADR,the downstream analysis of these datasets-encompassing population genetics inference and spatiotemporal interpretation-remains a significant challenge.The AADR provides a unified collection of published ancient DNA(aDNA)data,yet its file-based format and reliance on command-line tools,such as those in Admix-Tools(Patterson et al.,2012),require advanced computational expertise for effective exploration and analysis.These requirements can present significant challenges forresearchers lackingadvanced computational expertise,limiting the accessibility and broader application of these valuable genomic resources.
基金supported by grants from the National Key R&D Program of China (2017YFA0106700)National Natural Science Foundation of China (81772614, U1611261, 81772586 and 81602461)+3 种基金China Postdoctoral Science Foundation (2017M610573)Young Elite Scientists Sponsorship Program by CAST (2017QNRC001)Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2017, to J. Zheng)Fundamental Research Funds for the Central Universities (SYSU:17ykzd32)
文摘Long noncoding RNAs (IncRNAs) have been increasingly implicated in a variety of human diseases, including autoimmune disease (Wu et al., 2015), neurodegenerative diseases (Wapinski and Chang, 2011) and cancer (Huarte, 2015). Due to recent advances in next-generation sequencing technologies, tens of thousands of lncRNAs have been identified and annotated, a number of them have been proven to be functional in diverse biological processes through various mechanisms.
文摘There are some limitations when we apply conventional methods to analyze the massive amounts of seismic data acquired with high-density spatial sampling since processors usually obtain the properties of raw data from common shot gathers or other datasets located at certain points or along lines. We propose a novel method in this paper to observe seismic data on time slices from spatial subsets. The composition of a spatial subset and the unique character of orthogonal or oblique subsets are described and pre-stack subsets are shown by 3D visualization. In seismic data processing, spatial subsets can be used for the following aspects: (1) to check the trace distribution uniformity and regularity; (2) to observe the main features of ground-roll and linear noise; (3) to find abnormal traces from slices of datasets; and (4) to QC the results of pre-stack noise attenuation. The field data application shows that seismic data analysis in spatial subsets is an effective method that may lead to a better discrimination among various wavefields and help us obtain more information.
文摘Objective To evaluate the environmental and technical efficiencies of China's industrial sectors and provide appropriate advice for policy makers in the context of rapid economic growth and concurrent serious environmental damages caused by industrial pollutants. Methods A data of envelopment analysis (DEA) framework crediting both reduction of pollution outputs and expansion of good outputs was designed as a model to compute environmental efficiency of China's regional industrial systems. Results As shown by the geometric mean of environmental efficiency, if other inputs were made constant and good outputs were not to be improved, the air pollution outputs would have the potential to be decreased by about 60% in the whole China. Conclusion Both environmental and technical efficiencies have the potential to be greatly improved in China, which may provide some advice for policy-makers.
基金supported by the National Key R&D Program of China (Grant No.2017YFC1502104)the National Natural Science Foundation of China (Grant Nos.41775099 and 41605026)Grant No.NJCAR2016ZD02,and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)
文摘Assimilation configurations have significant impacts on analysis results and subsequent forecasts. A squall line system that occurred on 23 April 2007 over southern China was used to investigate the impacts of the data assimilation frequency of radar data on analyses and forecasts. A three-dimensional variational system was used to assimilate radial velocity data,and a cloud analysis system was used for reflectivity assimilation with a 2-h assimilation window covering the initial stage of the squall line. Two operators of radar reflectivity for cloud analyses corresponding to single-and double-moment schemes were used. In this study, we examined the sensitivity of assimilation frequency using 10-, 20-, 30-, and 60-min assimilation intervals. The results showed that analysis fields were not consistent with model dynamics and microphysics in general;thus, model states, including dynamic and microphysical variables, required approximately 20 min to reach a new balance after data assimilation in all experiments. Moreover, a 20-min data assimilation interval generally produced better forecasts for both single-and double-moment schemes in terms of equitable threat and bias scores. We conclude that a higher data assimilation frequency can produce a more intense cold pool and rear inflow jets but does not necessarily lead to a better forecast.
基金Project(40771175)supported by the National Nature Science Foundation of China
文摘The buildings and structures of mines were monitored automatically using modern surveying technology. Through the analysis of the monitoring data, the deformation characteristics were found out from three aspects containing points, lines and regions, which play an important role in understanding the stable state of buildings and structures. The stability and deformation of monitoring points were analysed, and time-series data of monitoring points were denoised with wavelet analysis and Kalman filtering, and exponent function and periodic function were used to get the ideal deformation trend model of monitoring points. Through calculating the monitoring data obtained, analyzing the deformation trend, and cognizing the deformation regularity, it can better service mine safety production and decision-making.
基金financial support from the National Natural Science Foundation of China (Grant Nos. 41104069, 41274124)National Key Basic Research Program of China (973 Program) (Grant No. 2014CB239006)+2 种基金National Science and Technology Major Project (Grant No. 2011ZX05014-001-008)the Open Foundation of SINOPEC Key Laboratory of Geophysics (Grant No. 33550006-15-FW2099-0033)the Fundamental Research Funds for the Central Universities (Grant No. 16CX06046A)
文摘Simultaneous-source acquisition has been recog- nized as an economic and efficient acquisition method, but the direct imaging of the simultaneous-source data produces migration artifacts because of the interference of adjacent sources. To overcome this problem, we propose the regularized least-squares reverse time migration method (RLSRTM) using the singular spectrum analysis technique that imposes sparseness constraints on the inverted model. Additionally, the difference spectrum theory of singular values is presented so that RLSRTM can be implemented adaptively to eliminate the migration artifacts. With numerical tests on a fiat layer model and a Marmousi model, we validate the superior imaging quality, efficiency and convergence of RLSRTM compared with LSRTM when dealing with simultaneoussource data, incomplete data and noisy data.
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金This project is supported by Special Foundation for Major State Basic Research of China (Project 973, No.G1998030415)
文摘In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.