The aim of this article is to synthetically describe the research projects that a selection of Italian univer- sities is undertaking in the context of big data. Far from being exhaustive, this article has the objectiv...The aim of this article is to synthetically describe the research projects that a selection of Italian univer- sities is undertaking in the context of big data. Far from being exhaustive, this article has the objective of offering a sample of distinct applications that address the issue of managing huge amounts of data in Italy, collected in relation to diverse domains.展开更多
Fraudulent website is an important car-rier tool for telecom fraud.At present,criminals can use artificial intelligence generative content technol-ogy to quickly generate fraudulent website templates and build fraudul...Fraudulent website is an important car-rier tool for telecom fraud.At present,criminals can use artificial intelligence generative content technol-ogy to quickly generate fraudulent website templates and build fraudulent websites in batches.Accurate identification of fraudulent website will effectively re-duce the risk of public victimization.Therefore,this study developed a fraudulent website template iden-tification method based on DOM structure extraction of website fingerprint features,which solves the prob-lems of single-dimension identification,low accuracy,and the insufficient generalization ability of current fraudulent website templates.This method uses an im-proved SimHash algorithm to traverse the DOM tree of a webpage,extract website node features,calcu-late the weight of each node,and obtain the finger-print feature vector of the website through dimension-ality reduction.Finally,the random forest algorithm is used to optimize the training features for the best combination of parameters.This method automati-cally extracts fingerprint features from websites and identifies website template ownership based on these features.An experimental analysis showed that this method achieves a classification accuracy of 89.8%and demonstrates superior recognition.展开更多
Developing low-carbon and efficient power systems is critical for energy security in the global warming context.We address this issue by focusing on the productivity impact of a decarbonization policy in China’s ther...Developing low-carbon and efficient power systems is critical for energy security in the global warming context.We address this issue by focusing on the productivity impact of a decarbonization policy in China’s thermal power sector—namely,the“Constructing Large Units and Restricting Small Ones”(CLRS)initiative.Utilizing a resource misallocation model,we construct a new theoretical framework to distinguish between technical and allocative efficiency and analyze productivity using plant-level data.The results indicate that the CLRS policy has significantly improved the allocative and technical efficiency of China’s coal-fired power sector,thereby ensuring power security.The closure of outdated and highly distorted small coal-fired units,which have been replaced by technologically advanced large units,primarily drives the enhanced efficiency.The policy’s effects are most pronounced in large-scale power plants and those with high coal combustion efficiency.Furthermore,a comparison of power plants’productivity distribution before and after policy implementation reveals that the CLRS policy not only enhances capital productivity in the coal-fired power sector but also increases rational labor allocation.Our findings have important policy implications for developing countries vis-à-vis building efficient and stable power systems amid climate change.展开更多
Objective To predict the potential targets of Qingfu Juanbi Decoction(青附蠲痹汤,QFJBD)in treating rheumatoid arthritis(RA)using an improved Transformer model and investigate the network pharmacological mechanisms und...Objective To predict the potential targets of Qingfu Juanbi Decoction(青附蠲痹汤,QFJBD)in treating rheumatoid arthritis(RA)using an improved Transformer model and investigate the network pharmacological mechanisms underlying QFJBD’s therapeutic effects on RA.Methods First,a traditional Chinese medicine herb-target interaction(TCMHTI)model was constructed to predict herb-target interactions based on Transformer improvement.The per-formance of the TCMHTI model was evaluated against baseline models using three metrics:area under the receiver operating characteristic curve(AUC),precision-recall curve(PRC),and accuracy.Subsequently,a protein-protein interaction(PPI)network was built based on the predicted targets,with core targets identified as the top nine nodes ranked by degree val-ues.Gene Ontology(GO)functional and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analyses were performed using the targets predicted by TCMHTI and the targets identified through network pharmacology method for comparison.Then,the re-sults were compared.Finally,the core targets predicted by TCMHTI were validated through molecular docking and literature review.Results The TCMHTI model achieved an AUC of 0.883,PRC of 0.849,and accuracy of 0.818,predicting 49 potential targets for QFJBD in RA treatment.Nine core targets were identified:tumor necrosis factor(TNF)-α,interleukin(IL)-1β,IL-6,IL-10,IL-17A,cluster of differentia-tion 40(CD40),cytotoxic T-lymphocyte-associated protein 4(CTLA4),IL-4,and signal trans-ducer and activator of transcription 3(STAT3).The enrichment analysis demonstrated that the TCMHTI model predicted 49 targets and enriched more pathways directly associated with RA,whereas classical network pharmacology identified 64 targets but enriched pathways showing weaker relevance to RA.Molecular docking demonstrated that the active molecules in QFJBD exhibit favorable binding energy with RA targets,while literature research further revealed that QFJBD can treat RA through 9 core targets.Conclusion The TCMHTI model demonstrated greater accuracy than traditional network pharmacology methods,suggesting QFJBD exerts therapeutic effects on RA by regulating tar-gets like TNF-α,IL-1β,and IL-6,as well as multiple signaling pathways.This study provides a novel framework for bridging traditional herbal knowledge with precision medicine,offering actionable insights for developing targeted TCM therapies against diseases.展开更多
This study identified the relationship between tropical cyclone(TC)activity and extreme Pacific–Japan(PJ)teleconnection patterns in August and September.In the East China Sea(ECS)and Mariana Islands(MI)regions,where ...This study identified the relationship between tropical cyclone(TC)activity and extreme Pacific–Japan(PJ)teleconnection patterns in August and September.In the East China Sea(ECS)and Mariana Islands(MI)regions,where the edge of the western North Pacific subtropical high(WNPSH)is located,approximately 60%–75%of TCs migrate to Far East Asian countries.A significant positive correlation existed between the frequency of northward migration of TCs and PJ patterns,since the TC frequency in the ECS and MI regions was significantly higher in the positive compared with the negative phase.In the positive phase,the main reason for the large number of TCs occurring was the monsoon trough’s location and strength.The strong and northeastward-shifted monsoon trough in the positive phase leads to more TCs in the ECS and MI regions.Other large-scale environments associated with TC formation also favored TC genesis around the ECS and MI regions.The higher PDI(power dissipation index)during the positive PJ phase can potentially lead to significant impacts in the Far East Asian countries.These characteristics were particularly more notable in August compared with September.展开更多
The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current re...The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.展开更多
Point-of-interest(POI)recommendations in location-based social networks(LBSNs)have developed rapidly by incorporating feature information and deep learning methods.However,most studies have failed to accurately reflec...Point-of-interest(POI)recommendations in location-based social networks(LBSNs)have developed rapidly by incorporating feature information and deep learning methods.However,most studies have failed to accurately reflect different users’preferences,in particular,the short-term preferences of inactive users.To better learn user preferences,in this study,we propose a long-short-term-preference-based adaptive successive POI recommendation(LSTP-ASR)method by combining trajectory sequence processing,long short-term preference learning,and spatiotemporal context.First,the check-in trajectory sequences are adaptively divided into recent and historical sequences according to a dynamic time window.Subsequently,an adaptive filling strategy is used to expand the recent check-in sequences of users with inactive check-in behavior using those of similar active users.We further propose an adaptive learning model to accurately extract long short-term preferences of users to establish an efficient successive POI recommendation system.A spatiotemporal-context-based recurrent neural network and temporal-context-based long short-term memory network are used to model the users’recent and historical checkin trajectory sequences,respectively.Extensive experiments on the Foursquare and Gowalla datasets reveal that the proposed method outperforms several other baseline methods in terms of three evaluation metrics.More specifically,LSTP-ASR outperforms the previously best baseline method(RTPM)with a 17.15%and 20.62%average improvement on the Foursquare and Gowalla datasets in terms of the Fβmetric,respectively.展开更多
Chinese Medicine(CM)has been widely used as an important avenue for disease prevention and treatment in China especially in the form of CM prescriptions combining sets of herbs to address patients’symptoms and syndro...Chinese Medicine(CM)has been widely used as an important avenue for disease prevention and treatment in China especially in the form of CM prescriptions combining sets of herbs to address patients’symptoms and syndromes.However,the selection and compatibility of herbs are complex and abstract due to intrinsic relationships between herbal properties and their overall functions.Network analysis is applied to demonstrate the complex relationships between individual herbal efficacy and the overall function of CM prescriptions.To illustrate their connections and correlations,prescription function(PF),prescription herb(PH),and herbal efficacy(HE)intranetworks are proposed based on CM theory to identify relationships between herbs and prescriptions.These three networks are then connected by PF-PH and PH-HE interlayer networks adopting herb dosage to form a multidimensional heterogeneous network,a Prescription-Herb-Function Network(PHFN).The network is applied to 112 classic prescriptions from Treatise on Exogenous Febrile and Miscellaneous Diseases to illustrate the application of PHFN.The PHFN is constructed including 146 functions in PF intra network,89 herbs in the PH intra network,and 163 herbal efficacies in the HE intra network.The results show that herb pairs with synergistic actions have stronger relevance,such as licorice-cassia twig,licorice-Chinese date,fresh ginger-Chinese date,etc.The integration of dosage to the network helps to indicate the main herbs for cluster analysis and automatic formulation.PHFN also reveals the internal relationships between the functions of prescriptions and composed herbal efficacies.展开更多
The collection and extraction of tongue images has always been an important part of intelligent tongue diagnosis.At present,the collection of tongue images generally needs to be completed in a sealed,stable light envi...The collection and extraction of tongue images has always been an important part of intelligent tongue diagnosis.At present,the collection of tongue images generally needs to be completed in a sealed,stable light environment,which is not conducive to the promotion of extensive tongue image and intelligent tongue diagnosis.In response to the problem,a newalgorithm named GCYTD(GELU-CA-YOLO Tongue Detection)is proposed to quickly detect and locate the tongue in a natural environment,which can greatly reduce the restriction of the tongue image collection environment.The algorithm is based on the YOLO(You Only Look Once)V4-tiny network model to detect the tongue.Firstly,the GELU(Gaussian Error Liner Units)activation function is integrated into the model to improve the training speed and reduce the number of model parameters;then,the CA(Coordinate Attention)mechanism is integrated into the model to enhance the detection precision and improve the failure tolerance of the model.Compared with the other classical algorithms,Experimental results show thatGCYTD algorithm has a better performance on the tongue images of all types in terms of training speed,tongue detection speed and detection precision,etc.The lighter model can contribute on deploying the tongue detection model on small mobile terminals.展开更多
Traditional data collection methods such as remote sensing and field surveying often fail to offer timely information during or immediately following disaster events.Social sensing enables all citizens to become part ...Traditional data collection methods such as remote sensing and field surveying often fail to offer timely information during or immediately following disaster events.Social sensing enables all citizens to become part of a large sensor network,which is low cost,more comprehensive,and always broadcasting situational awareness information.However,data collected with social sensing is often massive,heterogeneous,noisy,unreliable from some aspects,comes in continuous streams,and often lacks geospatial reference information.Together,these issues represent a grand challenge toward fully leveraging social sensing for emergency management decision making under extreme duress.Meanwhile,big data computing methods and technologies such as high-performance computing,deep learning,and multi-source data fusion become critical components of using social sensing to understand the impact of and response to the disaster events in a timely fashion.This special issue captures recent advancements in leveraging social sensing and big data computing for supporting disaster management.Specifically analyzed within these papers are some of the promises and pitfalls of social sensing data for disaster relevant information extraction,impact area assessment,population mapping,occurrence patterns,geographical disparities in social media use,and inclusion in larger decision support systems.展开更多
COVID-19 cripples the restaurant industry as a crucial socioeconomic sector that contributes immensely to the global economy.However,what the current literature less explored is to quantify the effect of COVID-19 on r...COVID-19 cripples the restaurant industry as a crucial socioeconomic sector that contributes immensely to the global economy.However,what the current literature less explored is to quantify the effect of COVID-19 on restaurant visitation and revenue at different spatial scales,as well as its relationship with the neighborhood character-istics of customers’origins.Based on the Point of Interest(POI)measures derived from SafeGraph data providing mobility records of 45 million cell phone users in the US,our study takes Lower Manhattan,New York City,as the pilot study,and aims to examine 1)the change of restaurant visitations and revenue in the period prior to and after the COVID-19 outbreak,2)the areas where restaurant customers live,and 3)the association between the neighborhood characteristics of these areas and lost customers.By doing so,we provide a geographic information system-based analytical frame-work integrating the big data mining,web crawling techniques,and spatial-economic modelling.Our analytical framework can be implemented to estimate the broader effect of COVID-19 on other industries and can be augmented in a financially monitoring manner in response to future pandemics or public emergencies.展开更多
With the number of connected devices increasing rapidly,the access latency issue increases drastically in the edge cloud environment.Massive low time-constrained and data-intensive mobile applications require efficien...With the number of connected devices increasing rapidly,the access latency issue increases drastically in the edge cloud environment.Massive low time-constrained and data-intensive mobile applications require efficient replication strategies to decrease retrieval time.However,the determination of replicas is not reasonable in many previous works,which incurs high response delay.To this end,a correlation-aware replica prefetching(CRP)strategy based on the file correlation principle is proposed,which can prefetch the files with high access probability.The key is to determine and obtain the implicit high-value files effectively,which has a significant impact on the performance of CRP.To achieve the goal of accelerating the acquisition of implicit highvalue files,an access rule management method based on consistent hashing is proposed,and then the storage and query mechanisms for access rules based on adjacency list storage structure are further presented.The theoretical analysis and simulation results corroborate that CRP shortens average response time over 4.8%,improves average hit ratio over 4.2%,reduces transmitting data amount over 8.3%,and maintains replication frequency at a reasonable level when compared to other schemes.展开更多
Essential ncRNA is a type of ncRNAwhich is indispensable for the sur-vival of organisms.Although essential ncRNAs cannot encode proteins,they are as important as essential coding genes in biology.They have got wide va...Essential ncRNA is a type of ncRNAwhich is indispensable for the sur-vival of organisms.Although essential ncRNAs cannot encode proteins,they are as important as essential coding genes in biology.They have got wide variety of applications such as antimicrobial target discovery,minimal genome construction and evolution analysis.At present,the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming,laborious and costly.In addition,tra-ditional experimental methods are limited by the organisms as less than 1%of bacteria can be cultured in the laboratory.Therefore,it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA.In this paper,we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences.The method was developed with Support Vector Machine(SVM).The accuracy of the method was evaluated through cross-species cross-vali-dation and found to be between 0.69 and 0.81.It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM.Thus,the method can be applied for discovering essential ncRNAs in bacteria.展开更多
The transformation from authoritative to user-generated data landscapes has garnered considerable attention,notably with the proliferation of crowdsourced geospatial data.Facilitated by advancements in digital technol...The transformation from authoritative to user-generated data landscapes has garnered considerable attention,notably with the proliferation of crowdsourced geospatial data.Facilitated by advancements in digital technology and high-speed communication,this paradigm shift has democratized data collection,obliterating traditional barriers between data producers and users.While previous literature has compartmentalized this subject into distinct platforms and application domains,this review offers a holistic examination of crowdsourced geospatial data.Employing a narrative review approach due to the interdisciplinary nature of the topic,we investigate both human and Earth observations through crowdsourced initiatives.This review categorizes the diverse applications of these data and rigorously examines specific platforms and paradigms pertinent to data collection.Furthermore,it addresses salient challenges,encompassing data quality,inherent biases,and ethical dimensions.We contend that this thorough analysis will serve as an invaluable scholarly resource,encapsulating the current state-of-the-art in crowdsourced geospatial data,and offering strategic directions for future interdisciplinary research and applications across various sectors.展开更多
The perceived visual quality of fruits and vegetables plays a central role in the choices made by retail customers.Machine learning(ML)approaches based on image analysis have been recently proposed to overcome the poo...The perceived visual quality of fruits and vegetables plays a central role in the choices made by retail customers.Machine learning(ML)approaches based on image analysis have been recently proposed to overcome the poor efficiency and subjectivity of human visual evaluation as well as the expensiveness and destructiveness of physical and chemical methods that measure internal indicators.In this paper,we propose a ML method based on Random Forests for estimating the chlorophyll and ammonia contents(considered,in the literature,reliable indicators of product freshness)from images of fresh-cut rocket leaves.Our approach copes with specific issues raised by(i)the non-uniform distributions of ammonia and chlorophyll values and(ii)the need to provide insights into the features that produce a particular model outcome,aiming to enhance its trustworthiness.Our experiments,performed on real images of fresh-cut rocket leaves,proved that the proposed approach significantly outperforms 7 competitor methods,obtaining an improvement of the RSE results of 6.6%for the prediction of the ammonia and of 10.4%for the prediction of the chlorophyll over its best competitor.Moreover,a specific analysis of the explainability of the predictions showed that the learned models are based on reasonable features,empowering their acceptance in real-world applications.展开更多
Objective:This prospective observational cohort real-world study evaluates and compares the efficacy and prognosis of ultrasound(US)and gene-based microwave ablation(MWA)and surgical treatment in patients with low-ris...Objective:This prospective observational cohort real-world study evaluates and compares the efficacy and prognosis of ultrasound(US)and gene-based microwave ablation(MWA)and surgical treatment in patients with low-risk papillary thyroid carcinoma(PTC),emphasizing the influence of genetic mutations on low-risk patient selection.Background:MWA,a minimally invasive technique,is increasingly recognized in the management of PTC.While traditional criteria for ablation focus on tumor size,number,and location,the impact of genetic mutations on treatment efficacy remains underexplored.Methods:A total of 201 patients with low-risk PTC without metastasis were prospectively enrolled.All patients underwent US and next-generation sequencing to confirm low-risk status.Patients chose either ablation or surgery and were monitored until November 2024.Efficacy and complications were assessed using thyroid US and contrast-enhanced US.Results:The median follow-up of this study is 12 months.There is no significant difference between the ablation group(3.0%)and the surgery group(1.0%)in disease free survival(P=0.360).However,the surgery group exhibited a significantly higher complication rate,particularly for temporary hypoparathyroidism(P<0.001).Ablation offers notable advantages,including shorter treatment duration,faster recovery,less intraoperative blood loss,and reduced costs(P<0.001),while maintaining favorable safety and comparable efficiency.Conclusions:For patients with low-risk genetic mutations,ablation provides comparable efficacy and disease free survival to surgery,with significant benefits in safety,recovery,and overall cost.Guided by US and next-generation sequencing,precise patient selection enhances the potential of ablation as a promising,minimally invasive alternative to surgery in the management of low-risk PTC.展开更多
Geospatial social media(GSM)data has been increasingly used in public health due to its rich,timely,and accessible spatial information,particularly in infectious disease research.This review synthesized 86 research ar...Geospatial social media(GSM)data has been increasingly used in public health due to its rich,timely,and accessible spatial information,particularly in infectious disease research.This review synthesized 86 research articles that use GSM data in infectious diseases published between December 2013 and March 2022.These articles cover 12 infectious disease types ranging from respiratory infectious diseases to sexually transmitted diseases with spatial levels varying from the neighborhood,county,state,and country.We categorized these studies into three major infectious disease research domains:surveillance,explanation,and prediction.With the assistance of advanced computing,statistical and spatial methods,GSM data has been widely and deeply applied to these domains,particularly in surveillance and explanation domains.We further identified four knowledge gaps in terms of contextual information use,application scopes,spatiotemporal dimension,and data limitations and proposed innovation opportunities for future research.Ourfindings will contribute to a better understanding of using GSM data in infectious diseases studies and provide insights into strategies for using GSM data more effectively in future research.展开更多
The COVID-19 pandemic poses unprecedented challenges around the world.Many studies have applied mobility data to explore spatiotemporal trends over time,investigate associations with other variables,and predict or sim...The COVID-19 pandemic poses unprecedented challenges around the world.Many studies have applied mobility data to explore spatiotemporal trends over time,investigate associations with other variables,and predict or simulate the spread of COVID-19.Our objective was to provide a comprehensive overview of human mobility open data to guide researchers and policymakers in conducting data-driven evaluations and decision-making for the COVID-19 pandemic and other infectious disease outbreaks.We summarized the mobility data usage in COVID-19 studies by reviewing recent publications on COVID-19 and human mobility from a data-oriented perspective.We identified three major sources of mobility data:public transit systems,mobile operators,and mobile phone applications.Four approaches have been commonly used to estimate human mobility:public transit-based flow,social activity patterns,index-based mobility data,and social media-derived mobility data.We compared mobility datasets’characteristics by assessing data privacy,quality,space–time coverage,high-performance data storage and processing,and accessibility.We also present challenges and future directions of using mobility data.This review makes a pivotal contribution to understanding the use of and access to human mobility data in the COVID-19 pandemic and future disease outbreaks.展开更多
Overeating is a risk factor and a management challenge in adiposity-based chronic disease(ABCD).Acupuncture has shown high safety and reliable clinical evidence in addressing overeating,and it is the promising potenti...Overeating is a risk factor and a management challenge in adiposity-based chronic disease(ABCD).Acupuncture has shown high safety and reliable clinical evidence in addressing overeating,and it is the promising potential non-pharmacological intervention.However,the mechanism underlying its effects has not been sufficiently summarized.The addiction model offers a framework to elucidate the mechanism of this aberrant eating behavior and provides novel perspectives and breakthrough points for optimizing clinical acupuncture strategies in ABCD management.In the paper,through analyzing domestic and in-ternational relevant findings,the characteristics of overeating based on food addiction,the relationship between overeating and ABCD,and the potential effect mechanisms of acupuncture for FA have been re-viewed and summarized.Including adaptive balance of transmitters and hormones,functional networks,periphery-central connection,and cross-system interaction.In future studies,the maturely-developed ad-diction research methods should be adopted to deepen the exploration on the mechanism of acupuncture effect,addiction medicine should be leveraged to shatter the cognitive barriers surrounding acupuncture’s role in mind-body regulation for ABCD treatment,and the prevention and treatment of overeating via acupuncture should be organically integrated into multidisciplinary management strategies.展开更多
Currently,most existing inductive relation prediction approaches are based on subgraph structures,with subgraph features extracted using graph neural networks to predict relations.However,subgraphs may contain disconn...Currently,most existing inductive relation prediction approaches are based on subgraph structures,with subgraph features extracted using graph neural networks to predict relations.However,subgraphs may contain disconnected regions,which usually represent different semantic ranges.Because not all semantic information about the regions is helpful in relation prediction,we propose a relation prediction model based on a disentangled subgraph structure and implement a feature updating approach based on relevant semantic aggregation.To indirectly achieve the disentangled subgraph structure from a semantic perspective,the mapping of entity features into different semantic spaces and the aggregation of related semantics on each semantic space are updated.The disentangled model can focus on features having higher semantic relevance in the prediction,thus addressing a problem with existing approaches,which ignore the semantic differences in different subgraph structures.Furthermore,using a gated recurrent neural network,this model enhances the features of entities by sorting them by distance and extracting the path information in the subgraphs.Experimentally,it is shown that when there are numerous disconnected regions in the subgraph,our model outperforms existing mainstream models in terms of both Area Under the Curve-Precision-Recall(AUC-PR)and Hits@10.Experiments prove that semantic differences in the knowledge graph can be effectively distinguished and verify the effectiveness of this method.展开更多
文摘The aim of this article is to synthetically describe the research projects that a selection of Italian univer- sities is undertaking in the context of big data. Far from being exhaustive, this article has the objective of offering a sample of distinct applications that address the issue of managing huge amounts of data in Italy, collected in relation to diverse domains.
基金This research is a phased achievement of The National Social Science Fund of China(23BGL272).
文摘Fraudulent website is an important car-rier tool for telecom fraud.At present,criminals can use artificial intelligence generative content technol-ogy to quickly generate fraudulent website templates and build fraudulent websites in batches.Accurate identification of fraudulent website will effectively re-duce the risk of public victimization.Therefore,this study developed a fraudulent website template iden-tification method based on DOM structure extraction of website fingerprint features,which solves the prob-lems of single-dimension identification,low accuracy,and the insufficient generalization ability of current fraudulent website templates.This method uses an im-proved SimHash algorithm to traverse the DOM tree of a webpage,extract website node features,calcu-late the weight of each node,and obtain the finger-print feature vector of the website through dimension-ality reduction.Finally,the random forest algorithm is used to optimize the training features for the best combination of parameters.This method automati-cally extracts fingerprint features from websites and identifies website template ownership based on these features.An experimental analysis showed that this method achieves a classification accuracy of 89.8%and demonstrates superior recognition.
基金supported by the Chengdu Philosophy and Social Science Planning Project[Grant No.2022C05]National Natural Science Foundation of China[Grant No.71904158].
文摘Developing low-carbon and efficient power systems is critical for energy security in the global warming context.We address this issue by focusing on the productivity impact of a decarbonization policy in China’s thermal power sector—namely,the“Constructing Large Units and Restricting Small Ones”(CLRS)initiative.Utilizing a resource misallocation model,we construct a new theoretical framework to distinguish between technical and allocative efficiency and analyze productivity using plant-level data.The results indicate that the CLRS policy has significantly improved the allocative and technical efficiency of China’s coal-fired power sector,thereby ensuring power security.The closure of outdated and highly distorted small coal-fired units,which have been replaced by technologically advanced large units,primarily drives the enhanced efficiency.The policy’s effects are most pronounced in large-scale power plants and those with high coal combustion efficiency.Furthermore,a comparison of power plants’productivity distribution before and after policy implementation reveals that the CLRS policy not only enhances capital productivity in the coal-fired power sector but also increases rational labor allocation.Our findings have important policy implications for developing countries vis-à-vis building efficient and stable power systems amid climate change.
基金General Program of the National Natural Science Foundation of China (82474352)Natural Science Foundation of Hunan Province (2023JJ60124)。
文摘Objective To predict the potential targets of Qingfu Juanbi Decoction(青附蠲痹汤,QFJBD)in treating rheumatoid arthritis(RA)using an improved Transformer model and investigate the network pharmacological mechanisms underlying QFJBD’s therapeutic effects on RA.Methods First,a traditional Chinese medicine herb-target interaction(TCMHTI)model was constructed to predict herb-target interactions based on Transformer improvement.The per-formance of the TCMHTI model was evaluated against baseline models using three metrics:area under the receiver operating characteristic curve(AUC),precision-recall curve(PRC),and accuracy.Subsequently,a protein-protein interaction(PPI)network was built based on the predicted targets,with core targets identified as the top nine nodes ranked by degree val-ues.Gene Ontology(GO)functional and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analyses were performed using the targets predicted by TCMHTI and the targets identified through network pharmacology method for comparison.Then,the re-sults were compared.Finally,the core targets predicted by TCMHTI were validated through molecular docking and literature review.Results The TCMHTI model achieved an AUC of 0.883,PRC of 0.849,and accuracy of 0.818,predicting 49 potential targets for QFJBD in RA treatment.Nine core targets were identified:tumor necrosis factor(TNF)-α,interleukin(IL)-1β,IL-6,IL-10,IL-17A,cluster of differentia-tion 40(CD40),cytotoxic T-lymphocyte-associated protein 4(CTLA4),IL-4,and signal trans-ducer and activator of transcription 3(STAT3).The enrichment analysis demonstrated that the TCMHTI model predicted 49 targets and enriched more pathways directly associated with RA,whereas classical network pharmacology identified 64 targets but enriched pathways showing weaker relevance to RA.Molecular docking demonstrated that the active molecules in QFJBD exhibit favorable binding energy with RA targets,while literature research further revealed that QFJBD can treat RA through 9 core targets.Conclusion The TCMHTI model demonstrated greater accuracy than traditional network pharmacology methods,suggesting QFJBD exerts therapeutic effects on RA by regulating tar-gets like TNF-α,IL-1β,and IL-6,as well as multiple signaling pathways.This study provides a novel framework for bridging traditional herbal knowledge with precision medicine,offering actionable insights for developing targeted TCM therapies against diseases.
基金the Korea Meteorological Administration Research and Development Program under Grant KMI(Grant No.RS-2023-00241809)conducted under the framework of the research and development program of the Korea Institute of Energy Research(C5-2422).
文摘This study identified the relationship between tropical cyclone(TC)activity and extreme Pacific–Japan(PJ)teleconnection patterns in August and September.In the East China Sea(ECS)and Mariana Islands(MI)regions,where the edge of the western North Pacific subtropical high(WNPSH)is located,approximately 60%–75%of TCs migrate to Far East Asian countries.A significant positive correlation existed between the frequency of northward migration of TCs and PJ patterns,since the TC frequency in the ECS and MI regions was significantly higher in the positive compared with the negative phase.In the positive phase,the main reason for the large number of TCs occurring was the monsoon trough’s location and strength.The strong and northeastward-shifted monsoon trough in the positive phase leads to more TCs in the ECS and MI regions.Other large-scale environments associated with TC formation also favored TC genesis around the ECS and MI regions.The higher PDI(power dissipation index)during the positive PJ phase can potentially lead to significant impacts in the Far East Asian countries.These characteristics were particularly more notable in August compared with September.
基金supported by the National Social Science Fund of China(23BGL272)。
文摘The fraudulent website image is a vital information carrier for telecom fraud.The efficient and precise recognition of fraudulent website images is critical to combating and dealing with fraudulent websites.Current research on image recognition of fraudulent websites is mainly carried out at the level of image feature extraction and similarity study,which have such disadvantages as difficulty in obtaining image data,insufficient image analysis,and single identification types.This study develops a model based on the entropy method for image leader decision and Inception-v3 transfer learning to address these disadvantages.The data processing part of the model uses a breadth search crawler to capture the image data.Then,the information in the images is evaluated with the entropy method,image weights are assigned,and the image leader is selected.In model training and prediction,the transfer learning of the Inception-v3 model is introduced into image recognition of fraudulent websites.Using selected image leaders to train the model,multiple types of fraudulent websites are identified with high accuracy.The experiment proves that this model has a superior accuracy in recognizing images on fraudulent websites compared to other current models.
基金the National Natural Science Foundation of China(Grant Nos.62102347,62376041,62172352)Guangdong Ocean University Research Fund Project(Grant No.060302102304).
文摘Point-of-interest(POI)recommendations in location-based social networks(LBSNs)have developed rapidly by incorporating feature information and deep learning methods.However,most studies have failed to accurately reflect different users’preferences,in particular,the short-term preferences of inactive users.To better learn user preferences,in this study,we propose a long-short-term-preference-based adaptive successive POI recommendation(LSTP-ASR)method by combining trajectory sequence processing,long short-term preference learning,and spatiotemporal context.First,the check-in trajectory sequences are adaptively divided into recent and historical sequences according to a dynamic time window.Subsequently,an adaptive filling strategy is used to expand the recent check-in sequences of users with inactive check-in behavior using those of similar active users.We further propose an adaptive learning model to accurately extract long short-term preferences of users to establish an efficient successive POI recommendation system.A spatiotemporal-context-based recurrent neural network and temporal-context-based long short-term memory network are used to model the users’recent and historical checkin trajectory sequences,respectively.Extensive experiments on the Foursquare and Gowalla datasets reveal that the proposed method outperforms several other baseline methods in terms of three evaluation metrics.More specifically,LSTP-ASR outperforms the previously best baseline method(RTPM)with a 17.15%and 20.62%average improvement on the Foursquare and Gowalla datasets in terms of the Fβmetric,respectively.
文摘Chinese Medicine(CM)has been widely used as an important avenue for disease prevention and treatment in China especially in the form of CM prescriptions combining sets of herbs to address patients’symptoms and syndromes.However,the selection and compatibility of herbs are complex and abstract due to intrinsic relationships between herbal properties and their overall functions.Network analysis is applied to demonstrate the complex relationships between individual herbal efficacy and the overall function of CM prescriptions.To illustrate their connections and correlations,prescription function(PF),prescription herb(PH),and herbal efficacy(HE)intranetworks are proposed based on CM theory to identify relationships between herbs and prescriptions.These three networks are then connected by PF-PH and PH-HE interlayer networks adopting herb dosage to form a multidimensional heterogeneous network,a Prescription-Herb-Function Network(PHFN).The network is applied to 112 classic prescriptions from Treatise on Exogenous Febrile and Miscellaneous Diseases to illustrate the application of PHFN.The PHFN is constructed including 146 functions in PF intra network,89 herbs in the PH intra network,and 163 herbal efficacies in the HE intra network.The results show that herb pairs with synergistic actions have stronger relevance,such as licorice-cassia twig,licorice-Chinese date,fresh ginger-Chinese date,etc.The integration of dosage to the network helps to indicate the main herbs for cluster analysis and automatic formulation.PHFN also reveals the internal relationships between the functions of prescriptions and composed herbal efficacies.
基金This work was supported by the Key Research and Development Plan of China(No.2017YFC1703306)Key Project of Education Department in Hunan Province(No.18A227)Key Project of Traditional Chinese Medicine Scientific Research Plan in Hunan Province(2020002).
文摘The collection and extraction of tongue images has always been an important part of intelligent tongue diagnosis.At present,the collection of tongue images generally needs to be completed in a sealed,stable light environment,which is not conducive to the promotion of extensive tongue image and intelligent tongue diagnosis.In response to the problem,a newalgorithm named GCYTD(GELU-CA-YOLO Tongue Detection)is proposed to quickly detect and locate the tongue in a natural environment,which can greatly reduce the restriction of the tongue image collection environment.The algorithm is based on the YOLO(You Only Look Once)V4-tiny network model to detect the tongue.Firstly,the GELU(Gaussian Error Liner Units)activation function is integrated into the model to improve the training speed and reduce the number of model parameters;then,the CA(Coordinate Attention)mechanism is integrated into the model to enhance the detection precision and improve the failure tolerance of the model.Compared with the other classical algorithms,Experimental results show thatGCYTD algorithm has a better performance on the tongue images of all types in terms of training speed,tongue detection speed and detection precision,etc.The lighter model can contribute on deploying the tongue detection model on small mobile terminals.
文摘Traditional data collection methods such as remote sensing and field surveying often fail to offer timely information during or immediately following disaster events.Social sensing enables all citizens to become part of a large sensor network,which is low cost,more comprehensive,and always broadcasting situational awareness information.However,data collected with social sensing is often massive,heterogeneous,noisy,unreliable from some aspects,comes in continuous streams,and often lacks geospatial reference information.Together,these issues represent a grand challenge toward fully leveraging social sensing for emergency management decision making under extreme duress.Meanwhile,big data computing methods and technologies such as high-performance computing,deep learning,and multi-source data fusion become critical components of using social sensing to understand the impact of and response to the disaster events in a timely fashion.This special issue captures recent advancements in leveraging social sensing and big data computing for supporting disaster management.Specifically analyzed within these papers are some of the promises and pitfalls of social sensing data for disaster relevant information extraction,impact area assessment,population mapping,occurrence patterns,geographical disparities in social media use,and inclusion in larger decision support systems.
基金This study was funded by the National Science Foundation(Grant#2028791).
文摘COVID-19 cripples the restaurant industry as a crucial socioeconomic sector that contributes immensely to the global economy.However,what the current literature less explored is to quantify the effect of COVID-19 on restaurant visitation and revenue at different spatial scales,as well as its relationship with the neighborhood character-istics of customers’origins.Based on the Point of Interest(POI)measures derived from SafeGraph data providing mobility records of 45 million cell phone users in the US,our study takes Lower Manhattan,New York City,as the pilot study,and aims to examine 1)the change of restaurant visitations and revenue in the period prior to and after the COVID-19 outbreak,2)the areas where restaurant customers live,and 3)the association between the neighborhood characteristics of these areas and lost customers.By doing so,we provide a geographic information system-based analytical frame-work integrating the big data mining,web crawling techniques,and spatial-economic modelling.Our analytical framework can be implemented to estimate the broader effect of COVID-19 on other industries and can be augmented in a financially monitoring manner in response to future pandemics or public emergencies.
基金the National Natural Science Foundation of China(No.61602525,No.61572525)the Research Foundation of Education Bureau of Hunan Province of China(No.19C1391)the Natural Science Foundation of Hunan Province of China(No.2020JJ5775)。
文摘With the number of connected devices increasing rapidly,the access latency issue increases drastically in the edge cloud environment.Massive low time-constrained and data-intensive mobile applications require efficient replication strategies to decrease retrieval time.However,the determination of replicas is not reasonable in many previous works,which incurs high response delay.To this end,a correlation-aware replica prefetching(CRP)strategy based on the file correlation principle is proposed,which can prefetch the files with high access probability.The key is to determine and obtain the implicit high-value files effectively,which has a significant impact on the performance of CRP.To achieve the goal of accelerating the acquisition of implicit highvalue files,an access rule management method based on consistent hashing is proposed,and then the storage and query mechanisms for access rules based on adjacency list storage structure are further presented.The theoretical analysis and simulation results corroborate that CRP shortens average response time over 4.8%,improves average hit ratio over 4.2%,reduces transmitting data amount over 8.3%,and maintains replication frequency at a reasonable level when compared to other schemes.
基金This study was jointly funded by the National Natural Science Foundation of China(61803112,32160151)the Science and Technology Foundation of Guizhou Province(2019-2811).
文摘Essential ncRNA is a type of ncRNAwhich is indispensable for the sur-vival of organisms.Although essential ncRNAs cannot encode proteins,they are as important as essential coding genes in biology.They have got wide variety of applications such as antimicrobial target discovery,minimal genome construction and evolution analysis.At present,the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming,laborious and costly.In addition,tra-ditional experimental methods are limited by the organisms as less than 1%of bacteria can be cultured in the laboratory.Therefore,it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA.In this paper,we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences.The method was developed with Support Vector Machine(SVM).The accuracy of the method was evaluated through cross-species cross-vali-dation and found to be between 0.69 and 0.81.It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM.Thus,the method can be applied for discovering essential ncRNAs in bacteria.
基金supported by the Faculty Startup Fund of the College of Arts and Sciences at Emory University.
文摘The transformation from authoritative to user-generated data landscapes has garnered considerable attention,notably with the proliferation of crowdsourced geospatial data.Facilitated by advancements in digital technology and high-speed communication,this paradigm shift has democratized data collection,obliterating traditional barriers between data producers and users.While previous literature has compartmentalized this subject into distinct platforms and application domains,this review offers a holistic examination of crowdsourced geospatial data.Employing a narrative review approach due to the interdisciplinary nature of the topic,we investigate both human and Earth observations through crowdsourced initiatives.This review categorizes the diverse applications of these data and rigorously examines specific platforms and paradigms pertinent to data collection.Furthermore,it addresses salient challenges,encompassing data quality,inherent biases,and ethical dimensions.We contend that this thorough analysis will serve as an invaluable scholarly resource,encapsulating the current state-of-the-art in crowdsourced geospatial data,and offering strategic directions for future interdisciplinary research and applications across various sectors.
基金supported by the project FAIR-Future AI Research(PE00000013)spoke 6–Symbiotic AI,under the NRRP MUR program funded by the NextGenerationEU and by the project Prin 2017“SUS&LOW-Sustaining low-impact practices in horticulture through non-destructive approach to provide more information on fresh produce history and quality”(grant number:201785Z5H9)from the Italian Ministry of University and Research。
文摘The perceived visual quality of fruits and vegetables plays a central role in the choices made by retail customers.Machine learning(ML)approaches based on image analysis have been recently proposed to overcome the poor efficiency and subjectivity of human visual evaluation as well as the expensiveness and destructiveness of physical and chemical methods that measure internal indicators.In this paper,we propose a ML method based on Random Forests for estimating the chlorophyll and ammonia contents(considered,in the literature,reliable indicators of product freshness)from images of fresh-cut rocket leaves.Our approach copes with specific issues raised by(i)the non-uniform distributions of ammonia and chlorophyll values and(ii)the need to provide insights into the features that produce a particular model outcome,aiming to enhance its trustworthiness.Our experiments,performed on real images of fresh-cut rocket leaves,proved that the proposed approach significantly outperforms 7 competitor methods,obtaining an improvement of the RSE results of 6.6%for the prediction of the ammonia and of 10.4%for the prediction of the chlorophyll over its best competitor.Moreover,a specific analysis of the explainability of the predictions showed that the learned models are based on reasonable features,empowering their acceptance in real-world applications.
基金supported by the National Natural Science Foundation of China(Grant No.82402294)the Guangzhou Science and Technology Project(Grant Nos.2023A03J0722 and 2024A03J1194)+2 种基金the Scientific Research Launch Project of Sun Yat-Sen Memorial Hospital(Grant No.SYSYQH-II-2024–07)the Guangdong Science and Technology Department(Grant No.2024B1212030002)the Guangdong Yiyang Healthcare Charity Foundation(Grant No.2023CSM003).
文摘Objective:This prospective observational cohort real-world study evaluates and compares the efficacy and prognosis of ultrasound(US)and gene-based microwave ablation(MWA)and surgical treatment in patients with low-risk papillary thyroid carcinoma(PTC),emphasizing the influence of genetic mutations on low-risk patient selection.Background:MWA,a minimally invasive technique,is increasingly recognized in the management of PTC.While traditional criteria for ablation focus on tumor size,number,and location,the impact of genetic mutations on treatment efficacy remains underexplored.Methods:A total of 201 patients with low-risk PTC without metastasis were prospectively enrolled.All patients underwent US and next-generation sequencing to confirm low-risk status.Patients chose either ablation or surgery and were monitored until November 2024.Efficacy and complications were assessed using thyroid US and contrast-enhanced US.Results:The median follow-up of this study is 12 months.There is no significant difference between the ablation group(3.0%)and the surgery group(1.0%)in disease free survival(P=0.360).However,the surgery group exhibited a significantly higher complication rate,particularly for temporary hypoparathyroidism(P<0.001).Ablation offers notable advantages,including shorter treatment duration,faster recovery,less intraoperative blood loss,and reduced costs(P<0.001),while maintaining favorable safety and comparable efficiency.Conclusions:For patients with low-risk genetic mutations,ablation provides comparable efficacy and disease free survival to surgery,with significant benefits in safety,recovery,and overall cost.Guided by US and next-generation sequencing,precise patient selection enhances the potential of ablation as a promising,minimally invasive alternative to surgery in the management of low-risk PTC.
基金supported by National Institutes of Health[grant number 3R01AI127203-04S1]and NSF[grant num-ber 2028791].
文摘Geospatial social media(GSM)data has been increasingly used in public health due to its rich,timely,and accessible spatial information,particularly in infectious disease research.This review synthesized 86 research articles that use GSM data in infectious diseases published between December 2013 and March 2022.These articles cover 12 infectious disease types ranging from respiratory infectious diseases to sexually transmitted diseases with spatial levels varying from the neighborhood,county,state,and country.We categorized these studies into three major infectious disease research domains:surveillance,explanation,and prediction.With the assistance of advanced computing,statistical and spatial methods,GSM data has been widely and deeply applied to these domains,particularly in surveillance and explanation domains.We further identified four knowledge gaps in terms of contextual information use,application scopes,spatiotemporal dimension,and data limitations and proposed innovation opportunities for future research.Ourfindings will contribute to a better understanding of using GSM data in infectious diseases studies and provide insights into strategies for using GSM data more effectively in future research.
基金supported by the NSF[National Science Foundation]under grant 1841403,2027540,and 2028791.
文摘The COVID-19 pandemic poses unprecedented challenges around the world.Many studies have applied mobility data to explore spatiotemporal trends over time,investigate associations with other variables,and predict or simulate the spread of COVID-19.Our objective was to provide a comprehensive overview of human mobility open data to guide researchers and policymakers in conducting data-driven evaluations and decision-making for the COVID-19 pandemic and other infectious disease outbreaks.We summarized the mobility data usage in COVID-19 studies by reviewing recent publications on COVID-19 and human mobility from a data-oriented perspective.We identified three major sources of mobility data:public transit systems,mobile operators,and mobile phone applications.Four approaches have been commonly used to estimate human mobility:public transit-based flow,social activity patterns,index-based mobility data,and social media-derived mobility data.We compared mobility datasets’characteristics by assessing data privacy,quality,space–time coverage,high-performance data storage and processing,and accessibility.We also present challenges and future directions of using mobility data.This review makes a pivotal contribution to understanding the use of and access to human mobility data in the COVID-19 pandemic and future disease outbreaks.
基金Supported by National Natural Science Foundation of China:82405556,82174527the National Traditional Chinese Medicine Advantage Specialized Department Construction Project:Yue TCM[2024]No.2+4 种基金the China Postdoctoral Science Foundation General Project:2024M750464Guangdong Basic and Applied Basic Research Foundation:2023A1515110682Jin-Xiong Lao Guangdong Provincial Famous Chinese Medicine Practitioner Heritage Studio:Yue TCM Office Document[2023]No.108Jin-Xiong Lao Foshan City Famous Chinese Medicine Practitioner Heritage Studio:Foshan Health Office Document[2022]No.106Foshan City's 14th 5-Year Plan Chinese Medicine Key Specialized Construction Projects:Foshan Health Office Document[2020]No.15。
文摘Overeating is a risk factor and a management challenge in adiposity-based chronic disease(ABCD).Acupuncture has shown high safety and reliable clinical evidence in addressing overeating,and it is the promising potential non-pharmacological intervention.However,the mechanism underlying its effects has not been sufficiently summarized.The addiction model offers a framework to elucidate the mechanism of this aberrant eating behavior and provides novel perspectives and breakthrough points for optimizing clinical acupuncture strategies in ABCD management.In the paper,through analyzing domestic and in-ternational relevant findings,the characteristics of overeating based on food addiction,the relationship between overeating and ABCD,and the potential effect mechanisms of acupuncture for FA have been re-viewed and summarized.Including adaptive balance of transmitters and hormones,functional networks,periphery-central connection,and cross-system interaction.In future studies,the maturely-developed ad-diction research methods should be adopted to deepen the exploration on the mechanism of acupuncture effect,addiction medicine should be leveraged to shatter the cognitive barriers surrounding acupuncture’s role in mind-body regulation for ABCD treatment,and the prevention and treatment of overeating via acupuncture should be organically integrated into multidisciplinary management strategies.
基金supported by the National Natural Science Foundation of China(No.U19A2059)the 2022 Research Foundation of Chengdu Textile College(No.X22032161).
文摘Currently,most existing inductive relation prediction approaches are based on subgraph structures,with subgraph features extracted using graph neural networks to predict relations.However,subgraphs may contain disconnected regions,which usually represent different semantic ranges.Because not all semantic information about the regions is helpful in relation prediction,we propose a relation prediction model based on a disentangled subgraph structure and implement a feature updating approach based on relevant semantic aggregation.To indirectly achieve the disentangled subgraph structure from a semantic perspective,the mapping of entity features into different semantic spaces and the aggregation of related semantics on each semantic space are updated.The disentangled model can focus on features having higher semantic relevance in the prediction,thus addressing a problem with existing approaches,which ignore the semantic differences in different subgraph structures.Furthermore,using a gated recurrent neural network,this model enhances the features of entities by sorting them by distance and extracting the path information in the subgraphs.Experimentally,it is shown that when there are numerous disconnected regions in the subgraph,our model outperforms existing mainstream models in terms of both Area Under the Curve-Precision-Recall(AUC-PR)and Hits@10.Experiments prove that semantic differences in the knowledge graph can be effectively distinguished and verify the effectiveness of this method.