期刊文献+
共找到52篇文章
< 1 2 3 >
每页显示 20 50 100
Deep Learning Approaches for Battery Capacity and State of Charge Estimation with the NASA B0005 Dataset
1
作者 Zeyang Zhou Zachary James Ryan +5 位作者 Utkarsh Sharma Tran Tien Anh Shashi Mehrotra Angelo Greco Jason West Mukesh Prasad 《Computers, Materials & Continua》 2025年第6期4795-4813,共19页
Accurate capacity and State of Charge(SOC)estimation are crucial for ensuring the safety and longevity of lithium-ion batteries in electric vehicles.This study examines ten machine learning architectures,Including Dee... Accurate capacity and State of Charge(SOC)estimation are crucial for ensuring the safety and longevity of lithium-ion batteries in electric vehicles.This study examines ten machine learning architectures,Including Deep Belief Network(DBN),Bidirectional Recurrent Neural Network(BiDirRNN),Gated Recurrent Unit(GRU),and others using the NASA B0005 dataset of 591,458 instances.Results indicate that DBN excels in capacity estimation,achieving orders-of-magnitude lower error values and explaining over 99.97%of the predicted variable’s variance.When computational efficiency is paramount,the Deep Neural Network(DNN)offers a strong alternative,delivering near-competitive accuracy with significantly reduced prediction times.The GRU achieves the best overall performance for SOC estimation,attaining an R^(2) of 0.9999,while the BiDirRNN provides a marginally lower error at a slightly higher computational speed.In contrast,Convolutional Neural Networks(CNN)and Radial Basis Function Networks(RBFN)exhibit relatively high error rates,making them less viable for real-world battery management.Analyses of error distributions reveal that the top-performing models cluster most predictions within tight bounds,limiting the risk of overcharging or deep discharging.These findings highlight the trade-off between accuracy and computational overhead,offering valuable guidance for battery management system(BMS)designers seeking optimal performance under constrained resources.Future work may further explore advanced data augmentation and domain adaptation techniques to enhance these models’robustness in diverse operating conditions. 展开更多
关键词 Battery capacity estimation state of charge deep learning prediction efficiency energy storage systems
在线阅读 下载PDF
A Conceptual and Computational Framework for Aspect-Based Collaborative Filtering Recommender Systems 被引量:1
2
作者 Samin Poudel Marwan Bikdash 《Journal of Computer and Communications》 2023年第3期110-130,共21页
Many datasets in E-commerce have rich information about items and users who purchase or rate them. This information can enable advanced machine learning algorithms to extract and assign user sentiments to various aspe... Many datasets in E-commerce have rich information about items and users who purchase or rate them. This information can enable advanced machine learning algorithms to extract and assign user sentiments to various aspects of the items thus leading to more sophisticated and justifiable recommendations. However, most Collaborative Filtering (CF) techniques rely mainly on the overall preferences of users toward items only. And there is lack of conceptual and computational framework that enables an understandable aspect-based AI approach to recommending items to users. In this paper, we propose concepts and computational tools that can sharpen the logic of recommendations and that rely on users’ sentiments along various aspects of items. These concepts include: The sentiment of a user towards a specific aspect of a specific item, the emphasis that a given user places on a specific aspect in general, the popularity and controversy of an aspect among groups of users, clusters of users emphasizing a given aspect, clusters of items that are popular among a group of users and so forth. The framework introduced in this study is developed in terms of user emphasis, aspect popularity, aspect controversy, and users and items similarity. Towards this end, we introduce the Aspect-Based Collaborative Filtering Toolbox (ABCFT), where the tools are all developed based on the three-index sentiment tensor with the indices being the user, item, and aspect. The toolbox computes solutions to the questions alluded to above. We illustrate the methodology using a hotel review dataset having around 6000 users, 400 hotels and 6 aspects. 展开更多
关键词 Recommender System Collaborative Filtering Aspect based recommendation Recommendation System Framework Aspect Sentiments
在线阅读 下载PDF
Cloud detection from visual band of satellite image based on variance of fractal dimension
3
作者 TIAN Pingfang GUANG Qiang LIU Xing 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第3期485-491,共7页
Cover ratio of cloud is a very important factor which affects the quality of a satellite image, therefore cloud detection from satellite images is a necessary step in assessing the image quality. The study on cloud de... Cover ratio of cloud is a very important factor which affects the quality of a satellite image, therefore cloud detection from satellite images is a necessary step in assessing the image quality. The study on cloud detection from the visual band of a satellite image is developed. Firstly, we consider the differences between the cloud and ground including high grey level, good continuity of grey level, area of cloud region, and the variance of local fractal dimension (VLFD) of the cloud region. A single cloud region detection method is proposed. Secondly, by introducing a reference satellite image and by comparing the variance in the dimensions corresponding to the reference and the tested images, a method that detects multiple cloud regions and determines whether or not the cloud exists in an image is described. By using several Ikonos images, the performance of the proposed method is demonstrated. 展开更多
关键词 CLOUD detection VISUAL IMAGE satellite IMAGE variance of local FRACTAL DIMENSION (VLFD)
在线阅读 下载PDF
Conceptual Method of Temperature Sensation in Bionic Hand by Extraordinary Perceptual Phenomenon
4
作者 Saeed Bahrami Moqadam Ahamd Saleh Asheghabadi +6 位作者 Farzaneh Norouzi Hamed Jafarzadeh Ali Khosroabadi AfshinAlagheband Ghazal Bangash Negar Morovatdar Jing Xu 《Journal of Bionic Engineering》 SCIE EI CSCD 2021年第6期1344-1357,共14页
Lack of temperature sensation of myoelectric prosthetic hand limits the daily activities of amputees.To this end,a noninvasive temperature sensation method is proposed to train amputees to sense temperature with psych... Lack of temperature sensation of myoelectric prosthetic hand limits the daily activities of amputees.To this end,a noninvasive temperature sensation method is proposed to train amputees to sense temperature with psychophysical sensory substitution.In this study,22 healthy participants took part besides 5 amputee participants.The duration time of the study was 31 days with five test steps according to the Leitner technique.An adjustable temperature mug and a Peltier were used to change the temperature of the water/phantom digits to induce temperature to participants.Also,to isolate the surroundings and show colors,a Virtual Reality(VR)glass was employed.The statistical results conducted are based on the response of participants with questionnaire method.Using Chi-square tests,it is concluded that participants answer the experiment significantly correctly using the Leitner technique(P value<0.05).Also,by applying the“Repeated Measures ANOVA”,it is noticed that the time of numbness felt by participants had significant(P value<0.001)difference.Participants could remember lowest and highest temperatures significantly better than other temperatures(P value<0.001);furthermore,the well-trained amputee participant practically using the prosthesis with 72.58%could identify object’s temperature with only once time experimenting the color temperature. 展开更多
关键词 Temperature perceptions Leitner learning technique Classical conditioning Psychophysical sensory substitution Extraordinary perceptual phenomenon Bionic hand prosthesis
暂未订购
Solving the BBMB Equation in Shallow Water Waves via Space-Time MQ-RBF Collocation
5
作者 Hongwei Ma Yingqian Tian +2 位作者 Fuzhang Wang Quanfu Lou Lijuan Yu 《Computer Modeling in Engineering & Sciences》 2025年第9期3419-3432,共14页
This study introduces a novel single-layer meshless method,the space-time collocation method based on multiquadric-radial basis functions(MQ-RBF),for solving the Benjamin-Bona-Mahony-Burgers(BBMB)equation.By reconstru... This study introduces a novel single-layer meshless method,the space-time collocation method based on multiquadric-radial basis functions(MQ-RBF),for solving the Benjamin-Bona-Mahony-Burgers(BBMB)equation.By reconstructing the time variable as a space variable,this method establishes a combined space-time structure that can eliminate the two-step computational process required in traditional grid methods.By introducing shape parameteroptimized MQ-RBF,high-precision discretization of the nonlinear,dispersive,and dissipative terms in the BBMB equation is achieved.The numerical experiment section validates the effectiveness of the proposed method through three benchmark examples.This method shows significant advantages in computational efficiency,providing a new numerical tool for engineering applications in fields such as shallow water wave dynamics. 展开更多
关键词 Numerical method BBMB equation meshless method radial basis function nonlinear partial differential equation
在线阅读 下载PDF
Spatial data intelligence and city metaverse:A review
6
作者 Xiaofeng Meng Yong Li +9 位作者 Ke Liu Yu Liu Bin Yang Xuan Song Guoqiong Liao Senzhang Wang Ziqiang Yu Longbiao Chen Xiao Pan Yuming Lin 《Fundamental Research》 2025年第3期1169-1193,共25页
Spatial Data Intelligence(SDI)encompasses acquiring,storing,analyzing,mining,and visualizing spatial data to gain insights into the physical world and uncover valuable knowledge.These understandings and knowledge play... Spatial Data Intelligence(SDI)encompasses acquiring,storing,analyzing,mining,and visualizing spatial data to gain insights into the physical world and uncover valuable knowledge.These understandings and knowledge play a crucial role in connecting physical and virtual realms,such as in developing a City Metaverse(CM)aimed at enhancing and optimizing modern urban environments.The advancement of CM holds immense potential to benefit urban dwellers,making research on SDI an increasingly prominent area of focus.This paper contributes significantly by organizing the relevant research and technologies within a coherent framework.Firstly,we identify SDI technologies capable of collecting real-world information to construct a virtual CM.Subsequently,we delve into the technologies that can be compositely integrated with SDI to facilitate interaction with and management of actual cities from the virtual perspective.Additionally,we emphasize the effectiveness and potential of these methods in practical applications.Lastly,we conclude our survey by discussing emerging challenges associated with technological progress,the industrial chain,legal and regulatory aspects,and ethical and moral considerations. 展开更多
关键词 Spatial data intelligence City metaverse Virtual-real interaction Application prospect Smart city
原文传递
Training of Engineering-oriented Mindset with Application of an IUR Collaboration Project
7
作者 Liang Dong Shouqiang Liu Qingzhen Xu 《计算机教育》 2026年第3期61-66,共6页
Software engineering has been embraced by almost all industries to promote work efficiency,improve user experience or cut cost.In line with this,the education on software engineering should be made more adaptable to m... Software engineering has been embraced by almost all industries to promote work efficiency,improve user experience or cut cost.In line with this,the education on software engineering should be made more adaptable to meet the needs of industries.Industry-university-research(IUR)collaboration project,which was initially designed to reinforce the association between universities and enterprises,brought added value to this end.In this paper,an IUR collaboration project on tele-rehabilitation is presented as an example for education practice,where emphasis is laid on the ways of analyzing users’needs,converting users’needs to infrastructure design,decomposing a project into tasks,etc.The project had been used as both student assignments and case studies in software engineering courses,where students were motivated to deal with real medical problems from an engineering perspective.It was shown that by introducing the IUR collaboration project,it helped the students to build up engineering-oriented mindset besides improving their R&D ability on software engineering. 展开更多
关键词 Industry-university-research(IUR)collaboration Software engineering education Tele-rehabilitation Engineering-oriented mindset
在线阅读 下载PDF
Accurate and efficient follower log repair for Raft-replicated database systems 被引量:4
8
作者 Jinwei GUO Peng CAI +1 位作者 Weining QIAN Aoying ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第2期91-103,共13页
State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates th... State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates the log to other followers.Since the followers can handle read requests and many real workloads are usually read-intensive,the recovery speed of a crashed follower may significantly impact on the throughput.Different from traditional database recovery,the recovering follower needs to repair its local log first.Original Raft protocol takes many network round trips to do log comparison between leader and the crashed follower.To reduce network round trips,an optimization method is to truncate the follower’s uncertain log entries behind the latest local commit point,and then to directly fetch all committed log entries from the leader in one round trip.However,if the commit point is not persisted,the recovering follower has to get the whole log from the leader.In this paper,we propose an accurate and efficient log repair(AELR)algorithm for follower recovery.AELR is more robust and resilient to follower failure,and it only needs one network round trip to fetch the least number of log entries for follower recovery.This approach is implemented in the open source database system OceanBase.We experimentally show that the system adopting AELR has a good performance in terms of recovery time. 展开更多
关键词 RAFT high availability log replication log repair
原文传递
A Fast Filling Algorithm for Image Restoration Based on Contour Parity 被引量:1
9
作者 Yan Liu Wenxin Hu +2 位作者 Longzhe Han Maksymyuk Taras Zhiyun Chen 《Computers, Materials & Continua》 SCIE EI 2020年第4期509-519,共11页
Filling techniques are often used in the restoration of images.Yet the existing filling technique approaches either have high computational costs or present problems such as filling holes redundantly.This paper propos... Filling techniques are often used in the restoration of images.Yet the existing filling technique approaches either have high computational costs or present problems such as filling holes redundantly.This paper proposes a novel algorithm for filling holes and regions of the images.The proposed algorithm combines the advantages of both the parity-check filling approach and the region-growing inpainting technique.Pairing points of the region’s boundary are used to search and to fill the region.The scanning range of the filling method is within the target regions.The proposed method does not require additional working memory or assistant colors,and it can correctly fill any complex contours.Experimental results show that,compared to other approaches,the proposed algorithm fills regions faster and with lower computational cost. 展开更多
关键词 Region filling image restoration parity check region growing
在线阅读 下载PDF
Efficient and stable quorum-based log replication and replay for modern cluster-databases
10
作者 Donghui WANG Peng CAI +1 位作者 Weining QIAN Aoying ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第5期143-158,共16页
The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raf... The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raft have been widely used in the distributed databases to offer higher availability and fault-tolerance.However,it is non-trivial to replicate IMDB because high transaction rate has brought new challenges.First,the leader node in quorum replication should have adaptivity by considering various transaction arrival rates and the processing capability of follower nodes.Second,followers are required to replay logs to catch up the state of the leader in the highly concurrent setting to reduce visibility gap.Third,modern databases are often built with a cluster of commodity machines connected by low configuration networks,in which the network anomalies often happen.In this case,the performance would be significantly affected because the follower node falls into the long-duration exception handling process(e.g.,fetch lost logs from the leader).To this end,we build QuorumX,an efficient and stable quorum-based replication framework for IMDB under heavy OLTP workloads.QuorumX combines critical path based batching and pipeline batching to provide an adaptive log propagation scheme to obtain a stable and high performance at various settings.Further,we propose a safe and coordination-free log replay scheme to minimize the visibility gap between the leader and follower IMDBs.We further carefully design the process for the follower node in order to alleviate the influence of the unreliable network on the replication performance.Our evaluation results with the YCSB,TPC-C and a realistic microbenchmark demonstrate that QuorumX achieves the performance close to asynchronous primary-backup replication and could always provide a stable service with data consistency and a low-level visibility gap. 展开更多
关键词 log replication log replay consensus protocol high performance high availability QUORUM unreliable network packet loss
原文传递
Scalable and quantitative contention generation for performance evaluation on OLTP databases
11
作者 Chunxi ZHANG Yuming LI +2 位作者 Rong ZHANG Weining QIAN Aoying ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第2期15-31,共17页
Massive scale of transactions with critical requirements become popular for emerging businesses,especially in E-commerce.One of the most representative applications is the promotional event running on Alibaba's pl... Massive scale of transactions with critical requirements become popular for emerging businesses,especially in E-commerce.One of the most representative applications is the promotional event running on Alibaba's platform on some special dates,widely expected by global customers.Although we have achieved significant progress in improving the scalability of transactional database systems(OLTP),the presence of contention operations in workloads is still one of the fundamental obstacles to performance improving.The reason is that the overhead of managing conflict transactions with concurrency control mechanisms is proportional to the amount of contentions.As a consequence,generating contented workloads is urgent to evaluate performance of modern OLTP database systems.Though we have kinds of standard benchmarks which provide some ways in simulating contentions,e.g.,skew distribution control of transactions,they can not control the generation of contention quantitatively;even worse,the simulation effectiveness of these methods is affected by the scale of data.So in this paper we design a scalable quantitative contention generation method with fine contention granularity control.We conduct a comprehensive set of experiments on popular opensourced DBMSs compared with the latest contention simulation method to demonstrate the effectiveness of our generation work. 展开更多
关键词 high contention OLTP database performance evaluation database benchmarking
原文传递
The Space-Time Semi-Analytical Meshless Methods for Coupled Burgers'Equations
12
作者 ZHANG Zhiqiang WANG Fuzhang 《Wuhan University Journal of Natural Sciences》 CSCD 2024年第6期572-578,共7页
In this paper,a simple direct space-time semi-analytical meshless scheme is proposed for the numerical approximation of the coupled Burgers'equations.During the whole solution procedure,two different schemes are c... In this paper,a simple direct space-time semi-analytical meshless scheme is proposed for the numerical approximation of the coupled Burgers'equations.During the whole solution procedure,two different schemes are considered in terms of radial and non-radial basis functions.The time-dependent variable in the first radial scheme is directly considered as the normal space variables to formulate an"isotropic"space-time radial basis function.The second non-radial scheme considered relationship between time-dependent and spacedependent variables.Under such circumstance,we can get a one-step space-time meshless scheme.The numerical findings demonstrate that the proposed meshless schemes are precise,user-friendly,and effective in solving the coupled Burgers'equations. 展开更多
关键词 radial basis functions coupled Burgers'equations meshless methods numerical simulation
原文传递
A Survey on Quality Evaluation of Instruction Fine-tuning Datasets for Large Language Models 被引量:1
13
作者 Yitian Luo Yu Liu +2 位作者 Lu Zhang Feng Gao Jinguang Gu 《Data Intelligence》 2025年第3期527-566,共40页
Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality o... Instruction fine-tuning is a key method for adapting large language models(LLMs)to domain-specific tasks,and instruction quality significantly impacts model performance after fine-tuning.Hence,evaluating the quality of instruction and selecting high-quality instructions are essential steps in the process of LLM instruction fine-tuning.Although existing studies provide important theoretical foundations and techniques for this,there is still room for improvement in terms of generality,the relationship between methods and experimental verification.Current methods for evaluating instruction quality can be classified into four main categories:human evaluation,statistics-based evaluation,model-based evaluation,and LLMs-based evaluation.Among these methods,human evaluation relies on the subjective judgment and domain expertise of the evaluators,which offers interpretability and is suitable for scenarios involving small-scale data and sufficient budgets.Statistics-based evaluation estimates the quality of instructions using indicators such as stopwords and lexical diversity,providing high efficiency and a suitable evaluation for large-scale data.Model-based evaluation employs specific models to quantify indicators such as perplexity(PPL)and instruction following difficulty(IFD),which is flexible and suitable for specific tasks.The LLMs-based evaluation rates the quality of instructions through prompt-based interaction with LLMs,focusing on aspects such as accuracy and coherence,which is highly automated and customizable,simplifying the evaluation process.Finally,considering the limitations of current quality evaluation methods,some future research directions are proposed for improvement.These include refining instruction categories,extending evaluation indicators,enhancing human-AI interaction evaluation method,applying agents in instruction quality evaluation,and developing a comprehensive evaluation framework. 展开更多
关键词 Large language models Instruction fine-tuning datasets Quality evaluation Human evaluation Statistics-based evaluation Model-based evaluation LLMs-based evaluation
原文传递
Assessment of Municipal Solid Waste Management in the Farmgate Area of Dhaka North City Corporation
14
作者 Seyedali Mirmotalebi Shoeb Rahman +1 位作者 Mayida Rubya Tithi Imran Khan Apu 《World Journal of Engineering and Technology》 2024年第1期1-23,共23页
This investigation is focused on conducting a thorough analysis of Municipal Solid Waste Management (MSWM). MSWM encompasses a range of interdisciplinary measures that govern the various stages involved in managing un... This investigation is focused on conducting a thorough analysis of Municipal Solid Waste Management (MSWM). MSWM encompasses a range of interdisciplinary measures that govern the various stages involved in managing unwanted or non-utilizable solid materials, commonly known as rubbish, trash, junk, refuse, and garbage. These stages include generation, storage, collection, recycling, transportation, handling, disposal, and monitoring. The waste materials mentioned in this context exhibit a wide range of items, such as organic waste from food and vegetables, paper, plastic, polyethylene, iron, tin cans, deceased animals, byproducts from demolition activities, manure, and various other discarded materials. This study aims to provide insights into the possibilities of enhancing solid waste management in the Farmgate area of Dhaka North City Corporation (DNCC). To accomplish this objective, the research examines the conventional waste management methods employed in this area. It conducts extensive field surveys, collecting valuable data through interviews with local residents and key individuals involved in waste management, such as waste collectors, dealers, intermediate dealers, recyclers, and shopkeepers. The results indicate that significant amounts of distinct waste categories are produced daily. These include food and vegetable waste, which amount to 52.1 tons/day;polythene and plastic, which total 4.5 tons/day;metal and tin-can waste, which amounts to 1.4 tons/day;and paper waste, which totals 5.9 tons/day. This study highlights the significance of promoting environmental consciousness to effectively shape the attitudes of urban residents toward waste disposal and management. It emphasizes the need for collaboration between authorities and researchers to improve the current waste management system. 展开更多
关键词 Solid Waste Management Dhaka North City Corporation Sustainable Growth Integrated Waste Management Practice Waste Recycling
在线阅读 下载PDF
Agents with foundation models:advance and vision
15
作者 Chenghua GONG Xiang LI 《Frontiers of Computer Science》 2025年第4期119-120,共2页
1 Introduction With rapid development in computing power and breakthroughs in deep learning,the concept of“foundation models”has been introduced into the AI community.Generally,foundation models are large models tra... 1 Introduction With rapid development in computing power and breakthroughs in deep learning,the concept of“foundation models”has been introduced into the AI community.Generally,foundation models are large models trained on massive data and can be easily adapted to different domains for various tasks.With specific prompts,foundation models can generate texts and images,or even animate scenarios based on the given descriptions.Due to powerful capabilities,there is a growing trend to build agents based on foundation models.In this paper,we conduct an investigation into agents empowered by the foundation models. 展开更多
关键词 computing power foundation models adapted different domains animate scenarios deep learningthe massive data generate texts imagesor deep learning
原文传递
Efficient and accurate road crack detection technology based on YOLOv8-ES
16
作者 Kaili Zeng Rui Fan Xiaoyu Tang 《Autonomous Intelligent Systems》 2025年第1期339-349,共11页
Road damage detection is an important aspect of road maintenance.Traditional manual inspections are laborious and imprecise.With the rise of deep learning technology,pavement detection methods employing deep neural ne... Road damage detection is an important aspect of road maintenance.Traditional manual inspections are laborious and imprecise.With the rise of deep learning technology,pavement detection methods employing deep neural networks give an efficient and accurate solution.However,due to background diversity,limited resolution,and fracture similarity,it is tough to detect road cracks with high accuracy.In this study,we offer a unique,efficient and accurate road crack damage detection,namely YOLOv8-ES.We present a novel dynamic convolutional layer(EDCM)that successfully increases the feature extraction capabilities for small fractures.At the same time,we also present a new attention mechanism(SGAM).It can effectively retain crucial information and increase the network feature extraction capacity.The Wise-IoU technique contains a dynamic,non-monotonic focusing mechanism designed to return to the goal-bounding box more precisely,especially for low-quality samples.We validate our method on both RDD2022 and VOC2007 datasets.The experimental results suggest that YOLOv8-ES performs well.This unique approach provides great support for the development of intelligent road maintenance systems and is projected to achieve further advances in future applications. 展开更多
关键词 Road crack detection Object detection Attention mechanism Dynamic convolutional layer
原文传递
Fault-tolerant precise data access on distributed log-structured merge-tree 被引量:2
17
作者 Tao ZHU Huiqi HU +2 位作者 Weining QIAN Huan ZHOU Aoying ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第4期760-777,共18页
Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in?writing part and several read-only ones. Records are firstly written into a mem... Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in?writing part and several read-only ones. Records are firstly written into a memoryoptimized structure and then compacted into in-disk struc? tures periodically. It achieves high write throughput. However, it brings side effect that read requests have to go through multiple structures to find the required record. In a distributed database system, different parts of the LSM-tree are stored in distributed fashion. To this end, a server in the query layer has to issues multiple network communications to pull data items from the underlying storage layer. Coming to its rescue, this work proposes a precise data access strategy which includes: an efficient structure with low maintaining overhead designed to test whether a record exists in the in?writing part of the LSM-tree;a lease-based synchronization strategy proposed to maintain consistent copies of the structure on remote query servers. We further prove the technique is capable of working robustly when the LSM-Tree is re?organizing multiple structures in the backend. It is also fault-tolerant, which is able to recover the structures used in data access after node failures happen. Experiments using the YCSB benchmark show that the solution has 6x throughput improvement over existing methods. 展开更多
关键词 DISTRIBUTED data storage log-structured MERGE TREE LINEARIZABILITY fault tolerance
原文传递
EnAli:entity alignment across multiple heterogeneous data sources 被引量:2
18
作者 Chao KONG Ming GAO +3 位作者 Chen XU Yunbin FU Weining QIAN Aoying ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第1期157-169,共13页
Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others.Identifying entities across heterogeneous data sources is paramount to many research fie... Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others.Identifying entities across heterogeneous data sources is paramount to many research fields,such as data cleaning,data integration,.information retrieval and machine learning.The aligning process is not only overwhelmingly expensive for large data sources since it involves all tuples from two or more data sources,but also need to handle heterogeneous entity attributes.In this paper,we propose an unsupervised approach,called EnAli,to match entities across two or more heterogeneous data sources.EnAli employs a generative probabilistic model to incorporate the heterogeneous entity attributes via employing exponential family,handle missing values,and also utilize the locality sensitive hashing schema to reduce the candidate tuples and speed up the aligning process.EnAli is highly accurate and efficient even without any ground-truth tuples.We illustrate the performance of EnAli on re-identifying entities from the same data source,as well as aligning entities across three real data sources.Our experimental results manifest that our proposed approach outperforms the comparable baseline. 展开更多
关键词 ENTITY ALIGNMENT EXPONENTIAL family LOCALITY sensitive HASHING EM-algofithm
原文传递
Knowledge Representation and Reasoning for Complex Time Expression in Clinical Text 被引量:2
19
作者 Danyang Hu Meng Wang +2 位作者 Feng Gao Fangfang Xu Jinguang Gu 《Data Intelligence》 EI 2022年第3期573-598,共26页
Temporal information is pervasive and crucial in medical records and other clinical text,as it formulates the development process of medical conditions and is vital for clinical decision making.However,providing a hol... Temporal information is pervasive and crucial in medical records and other clinical text,as it formulates the development process of medical conditions and is vital for clinical decision making.However,providing a holistic knowledge representation and reasoning framework for various time expressions in the clinical text is challenging.In order to capture complex temporal semantics in clinical text,we propose a novel Clinical Time Ontology(CTO)as an extension from OWL framework.More specifically,we identified eight timerelated problems in clinical text and created 11 core temporal classes to conceptualize the fuzzy time,cyclic time,irregular time,negations and other complex aspects of clinical time.Then,we extended Allen’s and TEO’s temporal relations and defined the relation concept description between complex and simple time.Simultaneously,we provided a formulaic and graphical presentation of complex time and complex time relationships.We carried out empirical study on the expressiveness and usability of CTO using real-world healthcare datasets.Finally,experiment results demonstrate that CTO could faithfully represent and reason over 93%of the temporal expressions,and it can cover a wider range of time-related classes in clinical domain. 展开更多
关键词 Clinical text Temporal ontology Temporal relations OWL Negation of temporal relation
原文传递
Optimal Dependence of Performance and Efficiency of Collaborative Filtering on Random Stratified Subsampling 被引量:2
20
作者 Samin Poudel Marwan Bikdash 《Big Data Mining and Analytics》 EI 2022年第3期192-205,共14页
Dropping fractions of users or items judiciously can reduce the computational cost of Collaborative Filtering(CF)algorithms.The effect of this subsampling on the computing time and accuracy of CF is not fully understo... Dropping fractions of users or items judiciously can reduce the computational cost of Collaborative Filtering(CF)algorithms.The effect of this subsampling on the computing time and accuracy of CF is not fully understood,and clear guidelines for selecting optimal or even appropriate subsampling levels are not available.In this paper,we present a Density-based Random Stratified Subsampling using Clustering(DRSC)algorithm in which the desired Fraction of Users Dropped(FUD)and Fraction of Items Dropped(FID)are specified,and the overall density during subsampling is maintained.Subsequently,we develop simple models of the Training Time Improvement(TTI)and the Accuracy Loss(AL)as functions of FUD and FID,based on extensive simulations of seven standard CF algorithms as applied to various primary matrices from MovieLens,Yahoo Music Rating,and Amazon Automotive data.Simulations show that both TTI and a scaled AL are bi-linear in FID and FUD for all seven methods.The TTI linear regression of a CF method appears to be same for all datasets.Extensive simulations illustrate that TTI can be estimated reliably with FUD and FID only,but AL requires considering additional dataset characteristics.The derived models are then used to optimize the levels of subsampling addressing the tradeoff between TTI and AL.A simple sub-optimal approximation was found,in which the optimal AL is proportional to the optimal Training Time Reduction Factor(TTRF)for higher values of TTRF,and the optimal subsampling levels,like optimal FID/(1-FID),are proportional to the square root of TTRF. 展开更多
关键词 Collaborative Filtering(CF) SUBSAMPLING Training Time Improvement(TTI) performance loss Recommendation System(RS) collaborative filtering optimal solutions rating matrix
原文传递
上一页 1 2 3 下一页 到第
使用帮助 返回顶部