An attempt has been made to develop a distributed software infrastructure model for onboard data fusion system simulation, which is also applied to netted radar systems, onboard distributed detection systems and advan...An attempt has been made to develop a distributed software infrastructure model for onboard data fusion system simulation, which is also applied to netted radar systems, onboard distributed detection systems and advanced C3I systems. Two architectures are provided and verified: one is based on pure TCP/IP protocol and C/S model, and implemented with Winsock, the other is based on CORBA (common object request broker architecture). The performance of data fusion simulation system, i.e. reliability, flexibility and scalability, is improved and enhanced by two models. The study of them makes valuable explore on incorporating the distributed computation concepts into radar system simulation techniques.展开更多
Distributed computing is an important topic in the field of wireless communications and networking,and its high efficiency in handling large amounts of data is particularly noteworthy.Although distributed computing be...Distributed computing is an important topic in the field of wireless communications and networking,and its high efficiency in handling large amounts of data is particularly noteworthy.Although distributed computing benefits from its ability of processing data in parallel,the communication burden between different servers is incurred,thereby the computation process is detained.Recent researches have applied coding in distributed computing to reduce the communication burden,where repetitive computation is utilized to enable multicast opportunities so that the same coded information can be reused across different servers.To handle the computation tasks in practical heterogeneous systems,we propose a novel coding scheme to effectively mitigate the "straggling effect" in distributed computing.We assume that there are two types of servers in the system and the only difference between them is their computational capabilities,the servers with lower computational capabilities are called stragglers.Given any ratio of fast servers to slow servers and any gap of computational capabilities between them,we achieve approximately the same computation time for both fast and slow servers by assigning different amounts of computation tasks to them,thus reducing the overall computation time.Furthermore,we investigate the informationtheoretic lower bound of the inter-communication load and show that the lower bound is within a constant multiplicative gap to the upper bound achieved by our scheme.Various simulations also validate the effectiveness of the proposed scheme.展开更多
In the current noisy intermediate-scale quantum(NISQ)era,a single quantum processing unit(QPU)is insufficient to implement large-scale quantum algorithms;this has driven extensive research into distributed quantum com...In the current noisy intermediate-scale quantum(NISQ)era,a single quantum processing unit(QPU)is insufficient to implement large-scale quantum algorithms;this has driven extensive research into distributed quantum computing(DQC).DQC involves the cooperative operation of multiple QPUs but is concurrently challenged by excessive communication complexity.To address this issue,this paper proposes a quantum circuit partitioning method based on spectral clustering.The approach transforms quantum circuits into weighted graphs and,through computation of the Laplacian matrix and clustering techniques,identifies candidate partition schemes that minimize the total weight of the cut.Additionally,a global gate search tree strategy is introduced to meticulously explore opportunities for merged transfer of global gates,thereby minimizing the transmission cost of distributed quantum circuits and selecting the optimal partition scheme from the candidates.Finally,the proposed method is evaluated through various comparative experiments.The experimental results demonstrate that spectral clustering-based partitioning exhibits robust stability and efficiency in runtime in quantum circuits of different scales.In experiments involving the quantum Fourier transform algorithm and Revlib quantum circuits,the transmission cost achieved by the global gate search tree strategy is significantly optimized.展开更多
In this paper,we study a distributed model to cooperatively compute variational inequalities over time-varying directed graphs.Here,each agent has access to a part of the full mapping and holds a local view of the glo...In this paper,we study a distributed model to cooperatively compute variational inequalities over time-varying directed graphs.Here,each agent has access to a part of the full mapping and holds a local view of the global set constraint.By virtue of an auxiliary vector to compensate the graph imbalance,we propose a consensus-based distributed projection algorithm relying on local computation and communication at each agent.We show the convergence of this algorithm over uniformly jointly strongly connected unbalanced digraphs with nonidentical local constraints.We also provide a numerical example to illustrate the effectiveness of our algorithm.展开更多
Many latest high performance distributed computational environments come with high bandwidth in commu- nication. Such high bandwidth distributed systems provide unprecedented opportunities for analyzing huge datasets,...Many latest high performance distributed computational environments come with high bandwidth in commu- nication. Such high bandwidth distributed systems provide unprecedented opportunities for analyzing huge datasets, but simultaneously posts new technical challenges. For users, progressive query answering is important. For utility of systems, load balancing is critical. How we can achieve progressive and load balancing distributed computation is an interesting and promising research direction. As skyline analysis has been shown very useful in many multi-criteria decision making applications, in this paper, we study the problem of progressive and load balancing distributed skyline analysis. We propose a simple yet scalable approach which comes with several nice properties for progressive and load balancing query answering. We conduct extensive experiments which demonstrate the feasibility and effectiveness of the proposed method.展开更多
A dynamic multi-beam resource allocation algorithm for large low Earth orbit(LEO)constellation based on on-board distributed computing is proposed in this paper.The allocation is a combinatorial optimization process u...A dynamic multi-beam resource allocation algorithm for large low Earth orbit(LEO)constellation based on on-board distributed computing is proposed in this paper.The allocation is a combinatorial optimization process under a series of complex constraints,which is important for enhancing the matching between resources and requirements.A complex algorithm is not available because that the LEO on-board resources is limi-ted.The proposed genetic algorithm(GA)based on two-dimen-sional individual model and uncorrelated single paternal inheri-tance method is designed to support distributed computation to enhance the feasibility of on-board application.A distributed system composed of eight embedded devices is built to verify the algorithm.A typical scenario is built in the system to evalu-ate the resource allocation process,algorithm mathematical model,trigger strategy,and distributed computation architec-ture.According to the simulation and measurement results,the proposed algorithm can provide an allocation result for more than 1500 tasks in 14 s and the success rate is more than 91%in a typical scene.The response time is decreased by 40%com-pared with the conditional GA.展开更多
Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for ga...Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for gaining a deeper understanding of tumors and improving treatment outcomes.Machine learning models have become key players in automating brain tumor detection.Gradient descent methods are the mainstream algorithms for solving machine learning models.In this paper,we propose a novel distributed proximal stochastic gradient descent approach to solve the L_(1)-Smooth Support Vector Machine(SVM)classifier for brain tumor detection.Firstly,the smooth hinge loss is introduced to be used as the loss function of SVM.It avoids the issue of nondifferentiability at the zero point encountered by the traditional hinge loss function during gradient descent optimization.Secondly,the L_(1) regularization method is employed to sparsify features and enhance the robustness of the model.Finally,adaptive proximal stochastic gradient descent(PGD)with momentum,and distributed adaptive PGDwithmomentum(DPGD)are proposed and applied to the L_(1)-Smooth SVM.Distributed computing is crucial in large-scale data analysis,with its value manifested in extending algorithms to distributed clusters,thus enabling more efficient processing ofmassive amounts of data.The DPGD algorithm leverages Spark,enabling full utilization of the computer’s multi-core resources.Due to its sparsity induced by L_(1) regularization on parameters,it exhibits significantly accelerated convergence speed.From the perspective of loss reduction,DPGD converges faster than PGD.The experimental results show that adaptive PGD withmomentumand its variants have achieved cutting-edge accuracy and efficiency in brain tumor detection.Frompre-trained models,both the PGD andDPGD outperform other models,boasting an accuracy of 95.21%.展开更多
In distributed quantum computing(DQC),quantum hardware design mainly focuses on providing as many as possible high-quality inter-chip connections.Meanwhile,quantum software tries its best to reduce the required number...In distributed quantum computing(DQC),quantum hardware design mainly focuses on providing as many as possible high-quality inter-chip connections.Meanwhile,quantum software tries its best to reduce the required number of remote quantum gates between chips.However,this“hardware first,software follows”methodology may not fully exploit the potential of DQC.Inspired by classical software-hardware co-design,this paper explores the design space of application-specific DQC architectures.More specifically,we propose Auto Arch,an automated quantum chip network(QCN)structure design tool.With qubits grouping followed by a customized QCN design,AutoArch can generate a near-optimal DQC architecture suitable for target quantum algorithms.Experimental results show that the DQC architecture generated by Auto Arch can outperform other general QCN architectures when executing target quantum algorithms.展开更多
In this paper,we develop a distributed solver for a group of strict(non-strict)linear matrix inequalities over a multi-agent network,where each agent only knows one inequality,and all agents co-operate to reach a cons...In this paper,we develop a distributed solver for a group of strict(non-strict)linear matrix inequalities over a multi-agent network,where each agent only knows one inequality,and all agents co-operate to reach a consensus solution in the intersection of all the feasible regions.The formulation is transformed into a distributed optimization problem by introducing slack variables and consensus constraints.Then,by the primal–dual methods,a distributed algorithm is proposed with the help of projection operators and derivative feedback.Finally,the convergence of the algorithm is analyzed,followed by illustrative simulations.展开更多
Federated Learning(FL)has become a popular training paradigm in recent years.However,stragglers are critical bottlenecks in an Internet of Things(IoT)network while training.These nodes produce stale updates to the ser...Federated Learning(FL)has become a popular training paradigm in recent years.However,stragglers are critical bottlenecks in an Internet of Things(IoT)network while training.These nodes produce stale updates to the server,which slow down the convergence.In this paper,we studied the impact of the stale updates on the global model,which is observed to be significant.To address this,we propose a weighted averaging scheme,FedStrag,that optimizes the training with stale updates.The work is focused on training a model in an IoT network that has multiple challenges,such as resource constraints,stragglers,network issues,device heterogeneity,etc.To this end,we developed a time-bounded asynchronous FL paradigm that can train a model on the continuous iflow of data in the edge-fog-cloud continuum.To test the FedStrag approach,a model is trained with multiple stragglers scenarios on both Independent and Identically Distributed(IID)and non-IID datasets on Raspberry Pis.The experiment results suggest that the FedStrag outperforms the baseline FedAvg in all possible cases.展开更多
Spark performs excellently in large-scale data-parallel computing and iterative processing.However,with the increase in data size and program complexity,the default scheduling strategy has difficultymeeting the demand...Spark performs excellently in large-scale data-parallel computing and iterative processing.However,with the increase in data size and program complexity,the default scheduling strategy has difficultymeeting the demands of resource utilization and performance optimization.Scheduling strategy optimization,as a key direction for improving Spark’s execution efficiency,has attracted widespread attention.This paper first introduces the basic theories of Spark,compares several default scheduling strategies,and discusses common scheduling performance evaluation indicators and factors affecting scheduling efficiency.Subsequently,existing scheduling optimization schemes are summarized based on three scheduling modes:load characteristics,cluster characteristics,and matching of both,and representative algorithms are analyzed in terms of performance indicators and applicable scenarios,comparing the advantages and disadvantages of different scheduling modes.The article also explores in detail the integration of Spark scheduling strategies with specific application scenarios and the challenges in production environments.Finally,the limitations of the existing schemes are analyzed,and prospects are envisioned.展开更多
[Objective]This study aims to address the inefficiency of AI-for-Science tasks caused by the design and implementation challenges of applying the distributed parallel computing strategies to deep learning models,as we...[Objective]This study aims to address the inefficiency of AI-for-Science tasks caused by the design and implementation challenges of applying the distributed parallel computing strategies to deep learning models,as well as their inefficient execution.[Methods]We propose an automatic distributed parallelization method for AI-for-Science tasks,called FlowAware.Based on the AI-for-Science framework JAX,this approach thoroughly analyzes task characteristics,operator structures,and data flow properties of deep learning models.By incorporating cluster topology information,it constructs a search space for distributed parallel computing strategies.Guided by load balancing and communication optimization objectives,FlowAware automatically identifies optimal distributed parallel computing strategies for AI models.[Results]Comparative experiments conducted on both GPU-like accelerator clusters and GPU clusters demonstrated that FlowAware achieves a throughput improvement of up to 7.8×compared to Alpa.[Conclusions]FlowAware effectively enhances the search efficiency of distributed parallel computing strategies for AI models in scientific computing tasks and significantly improves their computational performance.展开更多
Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies,massive protein-protein interacti...Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies,massive protein-protein interaction(PPI)data have been generated,making it very difficult to analyze them efficiently.To address this problem,this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms,i.e.,CoFex,using MapReduce.To do so,an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for large-scale PPI data analysis and prediction.Respective solutions are then devised to overcome these limitations.In particular,we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins.After that,its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy.Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.展开更多
In LEO(Low Earth Orbit)satellite communication systems,the satellite network is made up of a large number of satellites,the dynamically changing network environment affects the results of distributed computing.In orde...In LEO(Low Earth Orbit)satellite communication systems,the satellite network is made up of a large number of satellites,the dynamically changing network environment affects the results of distributed computing.In order to improve the fault tolerance rate,a novel public blockchain consensus mechanism that applies a distributed computing architecture in a public network is proposed.Redundant calculation of blockchain ensures the credibility of the results;and the transactions with calculation results of a task are stored distributed in sequence in Directed Acyclic Graphs(DAG).The transactions issued by nodes are connected to form a net.The net can quickly provide node reputation evaluation that does not rely on third parties.Simulations show that our proposed blockchain has the following advantages:1.The task processing speed of the blockchain can be close to that of the fastest node in the entire blockchain;2.When the tasks’arrival time intervals and demanded working nodes(WNs)meet certain conditions,the network can tolerate more than 50%of malicious devices;3.No matter the number of nodes in the blockchain is increased or reduced,the network can keep robustness by adjusting the task’s arrival time interval and demanded WNs.展开更多
To security support large-scale intelligent applications,distributed machine learning based on blockchain is an intuitive solution scheme.However,the distributed machine learning is difficult to train due to that the ...To security support large-scale intelligent applications,distributed machine learning based on blockchain is an intuitive solution scheme.However,the distributed machine learning is difficult to train due to that the corresponding optimization solver algorithms converge slowly,which highly demand on computing and memory resources.To overcome the challenges,we propose a distributed computing framework for L-BFGS optimization algorithm based on variance reduction method,which is a lightweight,few additional cost and parallelized scheme for the model training process.To validate the claims,we have conducted several experiments on multiple classical datasets.Results show that our proposed computing framework can steadily accelerate the training process of solver in either local mode or distributed mode.展开更多
Mobile agents provide a new method for the distributed computation. This paper presents the advantages of using mobile agents in a distributed virtual environment (DVE) system, and describes the architecture of hetero...Mobile agents provide a new method for the distributed computation. This paper presents the advantages of using mobile agents in a distributed virtual environment (DVE) system, and describes the architecture of heterogeneous computer's distributed virtual environment system (HCWES) designed to populate some mobile agents as well as stationary agents. Finally, the paper introduces how heterogeneous computer network communication is to be realized.展开更多
Recently, wireless distributed computing (WDC) concept has emerged promising manifolds improvements to current wireless technotogies. Despite the various expected benefits of this concept, significant drawbacks were...Recently, wireless distributed computing (WDC) concept has emerged promising manifolds improvements to current wireless technotogies. Despite the various expected benefits of this concept, significant drawbacks were addressed in the open literature. One of WDC key challenges is the impact of wireless channel quality on the load of distributed computations. Therefore, this research investigates the wireless channel impact on WDC performance when the tatter is applied to spectrum sensing in cognitive radio (CR) technology. However, a trade- off is found between accuracy and computational complexity in spectrum sensing approaches. Increasing these approaches accuracy is accompanied by an increase in computational complexity. This results in greater power consumption and processing time. A novel WDC scheme for cyclostationary feature detection spectrum sensing approach is proposed in this paper and thoroughly investigated. The benefits of the proposed scheme are firstly presented. Then, the impact of the wireless channel of the proposed scheme is addressed considering two scenarios. In the first scenario, workload matrices are distributed over the wireless channel展开更多
The title complex is widely used as an efficient key component of Ziegler-Natta catalyst for stereospecific polymerization of dienes to produce synthetic rubbers. However, the quantitative structure-activity relations...The title complex is widely used as an efficient key component of Ziegler-Natta catalyst for stereospecific polymerization of dienes to produce synthetic rubbers. However, the quantitative structure-activity relationship(QSAR) of this kind of complexes is still not clear mainly due to the difficulties to obtain their geometric molecular structures through laboratory experiments. An alternative solution is the quantum chemistry calculation in which the comformational population shall be determined. In this study, ten conformers of the title complex were obtained with the function of molecular dynamics conformational search in Gabedit 2.4.8, and their geometry optimization and thermodynamics calculation were made with a Sparkle/PM7 approach in MOPAC 2012. Their Gibbs free energies at 1 atm. and 298.15 K were calculated. Population of the conformers was further calculated out according to the theory of Boltzmann distribution, indicating that one of the ten conformers has a dominant population of 77.13%.展开更多
In the large-scale Distributed Virtual Environment(DVE)multimedia systems,one of key challenges is to distributedly preserve causal order delivery of messages in real time.Most of the existing causal order control app...In the large-scale Distributed Virtual Environment(DVE)multimedia systems,one of key challenges is to distributedly preserve causal order delivery of messages in real time.Most of the existing causal order control approaches with real-time constraints use vector time as causal control information which is closely coupled with system scales.As the scale expands,each message is attached a large amount of control information that introduces too much network transmission overhead to maintain the real-time causal order delivery.In this article,a novel Lightweight Real-Time Causal Order(LRTCO)algorithm is proposed for large-scale DVE multimedia systems.LRTCO predicts and compares the network transmission times of messages so as to select the proper causal control information of which the amount is dynamically adapted to the network latency variations and unconcerned with system scales.The control information in LRTCO is effective to preserve causal order delivery of messages and lightweight to maintain the real-time property of DVE systems.Experimental results demonstrate that LRTCO costs low transmission overhead and communication bandwidth,reduces causal order violations efficiently,and improves the scalability of DVE systems.展开更多
To meet the challenge of implementing rapidly advanced, time-consuming medical image processing algorithms, it is necessary to develop a medical image processing technology to process a 2D or 3D medical image dynamica...To meet the challenge of implementing rapidly advanced, time-consuming medical image processing algorithms, it is necessary to develop a medical image processing technology to process a 2D or 3D medical image dynamically on the web. But in a premier system, only static image processing can be provided with the limitation of web technology. The development of Java and CORBA (common object request broker architecture) overcomes the shortcoming of the web static application and makes the dynamic processing of medical images on the web available. To develop an open solution of distributed computing, we integrate the Java, and web with the CORBA and present a web-based medical image dynamic processing methed, which adopts Java technology as the language to program application and components of the web and utilies the CORBA architecture to cope with heterogeneous property of a complex distributed system. The method also provides a platform-independent, transparent processing architecture to implement the advanced image routines and enable users to access large dataset and resources according to the requirements of medical applications. The experiment in this paper shows that the medical image dynamic processing method implemented on the web by using Java and the CORBA is feasible.展开更多
文摘An attempt has been made to develop a distributed software infrastructure model for onboard data fusion system simulation, which is also applied to netted radar systems, onboard distributed detection systems and advanced C3I systems. Two architectures are provided and verified: one is based on pure TCP/IP protocol and C/S model, and implemented with Winsock, the other is based on CORBA (common object request broker architecture). The performance of data fusion simulation system, i.e. reliability, flexibility and scalability, is improved and enhanced by two models. The study of them makes valuable explore on incorporating the distributed computation concepts into radar system simulation techniques.
基金supported by NSF China(No.T2421002,62061146002,62020106005)。
文摘Distributed computing is an important topic in the field of wireless communications and networking,and its high efficiency in handling large amounts of data is particularly noteworthy.Although distributed computing benefits from its ability of processing data in parallel,the communication burden between different servers is incurred,thereby the computation process is detained.Recent researches have applied coding in distributed computing to reduce the communication burden,where repetitive computation is utilized to enable multicast opportunities so that the same coded information can be reused across different servers.To handle the computation tasks in practical heterogeneous systems,we propose a novel coding scheme to effectively mitigate the "straggling effect" in distributed computing.We assume that there are two types of servers in the system and the only difference between them is their computational capabilities,the servers with lower computational capabilities are called stragglers.Given any ratio of fast servers to slow servers and any gap of computational capabilities between them,we achieve approximately the same computation time for both fast and slow servers by assigning different amounts of computation tasks to them,thus reducing the overall computation time.Furthermore,we investigate the informationtheoretic lower bound of the inter-communication load and show that the lower bound is within a constant multiplicative gap to the upper bound achieved by our scheme.Various simulations also validate the effectiveness of the proposed scheme.
基金supported by the National Natural Science Foundation of China(Grant No.62072259)in part by the Natural Science Foundation of Jiangsu Province(Grant No.BK20221411)+1 种基金the PhD Start-up Fund of Nantong University(Grant No.23B03)the Postgraduate Research&Practice Innovation Program of School of Information Science and Technology,Nantong University(Grant No.NTUSISTPR2405).
文摘In the current noisy intermediate-scale quantum(NISQ)era,a single quantum processing unit(QPU)is insufficient to implement large-scale quantum algorithms;this has driven extensive research into distributed quantum computing(DQC).DQC involves the cooperative operation of multiple QPUs but is concurrently challenged by excessive communication complexity.To address this issue,this paper proposes a quantum circuit partitioning method based on spectral clustering.The approach transforms quantum circuits into weighted graphs and,through computation of the Laplacian matrix and clustering techniques,identifies candidate partition schemes that minimize the total weight of the cut.Additionally,a global gate search tree strategy is introduced to meticulously explore opportunities for merged transfer of global gates,thereby minimizing the transmission cost of distributed quantum circuits and selecting the optimal partition scheme from the candidates.Finally,the proposed method is evaluated through various comparative experiments.The experimental results demonstrate that spectral clustering-based partitioning exhibits robust stability and efficiency in runtime in quantum circuits of different scales.In experiments involving the quantum Fourier transform algorithm and Revlib quantum circuits,the transmission cost achieved by the global gate search tree strategy is significantly optimized.
基金supported by the National Natural Science Foundation of China(No.61973043)Shanghai Municipal Science and Technology Major Project(No.2021SHZDZX0100).
文摘In this paper,we study a distributed model to cooperatively compute variational inequalities over time-varying directed graphs.Here,each agent has access to a part of the full mapping and holds a local view of the global set constraint.By virtue of an auxiliary vector to compensate the graph imbalance,we propose a consensus-based distributed projection algorithm relying on local computation and communication at each agent.We show the convergence of this algorithm over uniformly jointly strongly connected unbalanced digraphs with nonidentical local constraints.We also provide a numerical example to illustrate the effectiveness of our algorithm.
基金Supported by the Doctoral Research Foundation of the Natural Science Foundation of Guangdong Province under Grant No.8451064101000054the National Natural Science Foundation of China under Grant Nos. 60773198,60703111+3 种基金Natural Science Foundation of Guangdong Province under Grant Nos. 06104916,8151027501000021Research Foundation of Science and Technology PlanProject in Guangdong Province under Grant No. 2008B050100040Program for New Century Excellent Talents in University ofChina under Grant No. NCET-06-0727the Fundamental Research Funds for the Central Universities,SCUT,under Grant No.2009ZM0008
文摘Many latest high performance distributed computational environments come with high bandwidth in commu- nication. Such high bandwidth distributed systems provide unprecedented opportunities for analyzing huge datasets, but simultaneously posts new technical challenges. For users, progressive query answering is important. For utility of systems, load balancing is critical. How we can achieve progressive and load balancing distributed computation is an interesting and promising research direction. As skyline analysis has been shown very useful in many multi-criteria decision making applications, in this paper, we study the problem of progressive and load balancing distributed skyline analysis. We propose a simple yet scalable approach which comes with several nice properties for progressive and load balancing query answering. We conduct extensive experiments which demonstrate the feasibility and effectiveness of the proposed method.
基金This work was supported by the National Key Research and Development Program of China(2021YFB2900603)the National Natural Science Foundation of China(61831008).
文摘A dynamic multi-beam resource allocation algorithm for large low Earth orbit(LEO)constellation based on on-board distributed computing is proposed in this paper.The allocation is a combinatorial optimization process under a series of complex constraints,which is important for enhancing the matching between resources and requirements.A complex algorithm is not available because that the LEO on-board resources is limi-ted.The proposed genetic algorithm(GA)based on two-dimen-sional individual model and uncorrelated single paternal inheri-tance method is designed to support distributed computation to enhance the feasibility of on-board application.A distributed system composed of eight embedded devices is built to verify the algorithm.A typical scenario is built in the system to evalu-ate the resource allocation process,algorithm mathematical model,trigger strategy,and distributed computation architec-ture.According to the simulation and measurement results,the proposed algorithm can provide an allocation result for more than 1500 tasks in 14 s and the success rate is more than 91%in a typical scene.The response time is decreased by 40%com-pared with the conditional GA.
基金the Natural Science Foundation of Ningxia Province(No.2021AAC03230).
文摘Brain tumors come in various types,each with distinct characteristics and treatment approaches,making manual detection a time-consuming and potentially ambiguous process.Brain tumor detection is a valuable tool for gaining a deeper understanding of tumors and improving treatment outcomes.Machine learning models have become key players in automating brain tumor detection.Gradient descent methods are the mainstream algorithms for solving machine learning models.In this paper,we propose a novel distributed proximal stochastic gradient descent approach to solve the L_(1)-Smooth Support Vector Machine(SVM)classifier for brain tumor detection.Firstly,the smooth hinge loss is introduced to be used as the loss function of SVM.It avoids the issue of nondifferentiability at the zero point encountered by the traditional hinge loss function during gradient descent optimization.Secondly,the L_(1) regularization method is employed to sparsify features and enhance the robustness of the model.Finally,adaptive proximal stochastic gradient descent(PGD)with momentum,and distributed adaptive PGDwithmomentum(DPGD)are proposed and applied to the L_(1)-Smooth SVM.Distributed computing is crucial in large-scale data analysis,with its value manifested in extending algorithms to distributed clusters,thus enabling more efficient processing ofmassive amounts of data.The DPGD algorithm leverages Spark,enabling full utilization of the computer’s multi-core resources.Due to its sparsity induced by L_(1) regularization on parameters,it exhibits significantly accelerated convergence speed.From the perspective of loss reduction,DPGD converges faster than PGD.The experimental results show that adaptive PGD withmomentumand its variants have achieved cutting-edge accuracy and efficiency in brain tumor detection.Frompre-trained models,both the PGD andDPGD outperform other models,boasting an accuracy of 95.21%.
基金Project supported by the National Key R&D Program of China(Grant No.2023YFA1009403)the National Natural Science Foundation of China(Grant Nos.62072176 and 62472175)the“Digital Silk Road”Shanghai International Joint Lab of Trustworthy Intelligent Software(Grant No.22510750100)。
文摘In distributed quantum computing(DQC),quantum hardware design mainly focuses on providing as many as possible high-quality inter-chip connections.Meanwhile,quantum software tries its best to reduce the required number of remote quantum gates between chips.However,this“hardware first,software follows”methodology may not fully exploit the potential of DQC.Inspired by classical software-hardware co-design,this paper explores the design space of application-specific DQC architectures.More specifically,we propose Auto Arch,an automated quantum chip network(QCN)structure design tool.With qubits grouping followed by a customized QCN design,AutoArch can generate a near-optimal DQC architecture suitable for target quantum algorithms.Experimental results show that the DQC architecture generated by Auto Arch can outperform other general QCN architectures when executing target quantum algorithms.
基金This work was supported by the Shanghai Municipal Science and Technology Major Project(No.2021SHZDZX0100)the National Natural Science Foundation of China(Nos.61733018,62073035)。
文摘In this paper,we develop a distributed solver for a group of strict(non-strict)linear matrix inequalities over a multi-agent network,where each agent only knows one inequality,and all agents co-operate to reach a consensus solution in the intersection of all the feasible regions.The formulation is transformed into a distributed optimization problem by introducing slack variables and consensus constraints.Then,by the primal–dual methods,a distributed algorithm is proposed with the help of projection operators and derivative feedback.Finally,the convergence of the algorithm is analyzed,followed by illustrative simulations.
基金supported by SERB,India,through grant CRG/2021/003888financial support to UoH-IoE by MHRD,India(F11/9/2019-U3(A)).
文摘Federated Learning(FL)has become a popular training paradigm in recent years.However,stragglers are critical bottlenecks in an Internet of Things(IoT)network while training.These nodes produce stale updates to the server,which slow down the convergence.In this paper,we studied the impact of the stale updates on the global model,which is observed to be significant.To address this,we propose a weighted averaging scheme,FedStrag,that optimizes the training with stale updates.The work is focused on training a model in an IoT network that has multiple challenges,such as resource constraints,stragglers,network issues,device heterogeneity,etc.To this end,we developed a time-bounded asynchronous FL paradigm that can train a model on the continuous iflow of data in the edge-fog-cloud continuum.To test the FedStrag approach,a model is trained with multiple stragglers scenarios on both Independent and Identically Distributed(IID)and non-IID datasets on Raspberry Pis.The experiment results suggest that the FedStrag outperforms the baseline FedAvg in all possible cases.
基金supported in part by the Key Research and Development Program of Shaanxi under Grant 2023-ZDLGY-34.
文摘Spark performs excellently in large-scale data-parallel computing and iterative processing.However,with the increase in data size and program complexity,the default scheduling strategy has difficultymeeting the demands of resource utilization and performance optimization.Scheduling strategy optimization,as a key direction for improving Spark’s execution efficiency,has attracted widespread attention.This paper first introduces the basic theories of Spark,compares several default scheduling strategies,and discusses common scheduling performance evaluation indicators and factors affecting scheduling efficiency.Subsequently,existing scheduling optimization schemes are summarized based on three scheduling modes:load characteristics,cluster characteristics,and matching of both,and representative algorithms are analyzed in terms of performance indicators and applicable scenarios,comparing the advantages and disadvantages of different scheduling modes.The article also explores in detail the integration of Spark scheduling strategies with specific application scenarios and the challenges in production environments.Finally,the limitations of the existing schemes are analyzed,and prospects are envisioned.
基金supported by the National Key Research and Development Program of China(2023YFB3001501)the National Natural Science Foundation of China(NSFC)(62302133)+3 种基金the Key Research and Development Program of Zhejiang Province(2024C01026)the Yangtze River Delta Project(2023ZY1068)Hangzhou Key Research Plan Project(2024SZD1A02)the GHfund A(202302019816).
文摘[Objective]This study aims to address the inefficiency of AI-for-Science tasks caused by the design and implementation challenges of applying the distributed parallel computing strategies to deep learning models,as well as their inefficient execution.[Methods]We propose an automatic distributed parallelization method for AI-for-Science tasks,called FlowAware.Based on the AI-for-Science framework JAX,this approach thoroughly analyzes task characteristics,operator structures,and data flow properties of deep learning models.By incorporating cluster topology information,it constructs a search space for distributed parallel computing strategies.Guided by load balancing and communication optimization objectives,FlowAware automatically identifies optimal distributed parallel computing strategies for AI models.[Results]Comparative experiments conducted on both GPU-like accelerator clusters and GPU clusters demonstrated that FlowAware achieves a throughput improvement of up to 7.8×compared to Alpa.[Conclusions]FlowAware effectively enhances the search efficiency of distributed parallel computing strategies for AI models in scientific computing tasks and significantly improves their computational performance.
基金This work was supported in part by the National Natural Science Foundation of China(61772493)the CAAI-Huawei MindSpore Open Fund(CAAIXSJLJJ-2020-004B)+4 种基金the Natural Science Foundation of Chongqing(China)(cstc2019jcyjjqX0013)Chongqing Research Program of Technology Innovation and Application(cstc2019jscx-fxydX0024,cstc2019jscx-fxydX0027,cstc2018jszx-cyzdX0041)Guangdong Province Universities and College Pearl River Scholar Funded Scheme(2019)the Pioneer Hundred Talents Program of Chinese Academy of Sciencesthe Deanship of Scientific Research(DSR)at King Abdulaziz University(G-21-135-38).
文摘Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies,massive protein-protein interaction(PPI)data have been generated,making it very difficult to analyze them efficiently.To address this problem,this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms,i.e.,CoFex,using MapReduce.To do so,an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for large-scale PPI data analysis and prediction.Respective solutions are then devised to overcome these limitations.In particular,we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins.After that,its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy.Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.
基金funded in part by the National Natural Science Foundation of China (Grant no. 61772352, 62172061, 61871422)National Key Research and Development Project (Grants nos. 2020YFB1711800 and 2020YFB1707900)+2 种基金the Science and Technology Project of Sichuan Province (Grants no. 2021YFG0152, 2021YFG0025, 2020YFG0479, 2020YFG0322, 2020GFW035, 2020GFW033, 2020YFH0071)the R&D Project of Chengdu City (Grant no. 2019-YF05-01790-GX)the Central Universities of Southwest Minzu University (Grants no. ZYN2022032)
文摘In LEO(Low Earth Orbit)satellite communication systems,the satellite network is made up of a large number of satellites,the dynamically changing network environment affects the results of distributed computing.In order to improve the fault tolerance rate,a novel public blockchain consensus mechanism that applies a distributed computing architecture in a public network is proposed.Redundant calculation of blockchain ensures the credibility of the results;and the transactions with calculation results of a task are stored distributed in sequence in Directed Acyclic Graphs(DAG).The transactions issued by nodes are connected to form a net.The net can quickly provide node reputation evaluation that does not rely on third parties.Simulations show that our proposed blockchain has the following advantages:1.The task processing speed of the blockchain can be close to that of the fastest node in the entire blockchain;2.When the tasks’arrival time intervals and demanded working nodes(WNs)meet certain conditions,the network can tolerate more than 50%of malicious devices;3.No matter the number of nodes in the blockchain is increased or reduced,the network can keep robustness by adjusting the task’s arrival time interval and demanded WNs.
基金partly supported by National Key Basic Research Program of China(2016YFB1000100)partly supported by National Natural Science Foundation of China(NO.61402490)。
文摘To security support large-scale intelligent applications,distributed machine learning based on blockchain is an intuitive solution scheme.However,the distributed machine learning is difficult to train due to that the corresponding optimization solver algorithms converge slowly,which highly demand on computing and memory resources.To overcome the challenges,we propose a distributed computing framework for L-BFGS optimization algorithm based on variance reduction method,which is a lightweight,few additional cost and parallelized scheme for the model training process.To validate the claims,we have conducted several experiments on multiple classical datasets.Results show that our proposed computing framework can steadily accelerate the training process of solver in either local mode or distributed mode.
文摘Mobile agents provide a new method for the distributed computation. This paper presents the advantages of using mobile agents in a distributed virtual environment (DVE) system, and describes the architecture of heterogeneous computer's distributed virtual environment system (HCWES) designed to populate some mobile agents as well as stationary agents. Finally, the paper introduces how heterogeneous computer network communication is to be realized.
文摘Recently, wireless distributed computing (WDC) concept has emerged promising manifolds improvements to current wireless technotogies. Despite the various expected benefits of this concept, significant drawbacks were addressed in the open literature. One of WDC key challenges is the impact of wireless channel quality on the load of distributed computations. Therefore, this research investigates the wireless channel impact on WDC performance when the tatter is applied to spectrum sensing in cognitive radio (CR) technology. However, a trade- off is found between accuracy and computational complexity in spectrum sensing approaches. Increasing these approaches accuracy is accompanied by an increase in computational complexity. This results in greater power consumption and processing time. A novel WDC scheme for cyclostationary feature detection spectrum sensing approach is proposed in this paper and thoroughly investigated. The benefits of the proposed scheme are firstly presented. Then, the impact of the wireless channel of the proposed scheme is addressed considering two scenarios. In the first scenario, workload matrices are distributed over the wireless channel
基金supported by the National Natural Science Foundation of China(No.21476119)
文摘The title complex is widely used as an efficient key component of Ziegler-Natta catalyst for stereospecific polymerization of dienes to produce synthetic rubbers. However, the quantitative structure-activity relationship(QSAR) of this kind of complexes is still not clear mainly due to the difficulties to obtain their geometric molecular structures through laboratory experiments. An alternative solution is the quantum chemistry calculation in which the comformational population shall be determined. In this study, ten conformers of the title complex were obtained with the function of molecular dynamics conformational search in Gabedit 2.4.8, and their geometry optimization and thermodynamics calculation were made with a Sparkle/PM7 approach in MOPAC 2012. Their Gibbs free energies at 1 atm. and 298.15 K were calculated. Population of the conformers was further calculated out according to the theory of Boltzmann distribution, indicating that one of the ten conformers has a dominant population of 77.13%.
基金This research work is supported by Hunan Provincial Natural Science Foundation of China(Grant No.2017JJ2016)Hunan Provincial Education Science 13th Five-Year Plan(Grant No.XJK016BXX001)+3 种基金Social Science Foundation of Hunan Province(Grant No.17YBA049)2017 Hunan Provincial Higher Education Teaching Re-form Research Project(Grant No.564)Scientific Research Fund of Hunan Provin-cial Education Department(Grant No.16C0269 and No.17B046)The work is also sup-ported by Open foundation for University Innovation Platform from Hunan Province,China(Grand No.16K013)and the 2011 Collaborative Innovation Center of Big Data for Finan-cial and Economical Asset Development and Utility in Universities of Hunan Province.We also thank the anonymous reviewers for their valuable comments and insightful sug-gestions.
文摘In the large-scale Distributed Virtual Environment(DVE)multimedia systems,one of key challenges is to distributedly preserve causal order delivery of messages in real time.Most of the existing causal order control approaches with real-time constraints use vector time as causal control information which is closely coupled with system scales.As the scale expands,each message is attached a large amount of control information that introduces too much network transmission overhead to maintain the real-time causal order delivery.In this article,a novel Lightweight Real-Time Causal Order(LRTCO)algorithm is proposed for large-scale DVE multimedia systems.LRTCO predicts and compares the network transmission times of messages so as to select the proper causal control information of which the amount is dynamically adapted to the network latency variations and unconcerned with system scales.The control information in LRTCO is effective to preserve causal order delivery of messages and lightweight to maintain the real-time property of DVE systems.Experimental results demonstrate that LRTCO costs low transmission overhead and communication bandwidth,reduces causal order violations efficiently,and improves the scalability of DVE systems.
基金This project was supported by the National Natural Science Foundation of China (69931010).
文摘To meet the challenge of implementing rapidly advanced, time-consuming medical image processing algorithms, it is necessary to develop a medical image processing technology to process a 2D or 3D medical image dynamically on the web. But in a premier system, only static image processing can be provided with the limitation of web technology. The development of Java and CORBA (common object request broker architecture) overcomes the shortcoming of the web static application and makes the dynamic processing of medical images on the web available. To develop an open solution of distributed computing, we integrate the Java, and web with the CORBA and present a web-based medical image dynamic processing methed, which adopts Java technology as the language to program application and components of the web and utilies the CORBA architecture to cope with heterogeneous property of a complex distributed system. The method also provides a platform-independent, transparent processing architecture to implement the advanced image routines and enable users to access large dataset and resources according to the requirements of medical applications. The experiment in this paper shows that the medical image dynamic processing method implemented on the web by using Java and the CORBA is feasible.