The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections an...The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.展开更多
With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.Howeve...With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.展开更多
1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe...1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe internet era. Many companies started building large inter- net-connected facililies,展开更多
According to the high operating costs and a large number of energy waste in the current data center network architectures, we propose a kind of trusted flow preemption scheduling combining the energy-saving routing me...According to the high operating costs and a large number of energy waste in the current data center network architectures, we propose a kind of trusted flow preemption scheduling combining the energy-saving routing mechanism based on typical data center network architecture. The mechanism can make the network flow in its exclusive network link bandwidth and transmission path, which can improve the link utilization and the use of the network energy efficiency. Meanwhile, we apply trusted computing to guarantee the high security, high performance and high fault-tolerant routing forwarding service, which helps improving the average completion time of network flow.展开更多
Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with s...Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.展开更多
In data centers, the transmission control protocol(TCP) incast causes catastrophic goodput degradation to applications with a many-to-one traffic pattern. In this paper, we intend to tame incast at the receiver-side a...In data centers, the transmission control protocol(TCP) incast causes catastrophic goodput degradation to applications with a many-to-one traffic pattern. In this paper, we intend to tame incast at the receiver-side application. Towards this goal, we first develop an analytical model that formulates the incast probability as a function of connection variables and network environment settings. We combine the model with the optimization theory and derive some insights into minimizing the incast probability through tuning connection variables related to applications. Then,enlightened by the analytical results, we propose an adaptive application-layer solution to the TCP incast.The solution equally allocates advertised windows to concurrent connections, and dynamically adapts the number of concurrent connections to the varying conditions. Simulation results show that our solution consistently eludes incast and achieves high goodput in various scenarios including the ones with multiple bottleneck links and background TCP traffic.展开更多
As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processin...As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.展开更多
Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize t...Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize the high network capacity by appropriate multi- path routing algorithms. As flow-level path splitting may lead to trafl'ic imbalance between paths due to flow- size difference, packet-level path splitting attracts more attention lately, which spreads packets from flows into multiple available paths and significantly improves link utilizations. However, it may cause packet reordering, confusing the TCP congestion control algorithm and lowering the throughput of flows. In this paper, we design a novel packetlevel multi-path routing scheme called SOPA, which leverag- es OpenFlow to perform packet-level path splitting in a round- robin fashion, and hence significantly mitigates the packet reordering problem and improves the network throughput. Moreover, SOPA leverages the topological feature of data center networks to encode a very small number of switches along the path into the packet header, resulting in very light overhead. Compared with random packet spraying (RPS), Hedera and equal-cost multi-path routing (ECMP), our simulations demonstrate that SOPA achieves 29.87%, 50.41% and 77.74% higher network throughput respectively under permutation workload, and reduces average data transfer completion time by 53.65%, 343.31% and 348.25% respectively under production workload.展开更多
In a data center network (DCN), load balancing is required when servers transfer data on the same path. This is necessary to avoid congestion. Load balancing is challenged by the dynamic transferral of demands and c...In a data center network (DCN), load balancing is required when servers transfer data on the same path. This is necessary to avoid congestion. Load balancing is challenged by the dynamic transferral of demands and complex routing control. Because of the distributed nature of a traditional network, previous research on load balancing has mostly focused on improving the performance of the local network; thus, the load has not been optimally balanced across the entire network. In this paper, we propose a novel dynamic load-balancing algorithm for fat-tree. This algorithm avoids congestions to the great possible extent by searching for non-conflicting paths in a centralized way. We implement the algorithm in the popular software-defined networking architecture and evaluate the algorithm' s performance on the Mininet platform. The results show that our algorithm has higher bisection band- width than the traditional equal-cost multi-path load-balancing algorithm and thus more effectively avoids congestion.展开更多
The heterogeneity of applications and their divergent resource requirements lead to uneven traffic distribution and imbalanced resource utilization across data center networks(DCNs).We propose a fine-grained baseband ...The heterogeneity of applications and their divergent resource requirements lead to uneven traffic distribution and imbalanced resource utilization across data center networks(DCNs).We propose a fine-grained baseband function reallocation scheme in heterogeneous optical switching-based DCNs.A deep reinforcement learning-based functional split and resource mapping approach(DRL-BFM)is proposed to maximize throughput in high-load server racks by implementing load balancing in DCNs.The results demonstrate that DRL-BFM improves the throughput by 20.8%,22.8%,and 29.8%on average compared to existing algorithms under different computational capacities,bandwidth constraints,and latency conditions,respectively.展开更多
Network updates have become increasingly prevalent since the broad adoption of software-defined networks(SDNs)in data centers.Modern TCP designs,including cutting-edge TCP variants DCTCP,CUBIC,and BBR,however,are not ...Network updates have become increasingly prevalent since the broad adoption of software-defined networks(SDNs)in data centers.Modern TCP designs,including cutting-edge TCP variants DCTCP,CUBIC,and BBR,however,are not resilient to network updates that provoke flow rerouting.In this paper,we first demonstrate that popular TCP implementations perform inadequately in the presence of frequent and inconsistent network updates,because inconsistent and frequent network updates result in out-of-order packets and packet drops induced via transitory congestion and lead to serious performance deterioration.We look into the causes and propose a network update-friendly TCP(NUFTCP),which is an extension of the DCTCP variant,as a solution.Simulations are used to assess the proposed NUFTCP.Our findings reveal that NUFTCP can more effectively manage the problems of out-of-order packets and packet drops triggered in network updates,and it outperforms DCTCP considerably.展开更多
Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict th...Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict their utility objectives.Yet,besides the cost of the physical assets and network resources,such scaling may also imposemore loads on the electricity power grids to feed the added nodes with the required energy to run and cool,which comes with extra costs too.Thus,those CDNproviders who utilize their resources better can certainly afford their services at lower price-units when compared to others who simply choose the scaling solutions.Resource utilization is a quite challenging process;indeed,clients of CDNs usually tend to exaggerate their true resource requirements when they lease their resources.Service providers are committed to their clients with Service Level Agreements(SLAs).Therefore,any amendment to the resource allocations needs to be approved by the clients first.In this work,we propose deploying a Stackelberg leadership framework to formulate a negotiation game between the cloud service providers and their client tenants.Through this,the providers seek to retrieve those leased unused resources from their clients.Cooperation is not expected from the clients,and they may ask high price units to return their extra resources to the provider’s premises.Hence,to motivate cooperation in such a non-cooperative game,as an extension to theVickery auctions,we developed an incentive-compatible pricingmodel for the returned resources.Moreover,we also proposed building a behavior belief function that shapes the way of negotiation and compensation for each client.Compared to other benchmark models,the assessment results showthat our proposed models provide for timely negotiation schemes,allowing for better resource utilization rates,higher utilities,and grid-friend CDNs.展开更多
The capability of the data center network largely decides the performance of cloud computing. However, the number of servers in the data center network becomes increasingly huge, because of the continuous growth of th...The capability of the data center network largely decides the performance of cloud computing. However, the number of servers in the data center network becomes increasingly huge, because of the continuous growth of the application requirements. The performance improvement of cloud computing faces great challenges of how to connect a large number of servers in building a data center network with promising performance. Traditional tree-based data center networks have issues of bandwidth bottleneck, failure of single switch, etc. Recently proposed data center networks such as DCell, FiConn, and BCube, have larger bandwidth and better fault-tolerance with respect to traditional tree-based data center networks. Nonetheless, for DCell and FiConn, the fault-tolerant length of path between servers increases in case of failure of switches; BCube requires higher performance in switches when its scale is enlarged. Based on the above considerations, we propose a new server-centric data center network, called BCDC, based on crossed cube with excellent performance. Then, we study the connectivity of BCDC networks. Furthermore, we propose communication algorithms and fault-tolerant routing algorithm of BCDC networks. Moreover, we analyze the performance and time complexities of the proposed algorithms in BCDC networks. Our research will provide the basis for design and implementation of a new family of data center networks.展开更多
In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore's law. This exp...In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore's law. This explosion of data comes from various sources such as mobile phones, video cameras and sensor networks, which often present multidi- mensional characteristics. The huge amount of data brings many challenges on the management, transportation, and pro- cessing IT infrastructures. To address these challenges, the state-of-art large scale data center networks have begun to provide cloud services that are increasingly prevalent. How- ever, how to build a good data center remains an open chal- lenge. Concurrently, the architecture design, which signifi- cantly affects the total performance, is of great research inter- est. This paper surveys advances in data center network de- sign. In this paper we first introduce the upcoming trends in the data center industry. Then we review some popular design principles for today's data center network architectures. In the third part, we present some up-to-date data center frame- works and make a comprehensive comparison of them. Dur- ing the comparison, we observe that there is no so-called op- timal data center and the design should be different referring to the data placement, replication, processing, and query pro- cessing. After that, several existing challenges and limitations are discussed. According to these observations, we point out some possible future research directions.展开更多
In modern data centers, power consumed by network is an observable portion of the total energy budget and thus improving the energy efficiency of data center networks (DCNs) truly matters. One effective way for this...In modern data centers, power consumed by network is an observable portion of the total energy budget and thus improving the energy efficiency of data center networks (DCNs) truly matters. One effective way for this energy efficiency is to make the size of DCNs elastic along with traffic demands by flow consolidation and bandwidth scheduling, i.e., turning off unnecessary network components to reduce the power consumption. Meanwhile, having the instinct support for data center management, software defined networking (SDN) provides a paradigm to elastically control the resources of DCNs. To achieve such power savings, most of the prior efforts just adopt simple greedy heuristic to reduce computational complexity. However, due to the inherent problem of greedy algorithm, a good-enough optimization cannot be always guaranteed. To address this problem, a modified hybrid genetic algorithm (MHGA) is employed to improve the solution's accuracy, and the fine-grained routing function of SDN is fully leveraged. The simulation results show that more efficient power management can be achieved than the previous studies, by increasing about 5% of network energy savings.展开更多
Data Center Networks (DCNs) are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important...Data Center Networks (DCNs) are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important traffic patterns in DCNs. Many architectures and protocols are proposed to meet this demand. However, these proposals either require complicated configurations on switches and servers, or cannot deliver an optimal performance. In this paper, we propose the peer-assisted data dissemination for DCNs. This approach utilizes the rich physical connections with high bandwidths and mutli-path connections, to facilitate efficient one-to-many data dissemination. We prove that an optimal P2P data dissemination schedule exists for FatTree, a specially- designed DCN architecture. We then present a theoretical analysis of this algorithm in the general multi-rooted tree topology, a widely-used DCN architecture. Additionally, we explore the performance of an intuitive line structure for data dissemination. Our analysis and experimental results prove that this simple structure is able to produce a comparable performance to the optimal algorithm. Since DCN applications heavily rely on virtualization to achieve optimal resource sharing, we present a general implementation method for the proposed algorithms, which aims to mitigate the impact of the potentially-high churn rate of the virtual machines.展开更多
The data center network(DCN), which is an important component of data centers, consists of a large number of hosted servers and switches connected with high speed communication links. A DCN enables the deployment of r...The data center network(DCN), which is an important component of data centers, consists of a large number of hosted servers and switches connected with high speed communication links. A DCN enables the deployment of resources centralization and on-demand access of the information and services of data centers to users. In recent years, the scale of the DCN has constantly increased with the widespread use of cloud-based services and the unprecedented amount of data delivery in/between data centers, whereas the traditional DCN architecture lacks aggregate bandwidth, scalability, and cost effectiveness for coping with the increasing demands of tenants in accessing the services of cloud data centers. Therefore, the design of a novel DCN architecture with the features of scalability, low cost, robustness, and energy conservation is required. This paper reviews the recent research findings and technologies of DCN architectures to identify the issues in the existing DCN architectures for cloud computing. We develop a taxonomy for the classification of the current DCN architectures, and also qualitatively analyze the traditional and contemporary DCN architectures. Moreover, the DCN architectures are compared on the basis of the significant characteristics, such as bandwidth, fault tolerance, scalability, overhead, and deployment cost. Finally, we put forward open research issues in the deployment of scalable, low-cost, robust, and energy-efficient DCN architecture, for data centers in computational clouds.展开更多
Cloud data centers now provide a plethora of rich online applications such as web search, social networking, and cloud computing. A key challenge for such applications, however, is to meet soft real-time constraints. ...Cloud data centers now provide a plethora of rich online applications such as web search, social networking, and cloud computing. A key challenge for such applications, however, is to meet soft real-time constraints. Due to the deadline-agnostic congestion control in Transmission Control Protocol(TCP), many deadline-sensitive flows cannot finish transmission before their deadlines. In this paper, we propose an SDNbased Explicit-Deadline-aware TCP(SED) for cloud Data Center Networks(DCN). SED assigns a base rate for non-deadline flows first and gives spare bandwidth to the deadline flows as much as possible. Subsequently,a Retransmission-enhanced SED(RSED) is introduced to solve the packet-loss timeout problem. Through our experiments, we show that SED can make flows meet deadlines effectively, and that it significantly outperforms previous protocols in the cloud data center environment.展开更多
Ethernet link aggregation, which provides an easy and cost-effective way to increase both bandwidth and link availability between a pair of devices, is well suited for data center networks. However, all the traffic sp...Ethernet link aggregation, which provides an easy and cost-effective way to increase both bandwidth and link availability between a pair of devices, is well suited for data center networks. However, all the traffic splitting algorithms used in existing Ethernet link aggregation are flow-level which do not work well owing to the traffic characteristics of data centers. Though frame-level traffic splitting can achieve optimal load balance and the maximum benefits from aggregated capacity, it is generally deprecated in most cases because of frame disordering which can disrupt the operation of many Internet protocols, most notably transmission control protocol (TCP). To address this issue, we first investigate the causes of frame disordering in link aggregation and find that all of them either are no longer true or can be prevented in data centers. Then we present a byte-counter frame-level traffic splitting algorithm which achieves optimal performance while causes no frame disordering. The only requirement is that frames in a flow are the same size which can be easily met in data centers. Simulation results show that the proposed frame-level traffic splitting method could achieve higher throughput and optimal load balance. The average completion time of different sized flows is reduced by 24% on average and by up to 46%.展开更多
Currently, the elastic interconnection has realized the high-rate data transmission among data centers(DCs). Thus, the elastic data center network(EDCN) emerged. In EDCNs, it is essential to achieve the virtual networ...Currently, the elastic interconnection has realized the high-rate data transmission among data centers(DCs). Thus, the elastic data center network(EDCN) emerged. In EDCNs, it is essential to achieve the virtual network(VN) embedding, which includes two main components: VM(virtual machine) mapping and VL(virtual link) mapping. In VM mapping, we allocate appropriate servers to hold VMs. While for VL mapping,an optimal substrate path is determined for each virtual lightpath. For the VN embedding in EDCNs, the power efficiency is a significant concern, and some solutions were proposed through sleeping light-duty servers.However, the increasing communication traffic between VMs leads to a serious energy dissipation problem, since it also consumes a great amount of energy on switches even utilizing the energy-efficient optical transmission technique. In this paper, considering load balancing and power-efficient VN embedding, we formulate the problem and design a novel heuristic for EDCNs, with the objective to achieve the power savings of servers and switches. In our solution, VMs are mapped into a single DC or multiple DCs with the short distance between each other, and the servers in the same cluster or adjacent clusters are preferred to hold VMs. Such that, a large amount of servers and switches will become vacant and can go into sleep mode. Simulation results demonstrate that our method performs well in terms of power savings and load balancing. Compared with benchmarks, the improvement ratio of power efficiency is 5%–13%.展开更多
文摘The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.
基金supported part by the National Natural Science Foundation of China(61601252,61801254)Public Technology Projects of Zhejiang Province(LG-G18F020007)+1 种基金Zhejiang Provincial Natural Science Foundation of China(LY20F020008,LY18F020011,LY20F010004)K.C.Wong Magna Fund in Ningbo University。
文摘With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.
基金supported by the ZTE-BJTU Collaborative Research Program under Grant No. K11L00190the Fundamental Research Funds for the Central Universities under Grant No. K12JB00060
文摘1 Introduction The history of data centers can be traced back to the 1960s. Early data centers were deployed on main- frames that were time-shared by users via remote terminals. The boom in data centers came duringthe internet era. Many companies started building large inter- net-connected facililies,
基金supported by the National Natural Science Foundation of China(The key trusted running technologies for the sensing nodes in Internet of things: 61501007The outstanding personnel training program of Beijing municipal Party Committee Organization Department (The Research of Trusted Computing environment for Internet of things in Smart City: 2014000020124G041
文摘According to the high operating costs and a large number of energy waste in the current data center network architectures, we propose a kind of trusted flow preemption scheduling combining the energy-saving routing mechanism based on typical data center network architecture. The mechanism can make the network flow in its exclusive network link bandwidth and transmission path, which can improve the link utilization and the use of the network energy efficiency. Meanwhile, we apply trusted computing to guarantee the high security, high performance and high fault-tolerant routing forwarding service, which helps improving the average completion time of network flow.
基金This work was supported by the Serbian Ministry of Science and Education(project TR-32022)by companies Telekom Srbija and Informatika.
文摘Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.
基金supported by the Fundamental Research Fundsfor the Central Universities under Grant No.ZYGX2015J009the Sichuan Province Scientific and Technological Support Project under Grants No.2014GZ0017 and No.2016GZ0093
文摘In data centers, the transmission control protocol(TCP) incast causes catastrophic goodput degradation to applications with a many-to-one traffic pattern. In this paper, we intend to tame incast at the receiver-side application. Towards this goal, we first develop an analytical model that formulates the incast probability as a function of connection variables and network environment settings. We combine the model with the optimization theory and derive some insights into minimizing the incast probability through tuning connection variables related to applications. Then,enlightened by the analytical results, we propose an adaptive application-layer solution to the TCP incast.The solution equally allocates advertised windows to concurrent connections, and dynamically adapts the number of concurrent connections to the varying conditions. Simulation results show that our solution consistently eludes incast and achieves high goodput in various scenarios including the ones with multiple bottleneck links and background TCP traffic.
基金This work was supported by the Hainan Provincial Natural Science Foundation of China(620RC560,2019RC096,620RC562)the Scientific Research Setup Fund of Hainan University(KYQD(ZR)1877)+2 种基金the National Natural Science Foundation of China(62162021,82160345,61802092)the key research and development program of Hainan province(ZDYF2020199,ZDYF2021GXJS017)the key science and technology plan project of Haikou(2011-016).
文摘As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.
基金supported by the National Basic Research Program of China(973 program)under Grant No.2014CB347800 and No.2012CB315803the National High-Tech R&D Program of China(863 program)under Grant No.2013AA013303+1 种基金the Natural Science Foundation of China under Grant No.61170291,No.61133006,and No.61161140454ZTE IndustryAcademia-Research Cooperation Funds
文摘Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize the high network capacity by appropriate multi- path routing algorithms. As flow-level path splitting may lead to trafl'ic imbalance between paths due to flow- size difference, packet-level path splitting attracts more attention lately, which spreads packets from flows into multiple available paths and significantly improves link utilizations. However, it may cause packet reordering, confusing the TCP congestion control algorithm and lowering the throughput of flows. In this paper, we design a novel packetlevel multi-path routing scheme called SOPA, which leverag- es OpenFlow to perform packet-level path splitting in a round- robin fashion, and hence significantly mitigates the packet reordering problem and improves the network throughput. Moreover, SOPA leverages the topological feature of data center networks to encode a very small number of switches along the path into the packet header, resulting in very light overhead. Compared with random packet spraying (RPS), Hedera and equal-cost multi-path routing (ECMP), our simulations demonstrate that SOPA achieves 29.87%, 50.41% and 77.74% higher network throughput respectively under permutation workload, and reduces average data transfer completion time by 53.65%, 343.31% and 348.25% respectively under production workload.
基金supported by the National Basic Research Program of China(973 Program)(2012CB315903)the Key Science and Technology Innovation Team Project of Zhejiang Province(2011R50010-05)+3 种基金the National Science and Technology Support Program(2014BAH24F01)863 Program of China(2012AA01A507)the National Natural Science Foundation of China(61379118 and 61103200)sponsored by the Research Fund of ZTE Corporation
文摘In a data center network (DCN), load balancing is required when servers transfer data on the same path. This is necessary to avoid congestion. Load balancing is challenged by the dynamic transferral of demands and complex routing control. Because of the distributed nature of a traditional network, previous research on load balancing has mostly focused on improving the performance of the local network; thus, the load has not been optimally balanced across the entire network. In this paper, we propose a novel dynamic load-balancing algorithm for fat-tree. This algorithm avoids congestions to the great possible extent by searching for non-conflicting paths in a centralized way. We implement the algorithm in the popular software-defined networking architecture and evaluate the algorithm' s performance on the Mininet platform. The results show that our algorithm has higher bisection band- width than the traditional equal-cost multi-path load-balancing algorithm and thus more effectively avoids congestion.
基金supported by the National Key R&D Program of China(Nos.2023YFB2905500 and 2023YFB2805302)the National Natural Science Foundation of China(No.62205026)the Beijing Institute of Technology Research Fund Program for Young Scholars,and the Open Fund of IPOC(BUPT)。
文摘The heterogeneity of applications and their divergent resource requirements lead to uneven traffic distribution and imbalanced resource utilization across data center networks(DCNs).We propose a fine-grained baseband function reallocation scheme in heterogeneous optical switching-based DCNs.A deep reinforcement learning-based functional split and resource mapping approach(DRL-BFM)is proposed to maximize throughput in high-load server racks by implementing load balancing in DCNs.The results demonstrate that DRL-BFM improves the throughput by 20.8%,22.8%,and 29.8%on average compared to existing algorithms under different computational capacities,bandwidth constraints,and latency conditions,respectively.
基金supportted by the King Khalid University through the Large Group Project(No.RGP.2/312/44).
文摘Network updates have become increasingly prevalent since the broad adoption of software-defined networks(SDNs)in data centers.Modern TCP designs,including cutting-edge TCP variants DCTCP,CUBIC,and BBR,however,are not resilient to network updates that provoke flow rerouting.In this paper,we first demonstrate that popular TCP implementations perform inadequately in the presence of frequent and inconsistent network updates,because inconsistent and frequent network updates result in out-of-order packets and packet drops induced via transitory congestion and lead to serious performance deterioration.We look into the causes and propose a network update-friendly TCP(NUFTCP),which is an extension of the DCTCP variant,as a solution.Simulations are used to assess the proposed NUFTCP.Our findings reveal that NUFTCP can more effectively manage the problems of out-of-order packets and packet drops triggered in network updates,and it outperforms DCTCP considerably.
基金The Deanship of Scientific Research at Hashemite University partially funds this workDeanship of Scientific Research at the Northern Border University,Arar,KSA for funding this research work through the project number“NBU-FFR-2024-1580-08”.
文摘Cloud Datacenter Network(CDN)providers usually have the option to scale their network structures to allow for far more resource capacities,though such scaling options may come with exponential costs that contradict their utility objectives.Yet,besides the cost of the physical assets and network resources,such scaling may also imposemore loads on the electricity power grids to feed the added nodes with the required energy to run and cool,which comes with extra costs too.Thus,those CDNproviders who utilize their resources better can certainly afford their services at lower price-units when compared to others who simply choose the scaling solutions.Resource utilization is a quite challenging process;indeed,clients of CDNs usually tend to exaggerate their true resource requirements when they lease their resources.Service providers are committed to their clients with Service Level Agreements(SLAs).Therefore,any amendment to the resource allocations needs to be approved by the clients first.In this work,we propose deploying a Stackelberg leadership framework to formulate a negotiation game between the cloud service providers and their client tenants.Through this,the providers seek to retrieve those leased unused resources from their clients.Cooperation is not expected from the clients,and they may ask high price units to return their extra resources to the provider’s premises.Hence,to motivate cooperation in such a non-cooperative game,as an extension to theVickery auctions,we developed an incentive-compatible pricingmodel for the returned resources.Moreover,we also proposed building a behavior belief function that shapes the way of negotiation and compensation for each client.Compared to other benchmark models,the assessment results showthat our proposed models provide for timely negotiation schemes,allowing for better resource utilization rates,higher utilities,and grid-friend CDNs.
基金This paper was supported by the National Natural Science Foundation of China under Grant Nos. 61572337, 61702351, and 61602333, the Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks Foundation under Grant No. WSNLBKF201701, the China Postdoctoral Science Foundation under Grant No. 172985, the Natural Science Foundation of Jiangsu Higher Education Institutions of China under Grant No. 17KJB520036, the Jiangsu Planned Projects for Postdoctoral Research Funds under Grant No. 1701172B, and the Application Foundation Research of Suzhou of China under Grant No. SYG201653.
文摘The capability of the data center network largely decides the performance of cloud computing. However, the number of servers in the data center network becomes increasingly huge, because of the continuous growth of the application requirements. The performance improvement of cloud computing faces great challenges of how to connect a large number of servers in building a data center network with promising performance. Traditional tree-based data center networks have issues of bandwidth bottleneck, failure of single switch, etc. Recently proposed data center networks such as DCell, FiConn, and BCube, have larger bandwidth and better fault-tolerance with respect to traditional tree-based data center networks. Nonetheless, for DCell and FiConn, the fault-tolerant length of path between servers increases in case of failure of switches; BCube requires higher performance in switches when its scale is enlarged. Based on the above considerations, we propose a new server-centric data center network, called BCDC, based on crossed cube with excellent performance. Then, we study the connectivity of BCDC networks. Furthermore, we propose communication algorithms and fault-tolerant routing algorithm of BCDC networks. Moreover, we analyze the performance and time complexities of the proposed algorithms in BCDC networks. Our research will provide the basis for design and implementation of a new family of data center networks.
文摘In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore's law. This explosion of data comes from various sources such as mobile phones, video cameras and sensor networks, which often present multidi- mensional characteristics. The huge amount of data brings many challenges on the management, transportation, and pro- cessing IT infrastructures. To address these challenges, the state-of-art large scale data center networks have begun to provide cloud services that are increasingly prevalent. How- ever, how to build a good data center remains an open chal- lenge. Concurrently, the architecture design, which signifi- cantly affects the total performance, is of great research inter- est. This paper surveys advances in data center network de- sign. In this paper we first introduce the upcoming trends in the data center industry. Then we review some popular design principles for today's data center network architectures. In the third part, we present some up-to-date data center frame- works and make a comprehensive comparison of them. Dur- ing the comparison, we observe that there is no so-called op- timal data center and the design should be different referring to the data placement, replication, processing, and query pro- cessing. After that, several existing challenges and limitations are discussed. According to these observations, we point out some possible future research directions.
基金supported by the Research Fund of Ministry of Education-China Mobile (MCM20160304)
文摘In modern data centers, power consumed by network is an observable portion of the total energy budget and thus improving the energy efficiency of data center networks (DCNs) truly matters. One effective way for this energy efficiency is to make the size of DCNs elastic along with traffic demands by flow consolidation and bandwidth scheduling, i.e., turning off unnecessary network components to reduce the power consumption. Meanwhile, having the instinct support for data center management, software defined networking (SDN) provides a paradigm to elastically control the resources of DCNs. To achieve such power savings, most of the prior efforts just adopt simple greedy heuristic to reduce computational complexity. However, due to the inherent problem of greedy algorithm, a good-enough optimization cannot be always guaranteed. To address this problem, a modified hybrid genetic algorithm (MHGA) is employed to improve the solution's accuracy, and the fine-grained routing function of SDN is fully leveraged. The simulation results show that more efficient power management can be achieved than the previous studies, by increasing about 5% of network energy savings.
基金supported in part by the Natural Science Foundation of USA(Nos.ECCS 1128209,CNS 10655444,CCF 1028167,CNS 0948184,and CCF 0830289)
文摘Data Center Networks (DCNs) are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important traffic patterns in DCNs. Many architectures and protocols are proposed to meet this demand. However, these proposals either require complicated configurations on switches and servers, or cannot deliver an optimal performance. In this paper, we propose the peer-assisted data dissemination for DCNs. This approach utilizes the rich physical connections with high bandwidths and mutli-path connections, to facilitate efficient one-to-many data dissemination. We prove that an optimal P2P data dissemination schedule exists for FatTree, a specially- designed DCN architecture. We then present a theoretical analysis of this algorithm in the general multi-rooted tree topology, a widely-used DCN architecture. Additionally, we explore the performance of an intuitive line structure for data dissemination. Our analysis and experimental results prove that this simple structure is able to produce a comparable performance to the optimal algorithm. Since DCN applications heavily rely on virtualization to achieve optimal resource sharing, we present a general implementation method for the proposed algorithms, which aims to mitigate the impact of the potentially-high churn rate of the virtual machines.
基金Project supported by the Malaysian Ministry of Higher Education under the University of Malaya High Impact Research Grant(No.UM.C/HIR/MOHE/FCSIT/03)
文摘The data center network(DCN), which is an important component of data centers, consists of a large number of hosted servers and switches connected with high speed communication links. A DCN enables the deployment of resources centralization and on-demand access of the information and services of data centers to users. In recent years, the scale of the DCN has constantly increased with the widespread use of cloud-based services and the unprecedented amount of data delivery in/between data centers, whereas the traditional DCN architecture lacks aggregate bandwidth, scalability, and cost effectiveness for coping with the increasing demands of tenants in accessing the services of cloud data centers. Therefore, the design of a novel DCN architecture with the features of scalability, low cost, robustness, and energy conservation is required. This paper reviews the recent research findings and technologies of DCN architectures to identify the issues in the existing DCN architectures for cloud computing. We develop a taxonomy for the classification of the current DCN architectures, and also qualitatively analyze the traditional and contemporary DCN architectures. Moreover, the DCN architectures are compared on the basis of the significant characteristics, such as bandwidth, fault tolerance, scalability, overhead, and deployment cost. Finally, we put forward open research issues in the deployment of scalable, low-cost, robust, and energy-efficient DCN architecture, for data centers in computational clouds.
基金supported by the National Natural Science Foundation of China (Nos. 61370209 and 61402230)
文摘Cloud data centers now provide a plethora of rich online applications such as web search, social networking, and cloud computing. A key challenge for such applications, however, is to meet soft real-time constraints. Due to the deadline-agnostic congestion control in Transmission Control Protocol(TCP), many deadline-sensitive flows cannot finish transmission before their deadlines. In this paper, we propose an SDNbased Explicit-Deadline-aware TCP(SED) for cloud Data Center Networks(DCN). SED assigns a base rate for non-deadline flows first and gives spare bandwidth to the deadline flows as much as possible. Subsequently,a Retransmission-enhanced SED(RSED) is introduced to solve the packet-loss timeout problem. Through our experiments, we show that SED can make flows meet deadlines effectively, and that it significantly outperforms previous protocols in the cloud data center environment.
基金supported by the National Natural Science Foundation of China(61002011)the Open Fund of the State Key Laboratory of Software Development Environment(SKLSDE-2009KF-2-08)+1 种基金the National Basic Research Program of China(2009CB320505)the Hi-Tech Research and Development Program of China(2011AA01A102)
文摘Ethernet link aggregation, which provides an easy and cost-effective way to increase both bandwidth and link availability between a pair of devices, is well suited for data center networks. However, all the traffic splitting algorithms used in existing Ethernet link aggregation are flow-level which do not work well owing to the traffic characteristics of data centers. Though frame-level traffic splitting can achieve optimal load balance and the maximum benefits from aggregated capacity, it is generally deprecated in most cases because of frame disordering which can disrupt the operation of many Internet protocols, most notably transmission control protocol (TCP). To address this issue, we first investigate the causes of frame disordering in link aggregation and find that all of them either are no longer true or can be prevented in data centers. Then we present a byte-counter frame-level traffic splitting algorithm which achieves optimal performance while causes no frame disordering. The only requirement is that frames in a flow are the same size which can be easily met in data centers. Simulation results show that the proposed frame-level traffic splitting method could achieve higher throughput and optimal load balance. The average completion time of different sized flows is reduced by 24% on average and by up to 46%.
基金supported in part by Open Foundation of State Key Laboratory of Information Photonics and Optical Communications (Grant No. IPOC2014B009)Fundamental Research Funds for the Central Universities (Grant Nos. N130817002, N140405005, N150401002)+3 种基金Foundation of the Education Department of Liaoning Province (Grant No. L2014089)National Natural Science Foundation of China (Grant Nos. 61302070, 61401082, 61471109, 61502075)Liaoning Bai Qian Wan Talents ProgramNational High-Level Personnel Special Support Program for Youth Top-Notch Talent
文摘Currently, the elastic interconnection has realized the high-rate data transmission among data centers(DCs). Thus, the elastic data center network(EDCN) emerged. In EDCNs, it is essential to achieve the virtual network(VN) embedding, which includes two main components: VM(virtual machine) mapping and VL(virtual link) mapping. In VM mapping, we allocate appropriate servers to hold VMs. While for VL mapping,an optimal substrate path is determined for each virtual lightpath. For the VN embedding in EDCNs, the power efficiency is a significant concern, and some solutions were proposed through sleeping light-duty servers.However, the increasing communication traffic between VMs leads to a serious energy dissipation problem, since it also consumes a great amount of energy on switches even utilizing the energy-efficient optical transmission technique. In this paper, considering load balancing and power-efficient VN embedding, we formulate the problem and design a novel heuristic for EDCNs, with the objective to achieve the power savings of servers and switches. In our solution, VMs are mapped into a single DC or multiple DCs with the short distance between each other, and the servers in the same cluster or adjacent clusters are preferred to hold VMs. Such that, a large amount of servers and switches will become vacant and can go into sleep mode. Simulation results demonstrate that our method performs well in terms of power savings and load balancing. Compared with benchmarks, the improvement ratio of power efficiency is 5%–13%.