To enhance the efficiency of warehouse order management,this study investigates a dual-com-mand operation mode in the Flying-V non-traditional warehouse layout.Three dual-command opera-tion strategies are designed,and...To enhance the efficiency of warehouse order management,this study investigates a dual-com-mand operation mode in the Flying-V non-traditional warehouse layout.Three dual-command opera-tion strategies are designed,and a dual-command operation path optimization model is established with the shortest path as the optimization goal.Furthermore,a genetic algorithm based on a dynamic decoding strategy is proposed.Simulation results demonstrate that the Flying-V layout warehouse management and access cooperation operation can reduce the operation time by an average of 25%-35%compared with the single access operation path,and by an average of 13%-23%compared with the‘deposit first and then pick’operation path.These findings provide evidence for the effec-tiveness of the optimization model and algorithm.展开更多
Large Language Models(LLMs)are expanding their applications across various fields,including Industrial Internet of Things(IIoT),where they analyze sensor data,automate diagnostics,and enhance predictive maintenance.LL...Large Language Models(LLMs)are expanding their applications across various fields,including Industrial Internet of Things(IIoT),where they analyze sensor data,automate diagnostics,and enhance predictive maintenance.LLM inference is provided by service providers to users,with each inference request undergoing two phases:prefill and decode.Due to the autoregressive nature of generation,only one token can be produced per iteration,necessitating multiple iterations to complete a request.Typically,batch processing groups multiple requests into a single batch for inference,improving throughput and hardware utilization.However,in service systems,a fixed batch size presents challenges under fluctuating request volumes,particularly in IIoT environments,where data flow can vary significantly.Specifically,during the high-load periods,a fixed batch size may lead to underutilization of resources,while during the low-load periods,it may result in resource wastage.In this paper,we introduce FlexiDecode Scheduler(FDS)to address these challenges by dynamically adjusting the decoding batch size based on system load conditions,improving resource utilization,and reducing wait time during high-load periods.FDS prioritizes prefilling new requests to maximize decoding efficiency and employs a request output length predictor to optimize request scheduling,minimizing End-to-End(E2E)latency.Compared to virtual Large Language Model(vLLM)and Sarathi,our approach achieves a 23%and 16%reduction in E2E latency,improves actual request execution time by 34%and 15%,respectively,and increases computational utilization by 10%.展开更多
A family of array codes with a maximum distance separable(MDS) property, named L codes, is proposed. The greatest strength of L codes is that the number of rows(columns) in a disk array does not be restricted by t...A family of array codes with a maximum distance separable(MDS) property, named L codes, is proposed. The greatest strength of L codes is that the number of rows(columns) in a disk array does not be restricted by the prime number, and more disks can be dynamically appended in a running storage system. L codes can tolerate at least two disk erasures and some sector loss simultaneously, and can tolerate multiple disk erasures(greater than or equal to three) under a certain condition. Because only XOR operations are needed in the process of encoding and decoding, L codes have very high computing efficiency which is roughly equivalent to X codes. Analysis shows that L codes are particularly suitable for large-scale storage systems.展开更多
基金the National Natural Science Foundation of China(51565036).
文摘To enhance the efficiency of warehouse order management,this study investigates a dual-com-mand operation mode in the Flying-V non-traditional warehouse layout.Three dual-command opera-tion strategies are designed,and a dual-command operation path optimization model is established with the shortest path as the optimization goal.Furthermore,a genetic algorithm based on a dynamic decoding strategy is proposed.Simulation results demonstrate that the Flying-V layout warehouse management and access cooperation operation can reduce the operation time by an average of 25%-35%compared with the single access operation path,and by an average of 13%-23%compared with the‘deposit first and then pick’operation path.These findings provide evidence for the effec-tiveness of the optimization model and algorithm.
基金supported by the National Science and Technology Major Project(No.2022ZD0116800)the NSFC International Young Scientists Fund(No.62350410478)+2 种基金the Taishan Scholars Program(Nos.TSQNZ20230621 and TSQN202211214)the Shandong Excellent Young Scientists Fund(Overseas)(No.2023HWYQ-113)the Shandong Provincial Natural Science Foundation(No.ZR20221150015).
文摘Large Language Models(LLMs)are expanding their applications across various fields,including Industrial Internet of Things(IIoT),where they analyze sensor data,automate diagnostics,and enhance predictive maintenance.LLM inference is provided by service providers to users,with each inference request undergoing two phases:prefill and decode.Due to the autoregressive nature of generation,only one token can be produced per iteration,necessitating multiple iterations to complete a request.Typically,batch processing groups multiple requests into a single batch for inference,improving throughput and hardware utilization.However,in service systems,a fixed batch size presents challenges under fluctuating request volumes,particularly in IIoT environments,where data flow can vary significantly.Specifically,during the high-load periods,a fixed batch size may lead to underutilization of resources,while during the low-load periods,it may result in resource wastage.In this paper,we introduce FlexiDecode Scheduler(FDS)to address these challenges by dynamically adjusting the decoding batch size based on system load conditions,improving resource utilization,and reducing wait time during high-load periods.FDS prioritizes prefilling new requests to maximize decoding efficiency and employs a request output length predictor to optimize request scheduling,minimizing End-to-End(E2E)latency.Compared to virtual Large Language Model(vLLM)and Sarathi,our approach achieves a 23%and 16%reduction in E2E latency,improves actual request execution time by 34%and 15%,respectively,and increases computational utilization by 10%.
基金supported by the National Natural Science Foundation of China under Grant No.61202250
文摘A family of array codes with a maximum distance separable(MDS) property, named L codes, is proposed. The greatest strength of L codes is that the number of rows(columns) in a disk array does not be restricted by the prime number, and more disks can be dynamically appended in a running storage system. L codes can tolerate at least two disk erasures and some sector loss simultaneously, and can tolerate multiple disk erasures(greater than or equal to three) under a certain condition. Because only XOR operations are needed in the process of encoding and decoding, L codes have very high computing efficiency which is roughly equivalent to X codes. Analysis shows that L codes are particularly suitable for large-scale storage systems.