Naturally degradable capsule provides a platform for sustained fragrance release.However,practical challenges such as low encapsulation efficiency and difficulty in sustained release are still limited in using fragran...Naturally degradable capsule provides a platform for sustained fragrance release.However,practical challenges such as low encapsulation efficiency and difficulty in sustained release are still limited in using fragranceloaded capsules.In this work,the natural materials sodium alginate and gelatine are dissolved and act as the aqueous phase,lavender is dissolved in caprylic/capric triglyceride(GTCC)as the oil phase,and SiO_(2) nanoparticles with neutralwettability as a solid emulsifier to form O/W Pickering emulsions simultaneously.Finally,multi-core capsules are prepared using the drop injection method with emulsions as templates.The results show that the capsules have been successfully prepared with a spherical morphology and multi-core structure,and the encapsulation rate of multi-core capsules can reach up to 99.6%.In addition,the multi-core capsules possess desirable sustained release performance,the cumulative sustained release rate of fragrance at 25℃over 49 days is only 32.5%.It is attributed to the significant protection of multi-core structure,Pickering emulsion nanoparticle membranes,and hydrogel network shell for encapsulated fragrance.This study is designed to deliver a new strategy for using sustained-release technology with fragrance in food,cosmetics,textiles,and other fields.展开更多
Underground engineering projects such as deep tunnel excavation often encounter rockburst disasters accompanied by numerous microseismic events.Rapid interpretation of microseismic signals is crucial for the timely id...Underground engineering projects such as deep tunnel excavation often encounter rockburst disasters accompanied by numerous microseismic events.Rapid interpretation of microseismic signals is crucial for the timely identification of rockbursts.However,conventional processing encompasses multi-step workflows,including classification,denoising,picking,locating,and computational analysis,coupled with manual intervention,which collectively compromise the reliability of early warnings.To address these challenges,this study innovatively proposes the“microseismic stethoscope"-a multi-task machine learning and deep learning model designed for the automated processing of massive microseismic signals.This model efficiently extracts three key parameters that are necessary for recognizing rockburst disasters:rupture location,microseismic energy,and moment magnitude.Specifically,the model extracts raw waveform features from three dedicated sub-networks:a classifier for source zone classification,and two regressors for microseismic energy and moment magnitude estimation.This model demonstrates superior efficiency compared to traditional processing and semi-automated processing,reducing per-event processing time from 0.71 s to 0.49 s to merely 0.036 s.It concurrently achieves 98%accuracy in source zone classification,with microseismic energy and moment magnitude estimation errors of 0.13 and 0.05,respectively.This model has been well applied and validated in the Daxiagu Tunnel case in Sichuan,China.The application results indicate that the model is as accurate as traditional methods in determining source parameters,and thus can be used to identify potential geomechanical processes of rockburst disasters.By enhancing the signal processing reliability of microseismic events,the proposed model in this study presents a significant advancement in the identification of rockburst disasters.展开更多
Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Re...Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.展开更多
A novel kind of multi-core magnetic composite particles, the surfaces of which were respectively mo- dified with goat-anti-mouse IgG and antitransferrin receptor(anti-CD71), was prepared. The fetal nucleated red blo...A novel kind of multi-core magnetic composite particles, the surfaces of which were respectively mo- dified with goat-anti-mouse IgG and antitransferrin receptor(anti-CD71), was prepared. The fetal nucleated red blood cells(FNRBCs) in the peripheral blood of a gravida were rapidly and effectively enriched and separated by the mo- dified multi-core magnetic composite particles in an external magnetic field. The obtained FNRBCs were used for the identification of the fetal sex by means of fluorescence in situ hybridization(FISH) technique. The results demonstrate that the multi-core magnetic composite particles meet the requirements for the enrichment and speration of FNRBCs with a low concentration and the accuracy of detetion for the diagnosis of fetal sex reached to 95%. Moreover, the obtained FNRBCs were applied to the non-invasive diagnosis of Down syndrome and chromosome 3p21 was de- tected. The above facts indicate that the novel multi-core magnetic composite particles-based method is simple, relia- ble and cost-effective and has opened up vast vistas for the potential application in clinic non-invasive prenatal diag- nosis.展开更多
A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time geneti...A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time genetic task mapping algorithm is proposed during the design stage to generate multiple task mapping solutions which cover a maximum range of chips. Then, during the run, one optimal task mapping solution is selected. Additionally, logical cores are mapped to physically available cores. Both core asymmetry and topological changes are considered in the proposed approach. Experimental results show that the performance yield of the proposed approach is 96% on average, and the communication cost, power consumption and peak temperature are all optimized without loss of performance yield.展开更多
Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at t...Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.展开更多
Decreasing mode coupling coefficient(κ) is an effective approach to suppress the inter-core crosstalk. Therefore, we deploy a low index rod and rectangle trench in the middle of two neighboring cores to reduce κ so ...Decreasing mode coupling coefficient(κ) is an effective approach to suppress the inter-core crosstalk. Therefore, we deploy a low index rod and rectangle trench in the middle of two neighboring cores to reduce κ so that the overlap of electric field distribution can be suppressed. We also propose approximate analytical solution(AAS) for κ of two crosstalk suppression models, which are two cores with one low index rod deployed in the middle and two cores with one low index rectangle trench deployed in the middle. We then do some modification for the results obtained by AAS and the modified results are proved to agree well with that obtained by finite element method(FEM). Therefore, we can use the modified AAS to get inter-core crosstalk for abovementioned two models quickly.展开更多
In this paper, the influencing factors that affect few-mode and multi core optical fiber channel are analyzed in a comprehensive way. The theoretical modeling and computer simulation of the information channel are car...In this paper, the influencing factors that affect few-mode and multi core optical fiber channel are analyzed in a comprehensive way. The theoretical modeling and computer simulation of the information channel are carried out and then the modeling scheme of few-mode multicore optical fiber channel based on non-uniform mode field distribution is put forward. The proposed modeling scheme can not only exponentially increases the system capacity through fewmode multi-core optical fiber channel, but has better transmission performance compared to the channel of the same type to the uniform channel revealing from the simulation results.展开更多
The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,...The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS.展开更多
This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the ...This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the multiplier-matrix,and the other is caused by the multiplicand.For each of them,the paper puts forward an optimization method respectively.The first hash based method removes cache misses of the 1 st category effectively,and improves the performance by a factor of 6 on an Intel 8-core CPU for the best cases.For cache misses of the 2nd category,it proposes a new cache replacement algorithm,which achieves a cache hit rate much higher than other historical knowledge based algorithms,and the algorithm is applicable on CELL and GPU.To further verify the effectiveness of our methods,we implement our algorithm on GPU,and the performance perfectly scales with the size of on-chip storage.展开更多
Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay di...Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data.展开更多
Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to acc...Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to access the shared cache simultaneously.The main problem in improving memory performance is the shared cache architecture and cache replacement.This paper documents the implementation of a Dual-Port Content Addressable Memory(DPCAM)and a modified Near-Far Access Replacement Algorithm(NFRA),which was previously proposed as a shared L2 cache layer in a multi-core processor.Standard Performance Evaluation Corporation(SPEC)Central Processing Unit(CPU)2006 benchmark workloads are used to evaluate the benefit of the shared L2 cache layer.Results show improved performance of the multicore processor’s DPCAM and NFRA algorithms,corresponding to a higher number of concurrent accesses to shared memory.The new architecture significantly increases system throughput and records performance improvements of up to 8.7%on various types of SPEC 2006 benchmarks.The miss rate is also improved by about 13%,with some exceptions in the sphinx3 and bzip2 benchmarks.These results could open a new window for solving the long-standing problems with shared cache in multi-core processors.展开更多
In this paper, a study related to the expected performance behaviour of present 3-level cache system for multi-core systems is presented. For this a queuing model for present 3-level cache system for multi-core proces...In this paper, a study related to the expected performance behaviour of present 3-level cache system for multi-core systems is presented. For this a queuing model for present 3-level cache system for multi-core processors is developed and its possible performance has been analyzed with the increase in number of cores. Various important performance parameters like access time and utilization of individual cache at different level and overall average access time of the cache system is determined. Results for up to 1024 cores have been reported in this paper.展开更多
In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of pa...In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of parallel processing mechanisms.One is that it can evenly allocate tasks to each server node in the cluster and the other is that it can implement the load balancing inside a server node.Based on the strategy,a new web-based spatial computing model is designed in this paper,in which,a task response ratio calculation method,a request queue buffer mechanism and a thread scheduling strategy are focused on.Experimental results show that the new model can fully use the multi-core computing advantage of each server node in the concurrent access environment and improve the average hits per second,average I/O Hits,CPU utilization and throughput.Using speed-up ratio to analyze the traditional model and the new one,the result shows that the new model has the best performance.The performance of the multi-core server nodes in the cluster is optimized;the resource utilization and the parallel processing capabilities are enhanced.The more CPU cores you have,the higher parallel processing capabilities will be obtained.展开更多
The first important problem in the star forming process is the formation of proto star core in star forming regions of molecular cloud. The multi core structure in star forming regions is related to the forming of pro...The first important problem in the star forming process is the formation of proto star core in star forming regions of molecular cloud. The multi core structure in star forming regions is related to the forming of proto star core. The molecular radiation of C 18 O( J = 1-0) in Cepheus C has been observed. The C 18 O( J = 1-0) observations form the basis for an interesting study on the cloud cores and star formation activity in the cores of the Cepheus C. In order to study the multi core structure of C 18 O( J = 1-0) in the Cepheus C the channel maps and the position velocity diagrams of C 18 O( J = 1-0) will be shown. From the maps it is found that the contour level and distribution size of the three cores in Cepheus C are related to the channel velocity very much. The channel velocity of C 18 O( J = 1-0) molecules in core b, which distributed in all the channels velocity, is different with one in core a and core c very much. The C 18 O( J = 1-0) molecules in core a and core c of the Cepheus C mostly distributed in the blue shifted channel velocity relating to peak velocity, and only in -10.0 ~ -9.5 km/s, which is the red shifted channel velocity relating to peak velocity. And the contour level of C 18 O( J = 1-0) in -10.0 ~ -9.5 km/s is small and the distrbution size in the channel map is small. According to the position velocity diagrams the asymmetry of the distribution both blue shifted and red shifted components should reflect the asymmetry of the profile. From the diagrams it also is found that the contour level and the distribution size of the three cores are different from each other. Both results from the maps and diagrams are coincident with each other.展开更多
Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardwar...Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardware system.This literature review focuses on calculating WCET for multi-core processors,providing a survey of traditional methods used for static and dynamic analysis and highlighting the major challenges that arise from different program execution scenarios on multi-core platforms.This paper outlines the strengths and weaknesses of current methodologies and offers insights into prospective areas of research on multi-core analysis.By presenting a comprehensive analysis of the current state of research on multi-core processor analysis for WCET estimation,this review aims to serve as a valuable resource for researchers and practitioners in the field.展开更多
The state-of-the-art multi-core computer systems are based on Very Large Scale three Dimensional (3D) Integrated circuits (VLSI). In order to provide high-speed vertical data transmission in such 3D systems, efficient...The state-of-the-art multi-core computer systems are based on Very Large Scale three Dimensional (3D) Integrated circuits (VLSI). In order to provide high-speed vertical data transmission in such 3D systems, efficient Through-Silicon Via (TSV) technology is critically important. In this paper, various Radio Frequency (RF) TSV designs and models are proposed. Specifically, the Cu-plug TSV with surrounding ground TSVs is used as the baseline structure. For further improvement, the dielectric coaxial and novel air-gap coaxial TSVs are introduced. Using the empirical parameters of these coaxial TSVs, the simulation results are obtained demonstrating that these coaxial RF-TSVs can provide two-order higher of cut-off frequencies than the Cu-plug TSVs. Based on these new RF-TSV technologies, we propose a novel 3D multi-core computer system as well as new architectures for manipulating the interfaces between RF and baseband circuit. Taking into consideration the scaling down of IC manufacture technologies, predictions for the performance of future generations of circuits are made. With simulation results indicating energy per bit and area per bit being reduced by 7% and 11% respectively, we can conclude that the proposed method is a worthwhile guideline for the design of future multi-core computer ICs.展开更多
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-...Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.展开更多
The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches...The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.展开更多
Tropical cyclones(TCs)are one of the most serious types of natural disasters,and accurate TC activity predictions are key to disaster prevention and mitigation.Recently,TC track predictions have made significant progr...Tropical cyclones(TCs)are one of the most serious types of natural disasters,and accurate TC activity predictions are key to disaster prevention and mitigation.Recently,TC track predictions have made significant progress,but the ability to predict their intensity is obviously lagging behind.At present,research on TC intensity prediction takes atmospheric reanalysis data as the research object and mines the relationship between TC-related environmental factors and intensity through deep learning.However,reanalysis data are non-real-time in nature,which does not meet the requirements for operational forecasting applications.Therefore,a TC intensity prediction model named TC-Rolling is proposed,which can simultaneously extract the degree of symmetry for strong TC convective cloud and convection intensity,and fuse the deviation-angle variance with satellite images to construct the correlation between TC convection structure and intensity.For TCs'complex dynamic processes,a convolutional neural network(CNN)is used to learn their temporal and spatial features.For real-time intensity estimation,multi-task learning acts as an implicit time-series enhancement.The model is designed with a rolling strategy that aims to moderate the long-term dependent decay problem and improve accuracy for short-term intensity predictions.Since multiple tasks are correlated,the loss function of 12 h and 24 h are corrected.After testing on a sample of TCs in the Northwest Pacific,with a 4.48 kt root-mean-square error(RMSE)of 6 h intensity prediction,5.78 kt for 12 h,and 13.94 kt for 24 h,TC records from official agencies are used to assess the validity of TC-Rolling.展开更多
文摘Naturally degradable capsule provides a platform for sustained fragrance release.However,practical challenges such as low encapsulation efficiency and difficulty in sustained release are still limited in using fragranceloaded capsules.In this work,the natural materials sodium alginate and gelatine are dissolved and act as the aqueous phase,lavender is dissolved in caprylic/capric triglyceride(GTCC)as the oil phase,and SiO_(2) nanoparticles with neutralwettability as a solid emulsifier to form O/W Pickering emulsions simultaneously.Finally,multi-core capsules are prepared using the drop injection method with emulsions as templates.The results show that the capsules have been successfully prepared with a spherical morphology and multi-core structure,and the encapsulation rate of multi-core capsules can reach up to 99.6%.In addition,the multi-core capsules possess desirable sustained release performance,the cumulative sustained release rate of fragrance at 25℃over 49 days is only 32.5%.It is attributed to the significant protection of multi-core structure,Pickering emulsion nanoparticle membranes,and hydrogel network shell for encapsulated fragrance.This study is designed to deliver a new strategy for using sustained-release technology with fragrance in food,cosmetics,textiles,and other fields.
基金supported by the National Natural Science Foundation of China(Grant Nos.42130719 and 42177173)the Doctoral Direct Train Project of Chongqing Natural Science Foundation(Grant No.CSTB2023NSCQ-BSX0029).
文摘Underground engineering projects such as deep tunnel excavation often encounter rockburst disasters accompanied by numerous microseismic events.Rapid interpretation of microseismic signals is crucial for the timely identification of rockbursts.However,conventional processing encompasses multi-step workflows,including classification,denoising,picking,locating,and computational analysis,coupled with manual intervention,which collectively compromise the reliability of early warnings.To address these challenges,this study innovatively proposes the“microseismic stethoscope"-a multi-task machine learning and deep learning model designed for the automated processing of massive microseismic signals.This model efficiently extracts three key parameters that are necessary for recognizing rockburst disasters:rupture location,microseismic energy,and moment magnitude.Specifically,the model extracts raw waveform features from three dedicated sub-networks:a classifier for source zone classification,and two regressors for microseismic energy and moment magnitude estimation.This model demonstrates superior efficiency compared to traditional processing and semi-automated processing,reducing per-event processing time from 0.71 s to 0.49 s to merely 0.036 s.It concurrently achieves 98%accuracy in source zone classification,with microseismic energy and moment magnitude estimation errors of 0.13 and 0.05,respectively.This model has been well applied and validated in the Daxiagu Tunnel case in Sichuan,China.The application results indicate that the model is as accurate as traditional methods in determining source parameters,and thus can be used to identify potential geomechanical processes of rockburst disasters.By enhancing the signal processing reliability of microseismic events,the proposed model in this study presents a significant advancement in the identification of rockburst disasters.
文摘Knowledge distillation has become a standard technique for compressing large language models into efficient student models,but existing methods often struggle to balance prediction accuracy with explanation quality.Recent approaches such as Distilling Step-by-Step(DSbS)introduce explanation supervision,yet they apply it in a uniform manner that may not fully exploit the different learning dynamics of prediction and explanation.In this work,we propose a task-structured curriculum learning(TSCL)framework that structures training into three sequential phases:(i)prediction-only,to establish stable feature representations;(ii)joint prediction-explanation,to align task outputs with rationale generation;and(iii)explanation-only,to refine the quality of rationales.This design provides a simple but effective modification to DSbS,requiring no architectural changes and adding negligible training cost.We justify the phase scheduling with ablation studies and convergence analysis,showing that an initial prediction-heavy stage followed by a balanced joint phase improves both stability and explanation alignment.Extensive experiments on five datasets(e-SNLI,ANLI,CommonsenseQA,SVAMP,and MedNLI)demonstrate that TSCL consistently outperforms strong baselines,achieving gains of+1.7-2.6 points in accuracy and 0.8-1.2 in ROUGE-L,corresponding to relative error reductions of up to 21%.Beyond lexical metrics,human evaluation and ERASERstyle faithfulness diagnostics confirm that TSCL produces more faithful and informative explanations.Comparative training curves further reveal faster convergence and lower variance across seeds.Efficiency analysis shows less than 3%overhead in wall-clock training time and no additional inference cost,making the approach practical for realworld deployment.This study demonstrates that a simple task-structured curriculum can significantly improve the effectiveness of knowledge distillation.By separating and sequencing objectives,TSCL achieves a better balance between accuracy,stability,and explanation quality.The framework generalizes across domains,including medical NLI,and offers a principled recipe for future applications in multimodal reasoning and reinforcement learning.
文摘A novel kind of multi-core magnetic composite particles, the surfaces of which were respectively mo- dified with goat-anti-mouse IgG and antitransferrin receptor(anti-CD71), was prepared. The fetal nucleated red blood cells(FNRBCs) in the peripheral blood of a gravida were rapidly and effectively enriched and separated by the mo- dified multi-core magnetic composite particles in an external magnetic field. The obtained FNRBCs were used for the identification of the fetal sex by means of fluorescence in situ hybridization(FISH) technique. The results demonstrate that the multi-core magnetic composite particles meet the requirements for the enrichment and speration of FNRBCs with a low concentration and the accuracy of detetion for the diagnosis of fetal sex reached to 95%. Moreover, the obtained FNRBCs were applied to the non-invasive diagnosis of Down syndrome and chromosome 3p21 was de- tected. The above facts indicate that the novel multi-core magnetic composite particles-based method is simple, relia- ble and cost-effective and has opened up vast vistas for the potential application in clinic non-invasive prenatal diag- nosis.
文摘A variation-aware task mapping approach is proposed for a multi-core network-on-chips with redundant cores, which includes both the design-time mapping and run-time scheduling algorithms. Firstly, a design-time genetic task mapping algorithm is proposed during the design stage to generate multiple task mapping solutions which cover a maximum range of chips. Then, during the run, one optimal task mapping solution is selected. Additionally, logical cores are mapped to physically available cores. Both core asymmetry and topological changes are considered in the proposed approach. Experimental results show that the performance yield of the proposed approach is 96% on average, and the communication cost, power consumption and peak temperature are all optimized without loss of performance yield.
基金Project(2008AA01A201) supported the National High-tech Research and Development Program of ChinaProjects(60833004, 60633050) supported by the National Natural Science Foundation of China
文摘Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.
基金supported by National B a-sic Research Program of China(Grant No.2012CB315905)National Natural Science Foundation of China(Grant No.61501027)+1 种基金China Postdoctoral Science Foundation(Grant No.2015M570934)Fundamental Research Funds for the Central Universities(Grant No.FRF-TP-15-031A1)
文摘Decreasing mode coupling coefficient(κ) is an effective approach to suppress the inter-core crosstalk. Therefore, we deploy a low index rod and rectangle trench in the middle of two neighboring cores to reduce κ so that the overlap of electric field distribution can be suppressed. We also propose approximate analytical solution(AAS) for κ of two crosstalk suppression models, which are two cores with one low index rod deployed in the middle and two cores with one low index rectangle trench deployed in the middle. We then do some modification for the results obtained by AAS and the modified results are proved to agree well with that obtained by finite element method(FEM). Therefore, we can use the modified AAS to get inter-core crosstalk for abovementioned two models quickly.
基金supports from National High Technology 863 Program of China(No.2013AA013403,2015AA015501,2015AA015502,2015AA015504)National NSFC(No.61425022/61522501/61307086/61475024/61275158/61201151/61275074/61372109)+4 种基金Beijing Nova Program(No.Z141101001814048)Beijing Excellent Ph.D.Thesis Guidance Foundation(No.20121001302)the Universities Ph.D.Special Research Funds(No.20120005110003/20120005120007)Fund of State Key Laboratory of IPOC(BUPT)P.R.China
文摘In this paper, the influencing factors that affect few-mode and multi core optical fiber channel are analyzed in a comprehensive way. The theoretical modeling and computer simulation of the information channel are carried out and then the modeling scheme of few-mode multicore optical fiber channel based on non-uniform mode field distribution is put forward. The proposed modeling scheme can not only exponentially increases the system capacity through fewmode multi-core optical fiber channel, but has better transmission performance compared to the channel of the same type to the uniform channel revealing from the simulation results.
基金supported by Taif University Researchers Supporting Program(Project Number:TURSP-2020/195)Taif University,Saudi Arabia.Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R203)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The developments of multi-core systems(MCS)have considerably improved the existing technologies in thefield of computer architecture.The MCS comprises several processors that are heterogeneous for resource capacities,working environments,topologies,and so on.The existing multi-core technology unlocks additional research opportunities for energy minimization by the use of effective task scheduling.At the same time,the task scheduling process is yet to be explored in the multi-core systems.This paper presents a new hybrid genetic algorithm(GA)with a krill herd(KH)based energy-efficient scheduling techni-que for multi-core systems(GAKH-SMCS).The goal of the GAKH-SMCS tech-nique is to derive scheduling tasks in such a way to achieve faster completion time and minimum energy dissipation.The GAKH-SMCS model involves a multi-objectivefitness function using four parameters such as makespan,processor utilization,speedup,and energy consumption to schedule tasks proficiently.The performance of the GAKH-SMCS model has been validated against two datasets namely random dataset and benchmark dataset.The experimental outcome ensured the effectiveness of the GAKH-SMCS model interms of makespan,pro-cessor utilization,speedup,and energy consumption.The overall simulation results depicted that the presented GAKH-SMCS model achieves energy effi-ciency by optimal task scheduling process in MCS.
基金Supported by the National High Technology Research and Development Programme of China(No.2010AA012302,2009AA01 A134)Tsinghua National Laboratory for Information Science and Technology(TNList)Cross-discipline Foundation
文摘This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the multiplier-matrix,and the other is caused by the multiplicand.For each of them,the paper puts forward an optimization method respectively.The first hash based method removes cache misses of the 1 st category effectively,and improves the performance by a factor of 6 on an Intel 8-core CPU for the best cases.For cache misses of the 2nd category,it proposes a new cache replacement algorithm,which achieves a cache hit rate much higher than other historical knowledge based algorithms,and the algorithm is applicable on CELL and GPU.To further verify the effectiveness of our methods,we implement our algorithm on GPU,and the performance perfectly scales with the size of on-chip storage.
文摘Deep neural networks(DNNs)are effective in solving both forward and inverse problems for nonlinear partial differential equations(PDEs).However,conventional DNNs are not effective in handling problems such as delay differential equations(DDEs)and delay integrodifferential equations(DIDEs)with constant delays,primarily due to their low regularity at delayinduced breaking points.In this paper,a DNN method that combines multi-task learning(MTL)which is proposed to solve both the forward and inverse problems of DIDEs.The core idea of this approach is to divide the original equation into multiple tasks based on the delay,using auxiliary outputs to represent the integral terms,followed by the use of MTL to seamlessly incorporate the properties at the breaking points into the loss function.Furthermore,given the increased training dificulty associated with multiple tasks and outputs,we employ a sequential training scheme to reduce training complexity and provide reference solutions for subsequent tasks.This approach significantly enhances the approximation accuracy of solving DIDEs with DNNs,as demonstrated by comparisons with traditional DNN methods.We validate the effectiveness of this method through several numerical experiments,test various parameter sharing structures in MTL and compare the testing results of these structures.Finally,this method is implemented to solve the inverse problem of nonlinear DIDE and the results show that the unknown parameters of DIDE can be discovered with sparse or noisy data.
文摘Modern shared-memory multi-core processors typically have shared Level 2(L2)or Level 3(L3)caches.Cache bottlenecks and replacement strategies are the main problems of such architectures,where multiple cores try to access the shared cache simultaneously.The main problem in improving memory performance is the shared cache architecture and cache replacement.This paper documents the implementation of a Dual-Port Content Addressable Memory(DPCAM)and a modified Near-Far Access Replacement Algorithm(NFRA),which was previously proposed as a shared L2 cache layer in a multi-core processor.Standard Performance Evaluation Corporation(SPEC)Central Processing Unit(CPU)2006 benchmark workloads are used to evaluate the benefit of the shared L2 cache layer.Results show improved performance of the multicore processor’s DPCAM and NFRA algorithms,corresponding to a higher number of concurrent accesses to shared memory.The new architecture significantly increases system throughput and records performance improvements of up to 8.7%on various types of SPEC 2006 benchmarks.The miss rate is also improved by about 13%,with some exceptions in the sphinx3 and bzip2 benchmarks.These results could open a new window for solving the long-standing problems with shared cache in multi-core processors.
文摘In this paper, a study related to the expected performance behaviour of present 3-level cache system for multi-core systems is presented. For this a queuing model for present 3-level cache system for multi-core processors is developed and its possible performance has been analyzed with the increase in number of cores. Various important performance parameters like access time and utilization of individual cache at different level and overall average access time of the cache system is determined. Results for up to 1024 cores have been reported in this paper.
基金Supported by the China Postdoctoral Science Foundation(No.2014M552115)the Fundamental Research Funds for the Central Universities,ChinaUniversity of Geosciences(Wuhan)(No.CUGL140833)the National Key Technology Support Program of China(No.2011BAH06B04)
文摘In order to improve the concurrent access performance of the web-based spatial computing system in cluster,a parallel scheduling strategy based on the multi-core environment is proposed,which includes two levels of parallel processing mechanisms.One is that it can evenly allocate tasks to each server node in the cluster and the other is that it can implement the load balancing inside a server node.Based on the strategy,a new web-based spatial computing model is designed in this paper,in which,a task response ratio calculation method,a request queue buffer mechanism and a thread scheduling strategy are focused on.Experimental results show that the new model can fully use the multi-core computing advantage of each server node in the concurrent access environment and improve the average hits per second,average I/O Hits,CPU utilization and throughput.Using speed-up ratio to analyze the traditional model and the new one,the result shows that the new model has the best performance.The performance of the multi-core server nodes in the cluster is optimized;the resource utilization and the parallel processing capabilities are enhanced.The more CPU cores you have,the higher parallel processing capabilities will be obtained.
文摘The first important problem in the star forming process is the formation of proto star core in star forming regions of molecular cloud. The multi core structure in star forming regions is related to the forming of proto star core. The molecular radiation of C 18 O( J = 1-0) in Cepheus C has been observed. The C 18 O( J = 1-0) observations form the basis for an interesting study on the cloud cores and star formation activity in the cores of the Cepheus C. In order to study the multi core structure of C 18 O( J = 1-0) in the Cepheus C the channel maps and the position velocity diagrams of C 18 O( J = 1-0) will be shown. From the maps it is found that the contour level and distribution size of the three cores in Cepheus C are related to the channel velocity very much. The channel velocity of C 18 O( J = 1-0) molecules in core b, which distributed in all the channels velocity, is different with one in core a and core c very much. The C 18 O( J = 1-0) molecules in core a and core c of the Cepheus C mostly distributed in the blue shifted channel velocity relating to peak velocity, and only in -10.0 ~ -9.5 km/s, which is the red shifted channel velocity relating to peak velocity. And the contour level of C 18 O( J = 1-0) in -10.0 ~ -9.5 km/s is small and the distrbution size in the channel map is small. According to the position velocity diagrams the asymmetry of the distribution both blue shifted and red shifted components should reflect the asymmetry of the profile. From the diagrams it also is found that the contour level and the distribution size of the three cores are different from each other. Both results from the maps and diagrams are coincident with each other.
基金supported by ZTE Industry-University-Institute Cooperation Funds under Grant No.2022ZTE09.
文摘Real-time system timing analysis is crucial for estimating the worst-case execution time(WCET)of a program.To achieve this,static or dynamic analysis methods are used,along with targeted modeling of the actual hardware system.This literature review focuses on calculating WCET for multi-core processors,providing a survey of traditional methods used for static and dynamic analysis and highlighting the major challenges that arise from different program execution scenarios on multi-core platforms.This paper outlines the strengths and weaknesses of current methodologies and offers insights into prospective areas of research on multi-core analysis.By presenting a comprehensive analysis of the current state of research on multi-core processor analysis for WCET estimation,this review aims to serve as a valuable resource for researchers and practitioners in the field.
文摘The state-of-the-art multi-core computer systems are based on Very Large Scale three Dimensional (3D) Integrated circuits (VLSI). In order to provide high-speed vertical data transmission in such 3D systems, efficient Through-Silicon Via (TSV) technology is critically important. In this paper, various Radio Frequency (RF) TSV designs and models are proposed. Specifically, the Cu-plug TSV with surrounding ground TSVs is used as the baseline structure. For further improvement, the dielectric coaxial and novel air-gap coaxial TSVs are introduced. Using the empirical parameters of these coaxial TSVs, the simulation results are obtained demonstrating that these coaxial RF-TSVs can provide two-order higher of cut-off frequencies than the Cu-plug TSVs. Based on these new RF-TSV technologies, we propose a novel 3D multi-core computer system as well as new architectures for manipulating the interfaces between RF and baseband circuit. Taking into consideration the scaling down of IC manufacture technologies, predictions for the performance of future generations of circuits are made. With simulation results indicating energy per bit and area per bit being reduced by 7% and 11% respectively, we can conclude that the proposed method is a worthwhile guideline for the design of future multi-core computer ICs.
基金The National Natural Science Foundation of China(62136008,62293541)The Beijing Natural Science Foundation(4232056)The Beijing Nova Program(20240484514).
文摘Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.
基金supported by the research on key technologies for monitoring and identifying drug abuse of anesthetic drugs and psychotropic drugs,and intervention for addiction(No.2023YFC3304200)the program of a study on the diagnosis of addiction to synthetic cannabinoids and methods of assessing the risk of abuse(No.2022YFC3300905)+1 种基金the program of Ab initio design and generation of AI models for small molecule ligands based on target structures(No.2022PE0AC03)ZHIJIANG LAB.
文摘The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.
基金jointly supported by the National Natural Science Foundation of China(Grant Nos.42075138 and 42375147)the Program on Key Basic Research Project of Jiangsu(Grant No.BE2023829)。
文摘Tropical cyclones(TCs)are one of the most serious types of natural disasters,and accurate TC activity predictions are key to disaster prevention and mitigation.Recently,TC track predictions have made significant progress,but the ability to predict their intensity is obviously lagging behind.At present,research on TC intensity prediction takes atmospheric reanalysis data as the research object and mines the relationship between TC-related environmental factors and intensity through deep learning.However,reanalysis data are non-real-time in nature,which does not meet the requirements for operational forecasting applications.Therefore,a TC intensity prediction model named TC-Rolling is proposed,which can simultaneously extract the degree of symmetry for strong TC convective cloud and convection intensity,and fuse the deviation-angle variance with satellite images to construct the correlation between TC convection structure and intensity.For TCs'complex dynamic processes,a convolutional neural network(CNN)is used to learn their temporal and spatial features.For real-time intensity estimation,multi-task learning acts as an implicit time-series enhancement.The model is designed with a rolling strategy that aims to moderate the long-term dependent decay problem and improve accuracy for short-term intensity predictions.Since multiple tasks are correlated,the loss function of 12 h and 24 h are corrected.After testing on a sample of TCs in the Northwest Pacific,with a 4.48 kt root-mean-square error(RMSE)of 6 h intensity prediction,5.78 kt for 12 h,and 13.94 kt for 24 h,TC records from official agencies are used to assess the validity of TC-Rolling.