Planning in lexical-prior-free environments presents a fundamental challenge for evaluating whether large language models(LLMs)possess genuine structural reasoning capabilities beyond lexical memorization.When predica...Planning in lexical-prior-free environments presents a fundamental challenge for evaluating whether large language models(LLMs)possess genuine structural reasoning capabilities beyond lexical memorization.When predicates and action names are replaced with semantically irrelevant random symbols while preserving logical structures,existing direct generation approaches exhibit severe performance degradation.This paper proposes a symbol-agnostic closed-loop planning pipeline that enables models to construct executable plans through systematic validation and iterative refinement.The system implements a complete generate-verify-repair cycle through six core processing components:semantic comprehension extracts structural constraints,language planner generates text plans,symbol translator performs structure-preserving mapping,consistency checker conducts static screening,Stanford Research Institute Problem Solver(STRIPS)simulator executes step-by-step validation,and VAL(Validator)provides semantic verification.A repair controller orchestrates four targeted strategies addressing typical failure patterns including first-step precondition errors andmid-segment statemaintenance issues.Comprehensive evaluation on PlanBench Mystery Blocksworld demonstrates substantial improvements over baseline approaches across both language models and reasoning models.Ablation studies confirm that each architectural component contributes non-redundantly to overall effectiveness,with targeted repair providing the largest impact,followed by deep constraint extraction and stepwise validation,demonstrating that superior performance emerges from synergistic integration of these mechanisms rather than any single dominant factor.Analysis reveals distinct failure patterns betweenmodel types—languagemodels struggle with local precondition satisfaction while reasoning models face global goal achievement challenges—yet the validation-driven mechanism successfully addresses these diverse weaknesses.A particularly noteworthy finding is the convergence of final success rates across models with varying intrinsic capabilities,suggesting that systematic validation and repair mechanisms play a more decisive role than raw model capacity in lexical-prior-free scenarios.This work establishes a rigorous evaluation framework incorporating statistical significance testing and mechanistic failure analysis,providingmethodological contributions for fair assessment and practical insights into building reliable planning systems under extreme constraint conditions.展开更多
Robots need task planning algorithms to sequence actions toward accomplishing goals that are impossible through individual actions. Off-the-shelf task planners can be used by intelligent robotics practitioners to solv...Robots need task planning algorithms to sequence actions toward accomplishing goals that are impossible through individual actions. Off-the-shelf task planners can be used by intelligent robotics practitioners to solve a variety of planning problems. However, many different planners exist, each with different strengths and weaknesses,and there are no general rules for which planner would be best to apply to a given problem. In this study, we empirically compare the performance of state-of-the-art planners that use either the planning domain description language(PDDL) or answer set programming(ASP) as the underlying action language. PDDL is designed for task planning, and PDDL-based planners are widely used for a variety of planning problems. ASP is designed for knowledge-intensive reasoning, but can also be used to solve task planning problems. Given domain encodings that are as similar as possible, we find that PDDL-based planners perform better on problems with longer solutions,and ASP-based planners are better on tasks with a large number of objects or tasks in which complex reasoning is required to reason about action preconditions and effects. The resulting analysis can inform selection among general-purpose planning systems for particular robot task planning domains.展开更多
Role-Based Access Control(RBAC)policies are at the core of Cybersecurity as they ease the enforcement of basic security principles,e.g.,Least Privilege and Separation of Duties.As ICT systems and business processes ev...Role-Based Access Control(RBAC)policies are at the core of Cybersecurity as they ease the enforcement of basic security principles,e.g.,Least Privilege and Separation of Duties.As ICT systems and business processes evolve,RBAC policies have to be updated to prevent unauthorised access to resources by capturing errors and misalignments between the current policy and reality.However,such update process is a human-intensive activity and it is expected to meet specific constraints.This paper proposes a semi-automatic RBAC maintenance process to fix and refine an RBAC state when“exceptions”and“violations”are detected.Exceptions are permissions some users realise they miss that are instrumental to their job and should be granted as soon as possible,while violations are permissions that have to be revoked since they are no longer required by their current owners.We propose a formalisation for the maintenance process which fixes single and multiple exceptions and violations by balancing two conflicting objectives,i.e.,(i)optimising the current RBAC state,and(ii)reducing the transition cost.Our approach is based on a Max-SAT formalisation of the constraint-based optimisation problem,and on PDDL planning to define the transition strategy with minimum cost.Our implementation relies on incomplete Max-SAT solvers and satisficing PDDL planners which provide approximations of optimal solutions.Experiments along with a comparative evaluation show good performance on real-world benchmarks.展开更多
Role-Based Access Control(RBAC)policies are at the core of Cybersecurity as they ease the enforcement of basic security principles,e.g.,Least Privilege and Separation of Duties.As ICT systems and business processes ev...Role-Based Access Control(RBAC)policies are at the core of Cybersecurity as they ease the enforcement of basic security principles,e.g.,Least Privilege and Separation of Duties.As ICT systems and business processes evolve,RBAC policies have to be updated to prevent unauthorised access to resources by capturing errors and misalignments between the current policy and reality.However,such update process is a human-intensive activity and it is expected to meet specific constraints.This paper proposes a semi-automatic RBAC maintenance process to fix and refine an RBAC state when“exceptions”and“violations”are detected.Exceptions are permissions some users realise they miss that are instrumental to their job and should be granted as soon as possible,while violations are permissions that have to be revoked since they are no longer required by their current owners.We propose a formalisation for the maintenance process which fixes single and multiple exceptions and violations by balancing two conflicting objectives,i.e.,(i)optimising the current RBAC state,and(ii)reducing the transition cost.Our approach is based on a Max-SAT formalisation of the constraint-based optimisation problem,and on PDDL planning to define the transition strategy with minimum cost.Our implementation relies on incomplete Max-SAT solvers and satisficing PDDL planners which provide approximations of optimal solutions.Experiments along with a comparative evaluation show good performance on real-world benchmarks.展开更多
以中继卫星(Racking and Data Relay Satellite,TDRS)为研究对象,以有色Petri网(Colored Petri Net,CPN)为数学工具,根据自顶向下的原则和层次化建模思想,提出一种基于CPN的TDRS操作规划模型,该模型分为顶层模型、控制模型、前向链路数...以中继卫星(Racking and Data Relay Satellite,TDRS)为研究对象,以有色Petri网(Colored Petri Net,CPN)为数学工具,根据自顶向下的原则和层次化建模思想,提出一种基于CPN的TDRS操作规划模型,该模型分为顶层模型、控制模型、前向链路数据接收任务与发送任务的操作规划模型和返向链路数据接收任务与发送任务的操作规划模型,有效地描述了TDRS的动态行为特性。最后,通过仿真实验得到了TDRS操作规划方案,验证了所建模型的有效性。与PDDL模型比较分析表明:所建模型可以有效引入TDRS的领域知识,能够有效提高求解效率。所建模型可以为TDRS操作规划方案的制定提供理论参考。展开更多
在地外天体执行遥操作任务时,在复杂约束条件下会出现多分支作业选择困难、事件属性设置复杂等现实难题。提出了一种通用型任务智能规划方法——分层规划对象模型(Hierarchical Plan Object Model,HPOM),巡视器在地外天体作业时,其分解...在地外天体执行遥操作任务时,在复杂约束条件下会出现多分支作业选择困难、事件属性设置复杂等现实难题。提出了一种通用型任务智能规划方法——分层规划对象模型(Hierarchical Plan Object Model,HPOM),巡视器在地外天体作业时,其分解为多选项作业、带约束行为、多分支指令序列、参数化虚拟指令4个层次,将带约束行为表示的计划转化为行为规划问题进行求解,获得求解方法集合。采用“人机协同迭代求解”(Human-In-The-Loop,HITL)的处理流程,生成指令序列以期实现对不同规划粒度方案的一致性验证。该方法已成功应用于“嫦娥四号”(Chang'E-4,CE-4)任务,为任务圆满成功提供了技术支撑。展开更多
基金supported by the Information,Production and Systems Research Center,Waseda University,and partly supported by the Future Robotics Organization,Waseda Universitythe Humanoid Robotics Institute,Waseda University,under the Humanoid Project+1 种基金the Waseda University Grant for Special Research Projects(grant numbers 2024C-518 and 2025E-027)was partly executed under the cooperation of organization between Kioxia Corporation andWaseda University.
文摘Planning in lexical-prior-free environments presents a fundamental challenge for evaluating whether large language models(LLMs)possess genuine structural reasoning capabilities beyond lexical memorization.When predicates and action names are replaced with semantically irrelevant random symbols while preserving logical structures,existing direct generation approaches exhibit severe performance degradation.This paper proposes a symbol-agnostic closed-loop planning pipeline that enables models to construct executable plans through systematic validation and iterative refinement.The system implements a complete generate-verify-repair cycle through six core processing components:semantic comprehension extracts structural constraints,language planner generates text plans,symbol translator performs structure-preserving mapping,consistency checker conducts static screening,Stanford Research Institute Problem Solver(STRIPS)simulator executes step-by-step validation,and VAL(Validator)provides semantic verification.A repair controller orchestrates four targeted strategies addressing typical failure patterns including first-step precondition errors andmid-segment statemaintenance issues.Comprehensive evaluation on PlanBench Mystery Blocksworld demonstrates substantial improvements over baseline approaches across both language models and reasoning models.Ablation studies confirm that each architectural component contributes non-redundantly to overall effectiveness,with targeted repair providing the largest impact,followed by deep constraint extraction and stepwise validation,demonstrating that superior performance emerges from synergistic integration of these mechanisms rather than any single dominant factor.Analysis reveals distinct failure patterns betweenmodel types—languagemodels struggle with local precondition satisfaction while reasoning models face global goal achievement challenges—yet the validation-driven mechanism successfully addresses these diverse weaknesses.A particularly noteworthy finding is the convergence of final success rates across models with varying intrinsic capabilities,suggesting that systematic validation and repair mechanisms play a more decisive role than raw model capacity in lexical-prior-free scenarios.This work establishes a rigorous evaluation framework incorporating statistical significance testing and mechanistic failure analysis,providingmethodological contributions for fair assessment and practical insights into building reliable planning systems under extreme constraint conditions.
基金supported in part by NSF (IIS1637736, IIS-1651089, IIS-1724157)ONR (N00014-182243)+2 种基金FLI (RFP2-000)Intel, RaytheonLockheed Martin
文摘Robots need task planning algorithms to sequence actions toward accomplishing goals that are impossible through individual actions. Off-the-shelf task planners can be used by intelligent robotics practitioners to solve a variety of planning problems. However, many different planners exist, each with different strengths and weaknesses,and there are no general rules for which planner would be best to apply to a given problem. In this study, we empirically compare the performance of state-of-the-art planners that use either the planning domain description language(PDDL) or answer set programming(ASP) as the underlying action language. PDDL is designed for task planning, and PDDL-based planners are widely used for a variety of planning problems. ASP is designed for knowledge-intensive reasoning, but can also be used to solve task planning problems. Given domain encodings that are as similar as possible, we find that PDDL-based planners perform better on problems with longer solutions,and ASP-based planners are better on tasks with a large number of objects or tasks in which complex reasoning is required to reason about action preconditions and effects. The resulting analysis can inform selection among general-purpose planning systems for particular robot task planning domains.
文摘Role-Based Access Control(RBAC)policies are at the core of Cybersecurity as they ease the enforcement of basic security principles,e.g.,Least Privilege and Separation of Duties.As ICT systems and business processes evolve,RBAC policies have to be updated to prevent unauthorised access to resources by capturing errors and misalignments between the current policy and reality.However,such update process is a human-intensive activity and it is expected to meet specific constraints.This paper proposes a semi-automatic RBAC maintenance process to fix and refine an RBAC state when“exceptions”and“violations”are detected.Exceptions are permissions some users realise they miss that are instrumental to their job and should be granted as soon as possible,while violations are permissions that have to be revoked since they are no longer required by their current owners.We propose a formalisation for the maintenance process which fixes single and multiple exceptions and violations by balancing two conflicting objectives,i.e.,(i)optimising the current RBAC state,and(ii)reducing the transition cost.Our approach is based on a Max-SAT formalisation of the constraint-based optimisation problem,and on PDDL planning to define the transition strategy with minimum cost.Our implementation relies on incomplete Max-SAT solvers and satisficing PDDL planners which provide approximations of optimal solutions.Experiments along with a comparative evaluation show good performance on real-world benchmarks.
文摘Role-Based Access Control(RBAC)policies are at the core of Cybersecurity as they ease the enforcement of basic security principles,e.g.,Least Privilege and Separation of Duties.As ICT systems and business processes evolve,RBAC policies have to be updated to prevent unauthorised access to resources by capturing errors and misalignments between the current policy and reality.However,such update process is a human-intensive activity and it is expected to meet specific constraints.This paper proposes a semi-automatic RBAC maintenance process to fix and refine an RBAC state when“exceptions”and“violations”are detected.Exceptions are permissions some users realise they miss that are instrumental to their job and should be granted as soon as possible,while violations are permissions that have to be revoked since they are no longer required by their current owners.We propose a formalisation for the maintenance process which fixes single and multiple exceptions and violations by balancing two conflicting objectives,i.e.,(i)optimising the current RBAC state,and(ii)reducing the transition cost.Our approach is based on a Max-SAT formalisation of the constraint-based optimisation problem,and on PDDL planning to define the transition strategy with minimum cost.Our implementation relies on incomplete Max-SAT solvers and satisficing PDDL planners which provide approximations of optimal solutions.Experiments along with a comparative evaluation show good performance on real-world benchmarks.
文摘以中继卫星(Racking and Data Relay Satellite,TDRS)为研究对象,以有色Petri网(Colored Petri Net,CPN)为数学工具,根据自顶向下的原则和层次化建模思想,提出一种基于CPN的TDRS操作规划模型,该模型分为顶层模型、控制模型、前向链路数据接收任务与发送任务的操作规划模型和返向链路数据接收任务与发送任务的操作规划模型,有效地描述了TDRS的动态行为特性。最后,通过仿真实验得到了TDRS操作规划方案,验证了所建模型的有效性。与PDDL模型比较分析表明:所建模型可以有效引入TDRS的领域知识,能够有效提高求解效率。所建模型可以为TDRS操作规划方案的制定提供理论参考。
文摘在地外天体执行遥操作任务时,在复杂约束条件下会出现多分支作业选择困难、事件属性设置复杂等现实难题。提出了一种通用型任务智能规划方法——分层规划对象模型(Hierarchical Plan Object Model,HPOM),巡视器在地外天体作业时,其分解为多选项作业、带约束行为、多分支指令序列、参数化虚拟指令4个层次,将带约束行为表示的计划转化为行为规划问题进行求解,获得求解方法集合。采用“人机协同迭代求解”(Human-In-The-Loop,HITL)的处理流程,生成指令序列以期实现对不同规划粒度方案的一致性验证。该方法已成功应用于“嫦娥四号”(Chang'E-4,CE-4)任务,为任务圆满成功提供了技术支撑。