基于分层强化学习的通用装配序列规划算法
作者:
作者单位:

1.南开大学;2.中国科学院深圳先进技术研究院

作者简介:

通讯作者:

中图分类号:

TP273

基金项目:

国家自然科学基金U1613210,天津市杰出青年科学基金19JCJQJC62100,天津市自然科学基金19JCYBJC18500,中央高校基本科研业务费


A general assembly sequence planning algorithm based on hierarchical reinforcement learning
Author:
Affiliation:

1.Nankai University;2.Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    对于装配序列规划问题,现有算法大多聚焦于单一的目标构型。对于多目标构型以及大规模问题,现有算法往往存在维数灾难及泛化能力差等问题。为此,本文利用了装配序列规划问题分层结构的特点,提出了一种基于分层强化学习的适用于多构型装配任务的通用装配序列规划方法。本文首先将装配序列规划问题构建为一个分层的马尔科夫决策过程,其中,上层进行序列规划,下层进行零件的动作规划,符合装配过程层次化的结构,使规划方法更具灵活性,且可解释性更强。其次,针对分层马尔科夫决策过程,本文提出了一种基于分层强化学习的通用装配序列规划算法,提高了规划方法对多种目标构型任务的适应能力和泛化能力以及对目标构型的信息利用率。最后,在搭建的仿真平台上进行了验证,结果表明所提方法可以提取到关于装配问题的广义信息,对于不同零件初始位置以及其它多种构型装配任务均具有着较好的决策能力,验证了所提方法的有 效性及通用性。从而实现了适用于多目标构型的更加通用灵活的装配序列规划算法。

    Abstract:

    For assembly sequence planning problems, most of the existing algorithms focus on a single target configuration. For multi-target configurations and large-scale problems, existing algorithms often have dimension disaster problems with poor generalization ability. To this end, this paper uses the characteristics of the hierarchical structure of assembly sequence planning problems and conducts a general assembly sequence planning method based on hierarchical reinforcement learning, which is suitable for multiple configuration assembly tasks. First of all, this paper constructs the assembly sequence planning problem as a hierarchical Markov decision process, in which the upper layer performs sequence planning, and the lower layer carries out workpiece motion planning, which conforms to the hierarchical structure of the assembly process, making the planning method more flexible and interpretable. Secondly, in view of the hierarchical Markov decision process, this paper proposes a general assembly sequence planning algorithm based on hierarchical reinforcement learning, which improved the adaptability and generalization ability of the planning method to multiple target configuration tasks and the information utilization of the target configuration. Finally, the proposed method is verified on the built simulation platform. The results show that the proposed method can extract general information about assembly problems, and it has good decision-making ability for any initial state and other various configuration assembly tasks, which verifies the effectiveness and flexibility of the method. Thus, a more general and flexible assembly sequence planning algorithm suitable for multiple configuration is realized.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-09-15
  • 最后修改日期:2020-12-05
  • 录用日期:2020-12-25
  • 在线发布日期: 2021-02-04
  • 出版日期: