基于深度相机融合视觉及深度强化学习的中国象棋定位与行棋策略
DOI:
作者:
作者单位:

1.南京师范大学;2.南京林业大学;3.南京邮电大学;4.东南大学

作者简介:

通讯作者:

中图分类号:

TP273

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


Chinese chess positioning and playing strategy based on integrated depth vision and deep reinforcement learning
Author:
Affiliation:

1.Nanjing Normal University;2.Nanjing Forestry University;3.Nanjing University of Posts and Telecommunications;4.Southeast University

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    中国象棋人机对弈系统实现的关键在于棋盘识别定位和自主行棋策略。首先,针对棋盘识别与定位问题,提出一种基于单目相机视觉与深度相机视觉融合的棋盘识别定位方法。该方法设计了一个棋子网格识别网络,利用立体棋子三维特征,将深度图像转换为棋盘网格,将棋子坐标与棋盘网格信息融合计算,有效提高棋子棋盘的识别定位精度。其次,针对自主行棋策略问题,提出一种基于深度神经网络与蒙特卡洛树搜索(Monte Carlo Tree Search, MCTS)的决策方法。该方法使用上限置信区间(Upper Confidence bound apply to Tree, UCT)指导改进的具有终局特征判断的蒙特卡洛树搜索,使用改进的具有特定方向优化的随机行棋策略指导模拟行棋,训练具有多尺度及残差结构的策略价值网络。最后,通过自对弈获取训练数据,并通过智能体对抗验证、更新网络模型参数,实现中国象棋识别与对弈。实验表明,相较于单目视觉识别,本文方法具有更高的精确度和稳定性,识别率达到97%;相较于基准枝剪搜索算法,本文方法对弈时赢得82%的对局,且所需运算时间缩短41%。

    Abstract:

    The key to the realization of the Chinese chess game system lies in the board recognition and positioning and autonomous chess strategy. First of all, for the problem of chessboard recognition and positioning, a method for chessboard recognition and positioning based on the fusion of monocular camera vision and depth camera vision is proposed. This method designs a chess piece grid recognition network, uses the three-dimensional characteristics of the three-dimensional chess pieces to convert the depth image into a chessboard grid, and integrates the chess piece coordinates with the chessboard grid information to effectively improve the recognition and positioning accuracy of the chessboard. Secondly, aiming at the problem of autonomous chess strategy, a decision-making method based on deep neural network and Monte Carlo Tree Search (MCTS) is proposed. This method uses the Upper Confidence bound apply to Tree (UCT) method to guide the improved Monte Carlo tree search with end-game feature judgment, and uses the improved random chess strategy optimized in a specific direction to guide the simulation of chess, which trains a policy and value network with scale and residual structure. Finally, the training data is obtained through self-playing, and the network model parameters are updated and verified through the agent confrontation, so as to realize the recognition and game of Chinese chess. Experiments show that compared with monocular visual recognition, this method has higher accuracy and stability, and the recognition rate reaches 97%. Compared with the branch and pruning search algorithm baseline, this method wins 82% of the games, and the computing time is reduced by 41%.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-04-29
  • 最后修改日期:2021-08-26
  • 录用日期:2021-08-26
  • 在线发布日期:
  • 出版日期: