1.College of Energy and Electrical Engineering，Hohai University;2.Nanjing University of Science & Technology
Operating Fee for Central Universities
针对自动驾驶场景下行人检测任务中对中、小尺寸目标和被遮挡目标的检测需求，以及现有深度学习模型的不足，本文提出基于ResNet34_D的改进YOLOv3模型：通过改进残差网络的卷积块结构，提出ResNet34_D，并作为YOLOv3的主干网络，以降低模型尺寸和训练难度；在ResNet34_D的3个尺度卷积特征图之后，增加SPP层和DropBlock模块，以提高模型的泛化能力；基于k-means 聚类算法确定自适应的多尺度锚框尺寸，提高对大、中、小三种尺寸行人目标的检测能力；引入DIoU损失函数，提高对被遮挡目标的识别能力。本文提出模型的消融实验验证了各个改进部分在提高模型检测准确率上的有效性。实验结果表明，本文提出的基于ResNet34_D的改进YOLOv3模型具有较好的准确率和实时性，在BDD100K-Person数据集上的AP50 达到了69.8%，检测速度达到了130帧/秒。由本文方法与现有目标检测方法的对比实验可知，本文方法对小目标和遮挡目标的误检率更低，速度更快，具有一定的实际应用价值。
Pedestrian detection is one of the main tasks of autonomous driving. The existed deep neural network is lack of the ability to detect small-size or medium-size objects and occluded objects, which is the requirement of pedestrian detection since pedestrians in images acquired by vehicle-equipped cameras are always small or medium or occluded. In this paper, an improved YOLOV3 model based on Resnet34_D has been proposed for pedestrian detection. And the contributions of the improved model are as follows. Firstly, developed residual network ResNet34_D by modifying the structure of convolutional block is proposed, and it is selected as the backbone of YOLOv3 to reduce the size of the model so as to decrease the training difficulty. Secondly, SPP layer and DropBlock module are introduced after the feature maps of three stages of ResNet34_D, which can improve the detection accuracy of pedestrian objects with different sizes. Thirdly, to further increase the detection accuracy, the multi-scale anchors are determined by k-means. Finally, DIoU loss function is used to improve the ability of detecting the occluded objects. Ablation experiments for the proposed model have demonstrated the effectiveness of each developed technologies in improving detection accuracy. And more experimental results show that the AP50 of the proposed model on BDD100K-Person dataset reaches 69.8%, and the detection speed can achieve 130FPS. Comparison experiments between the proposed method and the other existed methods demonstrate that the false detection rate for small targets and occlusion targets in this paper is lower, and the speed is faster. Therefore, the proposed improved YOLOV3 model based on Resnet34_D is valuable in practical applications.