当前位置: 首页 > news >正文

python训练Faster RCNNC++调用训练好的模型进行物体检测-基于opencv3.4.3(超详细)

介绍

上一篇博文讲到tensorflow Object Detection api 基于SSD模型对数据进行训练,然后通过C++版本的opencv进行调用,但是通过实验发现,SSD虽然快但是准确率实在太低了,所以又重新使用Faster RCNN进行重新训练~废话不多说了,开始主要内容介绍了!

训练阶段

配置:GTX1060、I7-8700k~
关于object detection api的配置使用就不多说了,详细请参考:
https://blog.csdn.net/zong596568821xp/article/details/82015126
https://blog.csdn.net/chuquanchang1051/article/details/79804965
这里是基于faster_rcnn_resnet50_coco模型,下载链接见:faster_rcnn_resnet50_coco_2018_01_28,下载之后解压,解压后的文件如下图所示:
在这里插入图片描述

当解压模型后,这时需要到object_detection/samples/configs/文件夹中找到对应的config文件,这里是faster_rcnn_resnet50_coco.config文件,打开config文件,在这里我修改了它的类别数目,由于这里只有一种garbage类别,所以num_classes=1。

model {
faster_rcnn {
num_classes: 1  //有几种类别就写几种
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}

除此之外,还需要配置一些文件目录,其中fine_tune_checkpoint就是下载模型中的model.ckpt,input_path和label_map_path就是训练数据和标签,其中object-detection.pbtxt是自己建立的文件,记录着自己的标签信息,因为这里只有一种类别,格式如下:

item {id: 1name: 'garbage'
}
use_moving_average: false
}
gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "model_zoo/faster_rcnn_resnet50_coco_2018_01_28/model.ckpt"
from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
}train_input_reader: {
tf_record_input_reader {
input_path: "train_data/garbage/train/tf_record/train.record"
}
label_map_path: "model_zoo/faster_rcnn_resnet50_coco_2018_01_28/object-detection.pbtxt"
}eval_config: {
num_examples: 116  #测试集的数据数量,需要修改下
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}eval_input_reader: {
tf_record_input_reader {
input_path: "train_data/garbage/test/tf_record/test.record"
}
label_map_path: "model_zoo/faster_rcnn_resnet50_coco_2018_01_28/object-detection.pbtxt"
shuffle: false
num_readers: 1
}

修改完配置文件之后,回到object detection api的train.py文件中,train.py文件原来在legacy文件夹中,将其复制出来即可。在train.py文件中,修改以下两项内容:

#模型保存路径
flags.DEFINE_string('train_dir', default='training_model/garbage/',help='')
#修改过的配置文件.config路径
flags.DEFINE_string('pipeline_config_path', default='samples/configs/faster_rcnn_resnet50_coco.config',help='')

然后就可以训练了~
在这里插入图片描述
训练一晚后的效果!

模型的转换

当训练结束后,需要把训练好的模型转换为.pb文件,下图为训练生成的文件。
在这里插入图片描述
转换文件为export_inference_graph.py,源代码如下所示,其中转换的模型要为最好的模型:

import tensorflow as tf
from google.protobuf import text_format
from object_detection import exporter
from object_detection.protos import pipeline_pb2slim = tf.contrib.slim
flags = tf.app.flagsflags.DEFINE_string('input_type', 'image_tensor', 'Type of input node. Can be ''one of [`image_tensor`, `encoded_image_string_tensor`, ''`tf_example`]')
flags.DEFINE_string('input_shape', None,'If input_type is `image_tensor`, this can explicitly set ''the shape of this input tensor to a fixed size. The ''dimensions are to be provided as a comma-separated list ''of integers. A value of -1 can be used for unknown ''dimensions. If not specified, for an `image_tensor, the ''default shape will be partially specified as ''`[None, None, None, 3]`.')
flags.DEFINE_string('pipeline_config_path', 'samples/configs/faster_rcnn_resnet50_coco.config','Path to a pipeline_pb2.TrainEvalPipelineConfig config ''file.')
flags.DEFINE_string('trained_checkpoint_prefix', 'training_model/garbage/model.ckpt-103122','Path to trained checkpoint, typically of the form ''path/to/model.ckpt')
flags.DEFINE_string('output_directory', 'pb_model/faster_rcnn_resnet50_coco', 'Path to write outputs.')
flags.DEFINE_string('config_override', '','pipeline_pb2.TrainEvalPipelineConfig ''text proto to override pipeline_config_path.')
flags.DEFINE_boolean('write_inference_graph', False,'If true, writes inference graph to disk.')
tf.app.flags.mark_flag_as_required('pipeline_config_path')
tf.app.flags.mark_flag_as_required('trained_checkpoint_prefix')
tf.app.flags.mark_flag_as_required('output_directory')
FLAGS = flags.FLAGSdef main(_):pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()with tf.gfile.GFile(FLAGS.pipeline_config_path, 'r') as f:text_format.Merge(f.read(), pipeline_config)text_format.Merge(FLAGS.config_override, pipeline_config)if FLAGS.input_shape:input_shape = [int(dim) if dim != '-1' else Nonefor dim in FLAGS.input_shape.split(',')]else:input_shape = Noneexporter.export_inference_graph(FLAGS.input_type, pipeline_config, FLAGS.trained_checkpoint_prefix,FLAGS.output_directory, input_shape=input_shape,write_inference_graph=FLAGS.write_inference_graph)if __name__ == '__main__':tf.app.run()

转换好之后,会在对应的文件夹中生成.pb文件。
在这里插入图片描述

这和最初下载的模型非常相似,但是里面的结构有所不同。opencv在调用faster rcnn等物体检测模型时,还需要一个.pbtxt文件,它可以告诉函数该怎么读取模型,该转换由tf_text_graph_faster_rcnn.py来完成。

详细请参考https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API#generate-a-config-file,该文件在opencv的dnn模块中,下载链接:dnn模块,下载完之后放到object_detection文件夹下即可。

需要修改的地方有input、output、config、num_classes,如果没有num_classes,可以将其加上,也可以不加,.config文件中包含这项。


if __name__ == "__main__":parser = argparse.ArgumentParser(description='Run this script to get a text graph of ''Faster-RCNN model from TensorFlow Object Detection API. ''Then pass it with .pb file to cv::dnn::readNetFromTensorflow function.')parser.add_argument('--input', default='F:/tensorflow_object_detection/object_detection/pb_model/faster_rcnn_resnet50_coco/frozen_inference_graph.pb',help='Path to frozen TensorFlow graph.')parser.add_argument('--output', default='F:/tensorflow_object_detection/object_detection/pb_model/faster_rcnn_resnet50_coco/faster_rcnn.pbtxt',help='Path to output text graph.')parser.add_argument('--config',default='F:/tensorflow_object_detection/object_detection/samples/configs/faster_rcnn_resnet50_coco.config',help='Path to a *.config file is used for training.')# parser.add_argument('--num_classes', required=True, default=1,help='Path to a *.config file is used for training.')args = parser.parse_args()createFasterRCNNGraph(args.input, args.config, args.output)

转换完成之后会生成faster_rcnn.pbtxt文件,然后就交给C++调用吧!除了C++,还有一个opencv-python版本的测试程序:

import cv2 as cvcvNet = cv.dnn.readNetFromTensorflow('C:/Users/18301/Desktop/faster_rcnn_resnet50_coco/frozen_inference_graph.pb','C:/Users/18301/Desktop/faster_rcnn_resnet50_coco/faster_rcnn.pbtxt')img = cv.imread('C:/Users/18301/Desktop/images/image12.jpg')
rows = img.shape[0]
cols = img.shape[1]
cvNet.setInput(cv.dnn.blobFromImage(img, size=(300, 300), swapRB=True, crop=False))
cvOut = cvNet.forward()
print(cvOut)
for detection in cvOut[0,0,:,:]:score = float(detection[2])if score > 0.3:left = detection[3] * colstop = detection[4] * rowsright = detection[5] * colsbottom = detection[6] * rowscv.rectangle(img, (int(left), int(top)), (int(right), int(bottom)), (23, 230, 210), thickness=2)cv.imshow('img', img)
cv.waitKey()

Opencv调用Faster RCNN源代码

#include<opencv2\opencv.hpp>
#include<opencv2\dnn.hpp>
#include <iostream>
#include<map>
#include<string>
#include<time.h>using namespace std;
using namespace cv;//这里的宽度和高度不能太小了,否则识别率会降低,但是大的尺寸会消耗比较多的时间。
const size_t inWidth = 600;
const size_t inHeight = 600;
//const float WHRatio = inWidth / (float)inHeight;
const char* classNames[] = { "garbage" };//只有一类的话,写一类就好//这是coco数据集的类别
//const char* classNames[] = { "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
//"fire hydrant", "background", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "background", "backpack",
//"umbrella", "background", "background", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket",
//"bottle", "background", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut",
//"cake", "chair", "couch", "potted plant", "bed", "background", "dining table", "background", "background", "toilet", "background", "tv", "laptop", "mouse", "remote", "keyboard",
//"cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "background", "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush", "background" };const float WHRatio = inWidth / (float)inHeight;int main() 
{String weights = "C:/Users/18301/Desktop/faster_rcnn_resnet50_coco/frozen_inference_graph.pb";String prototxt = "C:/Users/18301/Desktop/faster_rcnn_resnet50_coco/faster_rcnn.pbtxt";dnn::Net net = cv::dnn::readNetFromTensorflow(weights, prototxt);Mat frame = cv::imread("C:/Users/18301/Desktop/images/image17.jpg");Size frame_size = frame.size();cv::Mat blob = cv::dnn::blobFromImage(frame, 1, Size(inWidth, inHeight), false, true);//这里的格式是个坑,参考其它博客真的是不行,最后在google上找了好久才发现这个问题,这是个坑!net.setInput(blob);Mat output = net.forward();Mat detectionMat(output.size[2], output.size[3], CV_32F, output.ptr<float>());float confidenceThreshold = 0.5;for (int i = 0; i < detectionMat.rows; i++){float confidence = detectionMat.at<float>(i, 2);if (confidence > confidenceThreshold){size_t objectClass = (size_t)(detectionMat.at<float>(i, 1));int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * frame.cols);int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * frame.rows);int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * frame.cols);int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * frame.rows);ostringstream ss;ss << confidence;String conf(ss.str());Rect object((int)xLeftBottom, (int)yLeftBottom,(int)(xRightTop - xLeftBottom),(int)(yRightTop - yLeftBottom));rectangle(frame, object, Scalar(0, 255, 0), 2);String label = String(classNames[objectClass]) + ": " + conf;int baseLine = 0;Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);rectangle(frame, Rect(Point(xLeftBottom, yLeftBottom - labelSize.height),Size(labelSize.width, labelSize.height + baseLine)),Scalar(0, 255, 0), CV_FILLED);putText(frame, label, Point(xLeftBottom, yLeftBottom),FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));}}namedWindow("image", CV_WINDOW_NORMAL);imshow("image", frame);waitKey(0);return 0;
}

注意:cv::Mat blob = cv::dnn::blobFromImage(frame, 1, Size(inWidth, inHeight), false, true);这里会导致faster rcnn检测结果乱码,自己也是查了好久才发现,按这个格式来,亲测有效!

这里的宽度和高度不能太小了,否则识别率会降低,但是大的尺寸会消耗比较多的时间,自己找一个折衷吧!

const size_t inWidth = 600;
const size_t inHeight = 600;

实验结果

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述


http://www.taodudu.cc/news/show-1782117.html

相关文章:

  • 【physx/wasm】在physx中添加自定义接口并重新编译wasm
  • excel---常用操作
  • Lora训练Windows[笔记]
  • linux基础指令讲解(ls、pwd、cd、touch、mkdir)
  • InnoDB 事务处理机制
  • 启明云端ESP32 C3 模组WT32C3通过 MQTT 连接 AWS
  • mask rcnn数据转换为tfrecord数据
  • opencv3.4.2调用训练好的Openpose模型
  • python训练mask rcnn模型C++调用训练好的模型--基于opencv4.0(干货满满)
  • leetcode之移除链表的元素
  • leetcode之奇偶链表
  • leetcode之回文链表
  • ubuntu16.04下安装配置caffe2和detectron(亲测有效,非常简单)
  • leetcode之字符串中的第一个唯一字符
  • 哈希表中处理冲突的方法
  • SENet算法解析
  • SNIP物体检测算法理解
  • yolov3聚类自己数据的anchor box
  • Ubuntu下安装qt57creator-plugin-ros,在QT中进行ROS开发(亲测有效)
  • ROS下面调用自定义的头文件和.cpp/.so文件(亲测有效)
  • C++调用编译好的darknet来进行物体监测
  • hard-negative mining详细介绍
  • yolov3中如何进行聚类得到anchor box的
  • ubuntu16.04下编译安装Autoware
  • catkin_make和cmake
  • 深度学习三种分割定义
  • 基于激光雷达点云数据的目标检测
  • 递归和循环两种方式求解连续数的相加
  • PCL中把txt文件转换成.pcd文件(很简单)
  • pcl对点云进行直通滤波
  • error: (-205:Formats of input arguments do not match) All the matrices must have the same data type
  • ubuntu16.04下Qt无法输入中文注释
  • ROS下sensor_msgs::ImagePtr到sensor_msgs::Image之间的转换
  • ROS下同时接收多个话题并实现相机和雷达的数据融合
  • 柱状图之最大矩形面积
  • 哈希函数的构造方法以及哈希表解决冲突的方式