当前位置：首页 > news >正文

tensorflow线下训练SSD深度学习物体检测模型，C++线上调用模型进行识别定位（干货满满）

news 2024/5/20 3:51:44

一、介绍

近两年来出现了许多优秀的分类、物体检测和分割模型，包括SSD，yolo，Mask RCNN，这些模型速度快，准确率高，被广泛应用于无人驾驶、人脸定位识别、工业缺陷检测中。工业领域，系统一般基于C/C++，而这些模型基本都都是基于python的，那么如何把python训练的模型集成到C++中呢？Opencv搭了一个很好的桥梁。opencv4.0可以支持调用Mask RCNN、Faster RCNN、SSD、YOLO等模型。这样我们可以通过线下训练好这些模型，通过opencv把模型集成到系统中。

二、opencv调用模型

关于opencv如何调用深度学习模型请参考前面两篇博文：

https://blog.csdn.net/qq_29462849/article/details/85272575
https://blog.csdn.net/qq_29462849/article/details/85262609

opencv做了一个很好的tensorflow接口，这个时候就不得不提tensorflow object detection api，里面集成了SSD、Faster RCNN、Mask RCNN等多种模型，opencv和其无缝对接。

关于tensorflow object detection api的安装和配置请参考：

https://blog.csdn.net/zong596568821xp/article/details/82015126
https://blog.csdn.net/chuquanchang1051/article/details/79804965

如果在windows下配置时，出现缺少vc++14，那么下载安装即可。windows下如果proto成功了，但是安装不上，没关系，直接把整个object_detection模块拿来用就好。

三、训练数据

由于object detection api训练时采用的是tfrecords文件，所以需要进行转换，对voc数据来说，需要把xml文件转换成csv文件，源代码如下：

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ETdef xml_to_csv(path):xml_list = []for xml_file in glob.glob(path + '/*.xml'):tree = ET.parse(xml_file)root = tree.getroot()for member in root.findall('object'):value = (root.find('filename').text,int(root.find('size')[0].text),int(root.find('size')[1].text),member[0].text,int(member[4][0].text),int(member[4][1].text),int(member[4][2].text),int(member[4][3].text))xml_list.append(value)column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']xml_df = pd.DataFrame(xml_list, columns=column_name)return xml_dfdef main():image_path = os.path.join('./train_data/tank/test/', 'Annotations')xml_df = xml_to_csv(image_path)xml_df.to_csv('./train_data/tank/test/csv/test.csv', index=None)print('Successfully converted xml to csv.')main()

当把xml文件转换成csv文件后，需要对csv文件进行转换，生成最终得tfrecords文件，转换代码如下所示：

"""
Usage:# From tensorflow/models/# Create train data:python generate_tfrecord.py --csv_input=data/train_labels.csv  --output_path=train.record# Create test data:python generate_tfrecord.py --csv_input=data/test_labels.csv  --output_path=test.record
"""
from __future__ import division
from __future__ import print_function
from __future__ import absolute_importimport os
import io
import pandas as pd
import tensorflow as tf
import sys
from PIL import Image# sys.path.append("F:\setup\tf\models-master\research\object_detection")
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDictflags = tf.app.flags
flags.DEFINE_string('csv_input', default='./train_data/tank/train/csv/train.csv',help='')
flags.DEFINE_string('output_path', default='./train_data/tank/train/tf_record/train.record',help='')
flags.DEFINE_string('image_dir', default='./train_data/tank/train/JPEGImages/',help='')
FLAGS = flags.FLAGS# TO-DO replace this with label map
def class_text_to_int(row_label):if row_label == 'tank':return 1elif row_label == 'white':return 2else:return 0def split(df, group):data = namedtuple('data', ['filename', 'object'])gb = df.groupby(group)return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]def create_tf_example(group, path):with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:encoded_jpg = fid.read()encoded_jpg_io = io.BytesIO(encoded_jpg)image = Image.open(encoded_jpg_io)width, height = image.sizefilename = group.filename.encode('utf8')image_format = b'jpg'xmins = []xmaxs = []ymins = []ymaxs = []classes_text = []classes = []for index, row in group.object.iterrows():xmins.append(row['xmin'] / width)xmaxs.append(row['xmax'] / width)ymins.append(row['ymin'] / height)ymaxs.append(row['ymax'] / height)classes_text.append(row['class'].encode('utf8'))classes.append(class_text_to_int(row['class']))tf_example = tf.train.Example(features=tf.train.Features(feature={'image/height': dataset_util.int64_feature(height),'image/width': dataset_util.int64_feature(width),'image/filename': dataset_util.bytes_feature(filename),'image/source_id': dataset_util.bytes_feature(filename),'image/encoded': dataset_util.bytes_feature(encoded_jpg),'image/format': dataset_util.bytes_feature(image_format),'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),'image/object/class/text': dataset_util.bytes_list_feature(classes_text),'image/object/class/label': dataset_util.int64_list_feature(classes),}))return tf_exampledef main(_):writer = tf.python_io.TFRecordWriter(FLAGS.output_path)path = os.path.join(FLAGS.image_dir)examples = pd.read_csv(FLAGS.csv_input)grouped = split(examples, 'filename')for group in grouped:tf_example = create_tf_example(group, path)writer.write(tf_example.SerializeToString())writer.close()output_path = os.path.join(os.getcwd(), FLAGS.output_path)print('Successfully created the TFRecords: {}'.format(output_path))if __name__ == '__main__':tf.app.run()

把数据准备好后，需要进行一些配置，主要是samples/config文件夹下面的.config文件。
在这里插入图片描述
这里以ssd_mobilenet_v1_coco.config为例，当使用这个模型时候需要首先下载预训练的一些参数，下载链接：模型。
更多模型请参考：model_zoo

需要对ssd_mobilenet_v1_coco.config进行修改，主要有num_class，batch_size，还有一些文件路径。

model {
ssd {
num_classes: 2 //几类物体就写几类
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}

按照格式，改下路径即可。

fine_tune_checkpoint: "model_zoo/ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 20000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}train_input_reader: {
tf_record_input_reader {
input_path: "train_data/train/tf_record/train.record"
}
label_map_path: "model_zoo/ssd_mobilenet_v1_coco_11_06_2017/object-detection.pbtxt"
}eval_config: {
num_examples: 100
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
#max_evals: 10
}eval_input_reader: {
tf_record_input_reader {
input_path: "train_data/test/tf_record/test.record"
}
label_map_path: "model_zoo/ssd_mobilenet_v1_coco_11_06_2017/object-detection.pbtxt"
shuffle: false
num_readers: 1
}

关于.pbtxt文件是自己建立的，格式如下所示：

item {id: 1name: 'tank'
}
item {id: 2name: 'white'
}

然后，找到train.py，配置好路径，可以训练了~~~

# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================r"""Training executable for detection models.This executable is used to train DetectionModels. There are two ways of
configuring the training job:1) A single pipeline_pb2.TrainEvalPipelineConfig configuration file
can be specified by --pipeline_config_path.Example usage:./train \--logtostderr \--train_dir=path/to/train_dir \--pipeline_config_path=pipeline_config.pbtxt2) Three configuration files can be provided: a model_pb2.DetectionModel
configuration file to define what type of DetectionModel is being trained, an
input_reader_pb2.InputReader file to specify what training data will be used and
a train_pb2.TrainConfig file to configure training parameters.Example usage:./train \--logtostderr \--train_dir=path/to/train_dir \--model_config_path=model_config.pbtxt \--train_config_path=train_config.pbtxt \--input_config_path=train_input_config.pbtxt
"""import functools
import json
import os
import tensorflow as tffrom object_detection.builders import dataset_builder
from object_detection.builders import graph_rewriter_builder
from object_detection.builders import model_builder
from object_detection.legacy import trainer
from object_detection.utils import config_utiltf.logging.set_verbosity(tf.logging.INFO)flags = tf.app.flags
flags.DEFINE_string('master', '', 'Name of the TensorFlow master to use.')
flags.DEFINE_integer('task', 0, 'task id')
flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy per worker.')
flags.DEFINE_boolean('clone_on_cpu', False,'Force clones to be deployed on CPU.  Note that even if ''set to False (allowing ops to run on gpu), some ops may ''still be run on the CPU if they have no GPU kernel.')
flags.DEFINE_integer('worker_replicas', 1, 'Number of worker+trainer ''replicas.')
flags.DEFINE_integer('ps_tasks', 0,'Number of parameter server tasks. If None, does not use ''a parameter server.')
flags.DEFINE_string('train_dir', default='training_model/tank/',help='')flags.DEFINE_string('pipeline_config_path', default='samples/configs/ssd_mobilenet_v1_coco.config',help='')flags.DEFINE_string('train_config_path', '','Path to a train_pb2.TrainConfig config file.')
flags.DEFINE_string('input_config_path', '','Path to an input_reader_pb2.InputReader config file.')
flags.DEFINE_string('model_config_path', '','Path to a model_pb2.DetectionModel config file.')FLAGS = flags.FLAGS@tf.contrib.framework.deprecated(None, 'Use object_detection/model_main.py.')
def main(_):assert FLAGS.train_dir, '`train_dir` is missing.'if FLAGS.task == 0: tf.gfile.MakeDirs(FLAGS.train_dir)if FLAGS.pipeline_config_path:configs = config_util.get_configs_from_pipeline_file(FLAGS.pipeline_config_path)if FLAGS.task == 0:tf.gfile.Copy(FLAGS.pipeline_config_path,os.path.join(FLAGS.train_dir, 'pipeline.config'),overwrite=True)else:configs = config_util.get_configs_from_multiple_files(model_config_path=FLAGS.model_config_path,train_config_path=FLAGS.train_config_path,train_input_config_path=FLAGS.input_config_path)if FLAGS.task == 0:for name, config in [('model.config', FLAGS.model_config_path),('train.config', FLAGS.train_config_path),('input.config', FLAGS.input_config_path)]:tf.gfile.Copy(config, os.path.join(FLAGS.train_dir, name),overwrite=True)model_config = configs['model']train_config = configs['train_config']input_config = configs['train_input_config']model_fn = functools.partial(model_builder.build,model_config=model_config,is_training=True)def get_next(config):return dataset_builder.make_initializable_iterator(dataset_builder.build(config)).get_next()create_input_dict_fn = functools.partial(get_next, input_config)env = json.loads(os.environ.get('TF_CONFIG', '{}'))cluster_data = env.get('cluster', None)cluster = tf.train.ClusterSpec(cluster_data) if cluster_data else Nonetask_data = env.get('task', None) or {'type': 'master', 'index': 0}task_info = type('TaskSpec', (object,), task_data)# Parameters for a single worker.ps_tasks = 0worker_replicas = 1worker_job_name = 'lonely_worker'task = 0is_chief = Truemaster = ''if cluster_data and 'worker' in cluster_data:# Number of total worker replicas include "worker"s and the "master".worker_replicas = len(cluster_data['worker']) + 1if cluster_data and 'ps' in cluster_data:ps_tasks = len(cluster_data['ps'])if worker_replicas > 1 and ps_tasks < 1:raise ValueError('At least 1 ps task is needed for distributed training.')if worker_replicas >= 1 and ps_tasks > 0:# Set up distributed training.server = tf.train.Server(tf.train.ClusterSpec(cluster), protocol='grpc',job_name=task_info.type,task_index=task_info.index)if task_info.type == 'ps':server.join()returnworker_job_name = '%s/task:%d' % (task_info.type, task_info.index)task = task_info.indexis_chief = (task_info.type == 'master')master = server.targetgraph_rewriter_fn = Noneif 'graph_rewriter_config' in configs:graph_rewriter_fn = graph_rewriter_builder.build(configs['graph_rewriter_config'], is_training=True)trainer.train(create_input_dict_fn,model_fn,train_config,master,task,FLAGS.num_clones,worker_replicas,FLAGS.clone_on_cpu,ps_tasks,worker_job_name,is_chief,FLAGS.train_dir,graph_hook_fn=graph_rewriter_fn)if __name__ == '__main__':tf.app.run()

训练好之后，需要把生成的模型转换为.pb模型，这时需要用到export_inference_graph.py。

import tensorflow as tf
from google.protobuf import text_format
from object_detection import exporter
from object_detection.protos import pipeline_pb2slim = tf.contrib.slim
flags = tf.app.flagsflags.DEFINE_string('input_type', 'image_tensor', 'Type of input node. Can be ''one of [`image_tensor`, `encoded_image_string_tensor`, ''`tf_example`]')
flags.DEFINE_string('input_shape', None,'If input_type is `image_tensor`, this can explicitly set ''the shape of this input tensor to a fixed size. The ''dimensions are to be provided as a comma-separated list ''of integers. A value of -1 can be used for unknown ''dimensions. If not specified, for an `image_tensor, the ''default shape will be partially specified as ''`[None, None, None, 3]`.')
flags.DEFINE_string('pipeline_config_path', 'samples/configs/ssd_mobilenet_v1_coco.config','Path to a pipeline_pb2.TrainEvalPipelineConfig config ''file.')
flags.DEFINE_string('trained_checkpoint_prefix', 'training_model/tank/model.ckpt-7589','Path to trained checkpoint, typically of the form ''path/to/model.ckpt')
flags.DEFINE_string('output_directory', 'pb_model/ssd_mobilenet_v1_coco_11_06_2017_tank', 'Path to write outputs.')
flags.DEFINE_string('config_override', '','pipeline_pb2.TrainEvalPipelineConfig ''text proto to override pipeline_config_path.')
flags.DEFINE_boolean('write_inference_graph', False,'If true, writes inference graph to disk.')
tf.app.flags.mark_flag_as_required('pipeline_config_path')
tf.app.flags.mark_flag_as_required('trained_checkpoint_prefix')
tf.app.flags.mark_flag_as_required('output_directory')
FLAGS = flags.FLAGSdef main(_):pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()with tf.gfile.GFile(FLAGS.pipeline_config_path, 'r') as f:text_format.Merge(f.read(), pipeline_config)text_format.Merge(FLAGS.config_override, pipeline_config)if FLAGS.input_shape:input_shape = [int(dim) if dim != '-1' else Nonefor dim in FLAGS.input_shape.split(',')]else:input_shape = Noneexporter.export_inference_graph(FLAGS.input_type, pipeline_config, FLAGS.trained_checkpoint_prefix,FLAGS.output_directory, input_shape=input_shape,write_inference_graph=FLAGS.write_inference_graph)if __name__ == '__main__':tf.app.run()

转换成.pb模型之后，就可以使用opencv调用了~，opencv调用时，需要改下ssd_mobilenet_v1_coco.pbtxt对应的类别数，也就是真正的类别数+1，在最后几行。。。然后就可以调用了