通过tensorflow检测物体2 x

科技2025-03-26 53

The computer vision is one of the top fast growing domain and the deep learning based approach is now widely applied to solve real-world problems such as face recognition, cancer detection, etc.

计算机视觉是增长最快的领域之一，基于深度学习的方法现已广泛应用于解决诸如面部识别，癌症检测等现实问题。

One of the most effective tool is Tensorflow Object Detection API and use their pre-trained model, replacing the last layer for the particular problem trying to solve and fine tune the model.

最有效的工具之一是Tensorflow Object Detection API，并使用其预先训练的模型，针对试图解决和微调模型的特定问题替换最后一层。

Now the API supports Tensorflow 2.x. There are good reasons to use TF2 instead of TF1 — e.g. eager execution, which was introduced in TF1.5 to make the coding simpler and debugging easier, and new state of the art (SOTA) models such as CenterNet, ExtremeNet, and EfficientDet are available. The latest version as of writing this is Tensorflow 2.3.

现在，API支持Tensorflow2.x。有充分的理由使用TF2而不是TF1 -例如急于执行，这是在TF1.5引入使编码简单，调试容易，艺术(SOTA)模型的新状态，如CenterNet ， ExtremeNet和EfficientDet是可用。撰写本文时的最新版本是Tensorflow 2.3。

The API is backward compatible but still it would make your life easier to use TF1 if your runtime cannot be upgraded. Here is TF1 version of step-by-step guide.

该API向后兼容，但是如果您的运行时无法升级，它仍然可以使您更轻松地使用TF1。这是TF1版本的逐步指南。

Autonomous Vehicle Image by Author 作者提供的自动驾驶汽车图片

Here, I take the traffic light detection for a self-driving car, where red, yellow or green traffic light have to be detected at high frequency (~10fps) even on a small computing resource on the car by older libraries due to dependencies.

在这里，我采用自动驾驶汽车的交通信号灯检测方法，其中由于依赖关系，即使是较早的库在汽车上的少量计算资源上，也必须以较高的频率(〜10fps)检测红色，黄色或绿色的交通信号灯。

Please note that this post only describes the object detection by a machine learning approach. An actual self-driving car uses Lidar, Rader, GPS and map, and apply various filters for localization, object detection, trajectory planning and so on then apply actuators to accelerate, decelerate or turn the car, which is beyond this post.

请注意，本文仅描述通过机器学习方法进行的对象检测。实际的自动驾驶汽车使用激光雷达，雷达，GPS和地图，并应用各种过滤器进行定位，目标检测，轨迹规划等，然后应用执行器来加速，减速或转弯汽车，这已超出了本文的范围。

The pre-trained model used is Centernet Resnet50, which has achieved a great efficiency with reasonably good accuracy by means of the cascade corner pooling and center pooling architecture, using Resnet50 as the backbone network, which fits for this use case.

所使用的预训练模型是Centernet Resnet50，它通过使用Resnet50作为骨干网络的级联拐角池和中心池体系结构，以适合此用例的方式实现了很高的效率和相当好的精度。

https://arxiv.org/abs/1904.08189, and Resnet paper: https ：//arxiv.org/abs/1904.08189和Resnet论文： https://arxiv.org/abs/1512.03385 https ：//arxiv.org/abs/1512.03385的Centernet Resnet50架构的高级图像

As for training data, kitti (http://www.cvlibs.net/datasets/kitti/index.php) provides comprehensive dataset for autonomous car.

至于培训数据，kitti( http://www.cvlibs.net/datasets/kitti/index.php )提供了用于自动驾驶汽车的综合数据集。

The below uses Google Colab. If you train on a local environment, please refer to README in the repo.

下面使用Google Colab。如果你训练的本地环境，请参考README的回购。

第一部分：在本地PC上进行准备 (Part1: Preparation on Local PC)

1. Clone the project repo or create new one

1.克隆项目仓库或创建一个新的仓库

If you created a new repo, make the following directories

如果创建了新的仓库，请创建以下目录

mkdir annotations # To store annotated inputs mkdir images # To store training images mkdir models # To store training pipeline config and training output mkdir pre-trained-models # To store pre-trained models mkdir scripts # For scripts mkdir exported-models # To store exported models after training

When creating a new repo, copy all scripts in scripts dir.

创建新存储库时，请复制脚本目录中的所有脚本。

2. Select which pre-trained model to use

2.选择要使用的预训练模型

Go to Tensorflow 2 Detection Model Zoo in github and download the one which fits for the purpose. All models here have been trained by coco dataset as of writing, which works well in most cases. Speed and accuracy (mAP) are trade-off. I selected CenterNet Resnet50 V1 FPN 512x512, of which Speed is 27 and mAP is 31.2 — in deed this is much faster and more accurate than SSD Mobilenet v2 in TF1 with Speed is 31 and mAP is 21, which I used here!

转到github中的Tensorflow 2 Detection Model Zoo ，并下载适合该目的的模型。撰写本文时，这里的所有模型均已通过coco数据集进行了训练，在大多数情况下效果很好。速度和准确性(mAP)是折衷方案。我选择CenterNet Resnet50 V1 FPN 512×512，其速度是27和图31.2 -在行动上，这是远远超过SSD Mobilenet V2在TF1与速度更快，更准确的是31和图21，这在我以前在这里！

One can of course choose more complex model if the environment allows much larger model size and slower inference time.

如果环境允许更大的模型尺寸和更慢的推理时间，那么当然可以选择更复杂的模型。

Once the model is decided, unarchive and copy pipeline.config. This config is the one used for their training and need to modify later as we use it for fine tuning. Alternatively, you can copy sample pipeline config from the tensorflow model repo, as you actually don’t need to download the model itself when using Colab environment.

确定模型后，取消存档并复制pipeline.config。此配置是用于他们的培训的配置，以后需要进行修改，因为我们将其用于微调。另外，您可以从tensorflow模型库中复制示例管道配置，因为在使用Colab环境时，您实际上不需要下载模型本身。

3. Prepare training data

3.准备训练数据

As is always the case for supervised learning, you need to spend a few hours to manually label input data. Save training images under images directory where I provided traffic light detection images in a simulator with labels in Pascal VOC in my repo. There are a number of tools available and labelImg is one of the simplest classic ones for box type. Make sure you have python 3 and simply install it from PyPI, or refer to installation guide.

与监督学习一样，您需要花费几个小时来手动标记输入数据。将训练图像保存在图像目录下，在该目录中我在模拟器中提供了带有信号灯的Pascal VOC中带有标签的模拟器中的交通信号检测图像。有许多可用的工具，labelImg是用于框类型的最简单的经典工具之一。确保您拥有python 3并只需从PyPI安装它，或参考安装指南。

Save the xml files in the same directory. Good news is, no need to prepare tens of thousands of labels as this is a transfer learning and 100 labels per category will yield good results for fine tuning.

将xml文件保存在同一目录中。好消息是，无需准备数万个标签，因为这是一项转移学习，每个类别100个标签将产生良好的微调效果。

4. Create Label Map

4.创建标签图

This is just a list of labels. Names should match with the annotated labels in the previous step. Save the file as annotations/label_map.pbtxt

这只是标签列表。名称应与上一步中带注释的标签匹配。将文件另存为注释/label_map.pbtxt

item { id: 1 name: 'green' } item { id: 2 name: 'yellow' } item { id: 3 name: 'red' }

Next step of training set preparation is to separate images to train set and test set, then generate .TFRecord for each the xml files. This can be done locally here but I included in the step on Colab.

训练集准备的下一步是将图像分为训练集和测试集，然后为每个xml文件生成.TFRecord。这可以在这里本地完成，但我已包含在有关Colab的步骤中。

5. Edit pipeline.config

5.编辑pipeline.config

Some changes are mandatory and others are optional but better to fit for the training data.

有些更改是强制性的，而另一些则是可选的，但更好地适合于培训数据。

model { center_net { num_classes: 3 # [MUST] Update this to the number of classes in the training images feature_extractor { type: "resnet_v1_50_fpn" # [MUST] Make sure this matches with the model } image_resizer { keep_aspect_ratio_resizer { # [OPTIONAL] Change the resizer which fits to the training images min_dimension: 512 max_dimension: 512 pad_to_max_dimension: true } } ## omitted ## } } train_config: { batch_size: 8 # [MUST] Update based on the machine - larger size would face out of memory error num_steps: 10000 # [MUST] Update to the number of steps you would like to perform ## omitted ## optimizer { adam_optimizer: { epsilon: 1e-7 learning_rate: { cosine_decay_learning_rate { # [MUST] Update to match with the total steps and the lr shape. This affects to the model performance. learning_rate_base: 1e-3 total_steps: 10000 warmup_learning_rate: 2.5e-4 warmup_steps: 1000 } } } use_moving_average: false } max_number_of_boxes: 3 # [MUST] Update to the max number of objects per image unpad_groundtruth_tensors: false fine_tune_checkpoint_version: V2 fine_tune_checkpoint: "/content/models/research/pretrained_model/checkpoint/ckpt-0" # [MUST] Update to match with the file path. /content is for Colab. fine_tune_checkpoint_type: "fine_tune" # [MUST] Update to "fine_tune" for trainsfer learning. "classification" is for original training. } train_input_reader: { label_map_path: "annotations/label_map.pbtxt" # [MUST] Update to match with the file path. tf_record_input_reader { input_path: "annotations/train.record" # [MUST] Update to match with the file path. } } eval_config: { metrics_set: "coco_detection_metrics" use_moving_averages: false batch_size: 1; } eval_input_reader: { label_map_path: "annotations/label_map.txt" # [MUST] Update to match with the file path. shuffle: false num_epochs: 1 tf_record_input_reader { input_path: "annotations/test.record" # [MUST] Update to match with the file path. } }

Push the changes to github — all set!

将更改推送到github-全部设置好！

第2部分：Google Colab (Part 2: Google Colab)

Next, go to Google Colab and create a new notebook. traffic-light-detection-tf2.ipynb is a sample in my repo.

接下来，转到Google Colab并创建一个新笔记本。 traffic-light-detection-tf2.ipynb是我的仓库中的一个示例。

1. Install required libraries

1.安装所需的库

tensorflow_version !pip install -q pillow lxml jupyter matplotlib cython pandas contextlib2 !apt-get install -qq protobuf-compiler !pip install -q pycocotools tf_slim

2. Set up the variables

2.设置变量

import os # Repo URL repo_url = 'https://github.com/yuki678/driving-object-detection' # Models MODELS_CONFIG = { 'ssd_mobilenet_v2': { 'model_name': 'ssd_mobilenet_v2_320x320_coco17_tpu-8', 'model_path': '/models/tf2/my_ssd_mobilenet_v2/', 'pipeline_file': 'pipeline.config' }, 'ssd_mobilenet_v2_fpn': { 'model_name': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8', 'model_path': '/models/tf2/my_ssd_mobilenet_v2_fpnlite/', 'pipeline_file': 'pipeline.config' }, 'my_centernet_resnet50_v1_fpn': { 'model_name': 'centernet_resnet50_v1_fpn_512x512_coco17_tpu-8', 'model_path': '/models/tf2/my_centernet_resnet50_v1_fpn/', 'pipeline_file': 'pipeline.config' }, 'my_centernet_resnet101_v1_fpn': { 'model_name': 'centernet_resnet101_v1_fpn_512x512_coco17_tpu-8', 'model_path': '/models/tf2/my_centernet_resnet101_v1_fpn/', 'pipeline_file': 'pipeline.config' } } # Select a model to use. selected_model = 'my_centernet_resnet50_v1_fpn' model_name = MODELS_CONFIG[selected_model]['model_name'] model_path = MODELS_CONFIG[selected_model]['model_path'] pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file'] # Set Repository Home Directory repo_dir_path = os.path.abspath(os.path.join('.', os.path.basename(repo_url))) # Set Label Map (.pbtxt) path and pipeline.config path label_map_pbtxt_fname = repo_dir_path + '/annotations/label_map.pbtxt' pipeline_fname = repo_dir_path + model_path + pipeline_file # Set .record path test_record_fname = repo_dir_path + '/annotations/test.record' train_record_fname = repo_dir_path + '/annotations/train.record' # Set output directories and clean up model_dir = repo_dir_path + '/training/' output_dir = repo_dir_path + '/exported-models/' !rm -rf {model_dir} {output_dir} os.makedirs(model_dir, exist_ok=True) os.makedirs(output_dir, exist_ok=True)

3. Clone Tensorflow model repo

3.克隆Tensorflow模型库

Clone, compile protocol buffers, set PTTHONPATH and install.

克隆，编译协议缓冲区，设置PTTHONPATH并安装。

# Clone Tensorflow model repo %cd /content !git clone --quiet https://github.com/tensorflow/models.git # Compile protocol buffers %cd /content/models/research !protoc object_detection/protos/*.proto --python_out=. # Set environment variables import os os.environ['PYTHONPATH'] += ':/content/models:/content/models/research/:/content/models/research/slim/' # Install libraries !pip install . # Test !python object_detection/builders/model_builder_test.py

Then, install COCO API for evaluation

然后，安装COCO API进行评估

# Coco Installation (Optional, required when using Coco Evaluation) %cd /content !git clone --quiet https://github.com/cocodataset/cocoapi.git %cd cocoapi/PythonAPI !make !cp -r pycocotools /content/models/research/

4. Download a pre-trained models

4.下载预训练的模型

%cd /content/models/research import os import shutil import glob import urllib.request import tarfile MODEL_FILE = model_name + '.tar.gz' DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' DEST_DIR = '/content/models/research/pretrained_model' if not (os.path.exists(MODEL_FILE)): urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE) tar = tarfile.open(MODEL_FILE) tar.extractall() tar.close() os.remove(MODEL_FILE) if (os.path.exists(DEST_DIR)): shutil.rmtree(DEST_DIR) os.rename(model_name, DEST_DIR) # Check downloaded files !echo {DEST_DIR} !ls -alh {DEST_DIR} # Set fine tune checkpoint fine_tune_checkpoint = os.path.join(DEST_DIR, "checkpoint/ckpt-0") print("fine_tune_checkpoint: ", fine_tune_checkpoint)

5. Clone the project repo

5.克隆项目仓库

Make sure the local changes are committed in master branch and pushed to github beforehand.

确保本地更改已在master分支中提交并事先推送到github。

import os %cd /content # Clean up !rm -rf {repo_dir_path} # Clone !git clone {repo_url} # Pull (just in case the repo already exists) %cd {repo_dir_path} !git pull # Check if label map and pipeline files exist assert os.path.isfile(label_map_pbtxt_fname), '`{}` not exist'.format(label_map_pbtxt_fname) assert os.path.isfile(pipeline_fname), '`{}` not exist'.format(pipeline_fname)

6. Process input images (training dataset)

6.处理输入图像(训练数据集)

First, split the images to train set and test set.

首先，将图像拆分为训练集和测试集。

%cd {repo_dir_path} # Split images to train:test = 9:1 !python scripts/partition_dataset.py -x -i images/ -r 0.1 # Check test images !ls images/test

Then, convert xml files to a csv

然后，将xml文件转换为csv

# Create train data: !python scripts/xml_to_csv.py -i images/train -o annotations/train_labels.csv # Create test data: !python scripts/xml_to_csv.py -i images/test -o annotations/test_labels.csv

Finally, convert csv files to TFRecord format

最后，将csv文件转换为TFRecord格式

# Create train data: !python scripts/generate_tfrecord.py -c annotations/train_labels.csv -i images/train -x images/train -o annotations/train.record -l annotations/label_map.pbtxt # Create test data: !python scripts/generate_tfrecord.py -c annotations/test_labels.csv -i images/test -x images/test -o annotations/test.record -l annotations/label_map.pbtxt # Check assert os.path.isfile(test_record_fname), '`{}` not exist'.format(test_record_fname) assert os.path.isfile(train_record_fname), '`{}` not exist'.format(train_record_fname)

7. Set up Tensorboard

7.设置Tensorboard

It is very useful to launch tensorboard to monitor the training progress. In TF1 version we use ngrok tunnel but now a magic command is available to launch tensorboard within the notebook!

启动张量板来监视训练进度非常有用。在TF1版本中，我们使用ngrok隧道，但是现在可以使用魔术命令在笔记本中启动tensorboard！

# Set log directory for tensorboard to watch LOG_DIR = model_dir # Clean up the directory !rm -rf {LOG_DIR}/* # Use magic command to launch tensorboard within the notebook %load_ext tensorboard %tensorboard --logdir {LOG_DIR}

8. Train! finally:)

8.火车！最后：)

Simply, execute model_main_tf2.py. Note that TF1 uses model_main.py with slightly different arguments.

只需执行model_main_tf2.py。请注意，TF1使用model_main.py，其参数略有不同。

%cd {repo_dir_path} !python /content/models/research/object_detection/model_main_tf2.py \ --pipeline_config_path={pipeline_fname} \ --model_dir={model_dir} \ --alsologtostderr

This takes a while, depending on the model and parameters though, a few minutes to hours or more. Take a couple of tea, or even dinner and shower and TV and sleep…

根据模型和参数，这可能需要花费几分钟到几小时甚至更长的时间。喝杯茶，甚至晚餐，淋浴和电视，然后入睡……

9. Evaluate (Optional but should)

9.评估(可选，但应如此)

In TF1, it was done within the training process but somehow it has been separated. It’s useful when running locally but not so much in Colab… anyway you can run this after the training by specifying checkpoint_dir, which runs the evaluation mode. Set eval_timeout or kill the process after evaluation is done as it waits for the inputs infinitely.

在TF1中，它是在训练过程中完成的，但是以某种方式已被分离。在本地运行时很有用，但在Colab中运行不多……无论如何，您可以在训练后通过指定运行评估模式的checkpoint_dir来运行它。设置eval_timeout或在评估完成后终止进程，因为它无限期地等待输入。

%cd {repo_dir_path} !python /content/models/research/object_detection/model_main_tf2.py \ --pipeline_config_path={pipeline_fname} \ --model_dir={model_dir} \ --checkpoint_dir={model_dir} \ --eval_timeout=60

Adjust parameters in pipeline.config if the loss is not decreased, accuracy/recall is not improved or learning curve is not as expected. Target total loss under 2.

如果损失没有减少，准确性/调用率没有改善或学习曲线不符合预期，请在pipeline.config中调整参数。目标总损失低于2。

It provides nice comparison between ground truth and detection in images too, at each evaluation step.

在每个评估步骤中，它也提供了地面真实情况与图像检测之间的良好比较。

9. Export the outputs

9.导出输出

Don’t forget to export and download the trained model, which will be used for inference. Again, here uses exporter_main_v2.py instead of exporter_main.py in TF1.

不要忘记导出和下载经过训练的模型，该模型将用于推理。同样，这里使用的是exporter_main_v2.py而不是TF1中的exporter_main.py。

%cd {repo_dir_path} !python /content/models/research/object_detection/exporter_main_v2.py \ --input_type image_tensor \ --pipeline_config_path {pipeline_fname} \ --trained_checkpoint_dir {model_dir} \ --output_directory {output_dir} # Check the output files !echo {output_dir} !ls -lsr {output_dir}

Once successfully exported, archive and download it.

成功导出后，存档并下载。

# Archive the exported model %cd {repo_dir_path} !tar zcvf trained_model.tar.gz {output_dir} # Download the archive from Colab from google.colab import files files.download('trained_model.tar.gz')

10. Predict! (Optional)

10.预测！ (可选的)

The whole purpose of training a model is to use the trained model for inference, of course. Here, I used the test images for demonstration but this can be any new images which you would like to detect objects in.

当然，训练模型的整个目的是将训练后的模型用于推理。在这里，我使用测试图像进行演示，但这可以是您要在其中检测对象的任何新图像。

First configure image paths:

首先配置图像路径：

import os # Use images in test dir (update this if you have other images for inference) IMAGE_DIR = os.path.join(repo_dir_path, "images", "test") IMAGE_PATHS = [] for file in os.listdir(IMAGE_DIR): if file.endswith(".jpg") or file.endswith(".png"): IMAGE_PATHS.append(os.path.join(IMAGE_DIR, file)) IMAGE_PATHS

Then, load the trained model:

然后，加载训练后的模型：

%cd /content import time import tensorflow as tf # Added as colab instance often crash from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as viz_utils # Label Map path PATH_TO_LABELS = label_map_pbtxt_fname # Saved model path PATH_TO_SAVED_MODEL = os.path.join(output_dir, "saved_model") print('Loading model...', end='') start_time = time.time() # Load saved model and build the detection function detect_fn = tf.saved_model.load(PATH_TO_SAVED_MODEL) end_time = time.time() elapsed_time = end_time - start_time print('Done! Took {} seconds'.format(elapsed_time)) # Set category index category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

Now, run the inference for each image in image paths.

现在，为图像路径中的每个图像运行推断。

import numpy as np from PIL import Image import matplotlib.pyplot as plt import warnings warnings.filterwarnings('ignore') # Suppress Matplotlib warnings # This is required to display the images. %matplotlib inline for image_path in IMAGE_PATHS: print('Running inference for {}... '.format(image_path), end='') # Puts image into numpy array to feed into tensorflow graph. # Note that by convention we put it into a numpy array with shape # (height, width, channels), where channels=3 for RGB. image_np = np.array(Image.open(image_path)) # The input needs to be a tensor, convert it using `tf.convert_to_tensor`. input_tensor = tf.convert_to_tensor(image_np) # The model expects a batch of images, so add an axis with `tf.newaxis`. input_tensor = input_tensor[tf.newaxis, ...] # input_tensor = np.expand_dims(image_np, 0) detections = detect_fn(input_tensor) # All outputs are batches tensors. # Convert to numpy arrays, and take index [0] to remove the batch dimension. # We're only interested in the first num_detections. num_detections = int(detections.pop('num_detections')) detections = {key: value[0, :num_detections].numpy() for key, value in detections.items()} detections['num_detections'] = num_detections # detection_classes should be ints. detections['detection_classes'] = detections['detection_classes'].astype(np.int64) image_np_with_detections = image_np.copy() viz_utils.visualize_boxes_and_labels_on_image_array( image_np_with_detections, detections['detection_boxes'], detections['detection_classes'], detections['detection_scores'], category_index, use_normalized_coordinates=True, max_boxes_to_draw=20, min_score_thresh=.30, agnostic_mode=False) plt.figure(figsize = (12,8)) plt.imshow(image_np_with_detections) print('Done') plt.show()

Here are some sample outputs. It can detect and classify traffic lights in all cases with a good confidence level even with a faster inference speed.

这是一些示例输出。即使推理速度更快，它也可以在所有情况下以良好的置信度检测和分类交通信号灯。

This post explains how to use Tensorflow Object Detection API 2.x for training and perform inference on the fine-tuned model. If you use Tensorflow 1.x, please see this post. Sample code and images are available in my github repo.

这篇文章解释了如何使用Tensorflow Object Detection API 2.x进行训练并在微调模型上进行推理。如果您使用Tensorflow 1.x，请参阅这篇文章。示例代码和图像可在我的github存储库中找到。

翻译自: https://towardsdatascience.com/object-detection-by-tensorflow-2-x-e1199558abc

相关资源：微信小程序源码-合集6.rar

Processed: 0.012, SQL: 8