Object Detection service with YOLO and FastAPI

Deploy custom-trained machine learning model that detects objects in images

9 min readSep 19, 2022

In my previous machine learning articles 1, 2, 3, I covered an important task in computer vision, image classification. Image classification is one application in computer vision. In this article, I want to explore “object detection” task and deploy the machine learning model with FastAPI, so it can be consumed by restful API interface.

The following article provides a good overview of object detection.

Object Detection in 2022: The Definitive Guide - viso.ai

This article will provide an introduction to object detection and provide an overview of the state-of-the-art computer…

viso.ai

YOLO is one type of deep neural network commonly used in “object detection” task, which is very fast. The latest version of YOLO is v7. We will use YOLO in this article.

COCO dataset format

In “object detection” world, COCO (which stands for Common Objects In Context) dataset is a dataset format used for object detection research. On high-level, it is using JSON to do image annotation, it needs to specify the image location and other basic information, mostly importantly, it annotate the bounding boxes in each image with corresponding object category (e.g. which location in the image is apple, which location is banana, etc.). The following is a sample partial COCO JSON, notice annotations[].image_id refers to actual images[].id, annotations[].category_id refers to categories[].id, so you know which annotation refers to which bounding box/location in which image and the category of the bounding box.

For more information, refer to the following articles about COCO format.

COCO JSON Format for Object Detection | Haobin Tan

The COCO dataset is formatted in JSON and is a collection of "info", "licenses", "images", "annotations", "categories"…

haobin-tan.netlify.app

Coco dataset, What is it? and How can we use it?

What’s the COCO format?

medium.com

COCO data format for Object detection

In this article we will understand two popular data formats : COCO data format and Pascal VOC data formats. These data…

towardsdatascience.com

Getting started with COCO dataset

Understanding format of popular dataset for computer vision

towardsdatascience.com

Use code in the link to download sample COCO dataset

You can use pycocotools to parse COCO dataset in Python. Below are some samples to parse annotations and show as image like this article.

# instance with bounding boxes and category text
import pandas as pd 
import os
from pycocotools.coco import COCO
import skimage.io as io
import matplotlib.pyplot as plt
from pathlib import PathdataDir=Path('coco/images/val2017')
annFile = Path('coco/annotations/instances_val2017.json')
coco = COCO(annFile)
imgIds = coco.getImgIds()
imgs = coco.loadImgs(imgIds[-3:])_,axs = plt.subplots(len(imgs),2,figsize=(10,5 * len(imgs)))
for img, ax in zip(imgs, axs):
 I = io.imread(dataDir/img['file_name'])
 annIds = coco.getAnnIds(imgIds=[img['id']])
 anns = coco.loadAnns(annIds)
 ax[0].imshow(I)
 ax[1].imshow(I)
 plt.sca(ax[1])
 coco.showAnns(anns, draw_bbox=True)
 for i, ann in enumerate(anns):
 cat = coco.loadCats(anns[i]['category_id'])
 cat_name = cat[0]['name']
 ax[1].text(anns[i]['bbox'][0], anns[i]['bbox'][1], cat_name, style='italic', 
 bbox={'facecolor': 'white', 'alpha': 0.7, 'pad': 5})# person key points
dataDir=Path('coco/images/val2017')
annFile = Path('coco/annotations/person_keypoints_val2017.json')
coco = COCO(annFile)
imgIds = coco.getImgIds()
imgs = coco.loadImgs(imgIds[-1:])
_,axs = plt.subplots(len(imgs),2,figsize=(10,5 * len(imgs)))
for img, ax in zip(imgs, axs):
 I = io.imread(dataDir/img['file_name'])
 annIds = coco.getAnnIds(imgIds=[img['id']])
 anns = coco.loadAnns(annIds)
 ax[0].imshow(I)
 ax[1].imshow(I)
 plt.sca(ax[1])
 coco.showAnns(anns, draw_bbox=False)

Model inference with Yolov5 and fastapi on Google Colab

When you run fastapi API on Google Colab, you want to expose the endpoint to public, so that you can access from Internet. Previously you can use ngrok as described in this article, and this colab notebook. However, seems not ngrok requires you to register account first before using it. Also tried this colab notebook, but python minimal_server.py just hangs. Finally found colab-xterm, which solved the problem, and you can run multiple xterms in parallel (e.g. server and client), after enabling third-party cookies.

!pip install colab-xterm
%load_ext colabxterm

Then run

%xterm

I followed repo below

GitHub - WelkinU/yolov5-fastapi-demo: FastAPI Wrapper of YOLOv5

This is a demo FastAPI app that allows a user to upload image(s), perform inference using a pretrained YOLOv5 model…

github.com

Quick recap of FastAPI server code, which is a Python-based fast (high-performance), web framework for building APIs.

Typical when each time you hit FastAPI endpoint hosting a machine learning model, the following happen:

Based on your routing setup, your request will be routed by FastAPI into the function annotated by the route
Your code then use the model (the model is typically loaded globally, to avoid loading it every time on a request)
Use model predict function or do a forward pass of the model to get prediction result
Convert the result into JSON

Main uvicorn server code, tells server to listen at localhost:8000 and run app instance which is FastAPI instance.

from fastapi import FastAPI
app = FastAPI()# Python main entrypointif __name__ == '__main__':
    import uvicorn    # APP/app_str - The ASGI application to run, in the format "<module>:<attribute>"    app_str = 'server_minimal:app'
    uvicorn.run(app_str, host='localhost', port=8000, reload=True, workers=1)

Main object detection API endpoint. FastAPI uses decorator to tell “/” and HTTP post combination should be routed to process_home_form function, the function can take the passed-in filestream and model name and get prediction result, then convert to JSON format required by default, and send back.

@app.post("/")
async def process_home_form(file: UploadFile = File(...), 
              model_name: str = Form(...)):
  
    '''
    Requires an image file upload, model name (ex. yolov5s).
    Returns: json response with list of list of dicts.
      Each dict contains class, class_name, confidence, normalized_bbox
    Note: Because this is an async method, the YOLO inference is a blocking
    operation.
    '''model = torch.hub.load('ultralytics/yolov5', model_name, pretrained=True, force_reload = False)#This is how you decode + process image with PIL
    results = model(Image.open(BytesIO(await file.read())))    json_results = results_to_json(results,model)
    return json_results

If you look at the client code, you will see it is posting tohttp://localhost:8000, passing model name and image file stream.

import requests as r
import json
from pprint import pprintdef send_request(image = '../images/zidane.jpg', model_name = 'yolov5s'):
    res = r.post("http://localhost:8000", 
                    data={'model_name': model_name}, 
                    files = {'file': open(image , "rb")} #pass the files here
                    )pprint(json.loads(res.text))if __name__ == '__main__':
    send_request()

YOLOv5 object detection restful API server.

Object detection api client

In real-world, you will build a container image with necessary dependencies (e.g. Python runtime, Python packages needed by FastAPI (e.g. fastapi, uvicorn) and YOLO (e.g. pytorch), machine learning model). Below is a sample Dockerfile

yolov5-fastapi/Dockerfile at main · DanielChuDC/yolov5-fastapi

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below…

github.com

YOLO transfer learning/custom training

You also want to train your own model based on your own data and start with pre-trained YOLO model. YOLO5 repo provides a tutorial. To use that, you need to provide the data in the following format.

YOLOv5 Transfer Learning In Simple Steps Without Losing Your Mind - KiKaBeN

YOLOv5 transfer learning is well-integrated into the Ultralytics' implementation. We'll perform transfer learning to…

kikaben.com

images and labels folder

For corresponding images, it needs to have .txt in labels folder with same name, e.g. data/images/train/000000000009.jpg needs to have data/labels/train/000000000009.txt including the annotations
The label file format: each line has five number, the first is category id, others denote bounding box, each file can have multiples lines for multiple annotations

0 0.35750000000000004 0.53875 0.463334 0.39499999999999996

Training configuration file to provide information for data and labels, e.g. in the following, you have a path for the data set, the relative path of the training dataset, validation and test dataset, and list of the classes in the names

path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val:  # val images (relative to 'path') 128 images
test:  # test images (optional)# Classes
names:
  0: person
  1: bicycle
  2: car
  3: motorcycle
  4: airplane

YOLOv5 also separates backbone (which extracts image feature) with head layer. Generally you want to keep those layers’ weights untouched during transfer learning. There is a “freeze” parameter to keep backbone layers frozen. The basic command to train a custom data set is

python train.py --img <image size> --batch <batch size> --epochs <epoch> --data <training config file/coco128.yaml> --weights <pretrained model> --freeze <number of layers to freeze>

Use custom-trained model

python detect.py --weights runs/train/exp2/weights/best.pt --img 640 --conf 0.25 --source data/images

While for a quick-start project, FastAPI is ok to deploy machine learning model. In large-scale production workload, a mature model serving framework is needed. That will be in another article.

Appendix

Tool to work with COCO format

Convert any dataset to COCO object detection format with SAHI

After reading this post, you will be able to easily convert any dataset into COCO object detection format 🚀

medium.com

sahi Python package can generate COCO format (actually it is a lightweight vision library for object detection)

How to work with object detection datasets in COCO format

A comprehensive guide to defining, loading, exploring, and evaluating object detection datasets in COCO format using…

towardsdatascience.com

How to create custom COCO data set for object detection

Previously, we have trained a mmdetection model with custom annotated dataset in Pascal VOC data format. You are out of…

medium.datadriveninvestor.com

YOLO

Your Comprehensive Guide to the YOLO Family of Models

YOLO (You Only Look Once) is a family of computer vision models that has gained significant fanfare since Joseph…

blog.roboflow.com

History of YOLO: old versions of YOLO is based on Darknet, but from v5, most implementations are based on PyTorch.

GitHub - meituan/YOLOv6: YOLOv6: a single-stage object detection framework dedicated to industrial…

Implementation of paper - YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications YOLOv6 is a…

github.com

YOLOv6: next-generation object detection — review and comparison

eview and comparison of the next generation object detection

towardsdatascience.com

YOLOv6 Explained with Tutorial and Example - MLK - Machine Learning Knowledge

In this article, we will introduce the new object detection model YOLOv6 which has been making buzz in the computer…

machinelearningknowledge.ai

YOLOv7: The Most Powerful Object Detection Algorithm (2022 Guide) - viso.ai

The YOLOv7 algorithm is making big waves in the computer vision and machine learning communities. The newest YOLO…

viso.ai

Top YOLO Variants Of 2021

Object detection aims to mark the region of the image that contains objects with bounding boxes and classify them. This…

medium.com

YOLOv4 - Transfer Learning Toolkit 3.0 documentation

YOLOv4 is an object detection model that is included in the Transfer Learning Toolkit. YOLOv4 supports the following…

docs.nvidia.com

Train YOLOv8 on Custom Data?

YOLOv8🔥has achieved a new high in terms of Mean Average Precision (MAP) with a score of 53.7.

medium.com

Deploy machine learning model with FastAPI

How you can quickly deploy your ML models with FastAPI

How to deploy your ML models quickly with this API-building tool.

towardsdatascience.com

Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker

Introduction

medium.com

FastAPI Best Practices

Opinionated list of best practices and conventions we developed after 1.5 years in production at our startup.

betterprogramming.pub

Object detection using language other than Python

Tutorial: Detect objects using an ONNX deep learning model - ML.NET

Learn how to use a pre-trained ONNX model in ML.NET to detect objects in images. Training an object detection model…

docs.microsoft.com

YOLO transfer learning

Nvidia Transfer Learning Toolkit — A Comprehensive Guide

In today’s world, most of the highly optimized Deep Neural Networks architecture is already available to use and what…

medium.com

Transfer Learning Toolkit - Transfer Learning Toolkit 3.0 documentation

Creating an Experiment Spec File - Specification File for Classification

docs.nvidia.com

YOLOv4 transfer learning for scanned document structure recognition

When you first hear about “YOLO”, you probably intuitively think about “You Only Live Once”. But it’s also the name of…

mingzhi2.medium.com

TRAIN A CUSTOM YOLOv4 OBJECT DETECTOR (Using Google Colab)

(TUTORIAL FOR BEGINNERS)

medium.com

TRAIN A CUSTOM YOLOv4-tiny OBJECT DETECTOR (Using Google Colab)

Train a custom YOLO detector for mask detection

medium.com

GitHub - edgeimpulse/yolov5: YOLOv5 transfer learning model for Edge Impulse

This repository is an example on how to bring your own model into Edge Impulse. This repository is using YOLOv5 (an…

github.com

GitHub - Danielskauge/yolo_transfer_learning: Learning how to do transfer learning on yolo with…

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

Object detection: train YOLOv5 on a custom dataset - OVHcloud Blog

A guide to train a YOLO object detection algorithm on your dataset. It's based on the YOLOv5 open source repository by…

blog.ovhcloud.com