How to run inference with a pre-trained YOLOV8 object detection model on a single image
How to log image and prediction data locally using Babylog
How to load the logged binaries and view the logged information
Pre-requisites
For this tutorial, we will need to install the ultralytics and babylog libraries in python. Best is if you setup a virtual environment for this purpose.
pip install babylog
pip install ultralytics
Running a YOLOV8 model on a sample image
First, we load a sample image and initialize the YOLO detector with a pre-trained model.
from ultralytics import YOLO
import cv2
model = YOLO("yolov8n.pt") # loading a pretrained model YOLOV8 nano model
img = cv2.imread("bus.jpg")
Next, we run inference on the downloaded image with YOLO and measure the inference time. Note that we discard the first few runs and average the inference time (latency) over a number of samples.
import time
import numpy as np
times = []
for i in range(20):
# not using cuda time measurement since inference is on cpu
start_time = time.time()
results = model(img)
if i <= 3: # discard first few measurements
times.append(time.time() - start_time)
mean_latency = int(np.mean(times) * 1000) # ms
print('Avg execution time (ms): {}'.format(mean_latency))
Finally, we convert the detected bounding box information to the babylog standard.
boxes = results[0].boxes.numpy() # Boxes object for bbox outputs
boxes_xywh = boxes.xywh # bounding boxes in x, y, width, height format
cls = boxes.cls # detected classes
conf = boxes.conf # detection confidence
# convert the bounding boxes to babylog's format
boxes_babylog = [{'x': int(box[0]),
'y': int(box[1]),
'width': int(box[2]),
'height': int(box[3]),
'confidence': float(conf_),
'classification': {model.names[int(cls_)]: 1.0}}
for box, conf_, cls_ in zip(boxes_xywh, conf, cls)
]
Logging predictions using babylog
Babylog provides a standard for efficient logging of prediction data: model information, inference statistics, predictions, and raw input images are bundled up into one file in binary format. In this tutorial, we will only log this data locally, but it is possible to log this data directly to the cloud or stream it via tcp. It is also possible to log data at pre-defined intervals. You can find the full list of configuration parameters in an example here. One can log prediction data with Babylog in a few lines of code. For a full list of what you can log, please refer to the documentation.
import time
from babylog import Babylog, VisionModelType, InferenceDevice
bl = Babylog("babylog.config.yaml") # save_local is True by default
bl.log(
image=img,
model_type=VisionModelType.DETECTION,
model_name="yolov8n_pretrained",
model_version="0.0.1",
latency=mean_latency,
inference_device=InferenceDevice.CPU,
detection=boxes_babylog,
)
bl.shutdown()
Viewing the logged information with babylog
We use the LoggedPrediction class to load the logged binary file.
from babylog import LoggedPrediction
logged_prediction = LoggedPrediction.from_path(logfile_path)
predicted_image = logged_prediction.image # raw image that was logged
Next, we draw the detected bounding boxes and overlay the detected class labels and confidence scores on the input image.
import numpy as np
for detection in logged_prediction.detection:
# getting each logged bounding box
x = detection['x']; y = detection['y']
w = detection['width']; h = detection['height']
top_left = (x-w//2, y-h//2)
bottom_right = (x+w//2, y+h//2)
# visualizing the bounding boxes
color = (np.array([0., 0., 1.]) * 255).astype(np.uint8).tolist()
text = '{}:{:.1f}%'
.format(detection['classificationResult'][0]['className'],
detection['confidence'] * 100)
txt_color = (255, 255, 255)
font = cv2.FONT_HERSHEY_SIMPLEX
txt_size = cv2.getTextSize(text, font, 0.8, 1)[0]
cv2.rectangle(predicted_image, top_left, bottom_right, color, 2)
txt_bk_color = (np.array([0.8, 0., 0.8]) * 255).astype(np.uint8).tolist()
cv2.rectangle(
predicted_image,
(top_left[0], top_left[1] + 1),
(top_left[0] + int(0.5*txt_size[0]) + 1, top_left[1] + int(1.5*txt_size[1])),
txt_bk_color,
-1
)
cv2.putText(predicted_image, text, (top_left[0], top_left[1] + txt_size[1]),
font, 0.4, txt_color, thickness=1)
Finally, we overlay inference statistics and model information on the image, and display the image with imshow using OpenCV. Click here to check the image with the overlayed information.