🎥A YOLOv8 guide to logging computer vision data with Babylog

In this tutorial, we will learn the following:

  • How to run inference with a pre-trained YOLOV8 object detection model on a single image

  • How to log image and prediction data locally using Babylog

  • How to load the logged binaries and view the logged information

Pre-requisites

For this tutorial, we will need to install the ultralytics and babylog libraries in python. Best is if you setup a virtual environment for this purpose.

pip install babylog
pip install ultralytics

Running a YOLOV8 model on a sample image

First, we load a sample image and initialize the YOLO detector with a pre-trained model.

from ultralytics import YOLO
import cv2
 
model = YOLO("yolov8n.pt")  # loading a pretrained model YOLOV8 nano model
img = cv2.imread("bus.jpg")

Next, we run inference on the downloaded image with YOLO and measure the inference time. Note that we discard the first few runs and average the inference time (latency) over a number of samples.

import time
import numpy as np

times = []
for i in range(20):
	# not using cuda time  measurement since inference is on cpu
  start_time = time.time()  
  results = model(img)
  if i <= 3:  # discard first few measurements
    times.append(time.time() - start_time)
mean_latency = int(np.mean(times) * 1000)  # ms
print('Avg execution time (ms): {}'.format(mean_latency))

Finally, we convert the detected bounding box information to the babylog standard.

boxes = results[0].boxes.numpy()  # Boxes object for bbox outputs
boxes_xywh = boxes.xywh  # bounding boxes in x, y, width, height format
cls = boxes.cls  # detected classes
conf = boxes.conf # detection confidence

# convert the bounding boxes to babylog's format
boxes_babylog = [{'x': int(box[0]), 
                  'y': int(box[1]), 
                  'width': int(box[2]), 
                  'height': int(box[3]), 
                  'confidence': float(conf_),
                  'classification': {model.names[int(cls_)]: 1.0}}
                 for box, conf_, cls_ in zip(boxes_xywh, conf, cls)
                 ]

Logging predictions using babylog

Babylog provides a standard for efficient logging of prediction data: model information, inference statistics, predictions, and raw input images are bundled up into one file in binary format. In this tutorial, we will only log this data locally, but it is possible to log this data directly to the cloud or stream it via tcp. It is also possible to log data at pre-defined intervals. You can find the full list of configuration parameters in an example here. One can log prediction data with Babylog in a few lines of code. For a full list of what you can log, please refer to the documentation.

import time
from babylog import Babylog, VisionModelType, InferenceDevice

bl = Babylog("babylog.config.yaml")  # save_local is True by default

bl.log(
    image=img,
    model_type=VisionModelType.DETECTION,
    model_name="yolov8n_pretrained",
    model_version="0.0.1",
    latency=mean_latency,
    inference_device=InferenceDevice.CPU,
    detection=boxes_babylog,
)

bl.shutdown()

Viewing the logged information with babylog

We use the LoggedPrediction class to load the logged binary file.

from babylog import LoggedPrediction

logged_prediction = LoggedPrediction.from_path(logfile_path)
predicted_image = logged_prediction.image  # raw image that was logged

Next, we draw the detected bounding boxes and overlay the detected class labels and confidence scores on the input image.

import numpy as np

for detection in logged_prediction.detection:
	# getting each logged bounding box
  x = detection['x']; y = detection['y'] 
	w = detection['width']; h = detection['height']
  top_left = (x-w//2, y-h//2)
  bottom_right = (x+w//2, y+h//2)

  # visualizing the bounding boxes
  color = (np.array([0., 0., 1.]) * 255).astype(np.uint8).tolist()
  text = '{}:{:.1f}%'
					.format(detection['classificationResult'][0]['className'], 
									detection['confidence'] * 100)
  txt_color = (255, 255, 255)
  font = cv2.FONT_HERSHEY_SIMPLEX
  txt_size = cv2.getTextSize(text, font, 0.8, 1)[0]

  cv2.rectangle(predicted_image, top_left, bottom_right, color, 2)
  txt_bk_color = (np.array([0.8, 0., 0.8]) * 255).astype(np.uint8).tolist()
  cv2.rectangle(
      predicted_image,
      (top_left[0], top_left[1] + 1),
      (top_left[0] + int(0.5*txt_size[0]) + 1, top_left[1] + int(1.5*txt_size[1])),
      txt_bk_color,
      -1
  )
  cv2.putText(predicted_image, text, (top_left[0], top_left[1] + txt_size[1]), 
							font, 0.4, txt_color, thickness=1)

Finally, we overlay inference statistics and model information on the image, and display the image with imshow using OpenCV. Click here to check the image with the overlayed information.

# Overlay inference stats
device = logged_prediction.inference_stats['inferenceDevice']
latency = logged_prediction.inference_stats['latency']
stats = 'Inference stats: {}, {}ms'.format(device, latency)
predicted_image = cv2.putText(predicted_image, stats, (50, 50), 
															font, 1.0, (0., 255., 0.), thickness=2)

# Overlay model info
model_version = logged_prediction.model['version']
model_name = logged_prediction.model['name']
stats = 'Model info: {} v{}'.format(model_name, model_version)
predicted_image = cv2.putText(predicted_image, stats, (50, 100), 
															font, 1.0, (0., 255., 0.), thickness=2)

# Display the predicted image
cv2.imshow("Babylogged Prediction", predicted_image)
cv2.waitKey(0)

Try it out yourself!

You can try out this tutorial in a colab notebook. Follow us on LinkedIn and Github for more tutorials!

Last updated