YOLOv9 Architecture Explained | Stunning Vision AI

YOLOv9 Architecture Explained
By Dr. Priyanto Hidayatullah and Refdinal Tubagus

YOLOv9, the latest version in the YOLO object detection series, was released by Chien-Yao Wang and his team on February 2024. In this version, methods such as Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN) were introduced with the goal of effectively addressing the problem of information loss that occurs when passing through the layers of a deep learning network and improving computational efficiency.

Architecture

Fig. 1 YOLOv9-C Architecture

YOLO is a single-stage object detection algorithm. The goal of single-stage object detection is to look at an image only once. In general, YOLO consists of various sections, including the backbone, neck, and head. Backbone refers to the architecture that handles feature extraction. Neck is built utilizing a different pyramid network method. Pyramid networks are used to combine features from multiple layers of the backbone model. Multiple head perform the detection of objects in different resolutions.

In YOLOv9 there is an additional section, namely the auxiliary. The auxiliary improves the training process's reliability by providing additional information that links the input data to the target output. So the problem of losing information when passing through deep learning network layers can be resolved.

In the inference process, this auxiliary can be removed to speed up the performance of the model without reducing accuracy.

How to Draw the Architecture ?

Fig 2. How to draw the architecture from YAML file

This architecture image is based on a yolov9-c.yaml file, which is located in the models/detect folder. To draw the architecture, we start from the backbone. The numbering starts from 0. In the backbone, the first block is Silence, therefore the block is given the number 0. After that, continue to the next blocks such as Conv, RepNSCPELAN4, and ADown. For details of please refer to the source code, specifically in the yolo.py and common.py files.

To be honest, YOLOv9 is very accurate in some cases but YOLOv8 is better in some other cases. It is better if you learn both.

If you want to learn more about YOLOv9 architecture and its details along with its application, you can check out our YOLOv9-YOLOv8-YOLOv7: 3 IN ONE COURSE. With one enrollment, you get all three best models so far. Click this link for more information: https://bit.ly/YOLOv7_v8_v9_Course

How to Cite:

[1] Priyanto Hidayatullah and Refdinal Tubagus, “YOLOv9 Architecture Explained | Stunning Vision AI.” Accessed: Apr. 18, 2024. [Online]. Available: https://article.stunningvisionai.com/yolov9-architecture

@Article{
  AUTHOR = {Hidayatullah, Priyanto and Tubagus, Refdinal},
  TITLE = {YOLOv9 Architecture Explained},
  URL = {https://article.stunningvisionai.com/yolov9-architecture},
}