Training YOLO with a Custom Dataset: A Step-by-Step Guide
Object detection has become an essential technology in various industries, including security, automation, and robotics. YOLO (You Only Look Once) is one of the most popular real-time object detection models due to its speed and accuracy. In this blog post, we will walk you through training YOLO with your custom dataset, making it ready for real-world applications.
Step 1: Install Dependencies
To begin, install the necessary dependencies. The latest versions of YOLOv5 or YOLOv8 make training simpler and more efficient.
# Clone the YOLOv5 repository
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
# Install required packages
pip install -r requirements.txt
For YOLOv8, you can install the Ultralytics package directly:
pip install ultralytics
Step 2: Prepare Your Dataset
YOLO requires data in a specific format, where each image has an associated annotation file in the YOLO format:
<class_id> <x_center> <y_center> <width> <height>
All values are normalized between 0 and 1. Below is the correct dataset folder structure:
/dataset
├── images
│ ├── train
│ │ ├── img1.jpg
│ │ ├── img2.jpg
│ ├── val
│ ├── img3.jpg
│ ├── img4.jpg
├── labels
│ ├── train
│ │ ├── img1.txt
│ │ ├── img2.txt
│ ├── val
│ ├── img3.txt
│ ├── img4.txt
├── data.yaml
Creating the data.yaml File
This file defines the dataset structure and class names:
train: /path/to/dataset/images/train
val: /path/to/dataset/images/val
nc: 2 # Number of object classes
names: ['person', 'car'] # Object class names
Step 3: Train the Model
To train YOLOv5, run the following command:
python train.py --img 640 --batch 16 --epochs 50 --data dataset/data.yaml --weights yolov5s.pt --cache
For YOLOv8, use:
yolo train model=yolov8n.pt data=dataset/data.yaml epochs=50 imgsz=640
Step 4: Monitor Training Progress
YOLO logs various performance metrics during training. If using YOLOv5, results will be stored in runs/train/exp/. You can visualize training performance using TensorBoard:
tensorboard --logdir=runs/train
Step 5: Evaluate and Test the Model
Once training is complete, test the model on new images:
python detect.py --weights runs/train/exp/weights/best.pt --img 640 --source test_images/
For YOLOv8:
yolo detect model=runs/train/exp/weights/best.pt source=test_images/
Step 6: Export for Deployment
YOLO models can be exported to multiple formats for deployment:
python export.py --weights runs/train/exp/weights/best.pt --include onnx torchscript
For YOLOv8:
yolo export model=runs/train/exp/weights/best.pt format=onnx
Final Thoughts
Training YOLO with a custom dataset enables real-world object detection for applications such as security, traffic monitoring, and automation. By following this step-by-step guide, you can prepare, train, and deploy your YOLO model effectively.
Would you like help automating the dataset preparation or optimizing training settings? Let us know in the comments!
Get in Touch with us
Related Posts
- The Top 7 Reasons Digital Government Services Fail After Launch
- 面向市级与区级政府的数字化系统参考架构
- Reference Architecture for Provincial / Municipal Digital Systems
- 实用型 GovTech 架构:ERP、GIS、政务服务平台与数据中台
- A Practical GovTech Architecture: ERP, GIS, Citizen Portal, and Data Platform
- 为什么应急响应系统必须采用 Offline First 设计(来自 ATAK 的启示)
- Why Emergency Systems Must Work Offline First (Lessons from ATAK)
- 为什么地方政府的软件项目会失败 —— 如何在编写代码之前避免失败
- Why Government Software Projects Fail — And How to Prevent It Before Writing Code
- AI 热潮之后:接下来会发生什么(以及这对中国企业意味着什么)
- After the AI Hype: What Always Comes Next (And Why It Matters for Business)
- 为什么没有系统集成,回收行业的 AI 项目往往会失败
- Why AI in Recycling Fails Without System Integration
- ISA-95 vs RAMI 4.0:中国制造业应该如何选择(以及为什么两者缺一不可)
- ISA-95 vs RAMI 4.0: Which One Should You Use (And Why Both Matter)
- 为什么低代码正在退潮(以及它正在被什么取代)
- Why Low‑Code Is Falling Out of Trend (and What Replaced It)
- 2025 年失败的产品 —— 真正的原因是什么?
- The Biggest Product Failures of 2025 — And the Real Reason They Failed
- Agentic AI Explained: Manus vs OpenAI vs Google —— 中国企业的实践选择













