LEGO Cobot
by Dr. Yuhan Jiang
LOGO Cobot 101
LEGO Cobot first reads the LEGO PDF Building Instructions, separates the Brick images that are required for each step, and the AI (image classification model ) identifies the categories and numbers of LEGO Bricks for each step.
Next, LEGO Cobot looks for the available bricks on the Table via the Camera, AI (object detection model) determines the corresponding coordinates and rotation angles of the corresponding LEGO pieces.
Then, LEGO Cobot uses its Robotic Arm to pick up the corresponding LEGO bricks shown on the Building Instructions. LEGO Player (Kids) use the LEGO Cobot picked LEGO bricks to assemble the LEGO model by following the LEGO building instructions.
Jetson AGX Orin (Jetson SDK 5.1.2) Software Environment
Install Ultralytics Package
Here we will install Ultralytics package on the Jetson with optional dependencies so that we can export the PyTorch models to other different formats. We will mainly focus on NVIDIA TensorRT exports because TensorRT will make sure we can get the maximum performance out of the Jetson devices.
Update packages list, install pip and upgrade to latest
sudo apt update
sudo apt install python3-pip -y
pip install -U pip
Install ultralytics pip package with optional dependencies
pip install ultralytics[export]
Reboot the device
sudo reboot
Install PyTorch and Torchvision
The above ultralytics installation will install Torch and Torchvision. However, these 2 packages installed via pip are not compatible to run on Jetson platform which is based on ARM64 architecture. Therefore, we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source.
Uninstall currently installed PyTorch and Torchvision
pip uninstall torch torchvision
Install PyTorch 2.1.0 according to JP 5.1.2
sudo apt-get install -y libopenblas-base libopenmpi-dev
wget https://developer.download.nvidia.com/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl -O torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
pip install torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
Install Torchvision v0.16.2 according to PyTorch v2.1.0
sudo apt install -y libjpeg-dev zlib1g-dev
git clone https://github.com/pytorch/vision torchvision
cd torchvision
git checkout v0.16.2
python3 setup.py install --user
Install onnxruntime-gpu
The onnxruntime-gpu package hosted in PyPI does not have aarch64 binaries for the Jetson. So we need to manually install this package. This package is needed for some of the exports.
All different onnxruntime-gpu packages corresponding to different JetPack and Python versions are listed here. However, here we will download and install onnxruntime-gpu 1.17.0 with Python 3.8 support.
wget https://nvidia.box.com/shared/static/zostg6agm00fb6t5uisw51qi6kpcuwzd.whl -O onnxruntime_gpu-1.17.0-cp38-cp38-linux_aarch64.whl
pip install onnxruntime_gpu-1.17.0-cp38-cp38-linux_aarch64.whl
onnxruntime-gpu will automatically revert back the numpy version to latest. So we need to reinstall numpy 1.23.5 to fix an issue by executing:
pip install numpy==1.23.5
sudo pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v512 tensorflow==2.12.0+nv23.06
Jetson Install librealsense SDK with Debian packages
The docs suggest a simpler method for the latest JetPack versions.
Register the server’s public key
sudo apt-key adv --keyserver keys.gnupg.net --recv-key F6E65AC044F831AC80A06380C8B3A55A6F3EFCDE || sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-key F6E65AC044F831AC80A06380C8B3A55A6F3EFCDE
Add the server to the list of repositories
sudo add-apt-repository "deb https://librealsense.intel.com/Debian/apt-repo bionic main" -u
Install the SDK
sudo apt-get install librealsense2-utils
sudo apt-get install librealsense2-dev
Check installation using
realsense-viewer
Jetson Install PyCharm Professional
There is a separate tarball for ARM64 processors.
sudo tar xzf pycharm-*.tar.gz -C /opt/
cd /opt/pycharm-2024.2.4/bin
sh pycharm.sh
On Linux, the installation directory contains the launcher shell script pycharm.sh under bin. For example, if you installed PyCharm to /opt/pycharm, you can run the script using the following command:
/opt/pycharm-2024.2.4/bin/pycharm.sh
You can create a symbolic link to the launcher script in a directory from the PATH environment variable. For example, if you want to create a link named pycharm in /usr/local/bin, run the following command:
sudo ln -s /opt/pycharm-2024.2.4/bin/pycharm.sh /usr/local/bin/pycharm
Since /usr/local/bin should be in the PATH environment variable by default, you should be able to run the pycharm command from anywhere in the shell.
pycharm
LEGO Parts Object Detection with AI
import os
import cv2
import xml.dom.minidom
image_path="/media/.../B200 LEGO Detection Dataset/images/"
annotation_path="/media/.../B200 LEGO Detection Dataset/annotations/"
files_name = os.listdir(image_path)
font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 1
fontColor = (255,255,255)
thickness = 2
lineType = 2
for filename_ in files_name:
filename, extension= os.path.splitext(filename_)
img_path =image_path+filename+'.png'
xml_path =annotation_path+filename+'.xml'
print(img_path)
img = cv2.imread(img_path)
if img is None:
pass
dom = xml.dom.minidom.parse(xml_path)
root = dom.documentElement
objects=dom.getElementsByTagName("object")
print(filename)#objects)
i=0
for object in objects:
name = root.getElementsByTagName("name")[i]
name_data=name.childNodes[0].data
bndbox = root.getElementsByTagName('bndbox')[i]
xmin = bndbox.getElementsByTagName('xmin')[0]
ymin = bndbox.getElementsByTagName('ymin')[0]
xmax = bndbox.getElementsByTagName('xmax')[0]
ymax = bndbox.getElementsByTagName('ymax')[0]
xmin_data=xmin.childNodes[0].data
ymin_data=ymin.childNodes[0].data
xmax_data=xmax.childNodes[0].data
ymax_data=ymax.childNodes[0].data
print('Lego Part:',name_data,'@',xmin_data,'\t',ymin_data)
i= i +1
bottomLeftCornerOfText = (int(xmin_data),int(ymin_data))
cv2.putText(img,str(name_data),bottomLeftCornerOfText,font,fontScale,fontColor,thickness,lineType)
cv2.rectangle(img,(int(xmin_data),int(ymin_data)),(int(xmax_data),int(ymax_data)),(55,255,155),2)
#end one image
cv2.imshow('xml',img)
cv2.waitKey(10)
print("all done ====================================")
Object Detection AI Model Training Dataset Preparation
Convert B200 LEGO Detection Dataset *.xml Annotations to YOLO Labels, *.txt files. The 200 LEGO Part Names see the below part_list
import os
import xml.dom.minidom
TXT_EXT = '.txt'
image_path = "/media/jyh/3031-6638/B200 LEGO Detection Dataset/images/"
annotation_path = "/media/jyh/3031-6638/B200 LEGO Detection Dataset/annotations/"
img_width = 2048
img_height = 2048
files_name = os.listdir(image_path)
part_list = ['10247', '11090', '11211', '11212', '11214', '11458', '11476', '11477', '14704', '14719', '14769', '15068',
'15070', '15100', '15379', '15392', '15535', '15573', '15712', '18651', '18654', '18674', '18677', '20482',
'22388', '22885', '2357', '2412b', '2420', '24201', '24246', '2431', '2432', '2436', '2445', '2450',
'2454', '2456', '24866', '25269', '2540', '26047', '2654', '26601', '26603', '26604', '2780', '27925',
'28192', '2877', '3001', '3002', '3003', '3004', '3005', '3008', '3009', '3010', '30136', '3020', '3021',
'3022', '3023', '3024', '3031', '3032', '3034', '3035', '3037', '30374', '3039', '3040', '30413', '30414',
'3062b', '3065', '3068b', '3069b', '3070b', '32000', '32013', '32028', '32054', '32062', '32064', '32073',
'32123', '32140', '32184', '32278', '32316', '3245c', '32523', '32524', '32525', '32526', '32607', '32952',
'33291', '33909', '34103', '3460', '35480', '3622', '3623', '3660', '3665', '3666', '3673', '3700', '3701',
'3705', '3710', '3713', '3749', '3795', '3832', '3937', '3941', '3958', '4032', '40490', '4070', '4073',
'4081b', '4085', '4162', '41677', '41740', '41769', '41770', '42003', '4274', '4286', '43093', '43722',
'43723', '44728', '4477', '4519', '4589', '4599b', '4740', '47457', '48336', '4865', '48729', '49668',
'50950', '51739', '53451', '54200', '59443', '60470', '60474', '60478', '60479', '60481', '60483', '60592',
'60601', '6091', '61252', '6134', '61409', '61678', '62462', '63864', '63868', '63965', '64644', '6536',
'6541', '6558', '6632', '6636', '85080', '85861', '85984', '87079', '87083', '87087', '87552', '87580',
'87620', '87994', '88072', '88323', '92280', '92946', '93273', '98138', '98283', '99206', '99207', '99563',
'99780', '99781', '2429', '2430']
for filename_ in files_name:
filename, extension = os.path.splitext(filename_)
img_path = image_path + filename + '.png'
xml_path = annotation_path + filename + '.xml'
print(img_path)
dom = xml.dom.minidom.parse(xml_path)
root = dom.documentElement
objects = dom.getElementsByTagName("object")
print(filename) # objects)
i=0
out_file = open(image_path + str(filename) + TXT_EXT, 'w', encoding="utf-8")
for object in objects:
name = root.getElementsByTagName("name")[i]
name_data = str(name.childNodes[0].data)
bndbox = root.getElementsByTagName('bndbox')[i]
xmin = bndbox.getElementsByTagName('xmin')[0]
ymin = bndbox.getElementsByTagName('ymin')[0]
xmax = bndbox.getElementsByTagName('xmax')[0]
ymax = bndbox.getElementsByTagName('ymax')[0]
xmin_data = int(xmin.childNodes[0].data)
ymin_data = int(ymin.childNodes[0].data)
xmax_data = int(xmax.childNodes[0].data)
ymax_data = int(ymax.childNodes[0].data)
# BNDBox coordinates must be in normalized xywh format (from 0 to 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
x_center = float((xmin_data + xmax_data)) / 2 / img_width
y_center = float((ymin_data + ymax_data)) / 2 / img_height
width = float((xmax_data - xmin_data)) / img_width
height = float((ymax_data - ymin_data)) / img_height
if name_data not in part_list:
part_list.append(name_data)
classIndex = part_list.index(name_data)
out_file.write("%d %.6f %.6f %.6f %.6f\n" % (classIndex, x_center, y_center, width, height))
i=i+1
out_file.close()
May need copy the *.txt files to the labels folder manually. Example see below.
install [Windows Installer 64-bit link] python 3.8.10
https://www.python.org/downloads/release/python-3810/
install [Windows 11, Network Installer Link] CUDA 12.1
https://developer.nvidia.com/cuda-12-1-0-download-archive
install Torch v2.1.0, CUDA 12.1 version
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
install Yolo
pip install ultralytics
Training Yolo
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8x-p2.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="E:\LegoCobot\B200LEGO.yaml", epochs=10, imgsz=640,plots=True,device=[0, 1])
Download or Use notepad to create the B200LEGO.yaml file (change *.txt to *.yaml), copy and past the following texts. Change the path if applied
# https://www.yuhanjiang.com/K12/lego-cobot
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: E:/LegoCobot/ # dataset root dir
train: images/train # train images (relative to 'path') 1800 images 200-1999
val: images/val # val images (relative to 'path') 200 images 0-199
test: # test images (optional)
# Classes
names:
0 : 10247
1 : 11090
2 : 11211
3 : 11212
4 : 11214
5 : 11458
6 : 11476
7 : 11477
8 : 14704
9 : 14719
10 : 14769
11 : 15068
12 : 15070
13 : 15100
14 : 15379
15 : 15392
16 : 15535
17 : 15573
18 : 15712
19 : 18651
20 : 18654
21 : 18674
22 : 18677
23 : 20482
24 : 22388
25 : 22885
26 : 2357
27 : 2412b
28 : 2420
29 : 24201
30 : 24246
31 : 2431
32 : 2432
33 : 2436
34 : 2445
35 : 2450
36 : 2454
37 : 2456
38 : 24866
39 : 25269
40 : 2540
41 : 26047
42 : 2654
43 : 26601
44 : 26603
45 : 26604
46 : 2780
47 : 27925
48 : 28192
49 : 2877
50 : 3001
51 : 3002
52 : 3003
53 : 3004
54 : 3005
55 : 3008
56 : 3009
57 : 3010
58 : 30136
59 : 3020
60 : 3021
61 : 3022
62 : 3023
63 : 3024
64 : 3031
65 : 3032
66 : 3034
67 : 3035
68 : 3037
69 : 30374
70 : 3039
71 : 3040
72 : 30413
73 : 30414
74 : 3062b
75 : 3065
76 : 3068b
77 : 3069b
78 : 3070b
79 : 32000
80 : 32013
81 : 32028
82 : 32054
83 : 32062
84 : 32064
85 : 32073
86 : 32123
87 : 32140
88 : 32184
89 : 32278
90 : 32316
91 : 3245c
92 : 32523
93 : 32524
94 : 32525
95 : 32526
96 : 32607
97 : 32952
98 : 33291
99 : 33909
100 : 34103
101 : 3460
102 : 35480
103 : 3622
104 : 3623
105 : 3660
106 : 3665
107 : 3666
108 : 3673
109 : 3700
110 : 3701
111 : 3705
112 : 3710
113 : 3713
114 : 3749
115 : 3795
116 : 3832
117 : 3937
118 : 3941
119 : 3958
120 : 4032
121 : 40490
122 : 4070
123 : 4073
124 : 4081b
125 : 4085
126 : 4162
127 : 41677
128 : 41740
129 : 41769
130 : 41770
131 : 42003
132 : 4274
133 : 4286
134 : 43093
135 : 43722
136 : 43723
137 : 44728
138 : 4477
139 : 4519
140 : 4589
141 : 4599b
142 : 4740
143 : 47457
144 : 48336
145 : 4865
146 : 48729
147 : 49668
148 : 50950
149 : 51739
150 : 53451
151 : 54200
152 : 59443
153 : 60470
154 : 60474
155 : 60478
156 : 60479
157 : 60481
158 : 60483
159 : 60592
160 : 60601
161 : 6091
162 : 61252
163 : 6134
164 : 61409
165 : 61678
166 : 62462
167 : 63864
168 : 63868
169 : 63965
170 : 64644
171 : 6536
172 : 6541
173 : 6558
174 : 6632
175 : 6636
176 : 85080
177 : 85861
178 : 85984
179 : 87079
180 : 87083
181 : 87087
182 : 87552
183 : 87580
184 : 87620
185 : 87994
186 : 88072
187 : 88323
188 : 92280
189 : 92946
190 : 93273
191 : 98138
192 : 98283
193 : 99206
194 : 99207
195 : 99563
196 : 99780
197 : 99781
198 : 2429
199 : 2430
Below is the python3 code to download images from https://brickarchitect.com/parts/
from bs4 import *
import requests
import os
# CREATE FOLDER
def folder_create(images):
try:
folder_name = input("Enter Folder Name:- ")
# folder creation
os.mkdir(folder_name)
# if folder exists with that name, ask another name
except:
print("Folder Exist with that name!")
folder_create(images)
# image downloading start
#print(images["src"])
download_images(images, folder_name)
# DOWNLOAD ALL IMAGES FROM THAT URL
def download_images(images, folder_name):
# initial count is zero
count = 0
# print total images found in URL
print(f"Total {len(images)} Image Found!")
# checking if images is not zero
if len(images) != 0:
for i, image in enumerate(images):
try:# In image tag ,searching for "src"
image_link = image["src"]
except:# if no Source URL found
pass
img_name=image_link[image_link.find("parts/")+len("parts/"):] # Get Brick Part Name
print(img_name)
# After getting Image Source URL
# We will try to get the content of image
try:
r = requests.get(image_link).content
try:
# possibility of decode
r = str(r, 'utf-8')
except UnicodeDecodeError:
# After checking above condition, Image Download start
with open(f"{folder_name}/{img_name}", "wb+") as f:
f.write(r)
# counting number of image downloaded
count += 1
except:
pass
# There might be possible, that all
# images not download
# if all images download
if count == len(images):
print("All Images Downloaded!")
# if all images not download
else:
print(f"Total {count} Images Downloaded Out of {len(images)}")
# MAIN FUNCTION START
def main(url):
# content of URL
r = requests.get(url)
# Parse HTML Code
soup = BeautifulSoup(r.text, 'html.parser')
# find all images in URL
images = soup.findAll('img')
# Call folder create function
folder_create(images)
# take url
url = input("Enter URL:- ")
# CALL MAIN FUNCTION
main(url)
input: https://brickarchitect.com/parts/category-1
input: 1 as the example of save folder name.
Classification Dataset Preparation and Augmentation Strategies
1. One whole class for training the one image size 640 model
2. One less (<80 pixel, Brick Architect) for training the image size 64 or 96 pixel model
3. Image Augmentation, Padding -50 to 50 for 640, Padding 0 to 20 for 96, Padding 0 to 10 for 64. Edges and Blurred (filter size 5x5)
4. Adding color images from Brickset.com and LDRAW.org
url="https://brickset.com/parts/design-"+brick
url=f'https://library.ldraw.org/images/library/official/parts/{brick}.png'
Yolov11 Classification Demo
LEGO Cobot Operation
WidowX 250 S
bot.arm.set_ee_cartesian_trajectory(roll=(brick_ang-90)/180*np.pi, moving_time=0.5) # pi rad =180 degree
Reading LEGO PDF Building Instructions and Separating the Bricks for Each Step
Determine the Lego Brick Sizes, Coordinates, and Rotation Angle
Credit: https://i.sstatic.net/LhZRi.png
https://stackoverflow.com/questions/15956124/minarearect-angles-unsure-about-the-angle-returned
min_area_rectangle = cv2.minAreaRect(coord) #It returns a Box2D structure which contains following details - ( center (x,y), (width, height), angle of rotation ). But to draw this rectangle, we need 4 corners of the rectangle.
length=min_area_rectangle[1][0]; width=min_area_rectangle[1][1]; center_x=min_area_rectangle[0][0]; center_y=min_area_rectangle[0][1]; rotation=min_area_rectangle[2]
brick_length=max(length,width);brick_width=min(length,width);brick_height=height
if length>width:
rotation=rotation+180
else:
rotation=rotation+90