Building a mobile and anthropomorphic robotic arm with OpenCV for [627486]

Building a mobile and anthropomorphic robotic arm with OpenCV for
traffic sign detection
Romeo Mihai Horeang ˘a1
Coordonator: S.L. DR. ING. Vlad Ciobanu
Abstract — Real-time identification of traffic signs is a vital
component in new modern age vehicles. OpenCV installed on
a Raspberry Pi is just an approach to this problem. In this
project I wanted to develop and build a real-time and low-cost
mobile robotic anthropomorphic arm in a mini-rover style, able
to provide real-time decisions based on the video input from
the camera placed in the front part of the robot.
I. I NTRODUCTION
A. Robots
Industrial robots have emerged in response to human need
for automation of production processes, especially those
repetitive. In addition to fixed industrial robots lately they
have been used mobile robots. The main activities to be
undertaken by industrial mobile robots are linked to transport
and handling objects and sometimes carrying out processes
(painting, inspection, assembly, etc.)
The robot is a system composed of several elements:
Mechanical
Sensors
Actuators
Track mechanism
Robots are mostly driven by the combination of disci-
plines: mechanical, electrical and computer science.The most
important components of robots are sensors that allow their
mobility into the environment.[1]
The robots can be reduced to the constituent elements,
namely :
Space of operating
Source of energy
Source of information
Robot
Remote controlled robots have various scientific uses, in-
cluding hazardous environments, working in the deep ocean,
and space exploration, and police are using them to detect
bombs or some chemicals, etc. They are controlled using
radio control.[2]
B. Mechanical arm
Mechanical robot subsystem that gives movement of the
working device, is a mechanism with multiple degrees of
freedom driven by as many engines.
1Romeo H. is with Master of Science in Advanced Computer Architec-
tures, Faculty of AUTOMATIC CONTROL and COMPUTER SCIENCE
University POLITEHNICA of Bucharest, Bucharest (Romania)
Coordonator: S.L. DR. ING. Vlad CiobanuThe large number of mobility manipulator mechanism
offers the opportunity to achieve a variety of effector ele-
ment movements only by reprogramming the control system
without requiring changes in its mechanical structure.[3]
There have been developed two broad categories of ma-
nipulative mechanisms:
Serial mechanisms
Parallel mechanisms
Serial manipulative mechanisms have the simplest struc-
ture obtained by chaining of cinematic elements, from fixed
element basis. Kinematic couplings are connecting joints and
/ or the wings.
The anthropomorphic robotic arm used for this project has
five joints of rotation and a gripper. The first torque is in the
base and make a rotation on axis Z, the first rotation couple
is fixed on the mobile platform.[5]
Fig. 1. Representing couples rotation
Denavit-Hartenberg analysis is used to determine the pa-
rameters of direct kinematics.
TABLE I
DENAVIT -HARTENBERG PARAMETERS
i i1ai1dii
1 0 0d11
2 90 0 02
3 0a3 03
4 0a4 0 (490)
5 -90 0d55
6 0 0 0 gripper
Parameters of each coupling matrix transformations will
lead to the equation representing the five functions of torque:

T5
0=T0
1T1
2T2
3T3
4T4
5=2
664nxoxaxpx
nyoyaypy
nzozazpz
0 0 0 13
775
The first 3 columns of the transformation matrix is the
gripper orientation, and the last column represents its posi-
tion in space.
C. Image processing
Computer vision is an interesting and dynamic part of
cognitive science and computer science as a result of an
explosion of interest in the 1970s and 1980s, the last three
decades have been characterized by significant growth and
maturation of the field of active applications; remote sensing,
diagnostic technique, autonomous vehicle guidance, biomed-
ical imaging (2D, 3D and 4D) and automatic surveillance
zones are developing faster. This progress can be seen in a
growing number of software and hardware on the market.[4]
Image processing: is defined as any form of signal pro-
cessing input is an image or video sequence and the output
can be either another image / video clip or a set of features
or parameters that describe the image / video sequence input.
Computer vision: is the science where machines can see, the
sight if the ability of the system to extract information from
images that can be used to solve a given task.
II. S IMULATIONS
A. Simulating robotic arm in MatLab
In the MATLAB environment was developed an appli-
cation based on which it is established the workload and
determined trajectory of the gripper in the workspace. To
do this, I used the Robotics Toolbox package offered by
Peter Corke. Each coupling part was defined and each arm
dimensions and maximum angle of rotation. The graph
display was made using the "plot" function.
Fig. 2. MatLab Simulation of a 4DOF arm
B. Robotic arm design in Solidworks
SolidWorks environment was used to design the robotic
arm, it was used to create the basic models of the arm, as
well as components and assembly. (Fig. 3.)
Fig. 3. Anthropomorphic robotic arm in SolidWorks design environment
C. Prehensile mechanisms
The most common gripping systems are those composed
of gears and linkage mechanisms. In Fig. 4. is shown a
parallel system gripper. This model was used in making the
robotic arm, being the last element of the driveline. Actuators
of the gripping systems must respond to the main tasks
incumbent on such a system, such as: ensuring sufficient
clamping force, accuracy, reliability, flexibility and compli-
ance. Depending on the type of energy used to drive, the mo-
tors can be electric, hydraulic, pneumatic or unconventional
type. Electric motors are widely used in building control
systems grippers thanks to their simplicity. The hydraulic
motors, linear or rotary, are used in applications that involve
large clamping force, while the pneumatic actuation is used
as needed to applications where necessary forces have lower
values, but compliance is an important feature. [6] In this
case a servo motor is used to drive the gripping system.
III. S YSTEMS
A. The drive system of couplers – servo motors
For operating robotic system modules, one of the solutions
is the so-called servo motors.
Recent years have been marked by developments rep-
resentative in terms of size, speed of rotation and torque
servo motors.The latest improvement is the development of
digital actuators which have significant functional advantages
compared to standard actuators. Basically a digital servo
actuator is similar to a standard one, except a microprocessor
that analyzes incoming signals and controls the engine. One
of the differences is in how to process incoming signals
and transmitting the initial power to the actuator, reducing
dead time, increasing the resolution and generating higher
torque retention. Conventional servo motors in the resting
phase, do not power the actuator. Where digital servomotors,

Fig. 4. Prehensile mechanisms in SolidWorks design environment
when it received an order to start the servo motor, or
when applying torque to the output shaft, servo answers
by providing power supply voltage to engine. This power
is actually the maximum voltage, is transmitted with pulse
modulation or on / off with a fixed rate of 50 cycles per
second, generating low voltage pulses. By increasing the
duration of each pulse it creates a speed control, until is
applied the maximum voltage to the motor, accelerating the
servo arm to new position. The end of the servo movement
is reported and sent to the electronic servo components, and
thus reduce power pulses by shortening their duration until
the input voltage and actuator stops.
TABLE II
MECHANICAL ,ELECTRICAL AND GEOMETRICAL CHARACTERISTICS
Size(mm): Mass weight Couple at 4 ˙8V (Nmm) Speed (0:16s=45°) Holding Torque :
40.5x20x36.1 42 310 0.21 770
Digitalis servo motors advantages are:
First, because of the microprocessor, is possible to
receive input signals and apply parameters present in
this signal before it sends a pulse to the actuator.
This means that the pulse duration due summation
sent to activate the motor voltage can be adjusted by
programming the microprocessor to match the required
functions and more to optimize performance servo
Secondly, a digital servo sends pulses to the motor at
a significantly higher frequency. This means that the
engine instead of receiving 50 pulses / second will
receive 300 pulses / second. Although the pulse duration
is reduced in direct proportion to the frequency, because
the motor voltage changes more frequently, it has larger
tends to return. This means that digital actuators will
respond quickly to commands.
Motor control was achieved by using PWM channels
(Pulse-Width Modulation) from the microcontroller ArduinoMega. PWM command means a adjusting the width of a
pulse. It is one of the most modern methods of regulating the
speed of DC motors. PWM command can be done in open or
closed circuit.Pulse width is proportional to the input voltage
Uand the voltage Uoutaverage output is proportional to the
pulse duration. This results in a linear dependence between
Uoutamplifier and U. The studies so far have observed the
appearance of a hysteresis between the increase and decrease
of the period t. This mode of application of the pulse is
shown in Figure 5.
Fig. 5. Modulation in width of pulse
B. The drive system of DC motors
In order to achieve positioning accuracy of the tracked
mobile platform was used an H bridge. H Bridge is an
electronic circuit that allows applying a voltage to a load
in any direction. These circuits are often used in robotics
and other applications to allow DC motors to run back and
forth. H bridges are available as integrated circuits or may be
constructed of discrete components, bipolar transistors and
MOS transistors. H bridge has the name derived from the
usual way of drawing circuit (see Figure 6). This is the only
way to solid state control a motor in both directions.
Fig. 6. Principle scheme of bridge H
Operating mode of an H-bridge :
When switches S1 and S4 are closed and S2 and S3 are
opened will be a positive voltage applied to the motor.
By opening and closing switches S1 and S4 switches
S2 and S3, this voltage is reversed, thus facilitating the
functioning of the motor in reverse.

Using the above nomenclature, switches S1 and S2
should not be closed simultaneously, as this would cause
a short-circuit in the power supply ( Vin)
L298 integrated circuit is a mid-level driver in terms
of power driven. Can easily and independently control the
motors that require no more than 2 amps in both directions.
It is an dual engine driver and bidirectional (dual H-bridge).
C. Raspberry Pi
Raspberry Pi is a small computer running the operating
system compatible with ARM processor, for example Linux
operating systems.
Specifications :
Quad Core CPU
1GB RAM
1.2GHz Board Clock Speed Broadcom BCM2837 64bit
CPU
40 GPIO Pins
4 x USB 2 Ports
4 Pole Stereo Output
HDMI Port and 10/100 Ethernet
Micro SD Card Slot
System on a chip (SoC) used in the first generation
Raspberry Pi is somewhat equivalent to the chip used in
older smartphones.
As it manufacturers show, Raspberry Pi is a microcom-
puter that can be inserted into a monitor or TV , using a
standard keyboard and mouse, being able to allow use as
a computer, allowing learning programming languages such
as Python or Scratch, being able to perform the same tasks
as a normal desktop computer, internet browsing and video
games in high definition, Excel and Word.
Moreover, Raspberry is perfectly able to interact with
the outside, being very easy to use in developing projects,
from creating music, to detecting people, the observations
of environment (footage of animal behaviors, use of IR
cameras), using observation drones, etc.
The new Raspberry Pi 2 integrates a new model of
processor from Broadcom BCM2836. It has four cores at
900 MHz, providing six times the performance BCM2835
old previously used. Moreover, the new Pi now include 1GB
of RAM, double the first option, and compatibility with all
accessories for the Type B +.
D. Raspberry Pi – Arduino communication
Communication between the two components is essential
in this project, and the most effective way to transmit data
from one to the other is via a serial communication type.
We intend to continue to go into a little detail on the serial
communication. Serial communication is bidirectional (mean
Arduino can send and receive messages) and is generally
used to diagnose programs and interaction with various
peripherals
The most common functions in programming Arduino
boards are void setup () -function for initialization values
and void loop () – the main program loop. A library isa collection of functions, the collection all functions are
related.[7]
A very important aspect of libraries is that they contain
pre-written functions. This greatly improves during the exe-
cution of a project and also reduce the need for knowledge
for its implementation.
The Serial library functions used are:
Serial.begin() – designed to initialize Serial library. The
parameter speed is given in bits per second
Serial.read() – reading data from serial port
Serial.print() – print data to the serial port
Serial.println() – print data on a new line
A simplistic explanation of serial communication: Serial
communication is a form of I/O (input / output), the bits in a
byte are transferred one by one, in a sequence synchronized
on a single thread. Important to remember:
1 bit Is represented by 1 or 0
1 byte is a group of 8 bits
Fig. 7. Graphical representation of serial connection between Arduino and
Raspberry Pi
IV. T HE SYSTEM OF ARTIFICIAL VISION AND IMAGE
PROCESSING
In this chapter of the project studies how autonomous
robots uses images / video stream perceived from video
camera to identify objects of interest. The system of artificial
vision and image processing uses classification algorithms
and visual characteristics, this method was firstly imple-
mented by Viola and Jones [8], later complemented by
Rainer and Mazdt[8] has been proposed implementation of
new algorithms for detection objects in real time. The advan-
tage of this new methods of detection is that the algorithms
used have a very high level of detection and can be applied
to any type of object. One of the applications where those
detection methods can be used is in detecting physiognomy
and facial features.Method for detecting physiognomy is
based on a classifier named AdaBoost [8] and uses Haar
characteristics[8].
The algorithm for the detection of items proposed by Viola
and Jones [8] is based on features, on characteristics on
the input, and they can be codified so it will contain more
information in one place, and the processing speed is higher
then the direct use of the pixels.

Fig. 8. Haar features used in detecting objects[8]
Characteristics used for the first time in [9], and can
be used to detect human physiognomy has undergone sev-
eral modifications. This is mainly due to the fact that the
first occurrence of the method have been introduced only
rectangular features , with a representation of two to four
modules. On this method of initial representation modifica-
tions appeared in [10], a work that has removed rectangular
model with four modules and was defined as a new way
of representation in the form of rotated Haar features. The
introduction of this new representation leads to an increase
in system performance (i.e., reduce the number of incorrect
detection) up to 10
A. Python
In this project I have used Python, which is the best option
considering the usage of the Raspberry Pi 2, and it can
be used to allow interaction with the client, in therms of
acquiring the input signal and processing it and further taking
decisions based on them.
B. OpenCV
For the purposes of this project, OpenCV library was an
essential tool for the video capturing aspect of the mobile
platform. Due to the library’s flexible compatibility and
hardware acceleration standards, it was easily integrated in
the Python language and works perfectly with the Raspberry
Pi 2.
V. O PENCV IN PRACTICE
A. Initial Tests and Results & Analysis
In image processing, object recognition is meant by the
presence of an object belonging to a certain predefined
classes in a static frame (image) or a video sequence (video
stream). Objects can be classified into simple and complex
objects. If simple objects case, object recognition is equiva-
lent to the determination of edges (edge detection), regions
of a certain color, textures, contours, etc. In the case of
complex objects, for example, human face detection, the
detection is done using different learning methods based on
specific traits (classifiers). Viola and Jones have proposed
a new approach to object recognition using a cascade of
classifiers that describe the object to be recognized. The
algorithm proposed by the two researchers aimed, primarily,
human face detection, but can describe classifiers and otherclasses of objects that can be recognized. Classifier is the
basic unit for object detection and the classifier type is
called Haar(Haar-like feature), is similar to a function type
Haar.[11]
In this work, OpenCV library was used on the Linux
operating system on a Raspberry Pi platform. To initially
test the functionality of this library with Python language
(version 2.4) we implemented a simple detection system for
primitive forms, in this case, a circle of a certain color. To
determine the color, it used a combination of Hue, Satura-
tion and Brightness (or intensity)(HSV , "hue", "saturation",
"value" or "brightness"), their outcome after the logical AND
operation is the desired color for detection.
Each of these three components were manually set, as a
result of their composition, the result is a correspondent in
the RGB color. For the detection of the circle form, it was
used a Hough transform for the segmentation of margins,
connecting the pixel margins method based on the shape of
the object.
Fig. 9. Raspberry Pi – Python + OpenCV – detecting primitives (hough
circles)
It is shown in Figure 9 the detection of a circle, in
this case is blue circle within a traffic sign. As can be
seen in the window "closing" is represented with white
only what the camera perceives as blue. This is due to the
composition of hue, saturation and brightness, represented in
windows "SatComp" for saturation, "HueComp" for hue and
"ValComp" for brightness.
Following these tests have concluded that the method used
is effective for detection of primitive forms depending on
the color, managed to identify and qualify as detect the form
with a large precision. Circle drawn using OpenCV library
overlaid on the traffic sign edge with a very low error, at a
close range, and this can be used to estimate the distance
between the object of interest and the camera.
B. Generating Haar type classifier
Reasons for generating a Haar like classifier :
Classification of objects regardless of shape and color
they have.
Superior processing speed towards to direct use of
pixels.
High accuracy rate.

The main innovation brought by Viola and Jones was to
not try to directly analyze the image itself, only certain rect-
angular "features" in it. These features inspired by analogy
with the analysis of complex waveforms from the orthogonal
system of functions Haar, known as "Haar like features" after
the mathematician Alfred Haar.
First, if one color image is analyzed, it is transformed into
one in gray levels, in which appear only levels of brightness
/ luminance, color information being neglected.
Fig. 10. Haar classifier structure[12]
Finally, Viola and Jones use 38 levels for their cascade
of classifiers [10], using the first five of them respectively
by 2, 10, 25, 25 and 50 of features, and overall, all levels,
their number being 6000+ (in 5 stages). The number of
characteristics for each level was determined empirically by
trial and error, for the first few levels. The criteria were to
minimize false positives to below a certain threshold for each
level while maintaining a high rate of correct recognition
(very low rate of false-negatives). Each level features have
been added in stages until the proposed performance criteria
have been achieved at that level.
In Figure 10 it is described the structure to generate a
Haar type classifier, the need of a set of positive images
that describe the object that is intended to be identified,
and a set of negative images, which must be pictures with
totally different structure from those of positive images. For
example, to generate a Haar classifier for traffic signs, the
positive set of images used were pictures of the respective
traffic sign, and as negative images were used images of
what would be on the background of signs, that is images
of nature and city.
Fig. 11. Selecting region of interest for positive set in MatLabTo generate the classifier I used Matlab’s package Com-
puter Vision System Toolbox. It allowed uploading pictures
for positive set and selecting the region of interest within it.
It is desired to use a larger number of pictures for the
positive set, as it will produce a higher rate for detection.
After selecting the region of interest in Matlab, it provides
a tabular form for the pictures used and of each respective
region of interest, to be used later in generating the classifier.
You also need to introduce a large set of negative images
samples, from which the function will generate a negative
pattern. To achieve the utmost possible detection accuracy,
the number of steps must be as big as possible. With each
step, the detection rate function removes false positives and
the correct recognition rate will get better. [12]
Negative samples are not specified explicitly. Instead,
trainCascadeObjectDetector function automatically gener-
ates negative samples from the negative image provided by
the user that do not contain objects of interest.
Prior to each new stage, the detector lets the function
comprising the steps already trained on negative images.
Every object detected in these images is a false positive and
negative samples will be used as the next step. In this way,
each new phase of the cascade is trained to correct mistakes
made earlier stages.[12]
Sometimes classifier training may end soon, for example,
suppose formation stops after seven stages, even if we set
the parameter number to 20. The feature could not generate
enough negative samples. If we execute the function again
and we set number of steps to seven, we get the same result.
This is because the positive and negative samples for each
step are recalculated for the new number of steps.
Upon completion of the classifier training, MATLAB
returns an XML format file containing points of interest for
the detection of objects in the algorithm, it is loaded into the
program as a variable of type cv2.CascadeClassifier where
the file extension ".xml" will be generated set as a parameter.
C. Testing the generated classifier
After generating the classifier, it was loaded on a Rasp-
berry Pi and tested in a traffic sign detection application.
Fig. 12. Selecting region of interest for positive set in MatLab
The number of frames per second in this case, while using

only one classifier to detect only one region of interest was
around 2FPS.
D. Testing with multiple classifiers
After the test with one classifier, I decided to try and
detect more traffic signs using more Haar cascade classifiers,
each one made independent using a different set of positive
samples, but with the same set of samples as negative.
Fig. 13. Selecting region of interest for positive set in MatLab
In this test, the window size was set to a lower value from
the first one. However, the program managed to detect both
regions of interest with a same number of FPS’s.
More tests have concluded that, the more cascade classi-
fiers are loaded, the lower the frame rate will get. Another
test made for this case, was using six different classifiers
to detect six regions of interest, and it managed to properly
detect all six of them, but at a cost of 0.5 frames on second.
This is due to the low processing power of the Raspberry
Pi 2 B+, and the big resolution of the input image. It can
be solved, or at least improved by using a multi-threading
operation to get the input stream from the camera in a
different thread/core, and keep the input from the camera on a
high FPS stream.After all, by utilizing threads to improve the
FPS, the I/O latency will decrease as the FPS will increase.
E. Performance benchmark for OpenCV
To conclude this experiment, we have to test and see the
actual results, and for testing it I have used the imutils library
for python.[13]
$ python fps_demo . py
[ INFO ] s a m p l i n g f r a m e s from webcam . . .
[ INFO ] e l a p s e d ti me : 7 . 3 1
[ INFO ] approx . FPS : 1 3 . 6 8
[ INFO ] s a m p l i n g THREADED f r a m e s from webcam . . .
[ INFO ] e l a p s e d ti me : 3 . 7 9
[ INFO ] approx FPS : 2 6 . 1 2
As we can see from this experiment, the initial framerate,
which was around 14 frames per second has increased by
90.93% only by using the threaded I/O for the camera input
stream.
In put project we also want to use the Raspberry Pi’s
screen to show the output of the image processing process,therefore the function cv2.imshow will be used. This
will change the overall behavior of the program, since the
cv2.show function is just another stream of data, in this
case, a video stream, a form of I/O, only that it’s not reading
a frame from the video stream, instead it’s sending it as a
output frame on out display.
Now to test this we can run the test with -display 1
$ python fps_demo . py d i s p l a y 1
[ INFO ] s a m p l i n g f r a m e s from webcam . . .
[ INFO ] e l a p s e d ti me : 1 0 . 1 7
[ INFO ] approx . FPS : 9 . 8 3
[ INFO ] s a m p l i n g THREADED f r a m e s from webcam . . .
[ INFO ] e l a p s e d ti me : 9 . 6 5
[ INFO ] approx FPS : 1 0 . 3 7
In this case, we got a 5.49% increase in frames per
second, being far from the 90.93% increase from the first
test. However, this example also used a cv2.waitKey(1)
functions, which is necessary for the frame to be displayed
on our screen.
Overall, the cv2.imshow function is recommended in
the debugging of out program, but if the final product dose
not require it, there is no reason to include it, as it will
decrease the FPS.
F . Testing ORB: Feature matching
"This algorithm was brought up by Ethan Rublee, Vincent
Rabaud, Kurt Konolige and Gary R. Bradski in their paper
ORB: An efficient alternative to SIFT or SURF in 2011."[14]
ORB (Oriented FAST and Rotated BRIEF) for feature
matching in OpenCV is a algorithm that uses FAST to detect
the stable keypoints, selects the strongest features, finds their
orientation and computes the descriptors using BRIEF (the
coordinates, in pairs or k-tuples are rotated according to the
measured orientation).[14]
Fig. 14. ORB – 50 matches
One advantage for ORB is the efficiency of detection and
description on standard CPUs, which is a good choice for
our Raspberry Pi’s low processing capability. On an average
feature matched and run time, the ORB detector will go

up to 23.50 features matched on a run time of 15.33 ms,
which compared with other feature detector/matchers is a
good performance and efficiency.[15]
In Figure 14. the STOP sign was loaded as a ORB object,
with the optional parameter nfeatures which describes
the number of features to detect set as 50.
For the feature detection, ORB approximates the angle
to increments of 2=30(12 degrees) and builds a table of
precomputed BRIEF patterns, according to the orientation os
keypoints. For any feature set of nbinary tests at location
(xi;yi), define a 2nmatrix.
Fig. 15. Brute-Force Matching with ORB Descriptors
In Figure 15. we created a BFMatcher (Brute-force
descriptor matcher) object, in order to use it in the matching
of features, where crossCheck was set to True for better
results. Further, the cv2.drawMatches function which takes
first picture, set as a query image, it’s keypoints and descrip-
tors with ORB, the second picture as a training image and
with it’s keypoints and descriptors with ORB, the number of
matches, which for this example was set to 50, because that is
the number of features that have to match in the application
used further, and after that it draws the found matches of
keypoints from one image to another.
G. Using Haar cascade with ORB feature matching
OpenCV Haar feature-based cascade classifier works well
on detecting the object of interest in a simple scene, but it
can detect many false-positive if we lower the minNeighbors
(parameter specifying how many neighbors each candidate
rectangle should have to retain it) parameter, thus making
the object detection less strict.
1) Case 1: Lower minNeighbors parameter: In this case,
we will get many false positive, since we are less strict
regarding the detection of region of interest, regardless of
how good and how many times the cascade was trained, or
how many positive images we used for it.
2) Case 2: Higher minNeighbors parameter: On this
approach, the number of false positive appearances have
lowered but with the cost of not detecting the positive region
of interest at some points, like when it is a bit rotated or
at some certain distances from the camera. This problem is
called a negative-positive, because the object is within range,
is clear, but the program fails to detect the region.
3) Case 3: Lower minNeighbors parameter + ORB: The
solution was to let the parameter for minNeighbors low
enough to detect the objects of interest as well as other
false-positive matches, and then treat each region detectedwith a feature matching ORB method. Each region will be
inspected, to make sure it is actually a STOP sign.
There are many techniques available in OpenCV to inspect
objects(regions), but in this project I used the methods tested
above.
As for the principle of operation, the program firs
read the input stream from the camera. Next step
is to load the Haar cascade in a classifier function
(cv2.CascadeClassifier ) as well as loading a picture
with the clean STOP sign in a ORB function, to further
use it as a model for matching features with the region
detected firstly by the Haar classifier. Subsequently, we do
a detection for the classifier on the input stream from the
camera, with the detectMultiScale argument on the
variable attributed with the cascade classifier, and with the
parameter for image (matrix of the type CV_8U containing
an image) set as the camera input stream. Note that the input
stream is transformed in gray scale, colors been disregarded,
Haar classifiers working with only gray scale features(see
Fig. 8.).The scaleFactor , set to 1.1, specifies how much
the image size is reduced at each image scale. Experiments
have concluded that 1.1 is the optimum value, as it will
detect the image at any scale/distance. Last parameter for the
detect object is the minNeighbors , with the value set to
10 and it represents the number or neighbors each candidate
rectangle should get to retain it. Next, the program will find
the keypoints and descriptors for the STOP sign loaded as
a feature matching comparison image, as a InputArray for
thedetectAndCompute argument of ORB object created
earlier. After these preliminary setups, the program will start
looping through all detected object in the matched cases
from the Haar cascade classifier. Obtain object from input
frame, find the keypoints and descriptors for the object
in the candidate section, match descriptors, check if the
threshold for the matched features is met (in this project,
we used MATCH_THRESHOLD = 50 ) and if it is, it draws
the rectangular box on it’s coordinates.[16]
VI. R ESULTS
OpenCV 3.2.0 has many dependencies, and it was not easy
to install it on a embedded linux machine like Raspberry Pi,
however, after building and compiling it, the results were as
expected. However, the difficulty of recognizing road signs is
due to the outdoor lighting conditions, as well as reflections
and shadows, obstacles, such as trees or buildings, vehicles,
and to the effect of blurring caused by the movement of
vehicle the camera is mounted. Although, the number of
frames have been improved by using a threaded program,
and the false-positive appearances have been removed, or at
least lowered to a very low point, and if the traffic sign has an
obstacle in front of it, but the classifier and feature matching
algorithm still manage to match some features, then it will
still be detected. Now a second problem is the draw of current
by the Raspberry Pi, because it uses a high amount of Amps
to function, and even more, the webcam also requires some
current, therefore making the Raspberry a huge consumer.

As for the heat, it will not go over 50°C, and the processor
load will be around 50%.
VII. C ONCLUSIONS
This project has in mind the inexpensive solution for a
mobile platform, capable of autonomy by identifying traffic
signs with the lowest fals-positive rate. The STOP sign used
in this paper is just an example, as it can cover more traffic
signs, and note that a Haar cascade can be trained by using
all the traffic sign ones desires, and further distinguish them
with the feature matching approach.
For this build, the Linux operating system gives numerous
advantages, beside the fact that is on a memory card, and
can be easily switched just by changing the card from the
Raspberry, giving the user the ability to use a different
operating system (or in this case, a Linux with different
configurations). Python language was used because it works
perfectly with Raspberry and is easy to write and debug.
Another advantage os using Python was that the program
could be write on a PC, tested there, and after that moved
to the Raspberry and it would work the same.
VIII. F UTURE WORK
This project was focused on the computer vision of the
robot, making it detecting objects of interest with no false-
positive appearances and improving it’s speed of detection.
Also, the hole concept of robots with computer vision
focuses on achieving tasks, so that will be the next step for
this project, making it identify objects and pick them with
the gripper and sort them or put them in marked areas.
REFERENCES
[1] ***, https://ro.wikipedia.org/wiki/Robot
[2] ***, https://ro.wikipedia.org/wiki/Robot_industrial
[3] Alexandru N ˘astase, “MECANICA ROBO¸ TILOR Mecanisme manip-
ulatoare seriale”, Gala¸ ti, 2012, pp. 6-8
[4] Milan Sonka, Vaclav Hlavac, Roger Boyle, Image Processing, Analy-
sis, and Machine Vision, 3rd ed.ISBN: 10: 0-495-24438-4 ISBN: 13:
978-0-495-24428-7
[5] Mohammed Reyad AbuQassem, Simulation and Interfacing of 5 DOF
Educational Robot Arm, Islamic University of Gaza, June 2010
[6] T ,ârliman Doina –Cercet ˘ari privind sistemele de prehensiune
ale robot ,ilor industriali act ,ionate cu ajutorul mus ,chilor
pneumatici,Bras ,ov,2014.
[7] ***, http://www.roroid.ro/un-pic-mai-multe-despre-comunicarea-
seriala/
[8] Gigel M ˘aces ,anu, Tiberiu T. Cocias ,, Sisteme de Vedere Artificial ˘a,
Editura Universit ˘at,ii TRANSILV ANIA din Bras ,ov, 2016
[9] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted
Cascade of Simple Features,” Proc. of the 2001 IEEE Computer
Society Conf. on Computer Vision and Pattern Recognition, V ol. 1,
Kauai, USA, 2001, pp. 511–518.
[10] R. Lienhart and J. Maydt, “An Extended Set of Haar-like Features for
Rapid Object Detection,” Proc. of the 2002 Inter. Conf. on In Image
Processing ICIP, New York, USA, 2002, pp. 900–903.
[11] Alexandru Mihail ITU, GRASPING STRATEGIES CONTRIBU-
TIONS IN REAL AND VIRTUAL ENVIRONMENTS USING A
THREE FINGERED ANTHROPOMORPHIC GRIPPER, Bra¸ sov,
2010
[12] MatLab, http://www.mathworks.com/help/vision/ug/train-a-cascade-
object-detector.html
[13] http://www.pyimagesearch.com/2015/12/21/increasing-webcam-fps-
with-python-and-opencv/
[14] Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R. Bradski: ORB:
An efficient alternative to SIFT or SURF. ICCV 2011: 2564-2571.[15] Frazer K. Noble, Comparision of OpenCV’s Feature Detectors and
Feature Matchers
[16] Ross D Milligan, Road sign detection using OpenCV ORB, rdmilli-
gan.wordpress.com

Similar Posts