Faculty of Automation and Computers Master’s degree program: Automotive Embedded Software SMART TRAFFIC SIGN DETECTION ON AUTONOMOUS CAR Dissertation… [301845]
Faculty of Automation and Computers
Master’s degree program: [anonimizat]:
Ș.l. dr. ing. Răzvan BOGDAN
Ș.l. dr. ing. [anonimizat],
2018
[anonimizat] a symbol of modern society. [anonimizat] a [anonimizat], but also an extension of the drivers image.
The new millennium have started a [anonimizat]. A race that might take over unprepared all the drivers on this planet. [anonimizat], but to an electronic “brain”.
Specifically, [anonimizat] a revolutionary change in how means of transport will take place. And that’s because cars will drive alone from A to B, the driver’s role is simply to give orders for the destination.
So the first argument in favor of autonomous cars is much simpler than they want to accept: personal confort. [anonimizat] a computer does not get tired and makes all the calculations at an extremely high speed.
The main advantage of an autonomous car: driving, [anonimizat] ([anonimizat]/[anonimizat] a movie, etc.) or other activities from the category “time is money” (preparation of a presentation, videoconference or other things specific to busy business people). Perhaps, [anonimizat], not as a possession that takes too long.
Self-[anonimizat]. [anonimizat], or failing to adapt to road conditions. Humans are subjected to the mistake to a [anonimizat], especially of the victims.
[anonimizat], braking and direction. Also, a [anonimizat]. [anonimizat], informations being processed by performed software.
An intelligent car which detect signs
Safety systems are now operating before and after the crash. Thus, [anonimizat], but also on other cars in traffic.
The subject of this dissertation thesis is to build an autonomous car which can detect traffic signs using OpenCV. The development platform is Raspberry Pi 3 Model B [anonimizat]. [anonimizat]:
Quad Core CPU
1GB RAM
1.2GHz Board Clock Speed Broadcom BCM2837 64bit CPU
40 GPIO Pins
4 x USB 2 Ports
4 Pole Stereo Output
HDMI Port
10/100 Ethernet
Micro SD Card Slot
BCM43143 Wi-Fi and Bluetooth Low Energy (BLE) on the board
Also, a [anonimizat], a dual H-bridge driver to control the car, and a smartphone with Android application used as real-time display.
The automonous car
Previous work
From some of the scientific papers, one of them presents a study to recognize traffic sign patterns using Neural Network technique. The images were pre-processed with several image processing techniques, such as, Gaussian filter, Canny edge detection, Contour and Fit Ellipse. The Neural Networks stages were performed to recognize the traffic signs patterns. The system was trained to show highly accurate classifications of sign patterns and to reduce the computational cost of this proposed method.
Another diploma thesis was about designing an algorithm for the recognition of speed limit traffic signs and its integration into an application. The difficulty of this task was to achieve a high recognition rate and aggravated by the variety of possible image properties. Also, like the other paper, the training needs to done. It was attempted to increase the robustness of the classifier by extending the training data with randomly perturbed versions of existing samples.
The implemented module on this thesis used Histograms of Oriented Gradients and a linear classifier trained by Linear Discriminant Analysis. The module was able to handle a wide range of different image segment sizes, but has difficulties with night images. Also, the performance was reduced for blurred signs and those signs that were hard to find them in the image.
However, not all of the papers used the same classifier techniques, and one of them was to automatically detect and recognize traffic signs from images captured which forms a part of Advanced Driver Assistance Systems (ADAS). It was implemented using Raspberry Pi, running Raspbian Stretch. Also, it was used Python integrated with OpenCV library for implementing image processing algorithms related to the traffic sign recognition.
Furthermore, one of the German thesis wanted to introduce a real-world benchmark data set for traffic sign detection with carefully chosen evaluation metrics, baseline results, and a web-interface for comparing approaches. Their idea was to separate sign detection from classification, but measure the performance on relevant categories of signs to allow for benchmarking specialized solutions. It was used some popular detection approaches such as the Viola-Jones detector based on Haar features and a linear classifier relying on HOG descriptor. Another algorithm was that which exploits shape and color in a model-based Hough.
To sum up, the papers above tried to create an algorithm based on different classifiers so as to detect traffic signs in any circumstance like weather, blurred images etc. This topic is considered to be widely comprehensive and because we need solutions for making the autonomous cars, more and more safer, there will be always needing an improvement for these algorithms.
Theoretical notions
Android
Android is an open source operating system running on most of the smart phones currently on the market. At the same time, it can also be considered a development platform, giving anyone the possibility to develop new applications.
Overview of the Android architecture
The Android operating system is based on a Linux kernel. It is optimized for minimal energy consumption, especially when running on mobile phones where the power supply is a battery.
Above this level follows Hardware Abstraction Layer (HAL), which is a layer that abstracts hardware for easier access through software and for portability of the latter from one platform to the next.
Next follows the level of Native Libraries (native libraries) that contain a number of protocols and facilities used by the entire system.
Android runtime is the level at which applications run. For this, a dedicated Android, Dalvik or ART virtual machine is available, depending on the version of Android used.
The Android Framework level contains the necessary framework for the services offered to develop and run applications.
The last level is represented by the applications themselves.
Application development can be done using the Java programming language for application sources, and the XML format is used for the graphical interface.
Integrated development environments such as Eclipse / Android Studio and others can be used.
Useful in developing applications on Android is that they can run on other phones that have a version of Android as the one for which an application has been developed, and often porting on a later version is not necessary, with compatibility being quite high.
Practically due to this architecture, an application does not depend on the hardware implementation specific to a particular platform type, but only on the existing services.
Thus, each Android-powered platform (phone / tablet) has its own drivers with its own features, and the software developer is not affected by how they are interconnected (memory / ports / pins) but just by their existence.
By abstraction, the software's independence from hardware is reached.
C
It's a programming language that supports structured programming that came out in 1972 and is one of the most used programming even today.
Over time, there have been several compilers for this language for both computer and embedded software development.
This programming language is integrated on UNIX operating systems, originally written to improve the above mentioned operating system.
Jessie
The operating system running on Raspberry Pi 3 is a Debian Linux-based operating system. It offers graphical interface and programming possibilities using various programming languages: C, C ++, Java, Python and others.
Jessie Graphics Interface and Available Programming Languages
The structure of the directories in a Linux-based distribution system is in the form / DirectorSystem, where:
– The first one means the root directory specific to this type of system
– SystemSystem can be dev (system devices), etc (system configuration), home (base directory when the system starts) and more.
So any absolute path starts with / and continues with the path to the desired directory or file.
Programming was made possible by using the C programming language environment, called Geany Programmers Editor.
Python
Python is a high-level, object-oriented programming language that tracks legibility of the code. Its purpose is to be a dynamic language and accessible to software developers by using fewer code lines than other known programming languages.
The inclusion of all these structures, as well as the functions that allow their manipulation and processing, as well as many other function libraries, are present due to the concept of "Batteries Included", which can be explained by the fact that Guido van Rossum and the community around the language I think a programming language is not practical if it does not have a set of libraries that are important to most developers.
For this reason, Python includes file libraries, archives, XML files, and a set of libraries for working with the network and the main internet protocols (HTTP, Telnet, FTP). A large number of Web platforms are built with Python. Language skills as a language for CGI programming are beyond doubt. For example, YouTube, one of the world's largest traffic sites, is built on Python.
OpenCV
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel. The library is cross-platform and free for use under the open-source BSD license.
OpenCV project was initially an Intel Research initiative to advance CPU-intensive applications, part of a series of projects including real-time ray tracing and 3D display walls.
Advance vision research by providing not only open but also optimized code for basic vision infrastructure. No more reinventing the wheel.
Disseminate vision knowledge by providing a common infrastructure that developers could build on, so that code would be more readily readable and transferable.
Advance vision-based commercial applications by making portable, performance-optimized code available for free – with a license that did not require code to be open or free itself.
OpenCV is written in C++ and its primary interface is in C++, but it still retains a less comprehensive though extensive older C interface. There are bindings in Python, Java and MATLAB/OCTAVE.
PWM
The PWM consists in encoding the information in the pulse width obtained. The fill factor of a PWM signal is calculated with the relation D (fu) = Ti / T, where Ti is the duration of the pulse and the T period of the signal.
So we can see that every percentage of such a signal is an important value in the application we want with such a signal as opposed to a TTL signal that can only have two states. (high, low).
PWM signals are control signals of some power transistors used in switching converters. A PWM modulator has the function of commanding a switch and is an important and complex part of a switching voltage regulator.
The principle of making such a PWM modulator consists of making an electronic scheme that contains:
– saw tooth generator
– error amplifier
– comparator
The voltage at the output of the error amplifier is compared to the value of the signal in the teeth of the saw. If the output is greater than the value of the teeth, then at the output of the comparator we will have a '1' logic, meaning Ton. If the output of the amplifier is less than the value of the teeth, then at the output of the comparator we will have a '0' logic meaning Toff.
If the output voltage tends to increase, then the feedback voltage will rise above the reference voltage, so the output voltage of the error amplifier will decrease resulting in a shorter duration for which the output of the comparator will be that '1' logic. If the output voltage drops then at the output of the comparator we will have a logic longer than '1'. This change in pulse widths according to the output voltage is due to the duty cycle.
Duty cycle example
If the output voltage is constant this is maintained by the negative reaction to the desired value.
Hardware components description
Hardware components
The presented system has the hardware components presented in the Table 1.
Table 1. Hardware components
In the following subchapter is presented a short specification of each hardware component.
Raspberry Pi 3
Raspberry Pi 3 Model B is the third generation of Raspberry Pi. This little computer could be used for a lot of applications and exceeds Raspberry Pi Model B+ and Raspberry Pi 2 Model B.
Raspberry Pi 3
Technical specifications:
Processor : chip BroadCom BCM2387, processor Quad Core ARM Cortex-A53 64 bit 1,2 GHz, wireless LAN 802.11 b/g/n și Bluetooth 4.1 ( Bluetooth Classic si LE )
GPU : Dual Core VideoCore IV Multimedia Co-Procesor which provides Open GL ES 2.0, accelerated hardware OpenVG, and high decoded profile 1080p30 H264, Capable of 1Gpixel/s, 1,5Gtexel/s or 24GFLOPs with filtering texture and DMA infrastructure.
Memory : 1GB LPDDR2
Operating system : It can boot from the MicroSD card by running on a Linux or Windows 10 IoT operating system
Dimensions : 85x56x17 mm
Power : Micro USB socket 5V, maxim 2,5A
Conectors :
– Ethernet : 10/100 BaseT Ethernet socket
– Video output : HDMI (rev 1.3 si 1.4), composition RCA (PAL si NTSC)
– Audio output : Audio output 3.5mm jack, HDMI, conectors USB4 x USB 2.0
– Conector GPIO : 40-pins 2.54mm (100 mil) expansion header: 2×20 strip, provides 27 GPIO pins but also 3.3 V, 5V and GND as power lines
– Camera conectors: 15 pins MIPI serial interface camera (CSI-2)
Display connectors: 15 way flat cable cable (DSI) display with 2 data lines and one clock line
Memory card slot: push/pull Micro SDIO
Raspberry Pi connections
In the following figure, it is explained the special functions and the mapping of the extended 40 pins.
GPIO pins functions and their mapping
Wiring Pi
Web camera
The web camera used is Logitech C170. This has the following technical specifications:
Video call (640 x 480 pixels)
Video capture: up to 1024 x 768 pixels
Built-in microphone with noise reduction
USB 2.0 high-speed certificate
Logitech C170
Ultrasonic distance sensor
This sensor is capable of measuring distances between 2cm and 400cm with precision which can reach to 3mm. Every module HC-SR04 includes an ultrasonic transmitter, a receiver and a command line.
For operation the sensor needs 4 pins VCC (Power), Trig (Trigger), Echo (Receiver), and GND. This component has the following characteristics:
Operating voltage: DC 5V
Operating current: 15mA
Operating angle: 15 °
Distance: 2cm – 4m
Ultrasonic distance sensor HC-SR04
The Timing diagram is shown below. You only need to supply a short 10uS pulse to the trigger input to start the ranging, and then the module will send out an 8 cycle burst of ultrasound at 40 kHz and raise its echo. The Echo is a distance object that is pulse width and the range in proportion .You can calculate the range through the time interval between sending trigger signal and receiving echo signal. Formula: uS / 58 = centimeters or uS / 148 =inch; or: the range = high level time * velocity (340M/S) / 2;
Timing Diagram
Dual H-bridge
L298N is a high-voltage monolithic integrated circuit, dual H-bridge designed to accept standard TTL logic levels for control.
It can be used to control relays, solenoids, DC motors or step-by-step. Each bridge can be activated independently from ENA, ENB pins.
The module also contains a voltage stabilizer circuit (5V) that allows the operations at high voltages.
Specifications:
Operating voltage: up to 40V
Operating current: up to 3A (25W total)
Low saturation
Overload protection
Can operate with 2 motors simultaneously
High noise noise: Logic level "0" input up to 1.5V
Instructed voltage regulator 78M05. To avoid this integrated fault, use a 5V external logic source when the supply voltage exceeds 12V
Dual H-bridge
Smartphone with Android
A smartphone with Android is a type of phone which runs the Android open source operating system, and capable of executing several operations that improve the user’s life.
Most smart phones have more or less the following features:
High performance, low-power and low-power processor based on ARM architecture
RAM for applications
Wi-Fi connection
Bluetooth connection
Accelerometer
GPS
Camera / video
High speed internet access using the mobile internet provided by the SIM card
Touch screen
Such a phone has been favored by its ease of use, affordability, connectivity, and its operating system, enabling applications to be developed by anyone who wants to do so.
Smartphone
Bluetooth
Structure and operation:
Bluetooth is a standard that covers wireless transfers, low power and short distances;
IEEE 802.15 is a standard that covers wireless transfers to PANs (Personal Area Network);
PAN: a local network where all stations are controlled by one person or one family;
IEEE 802.15 covers Bluetooth and IEEE 802.15.3 and IEEE 802.15.4 standards;
Bluetooth works in the 2.4 GHz band;
Bluetooth is designed for many users; piconet: a small network with max. 8 users;
A maximum of 10 piconets may exist in a Bluetooth cell;
Each connection is encrypted for protection against unauthorized access or interference;
Types of applications:
– Data and voice access points: Provides wireless data and voice transfer between a mobile station and a fixed one;
– Cable removal: allows computers to be connected to printers, keyboards, mouse, remote control of home appliances, connecting headphones to phones, etc .; the distance is maximum 10 m but can be expanded, up to 100 m;
– Ad-hoc networks: Any 2 Bluetooth stations can establish a connection if they reach the other's range;
Wi-Fi
Wi-Fi is the commercial name for technologies built on the IEEE 802.11 family of communication standards used to establish wireless local area networks (WLANs) at speeds equivalent to those of Ethernet wired networks. Wi-Fi support is provided by various hardware devices, and almost all modern operating systems for personal computers (PCs), routers, mobile phones, game consoles, and the most advanced TVs.
The IEEE 802.11 standard describes communication protocols at the host-network level of the TCP / IP Model, and physical and data levels of the OSI Model. This means that IEEE 802.11 implementations must receive packets from network protocols (IP) and handle their transmissions, avoiding collisions with other stations that they want to transmit.
Access to the network can also be controlled by some simple techniques that can be limited but sufficient to remove some occasional intrusions.
The other two transmission technologies are 2.4 GHz band radio technology, a license-free band. Due to the freedom of use of this tape, it is also used by other technologies such as Bluetooth or cordless handsets, which may sometimes cause interference, although the transmission power of all these devices is generally low.
The first one is called FHSS (Frequency Spread Spectrum), and, in order to efficiently allocate frequencies in the 2.4 GHz band, it implies a periodic change of transmission frequency, as a result of pseudo-random numbers generated by the communicating stations. The other radio technology is DSSS (spread spectrum with direct sequence). Both offer transfer rates of up to 1 or 2 Mbps.
Such a technique is to configure the access point so that it does not publicly transmit its SSID. SSID (English Service Set IDentifier) is a name that an access point sends periodically to make known the presence of stations that want to enter the network. Stopping the transmission of this signal hides the presence of the network from a potential superficial attacker, though allowing stations that know the SSID of the access point to connect to the network. This solution is not such as to protect the system from accessing more intrusive intruders, since the interception of frames transmitted between the access point and the connected stations can provide the information needed to access the network.
Another equally simple but equally inefficient technique is MAC address filtering. As in Ethernet, network access devices are uniquely identified by a physical address (also called a MAC address). An access point can be configured to allow network access only to stations that have one of a finite list of MAC addresses. By the same technique of listening to legitimate traffic on the network, however, an intruder can find the MAC address of a legitimate station, falsifying that address and gaining access, claiming that the station is.
Architecture
Hardware architecture
The scheme from Fig. 14, shows exactly how the ultrasonic sensor is connected with Raspberry Pi.
Sensor connected to Raspberry Pi
The following picture, shows the connection of the dual H-bridge to Raspberry Pi and also how they are powered.
Dual H-bridge connected to Raspberry Pi
All the components were connected to Raspberry Pi and powered by a battery pack of 7.5V and an external battery.
Components connected to Raspberry Pi
In the picture above, it is shown the actions of the components and the connections which they are used in connectivity with Raspberry Pi.
Actions of the components
Software architecture
In the UML diagram, it is shown all the actions of the software application. The smartphone has the Android application which does all the actions in between, so as to make the car autonomous.
UML diagram for the software application
Fig. 19 shows the classes used in the software architecture and also the name variables of the application.
Classes used in the application
The last picture shows the actions of the car sensor and at what distance starts to detect obstacles.
Actions of the car sensor
Implementation
Hardware implementation
The first phase of implementation was to install OpenCV on the Raspberry Pi and then to connect the web camera.
Connection of the web camera
The schemes were made in the Fritzing development environment to help connect all the hardware components between them.
The second phase of implementation was to connect the ultrasonic distance sensor HC SR-04 and to test at which distance will detect obstacles.
Connection of the sensor
The pins of the sensor were connected to the pins of the Raspberry Pi as followed:
Trigger pin to pin 0 from Wiring Pi
Echo pin to pin 2 from Wiring Pi
VCC pin to pin 2
GND pin to pin 6
Then, a car with motor was connected with dual H-bridge to control the motor and the direction of the car. To give power to all the hardware components it was used a battery.
Connection of the dual H-bridge
The pins of the dual H-bridge were connected to the pins of the Raspberry Pi as followed:
Reverse pin to pin 26 from Wiring Pi
Forward pin to pin 23 from Wiring Pi
Direction left pin to pin 25 from Wiring Pi
Direction right pin to pin 27 from Wiring Pi
All the components conected
Fig. 19. represents all the components connected in the car. The Raspberry Pi is powered by an external battery of 5V and the car is powered with 7.5V battery pack.
Software implementation
The application installed on the smartphone has the following main functionality:
Obtaining the IP address of the web server from Raspberry Pi by Bluetooth.
Using the obtained IP address for viewing the image from the web camera through Wi-Fi.
Controlling the speed of the motor using Bluetooth.
Built on for the Android platform, the smartphone application exchanges information with the Raspberry Pi in real time through wireless technology, here including Bluetooth and Wi-Fi communication mediums.
Bluetooth gives the possibility to control the speed of the car, setting the speed of driving motor from 0 to 255, these values representing the Pulse Width Modulation (PWM) duty cycle.
Wi-Fi enables real time image sharing from Raspberry Pi. It has been chosen over Bluetooth because of larger bandwidth used to transfer the images faster. The initial implementation used Bluetooth for transferring the images but this was unsatisfactory because of lost synchronization and big delays.
The smartphone application graphical user interface and functionality is presented in the following images.
Smartphone application requiring Bluetooth to be enabled at startup
Smartphone application device select window
Smartphone application connecting to last used device
Smartphone application PWM set to 128
Smartphone application graphical user interface
Furthermore, the graphical user interface provides the following functionalities:
Reconnect last device button.
Area for displaying the processed image
Selected speed for motor
Scroll bar for selecting motor speed (between 0-255)
Button used for changing the device used for Bluetooth
Bluetooth connection status:
NOT CONNECTED!
CONNECTING…
CONNECTED!
Once connected to Raspberry Pi, the smartphone application receives the IP address of the server and will continue display the real-time video feed on the “2. area for displaying the processed image” all this by using the Wi-Fi communication medium.
Forbidden sign detection and transmission to the smartphone application
Image acquisition is done by using the web camera which is connected to Raspberry Pi through USB 2 connector. So, data is acquired using serial communication. The image received has BGR format (Blue, Green, Red) – specific to Bitmap format.
In order to get the desired result, the following steps are performed on the image:
Preprocessing
Processing
Postprocessing
Preprocessing is the first step done after image acquisition and it is an important since it can improve the image contrast and clarity. The image contrast can be increased by one of the following methods:
Histogram equalization
CLAHE (Contrast-limited adaptive histogram equalization)
These two methods give better results when processing the image in grayscale since the processing of the image is done in grayscale as well.
The following images show a visual comparison between the two preprocessing image methods.
Original image
Grayscale image
Histogram equalization
CLAHE
Processing is done in real time using OpenCV module and consists in traffic sign detection. This step is further explained in the “OpenCV traffic sign detection” paragraph.
Postprocessing is also done using OpenCV and consists in marking the detected traffic signs on the processed image. The marking of the detected signs is done using rectangles having different border colors as shown in “TABLE II. DETECTED TRAFFIC SIGNS”
Detected signs
OpenCV module uses a grayscale preprocessed image in order to detect traffic signs. During the processing time, each frame is processed in real time and the result consists in detecting blobs which have similar characteristics to the ones used in training process. As seen in the image above, “Detected signs”, the blobs which are detected are then highlighted. The following image shows the blob used for training:
Blob image example
The image is processed by using OpenCV module which uses a set of trained classifiers for different traffic signs. The traffic signs which are detected are then highlighted on the initial image by using rectangles of different colors. These signs and their highlighting colors are shown in the table below.
Table 2. Detected traffic signs
The training of the detection algorithm used the following steps:
A set of a minimum 15 images containing only the sign, called blobs, were used initially.
From this set of blobs, 128 samples were created for each of the initial image by combining it with a negative image.
Blob image example
In this way, a minimum of 1920 samples were obtained.
Sample image example
Afterwards, the training began using all the samples and 1000 negative images.
Negative image example
The training for each sign took from 5 hours to 5 days, depending on the computation capabilities of the computer used.
20 stages were applied for each training and for some of the signs the training stopped earlier because false positive rate was reached.
The output of the training is a XML file which represents the classifier for the respective sign, used afterwards for the detection at runtime.
The system processes the image from the webcam and after that the system performs segmentation algorithms on frames. The same segmentation algorithms are used for both training and detection.
For traffic signs detection stage Haar cascades based on Haar features of an object are used. This is a machine learning approach where a cascade function is trained from a set of positive and negative images as explained in the above steps.
The training algorithm used the following scripts:
create_samples.sh – used for creating samples
128 samples were created for each blob resulting 1920 samples in total
Width and height were set to 48
For each blob the samples obtained were saved in Samples folder, and the position of the blob within the sample was stored in samples$count.txt (where $count is a numbering variable, from 0 to number of initial blobs, usually 15)
At the end of sample creation, the samples$count.txt were merged in single text file positives.txt, each containing the position of the blob in the generated sample image
A vector file from positives images was created, containing all the blobs from the samples
numSamples=128
w=48
h=48
count=0
for i in ls Positives/*.jpg
do
echo $i
echo $count
./opencv_createsamples -img $i \
-bg Negatives/negatives.txt \
-info Samples/samples$count.txt \
-num $numSamples -maxxangle 0.0 -maxyangle 0.0 \
-maxzangle 0.3 -bgcolor 255 -bgthresh 8 \
-w $w -h $h
count=$((count+1))
sleep 1
done
count=$((count-1))
cat Samples/samples*.txt > Samples/positives.txt
echo $((numSamples*count))
./opencv_createsamples \
-info Samples/positives.txt \
-bg Negatives/negatives.txt \
-vec samples.vec \
-num $((numSamples*count)) -w $w -h $h
train_cascade.sh – used for training the neural network based on Haar features
Number of stages was usually chosen to be 20
Width and height were set to 48
Number of positives was 1000, while the number of negatives wass 600
The above generated vector file was used as input
A minimum hit rate of 0.995 was set in order to reduce false/positives
An xml file resulted at the end of the training in the Output folder
numStages=20
if [ $# -eq 1 ]; then
numStages=$1
fi
OUTPUT_DIRECTORY=../Output
w=48
h=48
cd Negatives
../opencv_traincascade -data $OUTPUT_DIRECTORY \
-vec ../samples.vec \
-bg negatives.txt \
-numPos 1000 -numNeg 600 -numStages $numStages \
-precalcValBufSize 1024 -precalcIdxBufSize 1024 \
-featureType HAAR \
-minHitRate 0.995 -maxFalseAlarmRate 0.5 \
-w $w -h $h -numThreads 1000
cd ..
The following folder structure is mandatory to be used for the scripts to run successfully:
Folder structure used for training phase
Where:
Negatives – contains negative images, more specific images that do not contain the traffic sign used for training
Output – output folder of the training algorithm
Positives – contains the traffic sign blobs
Samples – contains all the generated samples and their samples$count.txt generated files after running create_samples.sh
opencv_* – representing all the necessary OpenCV files used for training algorithm
Although the scripts are written to be used in a Linux environment and the folder structure can be seen on a Windows machine, the scripts were also run on Windows machine by using the CygWin environment/console. Also, even if Raspberry Pi is a powerful pocket PC and it has a Linux based environment, training on it was not done due to lack of RAM memory, since the training algorithm is a greedy one in what concerns the memory resources used, especially at runtime.
For the processing phase, the “class TrafficSignDetect(object)” was defined. This class provides the main functionality of the system this including the web server and the image preprocessing (histogram equalization or CLAHE), processing (traffic sign detection) and postprocessing (highlighting detected signs).
The following figure gives an overview of the methods this class implements:
TrafficSignDetect class
The parameters have the following significance:
Count1 and count2 – used for rendering the image on the web server
USE_CLAHE and USE_HISTOGRAM – enable/disable the enhancement of the image which can be obtained during preprocessing of it
Cascades[], scale[], neighbors[], minSizeX[], minSizeY[], flags[], red[], green[], blue[] – arrays containing parameters used for each sign detection
Camera – video camera object
The methods have the following description:
Init()
Initializes all the parameters:
#Initialize program
self.count1 = 0
self.count2 = 0
self.USE_CLAHE = 0
self.USE_HISTOGRAM = 0
self.cascades = []
self.scale = []
self.neighbors = []
self.minSizeX = []
self.minSizeY = []
self.flags = []
self.red = []
self.green = []
self.blue = []
Adds the traffic sign which can be detected:
#Add forbidden
self.cascades.append(cv2.CascadeClassifier("Cascades/cascade_forbidden.xml"))
self.scale.append(1.4) #1.1 -> 1.5
self.neighbors.append(5) #3 -> 6
self.minSizeX.append(50)
self.minSizeY.append(50)
self.flags.append(cv2.CASCADE_SCALE_IMAGE) #cv2.CASCADE_SCALE_IMAGE
self.red.append(0xFF)
self.green.append(0x45)
self.blue.append(0x00)
Initializes camera:
#Initialize camera
self.camera = VideoStream(src=0).start()
time.sleep(0.05)
self.camera.resolution = (60, 30)
self.camera.framerate = 25
Prepares CLAHE
#Create CLAHE (Contrast Limited Adaptive Histogram Equalization)
self.clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
ProcessImage()
Converts image to grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
Applies histogram and CLAHE
if(self.USE_CLAHE == 1):
cl1 = self.clahe.apply(gray)
gray = cl1
if(self.USE_HISTOGRAM == 1):
hist = cv2.equalizeHist(gray)
gray = hist
Detects traffic signs:
for i in range(0, len(self.cascades)):
#search occurences
detections = self.cascades[i].detectMultiScale(
gray,
self.scale[i],
self.neighbors[i],
self.flags[i],
(self.minSizeX[i], self.minSizeY[i])
)
Highlights detected signs:
#draw rectangles
for(x, y, w, h) in detections:
cv2.rectangle(frame, (x, y), (x+w, y+h), (self.blue[i], self.green[i], self.red[i]), 3)
GetFrame()
Fetches frame from web camera and reduces its dimensions for faster processing
image = self.camera.read()
self.image = cv2.resize(image, (0,0), fx=0.25, fy=0.25)
Calls ProcessFrame()
Saves the frame into JPEG file in order to transfer it to the web server as a binary file
cv2.imwrite(str(self.count1) + ".jpg", image)
self.count1 = self.count1 + 1
self.count1 = self.count1 % self.camera.framerate
time.sleep(0.05)
if(self.count1 >= 1):
self.count2 = self.count1 – 1
else:
self.count2 = self.camera.framerate – 1 – self.count1
return open(str(self.count2) + '.jpg', 'rb').read()
DeInit()
Stops camera and OpenCV processing
self.camera.stop()
cv2.destroyAllWindows()
Results and discussions
The following parameters were seen to influence the detection performance:
The number of training stages for Haar cascades.
The preprocessing of the image which was done by increasing the contrast through histogram equalization or CLAHE.
The parameters used by the detector function (detectMultiScale) which are:
Scale
Neighbors
Minimum size of the detection
An increased number of training stages for Haar cascades provides a better performance at runtime. At the training stage it was seen that a number of 20 stages is optimum for most of the signs.
Since the blob detection is done on gray scale copy of the original image both histogram and CLAHE generates a clearer image as input for the detector. Because the contrast of the input image is increased by either of the methods each of them provides added value. The optimal one would be CLAHE which is an adaptive method.
The following detection parameters values were found to be optimal for most of the signs:
Scale: 1.1 – 1.4. A value lower than this interval would create more false positives and a value higher than this interval would make the detection harder.
Neighbors: 3 – 6. A value lower than this interval would create more false positives and a value higher than this interval would make the detection harder.
Minimum size: (50, 50) – (100, 100). A value lower than this interval would create more false positives and a value higher than this interval would make the detection harder.
Give way sign detection trained with 20 stages
Forbidden sign detection and transmission to the smartphone application
Although, in Fig. 36 there are two signs observed, only one being detected, the detection of the second sign has happened when the sign was smaller like the first one. It can also be seen that a small artefact covers the sign in the left corner and even so, the detection worked just as expected. Fig. 37 shows the second forbidden sign being detected as well.
Second forbidden sign detection and transmission to the smartphone application
Based on the detections, the signs are classified by their priority (interdiction signs priority > permission signs priority) and in order to know which sign applies at the moment of detection, supposing that there are multiple signs detected in the same image, the blob size tells the algorithm which actions can be taken. These actions can be:
– Do nothing, maintain speed, if a permissive sign is encountered.
– Reduce speed to make a maneuver, if a sign like roundabout is encountered.
– Stop, if a sign like red semaphore is encountered.
The system respects important circulation rules such as semaphore priority > sign priority, especially when they are detected in the same image
When the car is reduced or stopped, the system tries to detect possible obstacles using ultrasonic sensors and afterwards the previous speed is resumed.
Traffic sign detection decision diagram
The following images show traffic sign detection examples.
During development and testing the following issues were encountered:
Semaphore detection is very much influenced by the light the semaphore emits, thus a vehicle to semaphore communication similar to the one Audi develops (http://madrives.com/audi-develops-a-semaphore-online-assistant/) would be much more suitable than its simple detection.
Some of the signs are different depending on the country and this can increase the number of false positives and/or missed sign detection.
Since light has a very important role in the detection phase, an infrared camera is recommended to be used by night.
The parameters used for the detection of one sign cannot always be used for the detection of another one, these being specific to each one.
Even though Raspberry Pi is a powerful computer, at some moments it has shown its limits, especially when the image resolution was higher, more powerful processing units being desirable. Higher resolutions are preferable for better detection.
Conclusion
The development and improvement of technology has enabled the use of more and more processing systems and power in nowadays cars. “Ubiquitous computing” has given the possibility of creating smarter, faster, low-power and smaller computing systems which can be integrated in automotive ECUs.
Furthermore, cars can perform more complex operations by including ECUs with multiple cores, which can now provide multiple active safety features like real-time traffic sign detection using powerful cameras, obstacle detection using radar systems and more often highly automated cars.
The usage of powerful computation units has become more and more a need in automotive safety and in the beginning of autonomous driving.
The scope of using such systems is to reduce the number of accidents and false-positives which can appear in both camera sign detection and radar obstacle detection. It has been seen recently that false-positives can create catastrophic decisions of autonomous cars, when a self-driving Uber car decided that a woman riding a bicycle is a false-positive and no action was taken to save her life.
Also, ethical algorithms may be the most challenging challenge in the field of artificial intelligence. As technology and costs will allow the installation of autonomous systems on as many cars as possible, a serious analysis of algorithmic morality is needed. And this has never been more urgent than now.
This is a lesson which must be learned, not only by Uber, also by the whole automotive industry, in order to create the “tomorrow car”, a car which in the future will be capable of driving itself and will successfully replace human drivers and much more, create added value by reducing the risk of driving errors to a perfectly 0%.
This dissertation thesis could be improvised to one or more possible development directions. One of the direction of development may be to make the car park when it will detect the specific sign for that.
References
Autonomous car picture, https://hackernoon.com/on-apple-wading-deeper-into-the-autonomous-car-industry-24aee6b41cef
Autonomous car detecting traffic signs, https://library.ctr.utexas.edu/ctr-publications/0-6849-1.pdf
Auranuch Lorsakul and Jackrit Suthakorn, “Traffic Sign Recognition Using Neural Network on OpenCV: Toward Intelligent Vehicle/Driver Assistance System”, Mahidol University.
Martin Rudorfer, “Design and Implementation of a Classification Algorithm for Speed Limit Traffic Sign Recognition”, Technical University of Berlin.
N Radhakrishnan, S Maruthi, “Real-time indian traffic sign detection using Raspberry Pi and Open CV”, November 2017.
Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel, “Detection of Traffic Signs in Real-World Images: The German Traffic Sign Detection Benchmark”
Android architecture, https://info448-s17.github.io/lecture-notes/introduction.html
PWM duty cycle, http://cs.curs.pub.ro/wiki/pm/lab/lab3
GPIO Raspberry Pi , http://www.me.cit.nihonu.ac.jp/lab/yanagisawa/for_student/raspberry_pi_gpio.html
Raspberry Pi Pins , https://www.element14.com/community/community/raspberrypi/raspberrypi3/content?filterID=contentstatus[published]~objecttype~objecttype[document]&filterID=contentstatus[published]~language~language%5Bcpl%5D
Wiring Pi, https://www.pinterest.co.uk/pin/655133077016947604/
Logitech camera webcam, http://www.logitech.com/en-in/product/webcam-c170
Raspberry Pi 3 datasheet, http://www.farnell.com/datasheets/2020826.pdf
Ultrasonic distance sensor HC-SR04 datasheet , https://www.mouser.com/ds/2/813/HCSR04-1022824.pdf
Dual H-bridge L298n datasheet, http://www.st.com/en/motor-drivers/l298.html
Audi develops a semaphore online assistant, http://madrives.com/audi-develops-a-semaphore-online-assistant/
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: Faculty of Automation and Computers Master’s degree program: Automotive Embedded Software SMART TRAFFIC SIGN DETECTION ON AUTONOMOUS CAR Dissertation… [301845] (ID: 301845)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
