University Politehnica of Bucharest [625971]
University “Politehnica” of Bucharest
Faculty of Electronics, Telecommunications and Information
Technology
CONVEYOR BELT OBJECT
COUNTING USING A VIDEO CAMERA
Diploma Thesis
submitted in partial fulfillment of the requi rements for the Degree of
Engineer in the domain Electronics, Telecommunications and
Information Technology , study program Applied Electronics (ETC –
ELAeng)
Thesis Advisor(s) Student
S. L. Dr. Ing. Bogdan Hurezeanu Mihai Ionut Andreana
2018
Contents :
Figure List ………………………….. ………………………….. ………………………….. ………………………….. …….. 9
Acronym list ………………………….. ………………………….. ………………………….. ………………………….. … 11
1. Digital images ………………………….. ………………………….. ………………………….. ……………………. 15
1.1.Digital imaging ………………………….. ………………………….. ………………………….. ………….. 15
1.2.The RGB system ………………………….. ………………………….. ………………………….. ……….. 15
1.3.Image representation ………………………….. ………………………….. ………………………….. ….. 16
1.4.File format and metadata ………………………….. ………………………….. …………………………. 17
2. Python ………………………….. ………………………….. ………………………….. ………………………….. ….. 19
2.1.Python features ………………………….. ………………………….. ………………………….. ………….. 19
2.2.Object Instantiation ………………………….. ………………………….. ………………………….. ……. 19
2.3.The images module ………………………….. ………………………….. ………………………….. ……. 20
3. OpenCV ………………………….. ………………………….. ………………………….. ………………………….. .. 21
3.1.Computer Vision ………………………….. ………………………….. ………………………….. ……….. 21
3.2.OpenCV components ………………………….. ………………………….. ………………………….. …. 22
4. OpenCV -Python ………………………….. ………………………….. ………………………….. …………………. 25
4.1.Introduction ………………………….. ………………………….. ………………………….. ………………. 25
4.2.Reading an image ………………………….. ………………………….. ………………………….. ………. 25
4.3.Displaying an image ………………………….. ………………………….. ………………………….. …… 26
4.4.Capturing Video from Camera ………………………….. ………………………….. …………………. 27
4.5.Playing Video from file ………………………….. ………………………….. ………………………….. . 28
4.6.Saving a Video ………………………….. ………………………….. ………………………….. ………….. 28
5. Raspberry Pi ………………………….. ………………………….. ………………………….. ………………………. 31
5.1.Raspberry Pi features and specifications ………………………….. ………………………….. ……. 31
5.2.Raspberry Pi Camera ………………………….. ………………………….. ………………………….. ….. 32
5.3.System initialization ………………………….. ………………………….. ………………………….. …… 34
6. System implementation ………………………….. ………………………….. ………………………….. ……….. 37
6.1.Video recording ………………………….. ………………………….. ………………………….. …………. 37
6.2.Background subtraction ………………………….. ………………………….. ………………………….. . 39
6.3.Morphological operations ………………………….. ………………………….. ………………………… 41
6.4.Contours ………………………….. ………………………….. ………………………….. …………………… 42
6.5.Object counting ………………………….. ………………………….. ………………………….. …………. 43
Figure List
Figure 1.1 – Representation of a 3×3 image
Figure 3.1 – Image codification
Figure 3.2 – OpenCV components
Figure 4.1 – Displaying an image in OpenCV
Figure 5.1 – Raspberry Pi 3 Model B
Figure 5.2 – Raspber ry Pi ports
Figure 5.3 – Raspberry Pi Camera
Figure 5.4 – NOOBS
Figure 5.5 – Configuration menu
Figure 5.6 – Sources.list editing
Figure 6.1 – Conveyor belt assembly
Figure 6.2 – Raspivid command
Figure 6.3 – BackgroundSubtractorMOG applied to the original image
Figure 6.4 – BackgroundSubtractorMOG2 applied to the original image
Figure 6.5 – Binarization styles
Figure 6.6 – Closing
Figure 6.7 – Contour of the objects
Figure 6.8 – Counting the objects
Acronym list
RGB – Red, Green, Blue
TIFF – Tagged Image File Format
JPEG – Joint Photographic Experts Group
OS – Operating System
CV – Computer Vision
MLL – Machine Learning Library
HMM – Hidden Markov Model
SBC – Single -Board -Computer
HDMI – High-Definition Multimedia Interface
USB – Universal Serial Bus
CSI – Camera Serial Interface
NOOBS – New Out of the Box Software
13
Introduction
People have always been curious. It’s in the human nature to travel, explore and express the
world that we live in, as good as we can. Images are all around us and without them, the universe
would be lifeless and covered in black – an empty space, frozen in time. The images remember us of
past experiences, places that we visited, and moments spent wit h our friends. This is what makes
our world so beautiful.
The beginning of the artistic domain, including the pictures is represented by the cave
symbols made by our ancestors, some of them lasting until our days. The technology continued to
evolve and, in 1961, the history of the digital camera has begun . It all started when Eugene F. Lally
started to think about the way he could use a mosaic photo sensor in order to capture digital images.
The first generally recognized digital camera was a prototype developed in 1975 by the
Eastman Kodak engineer Steven Sasson and it weighed nearly 4kg. The resolution was .01
megapixels and the required time to record a digital photograph was 23 seconds. In the last years,
the digital cameras became more and more soph isticated representing a daily -use object, often being
part of other modern technology gadgets like smart -phones. [4]
The exponential evolution of technology and the human thirst for knowledge lead to the idea
of replacing the human observer with a machi ne which could process the information much faster
and accurately, giving birth to the image processing domain. Using computer vision, a computer is
able to see, to be aware of the around environment, giving us the possibility to develop a lot of
useful, c omplex and cool applications such as face detection, object tracking and object counting.
The purpose of this project is to develop a system that will be able to count objects moving
on a conveyor belt. It will be split in 6 parts. Chapter 1 will consis t of an introduction to the image
processing domain, essential in understanding the way we can process images. Chapter 2 will focus
on the Python programming language, presenting its features and some basic operations that can be
done with it . Chapter 3 wi ll be an introduction to the open source OpenCV library that will be more
discussed in Chapter 4 which presents the features of OpenCV -Python and the basic methods that
will help in achieving the final purpose of this paper. In Chapter 5 will be presented the Raspberry
Pi SBC with its main specifications, features and the Raspberry Pi Camera module. Chapter 6
contains the system implementation steps, describing the stages of the application, from the system
initialization to the actual counting of the objec ts on the conveyor belt.
14
15
1. Digital images
1.1. Digital imaging
Digital image processing is the processing of bidimensional data using a computer. Digital
imaging is often realized using large quantities of data. The binary com puter language and the
visual complexity of the images, usually lead to large size files.
Digital images are made of several small elements called pixels. We can define an image as
a system of values organized in a structured array having a pixel as the main element. The pixels
have a rectangular shape and they are organized in vertical and horizontal lines, making a grid. The
image size is directly related to the dimensions of the pixel array. The image width is represented by
the number of array columns and the image height is represented by the number of rows. From one
edge to another of a pixel, the light intensity and the color remain unchanged and all the pixels from
a grid have the same dimension. When we refer to a specific pixel from the array, w e use its x and y
coordinates. In the coordinate system of an image matrix, x is increasing from left to right and y is
increasing from top to bottom making the origin to be located in the top left corner. [5]
Figure 1.1 – Representation of a 3×3 image [8]
Image size is often different from the real-world representation of the image as it is
represented by the number of pixels within the image. The resolution of an image can be described
as the image quality. When the resolution is inc reasing, the image becomes more sharper and more
detailed because it contains more information.
1.2. The RGB system
To represent a digital image, another parameter called intensity is also required. If all the
pixels would have the same value, we would see a uniform shade. A bit is the defining element of
the intensity in an image. A bit can only be 0 or 1. In a standard digital image, there is an 8 -bit
range of values which results in 256 different values of the intensity used to represent the transition
from one level of brightness to another.
Each pixel corresponds to a color. The RGB system is a standard schema used to represent
colors. The letters from its name symbolize the color components (red, green, blue) that the human
16
retina can perceive. These 3 components, pieced together can form unique color values. Unlike
humans, a computer doesn’t know about colors, so it has to represent these color values as integers
that will be translated by the display hardware to the colors seen by the human eye. Given the fact
that a color component can have a value from 0 through 255, the maximum saturation of the
components is assigned the value 255 while the value 0 corresponds to the full absence of the given
components. In the below table, we can see some common colors and their related RGB values.
[10]
Table 1.1 – Some colors and their corresponding RGB values
The total number of RGB color values is given by all the possible value combinations and it
is equal to 256^3 = 16,777,216 different values . The RGB system represents a true color system but
the human eye cannot see the difference between color values that are too close of each other.
1.3. Image representation
To represent a photographic image, a computer must use digital information that invol ves
discrete values such as integers or characters of a text. The things from the physical world that we
can percept consist of analog information which has a continuous domain of values.
The first playback and recording devices for sound and images wer e all analog. We have to
map the continuous analog data from a real visual scene into a pattern of discrete values using a
process that involves a technology called sampling.
If we sample of a large number of the color and intensity values projected onto a sensing
medium like the human retina, the digital information may represent an image that would look to
the human eye pretty much like the one from the original scene.
The discrete color values at different points over a two -dimensional grid represent the pixels
which and they can be measured using sampling devices. Theoretically, the more sampled pixels we
have, the more continuous and close to reality will appear the resulting image. Practically, it is
difficult for the human eye to distinguish objec ts closer to each other than 0.1mm, so we need a
sampling of 10 pixels for each linear millimeter (62500 pixels/square inch).
The digital image detection techniques use the results and mathematical methods from the
shape detection, artificial intelligen ce, computer science and many other scientific domains. To
simplify the computer vision understanding we can identify two levels in the algorithmic chain: the
low level image processing and the high level understanding of the image.
The high level proces sing is based on knowledge, using algorithms that are capable to
achieve the expected results. The artificial intelligence methods are used most of the time. High
17
level computer vision tries to mimic the human understanding and capacity of taking a decisio n
according to the information embedded in the image.
Computer vision and the understanding process are closely related to the previous
knowledge of the image content. An image can be described by a formal model but this model
remains unchanged. Although the initial model can be made from a previous general knowledge,
the high level processing extracts continuously new information form the images, renews and
clarifies the knowledge.
Most of the image processing models were proposed in the 70th years. Th e recent research
tries to find more efficient algorithms and apply them on parallel machines to reduce the impact of
the large number of operations required for processing.
A difficult and still unsolved problem is the automated pre -processing stages s pecification
needed to solve a specific task. The human operator is the one that usually choose the operations
sequence and the expected target achievement is strongly related to the intuition and previous
experience.
Because of the new analysis methods development and the computers processing capacity
improvements made i n the last years , the passion for movement detection and analysis has also
increased. To the movement analysis system input we can usually find a sequence of images which
considerably in creases the processed data volume. Movement analysis is implemented most of the
time in the real -time analysis systems like an autonomous robot orientation. Another problem
related to the movement analysis domain is finding information about the objects em bedded in the
images, including both moving and stationary objects.
1.4. File format and metadata
The file format and metadata of the rich digital master image has to be selected respecting
some well -known, documented and approved standards such that the high est adjustability level of
use is preserved and also to better assure its conservation over time.
A rich digital master should have a lossless file format. Currently, the most popular format
choice is the uncompressed TIFF6.0, though the wavelet compressi on of JPEG2000 shows a great
promise too.
The digitized photographs metadata represents the properties associated to the original
image and it also identifies technical information related to the digital images. The technical and the
descriptive metadata can be used to display, search, store, retrieve, handle and maintain the state of
the digital images, especially in a digital image repository. There are some standards that can be
used to lead the capture and generation of metadata.
18
19
2. Python
2.1. Pyth on features
Python is a dynamic, multi -paradigm programming language created in 1989 by the
programmer Guido van Rossum. Even today, van Rossum is a leader of the software developers
community that work to the improvement of Python and its basic impleme ntation, CPython, written
in C. Python is a multifunctional language used by companies such as Google, Yahoo! and many
others for web applications development but there are also some scientific or entertainment
applications written in Python. Its growing p opularity and programming power made Python to be
the main programming language used by specialized developers and it also started to be taught in
some universities. Many Unix based systems, such as Linux, BSD and Mac OS X come with the
CPython interpreter .
Python is focused on making the code as clean and simple as possible and its syntax allows
developers to implement theirs ideas in a more clear and precise manner than other programming
languages like C. Regarding the programming paradigm, Python can be used as an object -oriented,
functional or procedural language. It has an automatic memory management mechanism which uses
a “garbage collector” service. Another advantage of this language is represented by the large
number of available standard librarie s.
A lot of programmers love Python because it is able to provide an increased productivity. As
there’s no compilation step, we have an incredibly fast edit -test-debug cycle. The debugging of the
Python programs is relatively simple: A bad input or a bug will never result in a segmentation fault.
Instead, when an error is discovered by the interpreter, it will raise an exception. If no catch block
was defined for the exception, the interpreter will print a stack trace. A source level debugger gives
the po ssibility to check out the local and global variables, assess arbitrary expressions, step through
the code line by line, and so on. For a quick debugging of a program, we can also add some print
commands in the source as this approach can be very effective because of the fast edit -test-debug
chain. [7]
2.2. Object Instantiation
Firstly, we must create an object ( an instance of an object’s class ), then we can apply
methods to it. The process in which an object is created, is called instantiation. Programming
languages such as Python can automatically create objects like strings, numbers or lists when they
are encountered as literals. For other classes of objects such as those without literals, the
programmer must explicitly instantiate them. To instantiate a c lass, we can use the following
syntax:
< variable > = < class > ( < arguments >)
The expression from the right side is called a constructor and it is like a function call. The number
of arguments given to the constructor may vary and they represent the i nitial values of the object’s
attributes or data required to construct the object. For optional arguments, the program will
automatically provide default values.
20
2.3. The images module
The images module represents a package of resources that allow the devel oper to load an
image, view it in a window , manipulate its RGB values and save it to a file. The images module is,
like turtlegraphics, an open source, non -standard Python tool. This module has a class called Image
which describes an image as a grid of RGB values in a two -dimensional space. There are multiple
ways to create an instance of this class such as loading images from a file, making images from
scratch or as the consequence of processing other images. After we create an Image object, we can
use the attributes and methods defined by the class to easily process and manipulate it. The Image
class contains some methods, very useful in image processing, such as:
Table 2.1 – The Image methods
21
3. OpenCV
3.1. Computer Vision
OpenCV (Open Computer Vision) is an open -source library started in 1999 and first
released in 2000. The library is written in the C and C++ programming languages and it can be run
on several operating systems such as Linux, Windows and Mac OS X. The library is in a continuous
improvement and optimization process for interfaces like Python, Ruby, Matlab and other
programming languages. It was created for low computational performance and optimized for the
real time applications.
OpenCV offers a software platform for machine vision that helps people to rapidly develop
complex vision applications. The library contains more than 500 functions which cover many
artificial vision areas such as the visual inspection of products within a manufacture, medical
imaging, secu rity, camera calibration , stereo vision and robotics.
Because computer vision and machine learning are not automated, OpenCV also contains a
machine learning library (MLL) with a general purpose. This sub -library is focused on the
recognition and static g rouping of the models.
Computer Vision can be described as the transformation process of data from a photo or
video camera in a decision or a new representation. These transformations are made in order to
obtain the expected results. The input stream can also contain information about the photo/video
camera (for example if it is installed into an auto vehicle) or the distance to the target object.
Another representation of the computer vision concept is the transformation of an image into a grey
image or the removal of camera movements in a sequence of images.
In a machine vision system, a processor receives a matrix of numbers from the photo camera
but usually there is no incorporated recognition of the model and no automated control of the
focalizatio n and diaphragm, so the vision systems are not very smart yet.
Figure 3.1 – Image codification [6]
22
Any value from this grid has a noise component that reduces the quantity of useful
information but this is the maximum data that can be obtained from a machine vision system. The
difficult task is to transform this network of numbers into perception. The same 2D image could
represent any of the infinite possible combinations of 3D scenes even if the data would be perfect
but the data is corrupted by noise and distortions coming from the weather change, illumination,
reflections, movements, imperfections between the objective and mechanic adjustment, the
integration time of the sensor (movement blur) or the electrical noise from the sensor.
Anothe r problem occurring in Computer Vision is represented by the noise from the image.
Usually, we deal with the noise by using statistical methods. For example, it is nearly impossible to
detect an edge from an image, only by comparing a point with its close neighbors. But if we look at
the statistics on a local region, the edge detection becomes much easier. A real edge should appear
as a sequence of comparisons between neighbor pixels over a local region, each one in
correspondence with the others. It is al so possible to compensate the noise through static analysis
over time of the image.
OpenCV provides some basic tools useful in solving problems related to machine vision.
Sometimes, the high level functionalities in the library will be enough to solve d ifficult, complex
problems of the computer vision. The basic components of the library are enough to create a
complete solution for any problem related to computer vision.
3.2. OpenCV components
OpenCV has evolved from an Intel Research initiative to advance the applications with intensive
usage of the processors.
OpenCV is mainly build from five components, four of them represented below.
Figure 3.2 – OpenCV components
23
The OpenCV components contain the basic processing of the image mechanism and
superior level artificial vision algorithms. The ML library (machine learning) provides static
classifiers and data grouping tools. HighGUI contains the I/O routines and functions used in image
storing and loading. CXCore contains the data structures and the b asic content.
The CvAux component (not present in the image) contains the HMM (Hidden Markov
Model) and some experimental algorithms.
24
25
4. OpenCV -Python
4.1. Introduction
The Python API of OpenCV is OpenCV -Python and it incorporates the best elements of the
Python language and OpenCV C++ API. Python is one of the most used general purpose
programming languages and it was started by Guido van Rossum. It rapidly became very popular
especially for its simplicity and reada bility.
Python performs slower when compared to languages like C or C++ but it can be efficiently
extended with them and this feature allows us to write computationally intensive projects in C/C++
and develop a Python wrapper for them so we can use it as a Python module. This is mainly useful
because the code will run as rapidly as the original C/C++ code and Python also makes it very easy
to write complex programs. We can say that Open -CV Python is actually a Python wrapper over the
C++ implementation. [9]
The Python support for Numpy fac ilitates even more the development of complex projects.
Numpy is a library, highly refined for numerical operations. Any OpenCV array structure can be
modified to and from Numpy arrays. As a consequence, we can combine an y operations that we do
in Numpy with OpenCV. Many other libraries such as SciPy and Matplotlib also support Numpy
and can be used for the same purpose.
4.2. Reading an image
To read an image we can use cv2.imread() method. If it is not located in the worki ng
directory, we have to specify the full path of the image. The second argument represents a flag that
specifies how the image should be read.
▪ cv2.IMREAD_COLOR : used to load a color image. This is the default flag. Using
it, the image transparency will be ignored.
▪ cv2.IMREAD_GRAYSCALE : used if we want to load the image in grayscale
mode.
▪ cv2.IMREAD_UNCHANGED : if it is set, the image will be loaded with alpha
channel. Otherwise it will get cropped.
Instead of these flags, we can also use integers su ch as 1, 0 or -1.
Example :
import numpy as npy
import cv2
# Load the color image with grayscale
imag = cv2.imread(‘ kali.jpg’, 0)
The “0” argument corresponds to the grayscale mode.
26
4.3. Displaying an image
To display an image, we can use cv2. imshow() method. The image window will
automatically match the image size. The first argument of the method is a string that represents the
window name and the second argument represents the imag e. We can create an unlimited number of
windows but they must have different names.
cv2.imshow('kali',imag)
cv2.waitKey(0)
cv2.destroyAllWindows()
In a Windows machine, the image should look like this:
Figure 4.1 – Displaying an image in OpenCV
The cv2.waitKey() f unction is used for key binding and it takes as argument the time in
milliseconds. If any key is pressed in the specified time, the program continues. If the argument is
0, it will continually wait for a keyboard event. It can also recognize specific key s trokes and react
to that event.
The cv2.destroyAllWindows() function can be used to close all the created windows. If we
want to close only one specific window, we can use cv2.destroyWindow() and pass it as argument
the window name.
To save an image, we can use the function cv2.imwrite().
First argument represents the file name and the second argument represents the image that
we want to save.
cv2.imwrite(kali.jpg',img)
The image will be saved in the working directory , in JPG format .
27
4.4. Captur ing Video from Camera
Sometimes , we may want to capture live stream with the video camera. OpenCV gives us a
pretty simple interface to do this task . Next, we’ll capture a video with the camera , translate it into
grayscale video and then, display it.
To capture a video, we need to make a VideoCapture object. It can take as an argument ,
either a video file name or the device index. Device index represents the number that indicates
which camera. Usually, one camera is connected. So we can just pass 0 (or -1). We can also choose
the second camera by putting 1 and so on. Then , we can capture videos frame -by-frame. After we
finish , we must release the capture.
import numpy as np
import cv2
cpt = cv2.VideoCapture(0)
while(True):
# Capture frame -by-frame
ret, fra me = cpt.read()
# operations on the frame
gr = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Display the resulting frame
cv2.imshow('frame',gr)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# release the capture
cpt.release()
cv2.destroyAllWindows()
The c pt.read() method returns a bool ean (True/False). When the frame is correctly read, it
returns True. We can check when the video ends by checking the return value.
Sometimes, cpt may not initialize the capture. In this case, the code will show an error. We
can check if it is initialized by the method c pt.isOpened(). If the return value is True, it’s ok .
Otherwise we have to open it using c pt.open().
We can also see some features of the video using the method cpt.get(propId) where propId is
an integer from 0 to 18. Each number corresponds to a video property. Some of the values can be
changed using c pt.set(propId, value) where v alue represents the new value we want.
For example, we can determine the frame width and height with cpt.get(3) and c pt.get(4).
The default value is 640×480 but, if we want to change it to 320×240 for example , we simply use
ret = c pt.set(3,320) and ret = c pt.set(4,240).
If we are getting errors, we can use other applications for camera (like Cheese in Linux ) to
check if the camera is working fine.
28
4.5. Playing Video from file
Playing a video is almost the same as capturing . We only have to change camera index with
the video file name. We should also use appropriate time for cv2.waitKey() while displaying the
frame. If it is too small , the video will be extremely fast and if it ’s too high, then the video will be
very slow ( this way we can see videos in slow motion ).
import numpy as np
import cv2
cpt = cv2.VideoCapture('test.avi')
while(c pt.isOpened()):
ret, frame = c pt.read()
gr = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imshow('frame',gr)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cpt.release()
cv2.destroyAllWindows()
4.6. Saving a Video
We captured a video, process ed it frame -by-frame and now, we should save it . For images,
it’s pretty easy as we simply use cv2.imwrite(). In this case, a little more work is required.
This time we must create a VideoWriter object. We have to specify the output file name (eg:
out.avi). Then we have to specify the FourCC . Then we should pass the frame size and the number
of frames per second (fps ). The last one is the isColor flag. If its value is True, the encoder will
expect color frame, else it will work with grayscale frame.
FourCC is a 4 -byte code u sed to establish the video codec. It can be passed as
cv2.VideoWriter_fourcc('M','J','P','G') or as cv2.VideoWriter_fourcc(*'MJPG) for MJPG.
The next code will capture from Camera, flip the frame s in vertical direction and save them .
import numpy as np
import cv2
cpt = cv2.VideoCapture(0)
# Define the codec and make a VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('ou t.avi',fourcc, 20.0, (640,480))
while(c pt.isOpened()):
ret, frame = cap.read()
if ret==True:
frame = cv2.flip(frame,0)
# write the flipped frame
out.write(frame)
29
cv2.imshow('frame',frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
# Release everything when the job is fini shed
cpt.release()
out.release()
cv2.destroyAllWindows()
30
31
5. Raspberry Pi
5.1. Raspberry Pi features and specifications
Raspberry Pi is a SBC (Single -Board -Computer) with the dimensions of a credit card, made
in UK by the Raspberry Pi Foundation to promote the study of informatics in different schools.
Even if it has small dimensions (85mm x 56mm), Raspberry Pi can act just like a normal
computer, having usual functionalities such as running an operating system (Linux or Windows),
running applications like games, text editors, programming environments, music and video players
or web applications. The main differences between Raspberry Pi and a personal computer (PC) or
laptop, are represented by the reduced dimensions and computational power of the Raspberry Pi as
it is less performant than a PC but it also has a reduced cost. Raspberry Pi can be compared with a
tablet or a NetBook system, without an integrated display or keyboard. Raspberry Pi also gives us
the possibilit y to connect different electronic components, specific to the embedded systems:
sensors, buttons, LCD display, etc. The possibility to customize the programming systems (the
operating system, the applications) and the possibility of interconnection with ot her electronic
components make from Raspberry Pi a computing system that can be used for many interesting
personal projects. It is a computer which can be integrated in the electronic and mechanic systems
implemented by the user.
Despite its small dimens ions, Raspberry Pi 3 has many integrated periferics, completely
covering the functionality of a computing system (audio, video, USB ports, network connectivity) .
To make functional the Raspberry Pi board, we need the following components:
• HDMI cable and a monitor with HDMI input. If we don’t have one, we can use a
HDMI -DVI or HDMI -VGA adapter.
• AC adapter with 5V output, minimum 2.5A and microUSB jack. It is
recommended to use the official adapter or a good one that can provide the right
voltage and current to power the Raspberry Pi 3 board.
• USB mouse and keyboard. They are required for the initial configurations.
• microSD memory card with a minimum of 8GB capacity .
• A network UTP – patch -cord cable if the system is used in a local network.
32
Figure 5.1 – Raspberry Pi 3 Model B
5.2. Raspberry Pi Camera
The Raspberry Pi camera module is a customized add -on for Raspberry Pi and it can
be connected using the port labeled “Camera”, located near the HDMI output . It uses the
dedicated interface CSI, which was spe cially created for this purpose . The CSI bus is
capable of handling pretty high data rates and it carries exclusively pixel data.
The board alone is relatively small having approximately 25mm x 2 3mm x 9mm. Its
weight is also reduced (approximately 3g) ma king it perfect for mobile applications or other
applications where the w eight and the size are important. The photo device is connected to
the processor BCM2835 on the Pi bus via the CSI bus, a higher bandwidth link that carries
pixel data from the phot o camera back to the processor. This bus transports the information
along the spectra strip that attaches the camera to the Raspberry Pi.
The sensor has a resolution of 8 megapixels and it has a fixed objective on the board.
Regarding the video quality, the camera has a video resolution of 1080p 30fps, 720p 60fps,
640x480p 60/90fps. It can also capture static images up to 3280×2464 pixels and it is
compatible with all the Raspberry Pi versions.
33
Figure 5.2 – Raspberry Pi ports
Figure 5.3 – Raspberry Pi Camera
34
5.3. System initialization
Before we can use the Raspberry Pi , we have to connect the microSD memory card that
contains the operating system. If it doesn’t have a preinstalled operating system, we have to install it
manually. Raspberry Pi can run on multiple operating system distributions such as Linux or
Windows 10 IoT Core. The official operating system offered by Raspberry Pi is a Linux Raspbian
distribution that is easy to use and it is recommended for the beginners.
If we have an official microSD memory card from the Raspberry Pi manufacturer, the
operating system installing is easier because the memory card already has NOOBS (New Out of the
Box Software) which comes with a graphical interface where we can install one of the differ ent
operating systems supported by Raspberry Pi.
Figure 5.4 – NOOBS
If we don’t want to install Linux Raspbian , we must ensure that the device is connected to
the internet via a cable or Wifi. After the operating system has been installed, the system will
reboot. After the operating system has been installed , we can make some configurations to better fit
our needs. To open the terminal, we can click on its icon or use Ctrl+Alt+T key combination. To
open the configuration menu, we type “sudo raspi -config” in the terminal.
35
Figure 5.5 – Configuration menu
The sudo command allows for the privileged execution of the commands, with
administrative rights on the system. After the operating system installation, we should change the
default password. Th e default username is “pi” and the password is “raspberry”. Also, to ensure
system hardening, the SSH and VNC services should be disabled if we don’t need remote access to
the system.
To update the operating system packages to the latest version we must use the “sudo apt -get
update” and “sudo apt -get upgrade” commands in the terminal. We can also renew the current
version of the operating system to a better version, if available, by typing the “sudo apt -get update”,
“sudo apt -get dist -upgrade” commands in the terminal.
If we want to capture images or videos, we have to connect the camera to the Raspberry Pi
board and activate the camera module. This can be done from the Raspberry Pi Configuration
window. To connect the camera, we firstly introduce the spec tra strip in the dedicated CSI
communication port. The colored side of the spectra strip must be facing the Ethernet port and, after
it has been introduced in the connector, we have to push down the connector to make sure it stays
fixed. When we do this p rocedure, we should ensure that the board is disconnected from the power
grid.
To check that everything is working as intended, we can execute the following command in
the terminal: python -c “import picamera”. If no error message is received, it means th at all the
steps were correctly executed.
After all the initial steps were finished and the system is functional, we’ll run the following
commands:
Sudo apt -get update
Sudo apt -get upgrade
Sudo apt -get install python -opencv libopencv -dev -y
The last c ommand will install the packets required to run scripts in Python, from the
OpenCV library.
36
If the packets cannot be installed due to the server that has the resources being offline or the
Raspberry Pi cannot connect to it, we can run the following comma nd:
Cat /etc/apt/sources.list
This command will show the list of servers accessed by Raspberry Pi when it downloads
packages or renews the version of some applications.
To edit this list, we run the following command:
Sudo nano /etc/apt/sources.lis t
We delete the # symbol from the third line and save it.
Figure 5.6 – Sources.list editing
After this, we open an instance of Python from the following path: Start -> Programming ->
Python 3 (IDLE).
We create a new window from File -> New or pressing the Ctrl + N key combination.
37
6. System implementation
In this chapter will be present ed the architecture of the implemented software module. This
architecture will be used to accomplish the purpose of the project. Firstly, to realize the objec t
tracking, detection and counting we need to use a hardware assembly made from a Raspberry Pi
board which is externally powered and a Raspberry Pi video camera that will be connected to it.
After the assembly of the components, the camera must be strateg ically placed in order to
obtain a good view over the conveyor belt and the objects on it, preferably on the top of the
conveyor belt. Before the video recording, it is necessary to have a functional conveyor belt , Fig.
5.1 and the camera should be placed in a way such that the view area of the lens should contain only
the width of the belt on which the objects are placed. The conveyor belt system can be observed in
Fig. 5.1. This system is composed from a white belt of relatively small dimensions, two rota ting
cylinders , an engine and a wood support and it is controlled by an application on a Raspberry Pi
system.
Figure 6.1 – Conveyor belt assembly
6.1. Video recording
After assembling the system and placing the camera at the best view angle, the video
recording can be initialized. To start recording, it is necessary to connect to the Raspberry Pi and
access the camera. This is realized in two ways. It can be accessed from the terminal window or
from the Python script.
Figure 6.2 – raspivid command
38
In the above image is represented the raspivid command used to record a video with the
camera module [10]. For a simple recording, only a few arguments of the command are necessary.
To spe cify the output filename, -o argument will be used, followed by the f ilepath. The next
necessary argument is represented by -t which specifies the length of the video, expressed in
milliseconds. For example, -t 10000 will allow for the recording of a video with the length of ten
seconds. The default time, in case there is no argument specified for the time, is five seconds. If a
vertical flip of the image is required, it can be done using the command -vf. For horizontal flip, the
command -hf can be used.
To access the camera from the Python script, a few lines of code are necessary.
camera.start_preview()
camera.start_recording('/home/pi/video.h264')
sleep(10)
camera.stop_recording()
camera.stop_preview()
The camera.start_preview() method allows the system to view the environment through the
camera lens without recordi ng. This method is necessary to place the camera over a region of
interes. The camera.stop_preview () will stop the camera preview.
The camera.start_recording method is used to start the actual recording of the video and it
takes as argument the path wher e the video will be saved. The camera.stop_recording() method is
used to stop the video recording.
The sleep() method is used to specify the time between the camera.start_recording() and
camera.stop_recording() methods, indicating the length of the video.
To play the recorded video, the command omxplayer followed by the video name will be
used.
After some recording sessions, it has been observed that the camera placement plays an
essential role in the motion tracking accuracy. Due to the fact that the c onveyor belt is white and it
reflects external sources of light, it is important to have an ambient light intensity or adjust the
camera position in a way such that the reflected light will not compromise the recording as it
changes the color of the object s and also can be detected as a moving entity with its own area
because the conveyor belt is not stationary and it creates the impression of movement. Another
solution for an improved video quality is to change the material of the belt and use a more
perfo rmant camera.
The application proposed on this chapter is sensitive to motion, detecting any kind of
movement, including everything outside the conveyor belt. For this reason , it is recommended to
capture only the conveyor belt.
Firstly, some video reco rdings were made , for the purpose of application development but
the application can also track and count objects in real time. To accomplish this task, the MyObject
class must be created in order to detect and count the objects on the conveyor belt. This class
defines some methods such as updateCoordinates() and counting(). The updateCoordinates()
method will store the x and y coordinates of the object moment and also add the previous moment
coordinates to a tracking list. The counting() method looks if th e tracking list contains at least two
coordinates in which case it checks the last two moments from the list and verifies if the y
39
coordinate of the last moment passes a certain threshold line for which we will count the objects. In
that case the direction will be set and the returned value will be True .
Apart from the constructor that initializes the list and the (x, y) coordinates of the object
moment there are some getters that return the object attributes. For the motion tracking and
detection, we will have to import the OpenCV library that is useful in the processing of the images.
The Object class will also be imported in the CountingObjectsBelt script as well as the numpy and
time libraries.
The method cv2.VideoCapture() will be used to record the vi deo. The “0” argument is
passed when we want to capture video from the default system camera. In case the path is given, it
will open the video file from that path. In order to make the necessary image processing operations
for the detection of the objects , a frame by frame analysis of the video will be required. For this
purpose, the read() function should be called as long as the video stream is opened. To check if the
video is still recording or it is opened, the isOpened() method will be called as a con dition of a
while loop.
6.2. Background subtraction
In order to detect the motion of the belt and the objects placed on it, a first step that has to be
made is the background subtraction from the video frames. In this context, some experiments were
made, le ading to the conclusion that certain background subtraction algorithms are better than
others for the problem at hand.
Background subtraction is a very important step in the vision based applications. In cases
such as a room with some objects of interest like people or cars, we need to isolate these objects,
ignoring anything else. In most cases, the images are pretty complex and the task of background
subtraction becomes even more complicated when the images contain shadows of the objects
because the sha dow is also moving and the subtraction marks it as foreground, complicating the
process. The most important background subtraction algorithms implemented in OpenCV are
BackgroundSubtractorMOG and BackgroundSubtractorMOG2.
BackgroundSubtractorMOG is a Gau ssian Mixture -based Background/Foreground
Segmentation Algorithm that applies a method to model the background pixels by a mixture of K
Gaussian distributions. For this method, the number of Gaussian mixtures K, can take values in the
range 3 -5. The mixtur e weights correspond to the time proportions that the colors will stay in the
scene. The colours staying longer and more static, are the most probable background colours. This
method can be passed a few parameters such as the length of history which is the number of frames
to take into consideration for background extraction, the number of Gaussian mixtures, background
ratio and noise strength. Before subtracting, we need to create an object using the function
cv2.createBackgroundSubtractorMOG() and, inside the video loop, call the apply function from the
previously created object to get the foreground mask. [11]
40
Figure 6.3 – BackgroundSubtractorMOG applied to the original image
BackgroundSubtractorMOG2 is also a Gaussian Mixture -based Background/Fo reground
Segmentation Algorithm. As an improvement of the previous algorithm, this one contains more
adaptive Gaussian mixtures for background subtraction, selecting the appropriate number of
Gaussian distribution for each pixel. Compared to BackgroundSubt ractorMOG, it provides better
adjustability to changing video scenes due to factors such as illumination changes. Like the
previous algorithm, it also requires the creation of a background subtractor object. This algorithm
also provides us with the possibi lity of specifying if the shadows should be detected or not. In case
this feature is enabled, it detects and marks the objects with white and the shadows in gray color
with the drawback of decreasing the speed. This feature of detecting and marking the obj ects and
shadows can be observed in Figure 6.4. [12]
Figure 6.4 – BackgroundSubtractorMOG2 applied to the original image
41
6.3. Morphological operations
After extracting the background from the images, the image will undergo a binarization
process whic h can be obtained with the cv2.threshold() method. It can receive some arguments such
as the input image which should be a grayscale image, on which the binarization will be applied, the
value of the threshold, the new value of the pixels above the thresho ld and a field which the
binarization will be realized with, representing the style of thresholding. For the problem at hand,
there were chosen only the brightest pixels, with the intensity value above 220. All of these pixel
values were changed to the max imum intensity (255) , corresponding to the white color, while the
other pixels were given the lowest value (0) corresponding to the black color. This was done,
according to the cv2.THRESH_BINARY style that can be seen, along other binarization styles, in
the Figure 6.5. [1]
Figure 6.5 – binarization styles
Next, for a better view of the moving objects, another pre -processing step is required. This
one is a morphological operation called Closing. Closing is obtained by applying two consecutive
operati ons such as Erosion, followed by Dilation.
The idea of erosion is to slide the kernel through a binary image , such as in 2D convolution.
The pixel values will be modified according to the kernel size. They will remain white in case all
the pixels under t he kernel are white. Otherwise, their color will be changed to black, corresponding
to an erosion. Practically, all the pixels near the boundary are discarded related to the kernel
dimensions. So, the size of an white object will decrease. This is useful i n removing small white
noises and detaching different objects. A large kernel size removes considerable white noises with
the risk of reducing too much the important elements from the image.
The dilation operation is opposed to the erosion. It increases the white area of the
foreground objects by changing the intensity of the pixels to maximum, if at least one pixel is under
the given kernel. Dilation is used for noise removal, following the erosion process. After removing
the noise with erosion, the obje ct’s area will be so small such that it will not be correctly detected,
so we need to increase its area using dilation. Experimenting different kernel sizes, it was concluded
that the optimal kernel size is 35 by 35 pixels which is enough to remove the whi te noise from the
binary images.
42
Fig 6.6 – Closing [2]
6.4. Contours
After the closing process, all the contours will be extracted, using the method
cv2.findContours (). The contour of an object can be defined as a continuum of points along the
boundary points of an object that have the same color or intensity. For a better detection accuracy of
the contours, it is recommended to apply the specified function on a binary image. This method will
modify the input image, replacing it with a black image conta ining only the contours of the objects.
This feature can be observed in the Figure 6.7. In Python, each contour is a list of all the contours
within the image and each individual contour represents a Numpy array of coordinates (x,y) of the
boundary points of the object. This method returns beside the contours, a hierarchy of the contours,
specified with a contour retrieval mode and an approximation method. There are several retrieval
modes but the one used in this paper is RETR_TREE which retrieves all the contours and
reconstructs a full hierarchy of nested contours. As an approximation mode, was used the
CHAIN_APPROX_SIMPLE which compresses vertical, horizontal and diagonal segments and
leaves only their end points. [3]
Figure 6.7 – Contour of the o bjects
For each contour in the contour list, it is necessary to extract the contour area using the
method cv2.contourArea(contour) . This method takes as argument a contour and calculates its area.
If the object contour area passes a certain threshold, we move forward with the pre -processing steps
43
in order to count the objects. To do this, the image moments must be calculated as they help us in
obtaining the center mass of the objects. This can be realized with cv2.moments() method that
receives a contour as input parameter and returns certain points that help in calculating the centroid
coordinates with the below equations:
The center mass of the object represented by (Cx, Cy) coordinates are used in creating a new object
from MyObject class and they are appended at the object tracking list. The centroid will be drawn in
the original frame with the method cv2.circle() that receives as input parameters an image on which
we will draw the circle, the origin of the circle, the colour of the circle and the radius . In this case, a
radius of five pixels was used, the colour code (0, 255, 255) that represnt the maximum values for
green and red, corresponding to the yellow color. The -1 argument specifies the fact that the circle
can be drawn inside a rectangle.
cv2.circle(frame, (cx,cy), 5, (0, 255, 255), -1)
After finishing these steps, we need to view the direction of the moving object, bounding it
into a rectangle. In a straight rectangle, the rotation of the object is not taken into consideration, so
its area will not be minimum. It can be found with the method cv2.boundingRect() which receives a
contour as the input parameter and returns the top -left (x,y) coordinates of the rectangle and its
width and height. The rectangle can be drawn using the method cv2.rectangle() that takes as
arguments the image, the top -left corner coordinates and the bottom -right coordinates of the
rectangle.
cv2.rectangle(frame, (x,y), (int(x+w), int(y+h)), (0,255,0), 2)
To put text in an image, it is required to specify the fo nt that will be used, the text, the
coordinates of the position where it will be inserted, the color, thickness and other optional
characteristics.
font = cv2.FONT_HERSHEY_SIMPLEX
text = 'Count: ' + str(count)
cv2.putText(frame, text, (10, 40), font, 0.5, (255, 255, 255), 2, cv2.LINE_AA)
6.5. Object counting
In order to count the objects, after creating instantiating a MyObject object with an ID and
the centroid coordinates, we add them to a list of objects for which we will monitor the position in
frames. F or the problem at hand, the objects are tracked on the y axis. For each new frame, after
extracting the contour, contour area and the object moments, it is verified if the centroid are shifted,
case in which the object centroids are updated. By updating th e coordinates of an object, it also adds
the new coordinates to the object tracking list.
To count the objects, the method counting() is called for each object . If the last two
coordinates on the y axis of the centroid are separated by the specified line from the arguments, the
method will return True. In this case, the counter will be incremented and also displayed in the
terminal with the corresponding timestamp
44
Figure 6.8 – Counting the objects
After all the previous operations were real ized, we’ll save a video containg all the
processing methods applied to the input image s, such as drawing rectangles, center of the mass, a
line after which the objects will be counted and the real -time changing text. To do this, the method
videoWriter() will be called in order to create an object on which each modified frame will be
written.
out = cv2.VideoWriter('outpy.avi', fourcc , 30 , (int(width),int(height)))
The first argument of the VideoWriter() method represents the output video, the second
argument establishes the video codec, the next one is the video framerate and the last argument is
represented by the frame size of the output video.
45
Conclu sion
This project presented a system that is able to correctly track, detect and cou nt objects
moving on a conveyor belt. This system can find its applicability in the industrial field wherever the
counting of multiple objects is required , this representing a labor saving system . The counting
system can be improved with different features such as detecting different color objects,
recognizing their shape and texture. With these improvements, the application will be able to sort
different objects and identify unknown objects, feature that can be used in the detection of
unwanted or bad prod ucts from the assembly line of a factory.
In order to correctly detect the objects, result that will influence the counting process, there
are some important factors that should be taken into consideration. The first one is the camera
alignment with respe ct to the conveyor belt. The view area of the camera should contain only the
conveyor belt. Any other environmental elements, external to the conveyor belt will result in a more
complicated image analyzing process, leading in some cases to compromised resu lts such as
detection of unwanted objects and a high false positive ratio. Another important aspect is the color
and the material of the conveyor belt. It is recommended that the color should be black, without
producing any light reflections. Any reflectio n of the light can disturb the background extraction
process as it can change the color of the objects and also can be detected as a moving entity with its
own area due to the fact that the conveyor belt is moving.
In the current paper, it can be observed that, even when using a white conveyor belt with a
shiny surface, the influence of the light reflection was minimized by choosing the right image
processing operations, resulting in the correct counting of the products recorded with a video
camera.
It has been observed that choosing an optimal background subtraction algorithm can highly
influence the output of the system. A bad choice of the algorithm can lead to a noisy foreground. A
high volume of noise can negatively influence the detection of the int erest objects. The noise from
the images can be reduced using morphological operations like those presented in the paper
(erosion, dilation, closing, etc.). The parameters used for the kernels in this morphological
operations have manually chosen after mul tiple trial and error processes. That conducted to an
almost complete removal of the noise, leaving the image only with the wanted objects. Due to the
fact that the parameters have been manually chosen, this works almost perfectly for the tested
objects bu t it may not work very well on other, unknown environments. For the moment, the output
of an automated selection of the methods and parameters cannot be predicted because the computer
cannot distinct between a wanted and an unwanted object, without human i ntervention.
In conclusion, this project focused on the implementation of a general object counting
system based on image analysis and processing methods given by the OpenCV library. Multiple
combinations of these methods were tried in order to build a robust system but it was concluded that
the right detection and counting of the objects depends not only on the software application but also
on other factors such as the quality of the hardware equipment and the environmental conditions.
46
47
Bibliography
1. https://docs.opencv.org/3.4/d7/d4d/tutorial_py_thresholding.html
2. https://docs.opencv.org/3.0 –
beta/doc/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html
3. https://docs.opencv.org/3.1.0/dd/d49/tutorial_py_contour_features.html
4. https:// www.cnet.com/news/photos -the-history -of-the-digital -camera/
5. https://sites.google.com/site/learnimagej/image -processing/what -is-a-digital -image
6. https://www.docsity.com/en/non -linear -hypotheses -machine -learning -and-artificial –
intelligence -lecture -slides/1771 99/
7. https://www.python.org/doc/essays/blurb/
8. https://www.dyclassroom.com/image -processing -project/how -to-get-and-set-pixel -value -in-
java
9. http://opencv -python –
tutroals.readthedocs.io/en/latest/py_tutorials/py_setup/py_intro/py_intro.html#intro
10. http://home.wlu.edu/~lambertk/classes/101/Images.pdf
11. An improved adaptive background mixture model for real -time tracking with shadow detection –
P. KadewTraKuPong and R. Bowden, 2001
12. Improved adaptiv e Gaussian mixture model for background subtraction – Z.Zivkovic, 2004
48
49
Annexes
Object.py
from random import randint
import time
class MyObject:
def __init__(self, i, cxi, cyi):
self.i = i
self.x = cxi
self.y = cyi
self.tracks = []
self.R = randint(0,255)
self.G = randint(0,255)
self.B = randint(0,255)
self.done = False
self.state = '0'
self.direction = None
def getRGB(self):
return (self.R,self.G,self.B)
def getTracks(self):
return self.tracks
def getId(self):
return self.i
def getState(self):
return self.state
def getdirectionection(self):
return self.direction
def getX(self):
return self.x
def getY(self):
return self.y
def updateCoordinates(self, xn, yn):
self.tracks.append([self.x,self.y])
self.x = xn
self.y = yn
def setDone(self):
self.done = True
def Over(self):
return self.done
def counting(self,treshold):
if len(self.tracks) >= 2:
if self.state == '0':
if self.tracks[ -1][1] > treshold and self.tracks[ -2][1] <= treshold:
state = '1'
self.direction = 'down'
return True
else:
return False
else:
return False
50
CountingObjectsBelt .py
import numpy as np
import cv2
import Object
import time
count = 0
cap = cv2.VideoCapture( 0) #Open video file
width = cap.get(3) #get width
height = cap.get(4) #get height
background = cv2.bgsegm.createBackgroundSubtractorMOG(history= 20, nmixtures = 3, backgroundRatio = 0.7,
noiseSigma = 30)
closeKernel = np.ones((35,35),np.uint8)
GAUSSIAN_S MOOTH_FILTER_SIZE = (21, 21)
ADAPTIVE_THRESH_BLOCK_SIZE = 19
ADAPTIVE_THRESH_WEIGHT = 9
ObjectArea = (width*height)/250
line_down_position = int(3*(height/5))
up_limit = int(1*(height/5))
down_limit = int(4*(height/5))
point1 = [0, line_down_posit ion];
point2 = [width, line_down_position];
Line1 = np.array([point1,point2], np.int32)
Line1 = Line1.reshape(( -1,1,2))
mx = int(width/2)
my = int(height/2)
#Fields
font = cv2.FONT_HERSHEY_SIMPLEX
objects = []
pid = 1
# Define the codec and create Vide oWriter object.The output is stored in 'outpy.avi' file.
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('outpy.avi', fourcc , 30 , (int(width),int(height)))
while(cap.isOpened()):
ret, frame = cap.read() #read a frame
foregoun dmask = background.apply(frame) #Use the substractor
try:
ret,imgBinary= cv2.threshold(foregoundmask,220,255,cv2.THRESH_BINARY)
'''
foregoundmask1 = cv2.GaussianBlur(foregoundmask, GAUSSIAN_SMOOTH_FILTER_SIZE, 0)
ret,imgBinary1= cv2.threshold(foregoundmask1,220,255,cv2.THRESH_BINARY)
mask = cv2.morphologyEx(imgBinary , cv2.MORPH_CLOSE, closeKernel)
except:
#if there are no more frames to show…
print('EOF')
break
_, contours0, hierarchy = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours0:
51
area = cv2.contourArea(contour)
if area > ObjectArea:
Moment = cv2.moments(contour)
cx = int(Mom ent['m10']/Moment['m00'])
cy = int(Moment['m01']/Moment['m00'])
x,y,w,h = cv2.boundingRect(contour)
newObject = True
if cy in range(up_limit, down_limit):
for i in objects:
if abs(cx – i.getX()) <= w and abs(cy – i.getY()) <= h:
newObject = False
i.updateCoords(cx, cy)
if i.counting(line_down_position) == True:
count += 1
print("ID:", i.getId(), 'crossed going down at', time.strftime("%c"))
break
if i.getState() == '1':
if i.getDirection() == 'down' and i.getY() > down_limit:
i.setDone()
if i.Over():
index = objects.index(i)
objects.pop(index)
del i
if newObject == True:
p = Object.MyObject(pid, cx,cy)
objects.append(p)
pid += 1
cv2.circle(frame, (cx,cy), 5, (0, 255, 255), -1)
img = cv2.rectangle(frame, (x,y), (i nt(x+w), int(y+h)), (0,255,0), 2)
text = 'Count: ' + str(count)
frame = cv2.polylines(frame, [Line1], False, (255, 255, 255), thickness = 2)
cv2.putText(frame, text, (10, 40), font, 0.5, (255, 255, 255), 2, cv2.LINE_AA)
out.write(frame)
cv2.imshow('Frame', frame)
#Abort and exit with 'Q' or ESC
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cap.release() #release video file
out.release()
cv2.destroyAllWindows() #close all openCV windows
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: University Politehnica of Bucharest [625971] (ID: 625971)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
