A NEW REGION BASED IMAGE RETRIEV AL IN THE TRANSFORMED DOMAIN [601606]
A NEW REGION BASED IMAGE RETRIEV AL IN THE TRANSFORMED DOMAIN
Amina Belalia⋆Kamel Belloulata†
⋆Ecole Superieure d’Informatique de Sidi Bel Abbes, Algerie .
†Telecommunications Department, University of Djillali Li abes, BP 89, Sidi bel Abbes, Algeria.
ABSTRACT
Content-Based Image Retrieval (CBIR) is the process of
searching digital images in a large database based on fea-
tures. As majority of images are stored in compressed for-
mat, features can be extracted directly from Discrete Cosin e
Transform (DCT) for JPEG compressed images. This paper
proposes a new Region-Based Image Retrieval approach us-
ing a Shape Adaptive Discrete Cosine Transform (SA-DCT).
In this retrieval system, an image has a prior segmentation
alpha plane, which is defined exactly as in MPEG-4. There-
fore, an image is represented by segmented regions, and each
region is associated with a feature vector derived from DCT
and SA-DCT coefficients. Any region, from the query image,
can be selected as the main object. For images without salien t
objects, users can still select the whole image as the query.
The results show that our approach is able to identify main
objects and reduce the influence of background, and thus
improve the performance of image retrieval in comparison
with a conventional Content-Based Image Retrieval based on
DCT.
Index Terms —Content-Based Image Retrieval (CBIR),
DCT, Segmentation, Region-Based Image Retrieval (RBIR),
SA-DCT, MPEG-4.
1. INTRODUCTION
Proliferation of digital imaging devices, storage and netw ork-
ing systems has resulted in large volumes of images. Effi-
cient retrieval of images stored in such large databases has
become desirable. Content-Based Image Retrieval (CBIR) of –
fers a way of retrieving images according to their visual con –
tent [1, 2]. Feature extraction is the basis of CBIR, which is a
process of transferring the input image into the set of featu re
vectors. The representation of feature vectors is called fe ature
descriptor. A feature descriptor can be either global or lo-
cal. A global descriptor uses the visual features of the whol e
image (conventional CBIR), whereas a local descriptor uses
the visual features of regions or objects (Region-Based Ima ge
Retrieval RBIR). To obtain local visual descriptors, we nee d
The work was supported by the Partenariat Hubert Curien PHC- TASSILI
under grant No 12MDU864. The authors thank for their financia l supports.to perform a complete object segmentation [3] to obtain se-
mantically meaningful objects (like car, horse). Unlike ea rly
CBIR approaches, which compute global descriptor of im-
ages, the Region-Based Image Retrieval (RBIR) extract fea-
tures of the segmented regions and perform similarity com-
parisons at the level of the region [4, 5, 6, 7]. During retrie val,
the RBIR system provides the mask of the segmented regions
of the query image and assigns several properties, such as th e
regions to be matched, the features of the regions, and even
the weights of different features [8]. The matching of the re –
gions is restricted to be one-to-one, i.e., one region of an i m-
age can only match one region of another image.
For compressed images, features can be extracted directly
from images in their compressed format by using, for exam-
ple, Discrete Cosine Transform (DCT)[9] or Discrete Wavele t
Transform (DWT)[10] which are a part of compression pro-
cess. Recently, there were numerous approaches based on
DCT block processed images for information extraction [11,
12, 13]. The improvement in using DCT is explained by its
decorrelation properties, feature preservation and reduc tion in
complexity. It is evident that the use of DCT block transfor-
mation in combination with other techniques results in good
retrieval performance. In most cases DCT can be seen as a
preprocessing step followed by a more or less sophisticated
method for extracting structural features [14]. It has been
shown in some cases that the transformed based processing
gives significantly better results than direct pixel based p ro-
cessing [10].
In this paper we propose to optimize our early approach
[15, 9], based on the Shape-Adaptive DCT (SA-DCT) [16],
for RBIR. In CBIR which uses the DCT blocks [14], the
boundary blocks contain pixels from an object and either fro m
background or from another object (Fig. 1). To alleviate thi s
problem, we propose to apply SA-DCT [16] that takes into
account prior segmentation of the image into regions [9] so a s
the matching of the regions is restricted to be one-to-one, i .e.,
one region of an image can only match one region of another
image. The paper is organized as follows. In Section 2, we
briefly review the theory of Content-Based Image Retrieval
(CBIR) in the DCT domain. In Section 3, the Shape adaptive
DCT transform is described and the proposed Region-Based
Image Retrieval is presented. In Section 4 numerous experi-
mental results are shown whereas in Section 5 we draw con-
S1
2S2SS1
UU
Fig. 1 .Block with two segments S1andS2, belonging to different
regions, and its decomposition into two blocks with active p ixels
(grey), either in S1or inS2, and inactive pixels in U(white).
clusions.
2. CONTENT-BASED IMAGE RETRIEV AL USING
DCT TRANSFORM AND HISTOGRAM OF ITS
PATTERNS
LetIbe the intensity of an image to be transformed and
x= (x,y)be spatial coordinates of a pixel in this image.
Let{b1,…,b N}be the set of Nnon-overlapping blocks, i.e.,
collections of pixel coordinates, partitioning the image. To
permit compact notation, let Ibidenote restrictions of image
intensity Ito the block bi. In other words, Ibi={I(x) :
x∈bi}. Each block biis then transformed to the frequency
domain/hatwideIbi=DCT(Ibi)by the Discrete Cosine Transform
(DCT). This transform is originally defined in the 1-D form,
and can be used to construct 2-D separable transform. It has
been found useful for source coding, especially image and
video coding. Let u= (u,v)be a 2-D frequency of a DCT
coefficient and /hatwideI(u)the DCT transform for an NxN image
represented by pixel values I(x)forx,y= 1,…,N .
2.1. AC-Pattern and its histogram
In this study, 4×4 DCT block are considered. So the proposed
approach, selects 9 coefficients out of all 15 AC coefficients
in each block and uses their statistical information to con-
struct the AC-Pattern (Fig. 2). This selection gathers thes e
9 coefficients into 3 groups: Horizontal (Group H), Vertical
(Group V) and diagonal (Group D). For each group, the sum
of the coefficients is calculated firstly and then the squared –
differences between each coefficient and the sum of this grou p
are calculated. Finally, these squared-differences of the three
groups are used to construct the AC-Pattern. This selection is
retained because of its ability to represent local structur e of
content block [15]. Compared with the method of [14] which
uses 15 AC-Pattern, this selection can reduce the complex-
ity of the feature vector obviously. For constructing the AC –
Pattern histogram of an image, we just calculate the number
of appearance of these AC-Patterns in this image, and then we
get the AC-Pattern histogram HAC. Furthermore, there is one
special AC-Pattern in which all the AC-coefficients are zero
(Fig. 3) and this AC-Pattern mainly corresponds to uniform
blocks of image [15]. So in consideration of time-consuming
D /g18291 /g18292 /g18293
H
/g18294 /g18298 /g182912
V
/g18295 /g182910 /g182915
X /g18291 /g18292 /g18293
/g18294 /g18295 /g18296 /g18297
/g18298 /g18299 /g182910 /g182911
/g182912 /g182913 /g182914 /g182915
(a)
/g20121 /g20122 /g20123
Texture-Pa/g425ern
(d) /g20121= Ʃ /g1861 = 1,2,3 (/g1829 /g1861− /g1845 1)2
/g20122= Ʃ /g1861 = 4,8,12 (/g1829 /g1861− /g1845 2)2
/g20123= Ʃ /g1861 = 5,10,15 (/g1829 /g1861− /g1845 3)2
(c) /g18451= /g1829 1 + /g1829 2 + /g1829 3
/g18452= /g1829 4 + /g1829 8 + /g1829 12
/g18453= /g1829 5 + /g1829 10 + /g1829 15
(b) 4×4 DCT Block
Fig. 2 .The process of forming Texture-Pattern [15] a) Three groups
of AC coefficients extracted from DCT block b) Sums of each gro up
c) Sums of Squared-Differences d) Texture Pattern
0 10 20 30 40 50 60050010001500200025003000
AC−PatternNumber of AC−Pattern
Fig. 3 .Histogram of the first 50 AC-Patterns with highest frequency
of occurrence for Horse image (Content-Based Image Retriev al).
and efficiency, we just select some of AC-Patterns which have
higher frequency to construct the histogram [14].
2.2. DC-Pattern and its histogram
Differently from previous AC-Patterns that describe the lo cal
feature information inside each block (Intra-Block), DC-
Patterns integrate more global features by using gradients
between each block and its neighbors (Inter-Block). DC-
DirecVec [14] is defined and used as feature for DC-Patterns.
For a given DC value, the differences between its value and
the eight neighboring DC values are calculated and arranged
to produce the DC-Patterns. Like the same observations can
be done in AC-Pattern histogram HAC, we select those dom-
inant DC-Patterns to construct DC-Pattern histogram HDC.
0 20 40 60 80 100 120050010001500200025003000350040004500
Concatenation of AC and DC patternNumber of Pattern
Fig. 4 .Combined Histogram of the first 50 AC-Patterns and 50
DC-Patterns with highest frequency of occurrence for Flowe r image
(Content-Based Image Retrieval).
2.3. Feature descriptor and Similarity measurement
For each block, AC-Pattern is formed by 9 coefficients and
DC-Pattern is constructed by the DC coefficient of the block
itself and those of its 8 neighboring blocks. So the concate-
nation of AC-Pattern (intra-block features) and DC-Patter n
(inter-block features) histograms (Fig. 4) is used to do ima ge
retrieval. In this context, the descriptor is defined as foll ows:
H= [(1−α)×HAC,α×HDC] (1)
whereαis a weight parameter that controls the impact of AC-
Patterns and DC-Patterns histogram.
Many distances have been used to define the similarity
of two color histogram representations [17]. In our system,
the similarity between query and images in the database is
assessed by the Chi-Squared distance (eq. 2) between these
corresponding feature descriptor. The Chi-Square test is u sed
to compare two binned data sets and to determine if they are
drawn from the same distribution function.
Dis(i,j)=M/summationdisplay
k=1(Hi(k)−Hj(k))2
Hi(k)+Hj(k)(2)
Wherekdemonstrates the components of the descriptor and
iandjdemonstrate the different descriptors, Mindicates the
dimension of the descriptor.
3. PROPOSED REGION-BASED APPROACH
In the classical Content-Based Image Retrieval (CBIR) (Sec –
tion 2), the blocks are defined independently of the image
content. The boundary blocks contain pixels from an object
and either from background or from another object (Fig. 1).
Thus, independent retrieving of objects is not possible. Al so,
the retrieval quality may suffer since pixels on different s ides
of the boundary may have different intensity characteristi cs;by applying the standard DCT to such a block, spectral prop-
erties of these pixels are mixed up making the search for a
good region-to-region correspondence unreliable. In part ic-
ular, a sharp intensity transition will cause significant sp ec-
tral oscillations. To alleviate this problem, we propose a n ew
Region-Based Image Retrieval (RBIR) approach that takes
into account prior segmentation of the image into regions so
as to perform similarity comparisons at the granularity of t he
region independently of others regions (or background). Th is
allows independent retrieval of individual objects, thus p er-
mitting new functionalities.
3.1. Principle of the proposed system
Firstly, the system segments every image into two regions:
Foreground and Background, and then extracts low-level tex –
ture features of each region by using DCT and SA-DCT (see
Fig.5a). During retrieval, the user selects a region of inte rest
from the query image as the query region. It is assumed that
each image has at least one dominant region (the main object:
the horse in Fig.6) that expresses its semantics. The system
calculates the low-level features of the query region. Then ,
a subset of Nimages is retrieved from the database. This
set consists of those images which contain regions (objects )
of same concept as that of the query region (object). These
Nimages are ranked according to their distance (eq. 2) to
the query image. The low-level features are based on the
combined Histogram (eq. 1) of AC and DC-Patterns, for the
foreground only or the full image (Foreground + Background)
(see Fig.5b). For the full image, the histograms can be com-
bined by the concatenation of Foreground and Background
histograms to do image retrieval (see Fig.6). More general
concatenation is defined by applying β(eq. 3), a weight pa-
rameter that controls the impact of Foreground histogram
HFore and Background histogram HBack . In this context,
the global descriptor is defined as follows:
HT= [(1−β)×HFore,β×HBack] (3)
3.2. Region-Based Image Retrieval using Shape-Adaptive
DCT
After image segmentation with our algorithm [18], we need to
extract regions features. We propose to apply shape-adapti ve
DCT (SA-DCT) [16] to each segment Sof the boundary
blocks (Fig. 1) of the region of interest and the classical DC T
to the interior blocks of the object. The basic concept of the
SA-DCT is to perform vertical 1-D DCTs on the active pixels
first, and then to apply horizontal 1-D DCTs to the vertical
DCT coefficients with the same frequency index. Fig. 7 illus-
trates the procedure. The final coefficients of the SA-DCT are
located in the upper-left corner of each block. The number of
the SA-DCT coefficients is identical to the number of active
pixels. The most important benefit of SA-DCT is its capa-
bility to adapt to arbitrarily-shaped regions; the method f alls
Foreground Background Boundary Block
(SA-DCT)Interior Block
(DCT)
=
(a)
AC DC AC DC /g2009/g1832/g1867/g1870/g1857 /g2009/g1828 /g1853/g1855/g1863β
Foreground Background Bins Number of
Pa/g425erns
100 50 150 200
(b)
Fig. 5 .Illustration of the combined Histogram of the AC-Patterns
and DC-Patterns.
back to standard DCT on rectangular image blocks. Despite
the lack of rigorous theoretical justification for SA-DCT, i ts
performance is surprisingly high and closely approaches th at
of advanced methods based on basis orthogonalization [19].
Moreover, SA-DCT can be implemented in real-time whereas
the orthogonalization-based approaches are very demandin g
memory- and CPU-wise. Due to these properties, the SA-
DCT algorithm has become a common tool for coding of
arbitrarily-shaped image regions [20] and, in particular, has
been incorporated into MPEG-4 [21]. In this paper, we use a
variant of SA-DCT, called ∆DC-SA-DCT [22]. It improves
the performance of the SA-DCT by means of two additional
processing steps: extraction of the DC component from the
segmentSbefore performing forward SA-DCT and ∆DC
correction carried out during the inverse SA-DCT.
Recall that /hatwideIbiis DCT-transformed block of intensity Ibi.
LetPn
bibe the segment Sn
bi(n-th segment of boundary block
bi) after SA-DCT. Note that the shape of Pn
biis different from
that ofSn
bidue to the executed vertical and horizontal shifts,
but that the number of pixels is unchanged. Also, let /hatwideIn
bi(u)be
an SA-DCT coefficient in Pn
biat frequency u. To construct the0 50 100 150 200 250020040060080010001200
Concatenation AC and DC Pattern for Foreground and Backgrou ndNumber of Pattern
Fig. 6 .Combined Histogram of the AC-Patterns and DC-Patterns,
with highest frequency of occurrence, for the foreground an d back-
ground for Horse image (Region-Based Image Retrieval).
(b) (a) (c)U
S
Fig. 7 .Illustration of SA-DCT: (a) arbitrarily-shaped region; (b )
vertical alignment followed by vertical 1-D DCTs; (c) horiz ontal
alignment followed by horizontal 1-D DCTs.
AC-Patern (Section 2.1), we will select at most 9 coefficient s
in each extrapolated segment /tildewideIn
bi. where/tildewideIn
biis an extrapolated
n-th segment of the SA-DCT-transformed block intensity:
/tildewideIn
bi(u) =/braceleftBigg/hatwideIn
bi(u)ifu∈ Pn
bi,
v otherwise .(4)
Although various vvalues could be used, the to-be-padded
coefficients are at higher frequencies and therefore a logic al
choice, that we adopt here, is to set vto zero (ZERO padding).
4. EXPERIMENTAL RESULTS
To test the effectiveness of the proposed system, we have con –
ducted a number of tests on images taken from a widely used
COREL database. Corel-1000 is a real-world image database
collected by Wang et al. [23], including 1000 256X384 or
384X256 images, and it is a subset of Corel database. These
images are classified into 10 semantic categories: African,
buildings, beach, buses, dinosaurs, elephants, flowers, ho rses,
mountains and food. Therefore, the data set has 10 themat-
ically diverse image categories, each containing 100 image s.
All the images are stored in JPEG format. Images are con-
sidered as similar if the distance (eq. 2) between their fea-
tures descriptors is under a given threshold. Then the per-
formance can be evaluated by precision and recall. Precisio n
indicates the retrieval accuracy and is defined as the ratio o f
the number of relevant retrieved images over the number of
total retrieved images. Recall indicates the ability of ret riev-
ing relevant images from the database. It is defined as the
ratio of the number of relevant retrieved images over the to-
tal number of relevant images in the database. Relevant im-
ages are referred to images in the same category. In eval-
uating the effectiveness of the RBIR (DCT+SA-DCT) sys-
tem in comparison to the CBIR (only Classical DCT) system,
the one which gives higher precision value at the same re-
call value is the more effective system. Seven categories of
images are tested. The comparison of the average Precision-
Recall results, on the COREL database, between the proposed
Region-Based approach and the conventional Content-Based
approach [14] is shown in Fig.8. The best performance is
obtained by using the global combined descriptor (eq. 3) of
the foreground and the background together. We have ap-
plied a weight of β= 0.33on the background descriptor and
0.66on the foreground descriptor. So, the proposed system
attempts to overcome the limitation of global-based retrie val
systems by emphasizing the target objects and minimizing th e
influence of background. We can conclude, from the exper-
0 0.2 0.4 0.6 0.8 100.10.20.30.40.50.60.70.80.91
RecallPrecisionGlobal Corel-1000 Database Precision/Recall
RBIR (SA-DCT)
CBIR (DCT)
Fig. 8 .The average Precision-Recall results, on COREL database,
between our Region-Based approach (with mask) and the conve n-
tional Content-Based approach (without mask)[14]
imental results, that the proposal improves the performanc e
on COREL database. Furthermore, fewer AC coefficients and
fewer number of bins of histogram are used leading to lower
computation time.
5. CONCLUSION
Based on conventional Content-Based Image Retrieval (CBIR )
techniques in DCT domain, a new Region-Based Image Re-
trieval (RBIR) system based on SA-DCT is proposed to im-
prove the retrieval accuracy. Consequently, one can retrie ve
a region without reference to information about other regio ns
of the image. Clearly, this permits interesting operations suchas object-based querying. Since an automatic computation o f
semantically-meaningful object is extremely difficult, ou r ap-
proach, by exploiting a prior segmentation, can delegate th e
segmentation to sophisticated, high-performance, and nec es-
sarily CPU-intensive algorithms that can be executed off-l ine.
The proposed method has the power of capturing both local
features and global features, and making use of both semanti c
features and low level features. The experimental results i ndi-
cate its efficiency, high retrieval ratio and less complexit y. In
the future, we will enhance our technique for images which
have more than one main object. By integrating properties
of all the region in the images, this may reduce the adverse
effect of inaccurate segmentation.
6. REFERENCES
[1] A. W. M. Smeulders, M. Worring, and S. Santini, “Content-
based image retrieval at the end of the early years,” IEEE
Transactions on Pattern Analysis and Machine Intelligence ,
vol. 22, no. 12, pp. 1349–1380, 2000.
[2] R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image rerieval:
Ideas, influences, and trends of the new age,” ACM Comput.
Surv. , vol. 40, no. 2, pp. 5:1–60, 2008.
[3] Y . T. Wu, F. Y . Shih, J. Shi, and Y . T. Wu, “A top-down region
dividing approach for image segmentation,” Pattern Recogni-
tion, vol. 41, no. 1, pp. 1948–1960, 2008.
[4] J. Z. Wang, J. Li, and G. Wiederhold, “Simplicity: Semant ics-
sensitive integrated matching for picture libraries,” IEEE
Transactions on Pattern Analysis and Machine Intelligence ,
vol. 23, no. 9, pp. 947–963, 2001.
[5] Y . Liu, D. S. Zhang, G. Lu, and W. Y . Ma, “A survey of
content-based image retrieval with high-level semantics, ”Pat-
tern Recognition , vol. 40, no. 1, pp. 262–282, 2007.
[6] Y . Liu, D. S. Zhang, and G. Lu, “Region-based image retrie val
with high-level semantics using decision tree learning,” Pattern
Recognition , vol. 41, no. 1, pp. 2554–2570, 2008.
[7] C. Chiang, Y . Hung, H. Yang, and G. Lee, “Region-based im-
age retrieval using color-size of watershed regions,” Journal
of Visual Communications and Image Representation , vol. 20,
pp. 167–177, 2009.
[8] F. Jing, H. Zhang, and B. Zhang, “An efficient and effectiv e
region-based image retrieval framework,” IEEE Transactions
on Image Processing , vol. 13, no. 5, pp. 699–709, 2004.
[9] A. Belalia, K. Belloulata, and K. Kpalma, “Region-based im-
age retrieval in the compressed domain using shape-adaptiv e
dct,” Multimed Tools Appl , vol. 75, no. 17, pp. 10175–10199,
2016.
[10] L. Belhallouche, K. Belloulata, and K. Kpalma, “A new ap –
proach to region based image retrieval using shape adaptive
discrete wavelet transform,” International Journal of Image,
Graphics and Signal Processing (IJIGSP) , vol. 8, no. 1, pp.
1–14, 2016.
[11] G. Feng and J. Jiang, “Jpeg compressed image retrieval v ia
statistical features,” Pattern Recognition , vol. 36, no. 4, pp.
977–985, 2003.
[12] C. Chang, J. Chuang, and Y . Hu, “Retrieving digital imag es
from a jpeg compressed image database,” Image and Vision
Computing , vol. 22, no. 6, pp. 471–484, 2004.
[13] D. Edmundson, G. Schaefer, and M. Celebi, “Robust tex-
ture retrieval of compressed images,” in Proceedings ICIP-
12 (IEEE International Conference on Image Processing) , oct
2012, vol. IV , pp. 2421–2424.
[14] D. Zhong and I. Defee, “Dct histogram optimization for i mage
database retrieval,” Pattern Recognition Letters , vol. 26, no.
14, pp. 2272–2281, 2005.
[15] K. Belloulata, L. Belhallouche, A. Belalia, and K. Kpal ma,
“Region based image retrieval using shape-adaptive dct,” i n
Proceedings ChinaSIP-14 (2nd IEEE China Summit and Inter-
national Conference on Signal and Information Processing) ,
Jully 2014, pp. 470–474.
[16] T. Sikora and B. Makai, “Shape-adaptive DCT for generic cod-
ing of video,” IEEE Transactions on Circuits and Systems for
Video Technology , vol. 5, no. 1, pp. 59–62, Feb. 1995.
[17] A. Ferman, M. Tekalp, and R. Mehrotra, “Robust color his –
togram descriptors for video segment retrieval and identifi ca-
tion,” IEEE Transactions on Circuits and Systems for Video
Technology , vol. 11, no. 5, pp. 497–508, 2002.
[18] W. Zou, K. Kpalma, and J. Ronsin, “Semantic segmentatio n
via sparse coding over hierarchical regions,” in Proceedings
ICIP-12 (IEEE International Conference on Image Process-
ing), Sept 2012, pp. 2577–2580.
[19] M. Gilge, T. Engelhardt, and R. Mehlan, “Coding of arbit rar-
ily shaped image segments based on a generalized orthogonal
transform,” Signal Processing: Image Communication , vol. 1,
no. 2, pp. 153–180, Oct. 1989.
[20] K. Belloulata and J. Konrad, “Fractal image compressio n with
region-based functionality,” IEEE Transactions on Image Pro-
cessing , vol. 11, no. 4, pp. 351–362, 2002.
[21] ISO/IEC JTC1/SC29/WG11, “MPEG-4 Version 2 Visual
Working Draft Revision 2.0,” Feb. 1998.
[22] P. Kauff and K. Sch¨ u¨ ur, “Shape-adaptive DCT with bloc k-
based DC separation and ∆DC correction,” IEEE Transactions
on Circuits and Systems for Video Technology , vol. 8, no. 3, pp.
237–242, june 1998.
[23] “http://wang.ist.psu.edu/ jwang/test1.tar,” Jan. L ast accessed-
2013.
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: A NEW REGION BASED IMAGE RETRIEV AL IN THE TRANSFORMED DOMAIN [601606] (ID: 601606)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
