Using Co-occurrence and Granulometry Features for Content Based Image Retrieval

This communication presents a novel system for Content Based Image Retrieval (CBIR) using Granulometry and Color Co-occurrence Features (CCF). These features are extracted directly from images using visual codebook. Relative distance measures are used to identify the similarity between the stored images and the query image. Results show that proposed method of using Granulometry and CCF is superior to most state of the art CBIR systems. The proposed system is tested on Wang image database that contains 1000 images having different categories. The performance of the system, quantified using the Average Precision Rate (APR), is very encouraging.


Introduction
Progress in computer network technology combined with relatively cheap highvolume data storage devices has resulted in huge growth of the amount of digital images.This in return increases the complexity and time in searching a desired image from the data.There are two types of image retrieval system; one is text-based where the images are retrieved based on keyword, tag etc. Tags cannot describe the contents image in a single word and thus result of the search retrieves images that have similar tags but may have different content.The second type i.e. content-based retrieval uses the content of images to represent and access images.The term "content" may refer to colors, shapes, texture or any other information that can be derived from the image itself.Certain features are extracted to characterize the images.Memory occupied by these features is much less than actual size of the image data.A lot of investigation and research has been carried out since past few decades [1] [2] to address the problem of Content Based Image Retrieval.Images are retrieved from database on evaluation criteria provided by the user.Like for example, similarity of images content, similarity of pattern of edges and similarity of color of images etc. has been used for evaluation.The process works by accessing, browsing and retrieving similar images from the database of images in real-time applications.The authors in [3] mention various approaches developed for capturing image information.One certain approach involves constructing image in DCT domain as in [4].In another approach an image description scheme such as MPEG-7 Visual Content Descriptor (VCD) is required to describe the Content of images that are visual [5].The MPEG-7 VCD also includes other descriptors (Color Descriptor (CD), Descriptor of Texture (TD) and Shape descriptor (SD)).In distributed systems this standard offers great advantage in a sense that the user can remotely modify and recalculate the content descriptor instead of transferring image over different locations.In [6], authors discuss one CBIR approach where from an enormous database; the different scene categories of images are identified.Spatial pyramids and order bag-of-features image representations are the two features exploited for identification.This approach has proved itself to be better than the existing approaches and the result of the approach is encouraging in terms of natural classification of scene.The outcome of approach in [7] was distinguished in terms of scene categorization as the dimensionality in which the scene image was represented was very low in the integrated representation of spatial envelope.In [6], authors proposed another method that captured images with much lower dimensionality than that of its former approaches.To achieve the desired results, the concept of over completeness methodology and field design is exploited in the classification process of images.The CBIR system handles the issue of compressed image data in a proficient way.For CBIR system an image feature is generated by feature extractor without performing any decompression.Block Truncation Code (BTC) an input image called query image is divided into several blocks, which are subsequently quantized to maintain its first moment and second statistical moments in accordance with the original image and it works on both encoding and decoding stages.In CCF the two extreme color quantizes are produced namely low quantize and high quantize, at the end of BTC decoding.The bitmap information is replaced with high and low quantizes in a reverse procedure by BTC decoding.But auxiliary information is not needed in BTC decoding.image remains acceptable at the end of the process and entropy coding maintains the size of data stream.The authors in [8] demonstrate an example of CBIR system developed using BTC.The method used by [1] corresponds to BTC where the image block has a bitmap image and two quantized values namely low and high.The results in approach [9] have been shown to be better than all the former approaches at that time.The BTC based image retrieval system has been improved later in [2] and [5] for the extraction of the image feature descriptor uses RGB color space.By the help of BTC encoding the bit pattern codebook and histogram are derived from each color separately.
Stability and simplicity are the two most important considerations due to which the BTC technique is preferred over other and has inspired further enhancements discussed in the literature [7,8].The BTC and Half toning-based Block Truncation Code (HBTC) techniques are plausible for applications requiring real-time implementation due to their computational simplicity.HTBC requires halftone image rather than bitmap image in BTC.The two quantizers in HTBTC, in an image block are the Min and Max values originate in that block.Dithering based approach termed as void and cluster half toning is utilized in dithering-based BTC for the generation of bit pattern configuration of bitmap images.For the generation of quality images [10] utilizes the Human Visual System (HVS).Order-Dither Block Truncation Coding shortly ODBTC is one such dither-based approach.Applications that require privacy and ownership protection like watermarking schemes use the ODBTC due to its efficiency and low computational complexity and granulometry feature.In this work, we propose a new and efficient CBIR system based on CCF and Granulometry Features.The paper is ordered as follows: next section describes methodology of the proposed system followed by the detailed experiments and results in section 3. Lastly we present the brief conclusion and future perspectives.

Methodology
The projected method of proposed CBIR system is briefly discussed in this section.We will see how features from the images are extracted and on the basis of those features, similarity of the two images is measured.Figure 1 shows the complete procedure of the proposed methodology.In the proposed method two types of features namely CCF and Granulometry are extracted from images which capture the information of texture and color of an image.CCF and Granulometry are combined in a single feature and stored in feature vector database.When a query image is presented to the system, its feature vector is calculated and similarity measure is checked with the entire features vector in the database and based on the similarity measure images are retrieved.The number of retrieved images depends on user if the user wants to retrieve 20 images then only first 20 similar images will be retrieved.

Color Co-Occurrence Feature
Pixels are the sub units of images, color and content information of an image is carried out by these pixels.Characteristics of images i.e. color distribution can be acquired by means of color co-occurrence.By calculating the occurrence probability of a pixel, specific color information can be retrieved.Another method to represent the spatial information of an image is the color cooccurrence matrix that is calculated from the image and further color co-occurrence feature is calculated from the color co-occurrence matrix as shown in figure 2. Two color quantizers are used in computation of CCF feature.Color codebook is used to index the maximum and minimum quantizers.The CCF feature is computed from color co-occurrence matrix which is calculated from color codebook.By using LGB vector quantization (LGBQV) codebook is generated from training vectors.Let the minimum and maximum quantizers be R min and R max , then the code word generated by minimum quantizer is given in equation1 and that by 3 maximum quantizer is given in equation 2 on the basis of conditions given in equation 3 By using equation 4, from color co-occurrence matrix, CCF is computed

Granulometry Feature
Many tools for image analysis and processing are provided by mathematical morphology theory [5].Mathematical morphology issued two types of basic transformations i.e. morphological opening and closing.On the basis of shape, size and mask, which is known as "structuring element", these two transformations are defined.These methods are used in many image processing and analysis tasks like, edge detection, segmentation, restoration, shape analysis enhancement etc. [3].G. Matheron, in the study of porous materials, first introduced the concept of granulometry.Let be the image transformation, is the depending parameter of the transformation.

Similarity Measure
Relative distance measures have been used to measure the similarity between the database images and query images [10].Here in this method, two types of features are extracted, from the images i.e.Color co-occurrence features (CCF) and grannulmentry features.The set of similar images are retrieved from the database of images and indexed based on the similarity measure calculated, lowest distance means the most similar matching images in the database to that of the query image presented to the algorithm and vice versa.Similarity between the query image feature vector (combine i.e.CCF and elementary) and database feature vector can be calculated by using equation 5.

Where,
The distance is ordered in ascending order as shown in figure 3, and then number (depending on user) of similar images are retrieved based on similarity i.e. those images are considered nearest which have least distance with query image.
In figure 3 the distance of query image to all the images in data base which are 1000 in this case the distance 0 represent the query image because the feature vector of query image is also in feature data base.

Experimental Setup and Results
Number of experiments has been performed for the performance evaluation of the system.The two proposed features were extracted and store in a feature vector, similarly feature vector of all images were computed and stored in a feature vector database.Then query image is presented to the system, after the computation of query image feature vector, the distance of that feature vector is computed to all features vector in the feature vector data base.Images similar to the query are retrieved as shown in figure 5 and indexed on the basis of similarity distance from the query image.All the images in the data base are put as query image to the system one by one and its average precision rate is calculated

Performance Evaluation
Performance of the proposed method is evaluated using Average Precision Rate (APR).during the performance evaluation all images are put as query image one by one and then average of all Precisions are computed and tabulated in table 1, over all APR is calculated by taking average of all categories.The matric used for performance evaluation i-e.APR can be defined as follow [2] in equation 10.
The performance of the retrieval system will be better if the APR has larger value.

Database
Wang image database [11] that consists of total a of 1000 images and are further classified in to ten categories.Sample images from the data base is shown in figure 4. It consist of images of food, mountains, elephants, building, flowers, beach, dinosaur, people, horses and bus.

Conclusion
In this paper, a content based image retrieval system is proposed using color cooccurrence and granulometry features, inter class similarity and intra class variability is still a challenging issue that needs to be addressed.The proposed method is simple yet effective and as discussed in the results, this technique gives better results than other state of the art methods in terms of average precision rate.In perspective, this technique can be extended to content based video retrieval as well.