Improved JPEG image watermarking in data compression domain using block selection strategy

Nowadays, the joint photographic experts group (JPEG) is still the most dominant format among various digital image formats used in our daily life. Therefore, watermarking JPEG images is important and useful for many applications such as archive management and image authentication. Along with the embedding capacity and visual quality that must be considered for uncompressed images, the storage size of the watermarked JPEG image should be also considered. In this paper, based on the philosophy behind the JPEG encoder and the statistical properties of discrete cosine transform (DCT) coefficients, we propose a new JPEG image watermarking in the data compression domain, in which only coefficients with values 1 and -1 are expanded to carry message bits, coefficients with values 2 and -2 are shifted, and all the others coefficients remain unchanged. Moreover, a block selection strategy based on the number of 1 and -1 coefficients in each 8×8 block is proposed, which can be utilized to adaptively choose DCT coefficients for data embedding. Experimental results demonstrate that by using the proposed method, we can easily realize high embedding capacity and good visual quality. Furthermore, the storage size increase for the watermarked JPEG image is well lessened.


Introduction
Image Watermarking is the process of embedding a hidden stream of bits in a digital image. Digital watermarking has many applications such as broadcast monitoring, device control, owner identification, proof of ownership, transaction tracking, content authentication, and copy control [1]. The main specifications of a watermarking system are Robustness, Imperceptibility, and Capacity. In fact, there is a trade-off between these factors; importance of each depends on the application.
The algorithms found in the literature can be divided according to the domain of insertion into three categories: • Watermarking in Spatial Domain • Watermarking in Spectral Domain • Watermarking in Data Compression Domain Algorithms belonging to the last category have recently become of great interest. Secret data embedded in the compressed image would not be easily found and people would not suspect that there is something important hidden in the image.
Furthermore, among various digital image formats used in our daily life, the joint photographic experts group (JPEG) is the most popular. JPEG standard is one of the oldest and most commonly used digital image formats. Most current media broadcasting corporations and digital devices use JPEG image compression to store information in graphic form. Data hiding in a JPEG compression domain in a reversible manner is useful for image archive management, image authentication, and image privacy.
As JPEG image format is the most used compressed format, many related algorithms have been developed for hiding information in JPEG images. Upham [2] proposed a famous hiding tool for JPEG images named Jpeg-Jsteg, where the hidden data are embedded into the least significant bits (LSBs) of the quantized DCT coefficients whose values are not 0, 1 or -1. Fridrich et al. [3] developed a new idea to losslessly compress the LSB plane of some selected JPEG mode coefficients to make space for reversible data embedding. Chang et al. [4] proposed a lossless Many algorithms for JPEG images watermarking have been proposed in the past few years. However, improvements regarding the embedding capacity, visual quality, and file size preservation have still been a field of research. In this paper, some new insights on how to select DCT coefficients for data hiding will be analysed first. Then, on the basis of these analyses, we will present a new scheme for JPEG images watermarking. The main difference between our method and the previously proposed ones are as follows:

EAI Endorsed Transactions
(i) The secret message bits are only embedded in AC coefficients with values 1 and -1. (ii) AC coefficients with values 2 and -2 are shifted. (iii) All other coefficients remain unchanged in the embedding process. (iv) A novel block selection strategy based on the number of AC coefficients with values 1 and -1 in each 8×8 block is proposed.
Experimental results demonstrate that by using the proposed scheme, high embedding capacity and good visual quality can be easily obtained. Meanwhile, the storage size of the host JPEG file is well preserved. The remainder of this paper is organized as follows. In Section 2, the proposed JPEG image watermarking in the data compression domain is introduced. Experimental results and comparison with previous works are discussed in Section 3. Finally, we conclude in Section 4.

Proposed Scheme
In this section, an overview of JPEG compression is discussed first, and then, we introduce the proposed JPEG image watermarking in the data compression domain. In subsection 2.2, the embedding of the AC coefficients will be explained. Subsection 2.3 explains the block selection method. Subsection 2.4 explains the recovery of the original DCT coefficients and the extraction of the payload.

Overview of JPEG Compression
JPEG Compression is an algorithm developed by the Joint Photographic Experts Group with the aim of minimizing the file size of photographic image files. The key steps of the JPEG compression process [5,6], are shown in Fig. 1. The JPEG legacy encoder consists of three parts: discrete cosine transform (DCT), quantization step, and entropy encoding. The original image is divided into non-overlapping 8×8 pixel blocks, subtracted by 127 for normalization, transformed using the two-dimensional DCT function, and entropy coding is applied using the Huffman table. In the quantization step, DCT coefficients are divided according to the quantization table and rounded to the nearest integer as follows: where F(u,v) represents the original DCT coefficient, Q(u,v) represents the corresponding value in the quantization table, and D(u,v) represents the quantized DCT coefficient. JPEG standard does not specify the DCT quantization table, but it provides a recommended table in their Annex. This means that anyone can use their own quantization table by scaling the table elements using different quality factors [5,6]. When a different quantization table is used, different image qualities and compression ratios are achieved. The recommended quantization table is shown in Table 1. Quality factor (QF) is a value ranging from 1 to 100, and scaled quantization table Qs is obtained using the following equation: where SF is the scale factor, Q is the recommended quantization table value, and [x] represents the round operator of x.
The quantized DCT coefficients are arranged in a zigzag scanning order, the very first coefficient in a quantized coefficient vector is referred to as the direct current (DC) coefficient, whereas the 63 others are referred to as the alternating current (AC) coefficients.   The AC coefficients are encoded in a specific RLE format as intermediate symbols (R/C,V). Here, R (0 ≤R≤15) denotes the zero run length (i.e., the number of zero AC coefficients before the next nonzero AC coefficient), and C (1 ≤C≤ 10) denotes the category of the next nonzero AC coefficients (i.e., the number of bits needed to represent the amplitude of the next nonzero AC coefficient), which is shown in Table  2. There are 160 combinations of R and C. Each R/C is encoded with a variable-length code (VLC) from the Huffman table; the code length ranges from 1 to 16. In addition, because the run length of zero coefficients may exceed 15, the value R/C = F0H is defined to represent a run length of 15 zero coefficients followed by a coefficient of zero amplitude (this can be interpreted as a run length of 16 zero coefficients), and a special value R/C = 00H is used to code the end-of-block (EOB) when all remaining coefficients in the block are zero, where "H" implies that "F0" and "00" are hexadecimal numbers. Thus, there are in total 162 different VLC codes in the standard Huffman table.
Note that all R/C values are encoded with variablelength Huffman codes, but the V values are not Huffman coded. Each V value is encoded with a variable-length integer (VLI) code, whose length in bits is given in the first column of Table 2. They are appended to the Huffmancoded R/C to form the final JPEG bitstream.
The DC coefficients are pre-compressed differently using the differential pulse code modulation (DPCM), they are coded as (Size,Value) pair. The code for the Size is derived from Table 3 and the code for the Value is simply its binary code. The DC component value in each 8x8 block is large and varies across blocks but is often close to that in the previous block. The DPCM encode the difference between the current and previous 8x8 block as follows: where is the DC differential value.
Finally, the symbol string is Huffman coded to obtain the final compressed bitstream. After pre-pending the header, we obtain the final JPEG file. For more detail, please refer to the JPEG guidelines released by International Telecommunication Union [7].

Selection of Coefficients for Embedding
The very first coefficient in a quantized coefficient block is referred to as the DC coefficient, whereas 63 others are referred to as AC coefficients. The proposed method will only embed the payload in the AC coefficients valued either +1 or -1. For each AC coefficient C, the following method is used to embed the payload bits b ϵ 0,1 and creates the watermarked AC coefficient C′: where sign(x) = � A bit b is embedded if and only if C is either +1 or -1. If C is equal to +2 or -2, it is shifted by +1 and -1, respectively. Notice that the others coefficients including the zero coefficients are not modified as explained in the beginning of this section. Before we continue with the rest of the explanation, the following histogram shifting terms are summarized for clarity: • Embeddable coefficients: AC coefficients valued either +1 or −1. • Shiftable coefficients: AC coefficients which are greater than +2 or less than −2. • Unchangeable coefficients: All others AC coefficients.

Block selection strategy
We proposed two block selection strategies, the first is based on the number of AC coefficients valued either +1 or −1, the second is based on the number of 0 AC coefficients. A block with a higher number of those coefficients will be at the top with the highest embedding priority and a block with less number of those coefficients will be at the bottom with lowest embedding priority. Using this statistical feature, embedding only in +1 and -1 first in selected blocks

EAI Endorsed Transactions on
Internet of Things 10 2020 -02 2021 | Volume 6 | Issue 24 | e4 M. Zairi, T. Boujiha and A. Ouelli 4 effectively reduce distortion in the watermarked image. Additionally, the file size increase is also lessened. The modification of others AC coefficient increases the file size. This is because whenever an AC coefficient is modified, an extra symbol is needed to be coded. Therefore, the proposed method leads to smaller distortion and smaller file size increase.

Extraction and image recovery
In this proposed scheme, the payload should be extracted and the original host image should be recovered correctly without any errors. The extraction and recovery are done simultaneously.
The message or the watermark extraction and the host image restoration can be described as: where b ̃ and C ̃ represent the extracted message bit and the recovered AC coefficient respectively.

Experimental results
In our experiments, we have used 16 popular images from the popular USC-SIPI database including the Lena, Baboon, airplane (F-16), and House. These files are compressed using the JPEG standards with optimized Huffman table. The proposed method was evaluated by comparing the visual quality and file size increase. The PSNR is used to evaluate the visual quality of the watermarked JPEG image, and it is calculated between the original JPEG image and the watermarked JPEG image. File size increase is measured by calculating the number of bytes.

Visual Quality
For evaluating visual quality, the PSNR is calculated between the watermarked JPEG image and the original JPEG image. Table 4 shows the numerical results of PSNR values on the listed four images applying a quality factor of 70 and a payload size ranging from 8000 bits to 20000 bits. The complete PSNR results for the four images with payload size ranging from 2000 bits to 20000 bits are shown in Fig. 2. Fig. 3 shows the four watermarked JPEG image compressed with a factor quality of 70 and watermarked by a secret data with a payload size of 16000 bits. The experiment shows that the proposed method produces a good visual quality.    The proposed method was furthermore evaluated by comparing the visual quality against the three state-of-theart schemes [8,9,10]. The experiment shows that when we select the blocks with maximum 0 values the proposed method has the highest PSNR than all the previous works, but as the payload increases, the block selection has no more effect because all embeddable AC coefficients are used as we can note for Lena and F-16 images in Table 4 and Figure 2 for a payload size of 20000 bits.

File Size Preservation
It is obvious that it is important to consider the file size preservation along with visual quality in JPEG image watermarking. The file size of the watermarked images obtained by the proposed method is on average less when we select the blocks with maximum values of -1 and 1. Table 8 shows the file size increase for each watermarked JPEG image using payload sizes ranging from 8000 bits to 20000 bits. Fig. 4 shows that embedding only in 1 and -1 AC coefficients and shifting only AC coefficients valued 2 or -2 reduce the file size increase of the marked JPEG image.

Lena
Similarly, the proposed method was evaluated by comparing the preservation of the image size against the three state-of-the-art schemes [8,9,10]. The experiment Improved JPEG image watermarking in data compression domain using block selection strategy 6 shows that when we select the blocks with maximum values of 1 and -1, the proposed method has the highest size preservation than all the previous works, but as the payload increases, the block selection has no more effect because all embeddable AC coefficients are used.   In the present work, the process of watermarking an image includes embedding some AC coefficients and shifting others which lead inevitably to increase its initial size and reduce its visual quality. Block selection strategies significantly improve the visual quality and reduce the increase of the file; by selecting the blocks with maximum values of -1 and 1, we got more AC coefficients to embed in, less coefficients to shift, and few needed blocks to embed the entire hidden message which result finally in enhancing the size preservation as shown in Tables 9 and 10. In contrast, selecting blocks with maximum 0 values which mean blocks with minimum values of -1 and 1 leads to less embeddable AC coefficients in each block and the distortion will be distributed over more blocks which results in enhancing the visual quality as shown in Tables 5, 6, and 7. In fact, there is a trade-off between these two factors; importance of each depends on the application.

Conclusion
A new watermarking technique for JPEG images is of great significance due to its large series of applications. However, any embedding in JPEG images introduces inevitably more distortion and increases its initial size. In this paper, we present a new scheme for JPEG images watermarking. The main contributions of this paper are as follows: 1) On the basis of the philosophy behind the encoder and the distribution of quantized DCT coefficients, a new DCT coefficient modification-based method is proposed. In the proposed method, only AC coefficients with values 1 and -1 are expanded to carry the message bits and only AC coefficients with values 2 and -2 are shifted. 2) A novel block selection strategy has been proposed in this paper, which may result in better visual quality and less storage size of the marked JPEG file. 3) EAI Endorsed Transactions on Internet of Things 10 2020 -02 2021 | Volume 6 | Issue 24 | e4 M. Zairi, T. Boujiha and A. Ouelli Improved JPEG image watermarking in data compression domain using block selection strategy 7 The proposed scheme has strong practicability because high embedding capacity and good visual quality can be easily obtained. Meanwhile, the storage size increase of the watermarked JPEG image can be well lessened. Experimental results demonstrate that our proposed method can achieve better performance both in visual quality and file size preservation compared to the previous works in [8][9][10][11][12][13]. The future work includes the extension of our proposed technique to other category and types of images, for example, color images, DICOM images, and JPEG 2000 which is a wavelet-based image compression standard [14][15][16].