A Rekeying Scheme for Encrypted Deduplication Storage based on NTRU

Rekeying is a common way to protect outsourced data against key compromise and to enable data owners to enforce access control on their data. However, existing rekeying schemes are difficult to apply to the encryption dedu plication system which uses message-locked encryption for allowing the server to perform deduplication on users’ outsourced data. In this paper, we propose a new rekeying scheme named REEDBN, which leverages a proxy re-encryption based on NTRU to reduce the communicational cost for the system and the com putational overheads for clients during rekeying. We implement the prototype of our scheme and conduct testbed experiments. The results show that our system has much less communicational effort and computational overhead for clients than the previous scheme. Users can even rekey their outsourced data on some mobile terminals which only have limited computation power.


Introduction
In the era of big data, deduplication is an effective way to save storage space. Deduplication improves the utilization of storage space by removing redundant copies in storage. In cloud storage systems, users are usually billed by the amount of data transmitted and stored [1]. By client-side deduplication, it can not only save storage space but even reduce the amount of data being transferred between clients and servers, which greatly decreases the cost for both clients and servers [2]. Recently, the problem of data security has attracted more and more attention. Users incline to upload encrypted data for data confidentiality, so that the server cannot learn anything about the outsourced data. How-ever, traditional encryption algorithms are contradictory with the concept of deduplication since encryption turns the same message into indistinguishable ones and hinders deduplication as the detection of the duplication of encrypted data becomes difficult. Convergent encryption (CE) [3] is the first attempt to achieve encrypted deduplication. In CE, clients use the hash of the original data as the symmetric key to encrypt data. Since the encryption key is the same if users * Corresponding author. Email: 2472727110@qq.com have the same file, the server can detect duplication by the encrypted data. However, CE is inherently vulnerable to brute-force attacks [4], as it cannot provide confidentiality guarantees for predictable messages. DupLESS [4] generates encryption keys through a dedicated key server. It introduces a system-wide secret for the key generation. It is hard for attackers to perform brute-force attacks without knowing the system-wide secret. However, it is inefficient to use DupLESS in chunk-level deduplication because the speed of the key generation is slow [5]. Although current encrypted deduplication systems [6][7] [8] can provide data confidentiality to some extent, there are other threats on user data in different scenarios. For example, it is hard for data owners to rekey their outsourced data [9] when the encryption key is compromised. Suppose that a file stored on the server is encrypted by the key ( ), where (·) is a cryptography hash function. Once ( ) is breached by an adversary, the data owner has to rekey the file by a new convergent key ′ ( ), where ′(·) is another cryptography hash function. However, to achieve this process, the data owner needs to download the entire file, decrypt it, and upload the new ciphertext encrypted by the new key ℎ ℎ′ ( ) to the server. Besides, the data owner also needs to broadcast the new hash function ℎ ℎ′ (·) to all other owners for deduplication. Hence, rekeying in encrypted deduplication systems based on CE suffers from limitations REED [10] designs a rekeying-aware encrypted deduplication storage system. It uses the all-or-nothing transform (AONT) [7] and message-locked encryption (MLE) [12] to transform user data into stubs and trimmed packages. User data cannot be recovered without knowing both the stubs and the trimmed packages. A client uses its private key to encrypt the stubs which are only a small part of the entire data. REED applies deduplication to the trimmed packages whose size is much larger than stubs to maintain storage efficiency. When the outsourced data needs to be rekeyed, the client just needs to re-encrypt the encrypted stubs. Since the size of the stubs is tiny, REED achieves the light-weight rekeying in encrypted deduplication storage. However, rekeying in REED also needs clients to download the stubs, and encrypt them with a new key and then upload the new encrypted stubs. As the file size increases, so does the stub size. When we need to rekey frequently and the size of the file is relatively large, REED requires many communicational and computational efforts, especially for clients, who needs to encrypt and decrypt stubs constantly. It is hard for users to rekey their outsourced data on mobile terminals or terminals with limited computing power. In some real-world scenarios, staff turnover can often be seen in enterprise and these processes are accompanied by the changes of data access rights, which requires the rekeying of enterprise's outsourced data. During this frequent change of data access rights, REED also has much computational and transmission overhead. This paper focuses on the aforementioned problems in REED and proposes REEDBN, a rekeying scheme for encrypted deduplication storage based on NTRU [13] [16]. REEDBN applies NTRUReEncrypt [23] based on REED and reduces the bandwidth overhead in the system and the computational overhead for clients during rekeying. Even if rekeying of outsourced data is needed frequently, the overhead for clients is still negligible. Users can practically rekey their data even on mobile terminals that have limited computing power. In REEDBN, clients encrypt stubs with NTRU be-fore uploading them to the server. When a client needs to rekey its data, it computes a re-encryption key and upload it to the server. After receiving the re-encryption key, the server uses it to generate a new ciphertext which can only be decrypted by the new key. Client is exempted from downloading any data from the server during rekeying. There is no need to encrypt or decrypt stubs for the client. All re-encryption processes are done by the server which has strong computing power. Besides, the transmission be-tween the client and the server during rekeying contains only a re-encryption key and the only computing overhead for the client is the computation of the re-encryption key. Therefore, users can rekey their mass outsourcing data whenever and wherever even by their smart phone. Our contributions in this paper are summarized as follows: • First, we propose the REEDBN, which uses the NTRUReEncrypt to implement re-keying in an encrypted deduplication storage system. In contrast to previous schemes, it has less bandwidth and computational overhead for clients during rekeying. • Second, REEDBN can also integrate some existing primitives, such as ciphertext-policy attribute-based encryption (CP-ABE) [15], to control the access privileges of user files. • Finally, we implement a prototype of REEDBN and conduct testbed experiments to evaluate the performance of it. The result shows that REEDBN achieves a significant increase in the speed of rekeying which only incurs a little extra encryption and storage overhead. No matter how large an outsourced file is, the time for a client to rekey is kept within a few milliseconds. Rekeying a 900MB file only needs 21.14ms in REEDBN. The remainder of the paper proceeds as follows. Section 2 introduces the preliminaries that REEDBN is based on. Section 3 introduces the REED system which our solution is based on. Section 4 proposes the architecture, security model, and design goals of our scheme. Section 5 describes the specific design of REEDBN. Section 6 describes the implementation details of REEDBN. Section 7 evaluates the experimental results. In Section 8, we draw conclusions of this paper.

Preliminaries
In this section, we introduce the cryptographic primitives used in our encrypted deduplication scheme.

Message-locked encryption
Message-locked encryption (MLE) is an encryption scheme based on data contents. In MLE, keys and tags of data are all generated based on its content. The same data will be encrypted into the same ciphertexts so as to facilitate the duplication. Traditional MLE can protect unpredictable messages. However, for the predictable data, this type of MLE will be subject to offline brute-force dictionary attacks. Server-aided MLE in troduces a system-wide secret during key generation through a dedicated key server [4]. Adversaries cannot launch brute-force attacks without knowing this secret.

The NTRU Encryption Scheme
NTRU [13] is a lattice-based cryptosystem. Due to its potential resistance to quantum attack, NTRU is thought to be a prospective post-quantum cryptosystem. Among many public-key cryptosystems, its outstanding efficiency has attracted many research interests, which is even comparable to AES. The polynomial coefficients in NTRU are relatively small and allow efficient multiplication on some computationally constrained devices [17]. NTRU has homomorphic properties and can be used to construct proxy re-encryption schemes. These advantages are the reason we choose it.

EAI Endorsed Transactions on
Security and Safety 10 2020 -01 2021 | Volume 8 | Issue 27 | e4 There are three extremely important parameters in NTRU encryption scheme, which can be defined as , , and . is a prime that determines the calculation range of the entire scheme. All operations in NTRU work over a quotient ring defined as = ℤ[ ]/( − 1). The degree of its main elements should to be less than . We define as also a prime and the ciphertext space as = / . Another parameter ∈ [ ], is a polynomial with a small norm, such as = 2 or = + 3. Message is set as ∈ / . Now we introduce the algorithms in the NTRU encryption scheme. KeyGen(): Sample ', from D Z n ,σ . D Z n ,σ means the Gaussian distribution with deviation on Z n . Let = ′ + 1. ) ∈ / .

NTRUReEncrypt
NTRUReEncrypt [23] is a proxy re-encryption scheme based on the NTRU encryption scheme, which adds re-encryption key generation algorithms RekeyGen and re-encryption algorithms ReEnc. The instantiation of the scheme requires that ≡ 1 . We introduce the algorithms in the NTRUReEncrypt as follows. KeyGen(): It outputs the private key = and the public key = ℎ. Enc( , ): Given a plaintext and the public key , the algorithms outputs the ciphertext .

RekeyGen(
, ): In order to re-encrypt the ciphertext which is encrypted by , we need to compute a reencryption key. We input the original private key and the new private key into the algorithm to compute → = · −1 and outputs the re-encryption key → . ReEnc ( → , ): We use the re-encryption key to reencrypt the ciphertext encrypted by . ReEnc samples ′ from discrete Gaussian distribution . It computes = · → + ′ . The new ciphertext can only be decrypted by the and the is invalid. The algorithm outputs reencrypted ciphertext . Dec( , ): Given private key and ciphertext , it outputs the plaintext .

REED
Since the REEDBN we proposed is an improvement of REED, we introduce REED as a background in this section. We mainly introduce the encryption and rekeying in REED.

Main Idea
REED [10] is a rekeying-aware encrypted deduplication storage system. It follows a client-server architecture and it deploy a key manager to achieve the server-aided MLE. In REED, clients can outsource their data to a server, and the server can perform deduplication on them. There are two types of key in REED: a file-level secret key and a chunk-level MLE key. REED uses all-or-nothing transform (AONT) to transform the user data. To pre-serve deduplication, REED uses convergent AONT(CAONT), which replaces a random key with an MLE key. During encryption, a client needs to transform its data into stubs and trimmed packages using CAONT and chunk-level MLE key. After getting both the stubs and trimmed packages, the client encrypts the stubs with the file-level secret key. The outsourced data in REED are protected by both the file-level secret key and chunk-level MLE key. During rekeying, the client only needs to renew the file-level secret key. Because of the property of CAONT, user data is unrecoverable if the stubs are unavailable. Since the size of the stubs is small, overhead during rekeying can be reduced. The server could apply deduplication to the trimmed packages whose size is large to save storage space.

Encryption Schemes
There are two encryption schemes in REED: a basic encryption and an enhanced encryption. The former is more efficient while the latter is resilient against key leakage by a more expensive encryption. The basic encryption: To upload a file , the client splits into several chunks { 1, 2, … , }, each { } corresponds to a chunk-level MLE key i from a key manager. Now we assume the chunk is and its chunk-level key is i. The work-flow of the basic encryption is as follows: 1. Choose a fixed-sized string , concatenate i with , obtain the ( || ).

Choose a block
of the same size as ( || ), compute a pseudo-random mask ( ) = ( ), where (·) denotes a symmetric key encryption function using key i. Note that both and are publicly known.
is a hash function. 5. Trim the last few bytes (e.g., 64 bytes) from the package ( ) as the stub , and the remaining part as the trimmed package . 6. Generate a file-level secret key , encrypt all stubs of using AES and get the ( 1 , 2 , … , ). 7. The client uploads the encrypted stubs ( 1 , 2 , … , ) and all trimmed pack-ages { 1 , 2 , … } to the server. To reconstruct , follow these processes reversely. 3. Use ℎ and to compute the pseudo-random mask where is a publicly known block of the same size as || . Then XOR it with the package ( || ), compute the ′ = ( || ) ⊕ (ℎ). 4. Divide ′ into several fixed-size pieces whose size is the same as ℎ. XOR all these pieces and ℎ to compute the tail . 5. Concatenate ′ and , obtain the ( ′|| ). Trim the last few bytes from the package ( ′|| ) as the stub , and the remaining part as the trimmed package .
7. The client uploads the encrypted stubs ( 1 , 2 , … , ) and all the trimmed packages { 1 , 2 , … } to the server. To reconstruct , the client needs to have both the encrypted stubs and the trimmed packages. First, the client decrypts the encrypted stub with , and concatenate ′ and to obtain the ( ′|| ). Then we divide ′ into pieces and XOR them with to get ℎ. Recover ( || ) = ′ ⊕ (ℎ). Finally the client can obtain = ( , ), where (·) is a MLE decryption function.

Rekeying
To rekey file , the client downloads the encrypted stubs from the server, and decrypt them with the file-level secret key . Then the client encrypts the stubs with a new file-level secret key ′ and uploads a new ciphertext ′( ), as is shown in Figure 1. However, if the size of the stubs is relatively large or we need to rekey frequently, rekeying in REED will incur a lot of bandwidth overheads and computational efforts. Besides, when we need to rekey outsourced data on some mobile terminals (e.g., smart phone) which only have limited computational power, frequent encryptions and decryptions make it difficult to rekey on these devices.

Overview of REEDBN
In order to further reduce the cost of rekeying, we propose the REEDBN, which is a cross-user server-side encrypted deduplication storage system. Users can outsource their data to REEDBN for saving local storage space. MLE and NTRU are both used in REEDBN to protect user data. More importantly, REEDBN achieves a new light-weight rekeying scheme where the client just needs to compute a re-encryption key. REEDBN only has negligible bandwidth overhead during rekeying.

Architecture
REEDBN is composed of clients, a key manager, and a server.
Server: REEDBN deploy a server to provide data storage and data management services for multiple users. It applies the deduplication to user data to save storage space. When receiving data from clients, the server will detect whether there is a duplicate one or not. Only if the data is not duplicated, would the server store it. Furthermore, we assume the server has strong computing power and can perform costly calculations. Clients: Users can outsource their data to the server through clients. Since REEDBN achieves the block-level deduplication, clients need to divide user data into chunks. After chunking, clients generate trimmed packages and stubs based on the chunks. Then clients encrypt the stubs with NTRU. The encrypted stubs and trimmed packages will be uploaded to the server. Besides, clients can send a rekeying request to the server, achieving the efficient rekeying for outsourced data. Key Manager: To protect both predictable and unpredictable chunks, REEDBN de-ploys a key manager. It achieves a server-aided MLE by generating a key for a message based on both the message content and a systemwide secret. The key manager provides the system-wide secret to help REEDBN to protect against brute-force attacks.

Threat Model and Design Goals
In REEDBN, we assume the server and the key manager to be "honest-but-curious". They would follow our proposed protocol, but they are curious about user data and try to learn about them. In this paper, we consider two kinds of adversaries. The first are external adversaries who can compromise the server to access all ciphertexts out-sourced by users. The second are internal adversaries who can collude some valid cli-ents to obtain information about unauthorized data. Since REEDBN is a server-side encrypted deduplication storage system, there is no side channel attack.
REEDBN has two main design goals. The first is data confidentiality. Any kinds of adversaries cannot learn anything about data beyond their access scope, even if they compromise the server or collude other clients. The second EAI Endorsed Transactions on Security and Safety 10 2020 -01 2021 | Volume 8 | Issue 27 | e4 goal is the security and high-efficiency of rekeying. The revoked users cannot access unauthorized data after rekeying and the server also cannot learn anything about the data during rekeying.
Besides, the bandwidth overheads and computational efforts for clients during rekeying is negligible.

REEDBN Design
We introduce the NTRUReEncrypt [23] in REED. In REEDBN, a client just needs to upload a re-encryption key without downloading any data from the server during rekeying. REEDBN reduces bandwidth and calculation overheads of clients during re-keying. Hence, users can rekey their data even on their mobile terminals. Even if users need to rekey their data frequently, the communication overhead in the system is still little.

Main Idea
We observe that the only way to avoid the process of data transmission, encryption and decryption during rekeying is to transform the original encrypted data stored on the server into a new ciphertext. However, if a client directly uploads both the original key and a new key to the server and allow the server to generate the new ciphertext, the data will be insecure. The untrusted server can obtain data and learn the keys of the encrypted data. Proxy re-encryption [14] is a good way to achieve the transformation of ciphertext in case that the server does not know the encryption key. It is used commonly in many data sharing schemes for cloud storage [24][25] [11]. We apply the proxy re-encryption to the rekeying of encrypted deduplication storage. The purpose of proxy reencryption in our scheme is to rekey rather than data sharing. Therefore, there is no need to have a trusted proxy server in REEDBN since it is hard to find it in the real-world scenario. The re-encryption key can be generated directly by data owners them-selves. The proxy re-encryption scheme we choose is NTRUReEncrypt [23] because of its efficiency. In order to use NTRUReEncrypt [23], the outsourced data must be encrypted by NTRU. If clients encrypt the whole file with NTRU, the performance of encryption will degrade. This will also incur a lot of storage overheads because of the ciphertext expansion in NTRU. Therefore, we encrypt the stubs in REED with NTRU. Since the size of the stubs is small, our scheme will only cause limited overhead during encryption compared with REED.

Operations
In this subsection, we will give the detailed operations of file uploading, downloading and rekeying in REEDBN. File uploading: Suppose that a user wants to upload a file . A client transforms into the stubs and trimmed packages either by the basic encryption or the enhanced encryption just like REED. In contrast to REED, the client encrypts the stubs using NTRU rather than AES in REEDBN. Therefore, the stubs need to be encoded into some polynomials that NTRU can encrypt. We encode the binary bits in the stubs into the coefficients of polynomials. The client uses the algorithm () to output the ( , ) as described in section 2.3. Then the client encrypts the encoded stubs using the NTRU encryption algorithm to obtain the ciphertext = ( , ), where (·) is the NTRU encryption algorithm. Finally, the client uploads the trimmed packages and to the server (see Figure 2).

Fig. 2. File uploading in REEDBN
File downloading: The client downloads the trimmed packages and . The client uses the NTRU private key to decrypt to obtain the = ( , ( , )), where (·) is the NTRU decryption algorithm. The client can recover the entire file with the stubs and trimmed packages, and the user file can be reconstructed (see Figure  3).

Fig. 3. File downloading in REEDBN
Rekeying: When a user needs to rekey their outsourced data, the client needs to generate a re-encryption key. Suppose that the stubs stored on the server are encrypted by the key pair ( , ) and the user wants to update the key pair to ( , ). The client should input the secret keys and to the re-encryption key generation algorithm (·) to compute the re-encryption key → = ( , ). The client uploads the → to the server. The server inputs the re-encryption key → and the stored ciphertext = ( , ) to the reencryption algorithm (·). Finally, the server calculates the transformed ciphertext ′ = ( → , ) as is shown in Figure 4. At this point, the encrypted stubs stored on the server can only be decrypted with the new key pair ( , ) , and the original key pair ( , ) has been invalid. Therefore, the REEDBN can protect against the key compromise.

Dynamic Access Control
Similar to REED, REEDBN can also integrate CP-ABE to control the access privileges to user files. Every file in REEDBN corresponds to a NTRU key pair ( , ). The data owner creates an access policy for their outsourced files. The client then encrypts ( , ) using CP-ABE based on the access policy and uploads the encrypted key pair to the server. Only the users who satisfy the access policy can decrypt the encrypted key pair and decrypt the entire file. When the data owner needs to change the access privilege of outsourced files, the client can generate a new NTRU key pair for the file and encrypt the new key pair using CP-ABE based on the new access policy. Then the client uploads the new CP-ABE ciphertext to the server. Only users who satisfy the new access policy can decrypt the new CP-ABE ciphertext at this point.

Analysis
REEDBN is designed to efficiently rekey for encrypted deduplication storage. In this subsection, we analyze two aspects of REEDBN. The first is data confidentiality and the second is the performance and security of rekeying. We will use the following four propositions to illustrate the security and performance of REEDBN. In all these propositions, we assume that the cryptographic primitives used in REEDBN, such as MLE, OPRF, NTRU and CP-ABE, are secure. Proposition 1. Neither the server nor the key manager in REEDBN can obtain any information about user data. Proof. The clients interact with the key manager to get the server-aided MLE keys using the OPRF protocol. The hash of a file is blinded and the key manager cannot learn anything about the hash. Clients encrypt data before outsourcing them to the server. The server has neither the MLE key nor the NTRU key pair, so it cannot get any information for file plaintext. When users want to rekey their data, clients send → to the server. During the process of ciphertext conversion, user data is always encrypted, and the server cannot learn any plaintext. Therefore, the honest but curious server and key manager cannot obtain any information about user data. Proposition 2. Neither external adversaries nor internal adversaries can access any un-authorized data.
Proof. External adversaries can access all trimmed packages, encrypted stubs and the CP-ABE ciphertext of the NTRU key pair by compromising the server. Since they do not have any key, the encrypted data cannot be decrypted. Therefore, user data is secure because of the security of MLE, NTRU and CP-ABE. Internal adversaries can collude with valid clients, but their private keys do not satisfy the access policy define by the CP-ABE ciphertext. Hence, the CP-ABE ciphertext and the encrypted stubs cannot be decrypted. Because of the characteristic of CAONT, internal adversaries also cannot learn anything about the file without the stubs. Proposition 3. The revoked users cannot access unauthorized data after rekeying. Proof. Suppose that the stubs of a file are originally encrypted with 1. The data owner changes the encryption key to 2. The revoked 1 cannot decrypt the rekeyed stubs be-cause of the security of NTRUReEncrypt. The 2 is encrypted by CP-ABE and the revoked users' private keys do not satisfy the access policy set in ciphertext. Therefore, revoked 1 is invalid and the revoked users cannot access 2. Because of the nature of CAONT, the users will not be able to get data information without decrypting the encrypted stubs. Proposition 4. REEDBN has negligible bandwidth overhead and computational efforts for clients during rekeying. Proof. We show the comparisons of overheads during rekeying in both REED and REEDBN in Table 1. During rekeying, clients do not need to download anything from the server in REEDBN and it just needs to upload a re-encryption key → , the band-width overhead during rekeying is very little. In contrast to REED, the encrypted stubs in REEDBN also do not need to be encrypted and decrypted by clients. The only thing needs to be done by clients is to generate a reencryption key. The computational effort for clients is also negligible. Therefore, users can rekey their data on any device with limited computational power. These characteristics make REEDBN has a good application prospect. Table 1. The comparisons of overhead during rekeying in REED and REEDBN 6. Implementation

Implementation detail
We extend the open-source system REED to implement the prototype of REEDBN. The cryptographic operations in REEDBN are implemented by OpenSSL [21]. We in troduce NTRU in REED and it is implemented based on NTL [22], which is a portable C++ library. NTL is thread safe and it provides efficient algorithms and data structures for polynomials and big integers. It is worth mentioning that NTL is high-performance. Using it for arithmetical operations can greatly improve the efficiency of the system. REEDBN consists of three separate C++ programs for clients, a server and a key manager. The setup for REEDBN is the same as REED. A client supports both fixed-size and variable-size chunking scheme and the stub size is 64 bytes for each chunk. We write all the stubs of a file into a stub file, which is encrypted by the NTRU encryption algorithm. REEDBN uses the OPRF protocol to blind the file fingerprint to prevent the key manager from learning data. The server can receive file data from multiple clients and perform cross-user deduplication on the trimmer packages. In contrast to REED, the client needs to encode the binary bits in stub files into the coefficients of a polynomial since the unit of the NTRU encryption is polynomials. As for coding, we choose concatenating. First, the client reads a specific number of char-acters from the stub file and converts them into ASCII characters. As we know, an ASCII character consists of 8 bits. Since the parameter limits the message space, we shift and concatenate every /8 characters to form a -bit integer. Finally, we get -bits integers and encode them as polynomial coefficients. In this way, we get the encoded messages. After the stub file is encoded into polynomials, the client encrypts them with NTRU.

Optimization
REEDBN leverages some optimization techniques, such as multi-thread and fast flourier transform for better NTRU encryption performance. Multi-core and Multi-thread: The combination of multi-core and multi-thread al-lows threads to be scheduled to different kernels by the operating system, and complete parallel operations of a program. It achieves the goal of greatly improving the efficiency for program execution. When performing upload and download, we use multi-threading technology to handle multiple chunks in parallel for reducing transmission time. When encrypting and decrypting files, we also use this technique to parallelize the polynomial calculation and increase the computing efficiency. Fast Fourier Transform (FFT): The basic idea of FFT [19] is to make full use of the symmetric and periodic properties of the exponential factors in the formula. It makes appropriate combinations to achieve the purpose of eliminating duplicate calculations, reducing multiplication operations. Our scheme involves amounts of polynomial multiplication operations in both encryption and decryption. Besides, we need to ensure that the ciphertext is in the ring and the plaintext is in / . Hence, modulo operation is required for all results, which also involves polynomial multiplication. In order to shorten the time consumed by above steps, we use FFT algorithm for optimization. It is worth mentioning that FFT can perfectly fit the multi-thread technology. We divide the polynomial coefficients into two parts according to parity and create two threads to perform these two part's DFT transformations [19] in parallel. Butterfly diagram [18] is also used to make sure the output of FFT is in order.

Evaluation
We evaluate REEDBN using machines equipped with a quadcore 2.7GHz Intel Core-i7-7500U, 5400RPM SATA hard disk, and 8GB RAM, and installed with 64-bit Ubuntu 16.04.12. In this section, our simulation results are the average of more than 10 runs. We use synthetic datasets and real-world datasets respectively during the evaluation. The synthetic dataset consists of many artificial files with random binary bits and the real-world dataset is collected by the File system and Storage Lab (FSL) [20]. To evaluate the performance of REEDBN, especially compared to the enhanced encryption scheme in original REED, we mainly measure the speed of data encryption, upload and rekeying and the storage overhead of REEDBN and REED.

Evaluation on synthetic datasets
In this subsection, we compare the performance differences between REEDBN and REED in the processes of encrypting, uploading, and rekeying using synthetic dataset. Figure 5(a) shows the time delay in encryption of files with different size in REED and REEDBN. The encryption time consists of the time that transforming file data into the encrypted stubs and trimmed packages. It can be seen that the speed of encryption be-tween these two schemes is close. The speed of encryption in REED is 10.5% higher than REEDBN. This difference mainly comes from the encryption algorithm used in these two schemes. We use NTRU to encrypt the stubs in REEDBN, which is a little bit slower than AES. In Figure  5(b), the upload time of REED and REEDBN is also close. The speed of upload in REED is 10.8% higher than REEDBN. The upload time denotes the delay of sending all data to the server. Due to the ciphertext expansion problem in NTRU, the amount of data that clients need to upload in REEDBN is more than that in REED. Since the size of the stub is relatively small, the extra bandwidth overhead in REEDBN is little. Therefore, REEDBN incurs a little bit more upload time overhead compared with REED. Figure 5(c) shows the speeds of encryption under REED and REEDBN versus the average chunk size. As the average block size increased, the encryption performance of both systems improved. This is because the larger the average block size, the fewer blocks need to be processed. For REEDBN, the amount of data encrypted by NTRU is also reduced, thereby the encryption overhead of the system is further reduced.  The most obvious gap between REEDBN and REED is the speed of rekeying, which is also the motivation of this paper. The differences between REEDBN and REED in rekeying focus on the communicational overhead in the system and computational efforts for clients. Figure 5(d) shows the rekeying performance for files with different sizes both in REED and REEDBN. We measure the performance of active revocation scheme in REED. It can be seen that the time delay of rekeying in REED increases with the file size get larger, while the time delay of rekeying in REEDBN is stable and does not change with the increase of file size. This is because the encrypted stubs need to be downloaded, re-encrypted and uploaded to the server in REED. The larger files will be split into larger stubs. Hence, the communicational and computational efforts in rekeying will be increased. As the file size increases, so does the size of the stub, which multiply the data transfer and computational overhead for clients during rekeying in REED. Nevertheless, REEDBN outsources the expensive ciphertext conversion process to the server which has strong computing power. Regardless of the file size, the computational overhead for the client is only the generation of a re-encrypted key, and the communicational overhead in system is only the size (several bytes) of the re-encrypted key.

Evaluation on real-world datasets
In this subsection, we evaluate the storage cost and encryption performance of REEDBN on a real-world dataset. The realworld dataset we consider is FSL [20]. Figure 6(a) shows the storage cost in REEDBN. Although REEDBN only applies deduplication to trimmed packages and has ciphertext expansion because of NTRU, it still has a good storage performance. We compare the size of the original data before encryption and deduplication with the size of the total of trimmed packages and encrypted stubs stored on the server both in REED and REEDBN. We can see that the data storage overhead decreases significantly after deduplication. After 22 days, the deduplication ratio in REED and REEDBN are respectively 89.3% and 87.6%. The deduplication ratio is defined as the ratio of the duplicate data/original data. The ciphertext expansion in NTRU incurs a little extra storage overhead in REEDBN.

EAI Endorsed Transactions on
Security and Safety 10 2020 -01 2021 | Volume 8 | Issue 27 | e4 Fig. 6. Performance on real-world datasets Figure 6(b) compares the size of the trimmed packages with the encrypted stubs both in REED and REEDBN. It can be seen that compared with the trimmed packages, the size of the stub data is relatively low. The storage overhead of encrypted stub in REEDBN is larger than REED because of the ciphertext expansion in NTRU. The ratio of stubs to total storage in REED is 17.5%, while this ratio in REEDBN is 25.9%. We also measure the encryption performance in REEDBN over days. Figure 6(c) shows that REEDBN also has a good performance on real-world datasets. Its encryption performance is close to REED.

Conclusion
In this paper, we present REEDBN, which is a new rekeying scheme based on NTRU in encrypted deduplication storage. It takes advantage of NTRUReEncrypt to significantly mitigate the computational efforts for clients and the bandwidth cost in the system during rekeying. Clients just need to compute a re-encryption key and upload it to the server for rekeying their outsourced data without downloading any data from the server. The complicated process of re-encryption is done on the server side. We implement a REEDBN prototype and evaluate its performance. The results show that REEDBN indeed reduces both the bandwidth overhead in the system and computational efforts for clients.