1 Cryptoleq: A Heterogeneous Abstract Machine for Encrypted and Unencrypted Computation Oleg Mazonka, Nektarios Georgios Tsoutsos, Student Member, IEEE, and Michail Maniatakos, Member, IEEE Abstract The rapid expansion and increased popularity of cloud computing comes with no shortage of privacy concerns about outsourcing computation to semi-trusted parties. Leveraging the power of encryption, in this paper we introduce Cryptoleq: an abstract machine based on the concept of One Instruction Set Computer, capable of performing general-purpose computation on encrypted programs. The program operands are protected using the Paillier partially homomorphic cryptosystem, which supports addition on the encrypted domain. Full homomorphism over addition and multiplication, which is necessary for enabling general-purpose computation, is achieved by inventing a heuristically obfuscated software re-encryption module written using Cryptoleq instructions and blended into the executing program. Cryptoleq is heterogeneous, allowing mixing encrypted and unencrypted instruction operands in the same program memory space. Programming with Cryptoleq is facilitated using an enhanced assembly language that allows development of any advanced algorithm on encrypted datasets. In our evaluation, we compare Cryptoleq’s performance against a popular fully homomorphic encryption library, and demonstrate correctness using a typical Private Information Retrieval problem. Index Terms Abstract Machine, Compiler, Encrypted Computation, Obfuscation, One Instruction Set Computer, Heterogeneous Computer, Homomorphic Encryption, Paillier. Copyright c 2016 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. O. Mazonka and M. Maniatakos are with the Department of Electrical and Computer Engineering, New York University Abu Dhabi, UAE (e-mail: om22@nyu.edu; michail.maniatakos@nyu.edu). N. G. Tsoutsos is with the Department of Computer Science and Engineering, New York University, New York (e-mail: nektarios.tsoutsos@nyu.edu). This draft has been accepted for publication in the IEEE Transactions on Information Forensics & Security. The final version of this article is available online at http://ieeexplore.ieee.org (DOI 10.1109/TIFS.2016.2569062). 2 I. I NTRODUCTION Contemporary computing paradigms, such as cloud and pervasive computing, have become increasingly popular as they allow outsourcing computation to a typically more powerful or dedicated set of machines. From Bitcoin mining [1] and Mersenne primes search [2], to commercial cloud services offered by major industry companies, outsourced computation requires code execution in a remote machine. One fundamental concern with such paradigms, however, is the privacy of the outsourced data [3]. In addition to the legitimate third party that performs the outsourced computation, additional concerns arise in light of side channel attacks [4] or even hardware Trojans [5]–[7]. Fortunately, cryptographic primitives such as homomorphic encryption can be leveraged to address those privacy concerns, and eventually return control of the data back to the legitimate information owner [8], [9]. As soon as fully homomorphic encryption (FHE) became theoretically possible [10]–[13], the academic interest in FHE applications has increased accordingly. From secure cloud computation [14] and verifiable computation [15], to multiparty computation [16] and message authenticators [17]. In addition, partial homomorphic encryption (PHE) has recently been leveraged for verifiable computation [18]. Despite the wide range of applications that can benefit from FHE schemes, their efficiency has been a concern, and their practicality has been questioned [19], [20]. More practical implementations of FHE, such as [21], have already been instantiated in the HElib software library, but fully homomorphic operations could still have overheads in the order of seconds [22]. In addition, HElib only recently has evolved to support bootstrapping and rencryption [23], and applications that use this library to implement generic/interactive computer programs that process homomorphic data are yet to be seen. On the other hand, PHE schemes are more practical than their FHE counterparts, and what the former lack in range of supported operations, they gain in efficiency. Indeed, PHE schemes typically require straightforward operations on ciphertexts, such as modular multiplication, which can be implemented very efficiently (e.g. [24]–[27]). Our observation is that PHE could be sufficient for practical applications of outsourced computation, where the applicable threat model can afford obfuscation as an adequate mitigation control to protect the privacy of processed data. Of course, PHE is less powerful than FHE in terms of computational completeness, and end-to-end encryption is traded for performance in computations. This tradeoff will allow us to design and implement a new programming language capable of processing homomorphically encrypted data with practical computational cost. Problem formulation: This work addresses the problem of protecting the privacy of sensitive data, when these data are being processed within semi-trusted containers and the computation is outsourced. 3 Fig. 1. Cryptoleq abstraction layers. In addition, this work combines privacy preservation with a practicality requirement, which, so far, is difficult to achieve through FHE schemes. Likewise, this work also addresses the usability problem of theoretical approaches, which prevents them from seeing wide deployment and being used in everyday applications. Our contribution: In this work we propose Cryptoleq, a new programming language based on a single instruction computer architecture, which processes homomorphic data natively. Cryptoleq defines a universal computer for processing encrypted and unencrypted data together within the same program memory space. Data encryptions are generated using Paillier PHE, and only homomorphic addition is natively supported. Universal computation, however, requires support for both addition and multiplication, and in this work, we simulate multiplication with heuristically obfuscated re-encryption implemented using Cryptoleq instructions. Notably, we devised a PHE operating model with a single instruction that supports both encryption and decryption within itself ; this effectively enables unifying re-encryption with the executing program. Overall, we claim the following: • Design and implementation of Cryptoleq, which expands single instruction computing with native support for homomorphic data using a novel bit layout representation. Cryptoleq supports programs written without privacy protections, as well as protected execution using encrypted data under full encryption or heuristic obfuscation modes, depending on the need to multiply encrypted values. • A practical framework for Cryptoleq with extended assembly language, compiler, and emulator for executing Cryptoleq programs on different platforms (e.g. x86, ARM). Roadmap: The paper is organized on a bottom-up traversal of Cryptoleq abstraction layers (fig. 1). Following a preliminaries discussion on homomorphic encryption and single instruction architectures (section II), the Cryptoleq abstract machine is presented in section III. Multiplication on encrypted operands is then presented in section IV, while section V discusses the Cryptoleq Enhanced Assembly Language (CEAL) that accelerates program development in Cryptoleq. In section VI, we evaluate the correctness and performance of Cryptoleq using a Private Information Retrieval case study, and compare its efficiency 4 against a popular FHE library [23], while related work and conclusions appear in sections VIII and IX. II. P RELIMINARIES A. Homomorphic Encryption Homomorphic encryption allows the application of mathematical operations directly on encrypted data so that the results after decryption would correspond to applying matching operations on unencrypted data. This can be expressed as: v = f (u) ⇐⇒ Enc(v) = g(Enc(u)) (1) where v and u are some vectors in the unencrypted domain, Enc(·) is the encryption operation over each vector element, and f and g are related functions. For example, if Enc(z) = az then Enc(x) · Enc(y) = Enc(x+y) so the homomorphism is between addition in unencrypted data and multiplication in encrypted. In this case u is a vector of two elements, v is a scalar, f is scalar addition, and g is scalar multiplication. Having one homomorphic mathematical operation is not sufficient for universal computation, as not all functions can be expressed through that operation. On the other hand, when we have both addition and multiplication, there exists a zero element that is both an additive identity and multiplicative absorbing element, and this allows evaluating any function. Indeed, addition and multiplication on a binary ring would define a Turing complete pair of logic gates [28], and such pair is sufficient to construct a universal machine. A homomorphic encryption scheme supporting only one operation is called partially homomorphic, while a scheme supporting two orthogonal operations is called fully homomorphic. In this work, we employ the Paillier additive PHE scheme that uses a modulus N as its security parameter [29]. This modulus is defined as the product of two primes, and message m is encrypted as rN g m mod N 2 . In Paillier, g is an encryption base parameter, while r is randomly selected to provide semantic security to the probabilistic scheme [30]. B. One Instruction Set Computer One instruction set computer (OISC) is a computer architecture which supports only one instruction and is able to perform universal computation [31]. It operates on memory organized as a sequence of memory cells, while processor instructions and data reside in unified memory space, following the vonNeumann model. There exist three OISC categories [32]: (i) Transport Triggered Architecture Machines, (ii) Bit Manipulation Machines, and (iii) Arithmetic Based Machines. Since OISC has only one instruction, it is an attractive choice for homomorphic computations for several reasons. In particular, OISC does not require the use of opcodes, since there is no need to discriminate 5 between different instructions; the single instruction is fully defined by its arguments and the microoperations associated with that instruction are predefined. On the other hand, if the underlying architecture has multiple opcodes, they could either be unencrypted, so information about the executed algorithm can be leaked, or could be encrypted, so the encrypted processor would have to obliviously compute all possible opcodes for the given arguments and homomorphically combine their results. Furthermore, since OISC architectures typically apply a simple mathematical operation over their arguments, this operation can be directly ported to the encrypted domain by identifying its homomorphic counterpart. Thus, due to their simplicity, OISC architectures are naturally compatible with homomorphic encryption relations (eq. 1) and can be extended to support processing of encrypted arguments. One popular Turing-complete OISC is Subleq, which stands for subtract and branch if less than or equal to zero [33]. Its abstract machine is easily implemented both in hardware and emulating software, as it is sufficiently simple and computationally efficient compared to other OISC variants. In this work, we leverage some Subleq principles for constructing our computational model, as presented in the following section. III. C RYPTOLEQ A BSTRACT M ACHINE A. Description of the Model The Cryptoleq abstract machine is a processor model operating on a sequence of memory cells. Each memory cell has an address and a value, while any cell value may also be used as a memory address. The processor has an instruction pointer IP and executes instructions read from memory. Each such instruction consists of three operands, named A, B and C . Let [·] denote a dereference operation so that [X] represents a memory cell with address X , as well as its value. The three instruction operands are fetched from memory as follows: A = [IP]; B = [IP + 1]; C = [IP + 2]. (2) As soon as A and B are fetched, referenced memory cells [A] and [B] are accessed using the values of A and B as addresses (fig. 2). Then, the instruction modifies [B] and IP values according to the following two steps: [B] =O1 ([A], [B])  C, if O2 ([B]) ≤ 0 IP =  IP + 3, otherwise (3) where O1 and O2 are operations that will be explicitly defined in the following paragraphs (eq. 5). O1 results in a “cell type” value, while O2 results in an integer. 6 Fig. 2. Instruction and data organization in Cryptoleq. As soon as the first step of eq. 3 is executed (i.e. operation O1 ), the result is stored back to the memory cell pointed by B . Next, operation O2 is performed on the updated value of [B] and the result is compared with zero. We further define a “Less or equal to zero” compound Boolean operation, named Leq, over a cell value argument: def Leq(x) = (O2 (x) ≤ 0). (4) If Leq([B]) test is True, then value C is assigned to IP. This is effectively a control flow branch to the address of the cell where C points to. On the other hand, if Leq([B]) is False, the IP is updated with the address of the cell immediately after C (i.e. if IP was pointing to A, then it is increased by 3) and the execution sequence is repeated for the next instruction. Additional details regarding the increment of IP and address sequence organization are presented in section III-B. O operations: In Cryptoleq, operations O1 and O2 are defined as follows:   x−1 −1 2 O1 (x, y) = x · y mod N , O2 (x) = . N (5) Throughout this paper we use exponent (−1) for reciprocal1 in the corresponding modulus; to avoid ambiguity we also use (·)−1 ξ , where placeholder subscript ξ explicitly defines the modulus for the reciprocal. The floor brackets notation b·c refers to the integer part of the fraction in the equation above. Operations O1,2 have been judiciously selected in order to: 1) support computation on encrypted operands, compatible with the chosen encryption scheme (as it will become evident in section III-B), and 2) support computation on unencrypted operands, compatible with existing Turing-complete single instruction languages (further discussed in section III-D) 1 Modular inversion is an expensive operation; section III-E presents how this operation can be avoided. 7 B. Encryption Scheme Let N be a cryptographic parameter equal to the product of two primes. The Paillier encryption scheme is defined as a unique correspondence between a value x from Z∗N 2 and values m and r from ZN and Z∗N respectively: x = rN gm mod N 2 (6) where g is a generator in Z∗N 2 . We select g = 1 + N k with a random k coprime to N . Our selection of g is less general than the original Paillier encryption definition [29], where g can be any number from Z∗N 2 with its order being nonzero and multiple of N . This selection of g is equally secure to [29] and sufficient for our scheme, with the order of g being N , and the rest of encryption and decryption being the same up to notation. In our encryption scheme, r serves as the probabilistic part of encryption, while m is the plaintext value to be protected. Based on our selection of g , eq. 6 becomes: x = Enc(m) ≡ rN (1 + N km) mod N 2 . (7) Here m can be written inside the brackets because modulus N 2 reduces all higher powers of N . Moreover, decryption requires knowledge of k as defined above, and φ that is the value of Euler’s totient function of N : −1 xφ(kφ)N − 1 m = Dec(x) = N r = (xg −m Nφ−1 ) mod N, (8) mod N. Eqs. (7) and (8) provide a clear method to encrypt and decrypt numbers comprising the data and executable code of a program, while operations O1,2 in eq. 5, along with the abstract machine description, define a method to evaluate a program. Operation O1 (eq. 5) is homomorphic to subtraction of m’s with recombination of random parts, as follows: x−1 y = (rx−1 ry )N g my −mx mod N 2 . (9) Thus, multiplication of y by the inverted x corresponds to subtraction of their unencrypted values. Having the subtraction operation also brings the addition functionality to the programming paradigm, considering x + y = x − (0 − y). TS-notation: Our observation about the aforementioned encryption scheme is that the encrypted value x is also bijective to a unique pair of values t and s: x = 1 + Nt + s (10) 8 with t = [0, N − 1], s = [0, N − 2], and (s + 1) being coprime to N . If eq. 10 reconstructs x from t and s, the converse is:   x−1 t= N and s = ((x − 1) mod N ). (11) To distinguish these two equivalent representations, in this work we name X value the left-hand side of eq. 10 and TS value the corresponding t and s pair. The view of an encrypted value x (X value) as t and s (TS value) has additional benefits: First, as elaborated in section III-C, a simple definition of negative values for O2 is possible. Examining O2 in eq. 5 and t in eq. 11, we observe that comparing t with zero is equivalent to Leq(·) comparison in eq. 4. In addition, TS-values are useful when s is zero, since subtraction of t values demonstrates the same homomorphic property as the encrypted m in eq. 9: x−1 y = (1 − N tx )(1 + N ty ) = (12) = 1 + N (ty − tx ) 2 mod N . Leveraging this property, the Cryptoleq processor can execute instructions using TS values and generate the same result as if the instructions were executed on a backwards compatible Subleq processor. Moreover, a TS pair with s 6= 0 can still be used in addition and subtraction because a special “unit” for each particular s can be defined. Let U = 1 + N u be a unit that increases t values by 1. Then, combining this definition with eq. 10 and operation O1 (eq. 5), we can solve for u: U −1 · (1 + N t + s) = 1 + N (t + 1) + s mod N 2 , (13) u = (1 + s)−1 mod N. Using this definition, U can be used as a unit to predictably change the t part of any encrypted value. Since memory cell addresses in Cryptoleq are of the same type as cell values, arranging these cells in a sequence without introducing the notion of U , is problematic: when an operand is treated as an address, naive homomorphic addition or subtraction of a constant does not correspond to an equal increment or decrement in the address value. Introducing the U unit is essential for defining the notion of sequence in encrypted values and use them as memory addresses. Hence, address arithmetic in eqs. (2) and (3) becomes a simple modular multiplication with a power of U ; moreover, if TS values are used, address arithmetic is expressed as a direct increment or decrement of the t part. C. Range of values The Leq operation in eq. 4 is required to make a decision on what values within the range (0, N ) are considered less than zero. Thus, to introduce the concept of “negative” numbers, we break the values in range (0, N ) into two groups: 9 Fig. 3. Schematic display of Cryptoleq valid number range. • positive is a number whose most significant bit position is less than the most significant bit position of N , and • negative is a number whose most significant bit is at the same bit position as the most significant bit of N . This grouping is natural, as it is similar to the commonly accepted bit representation of numbers in conventional computers. Since N is not a power of 2, the range of negative numbers is actually smaller than the range of positive numbers. To ensure symmetry between positives and negatives, as expected in computer arithmetic, we further restrict the range of valid positive numbers. Therefore, we define a “valid” positive number as a number that: 1) has a corresponding negative counterpart, congruent to modulus N , and 2) does not become a negative number when doubled. Since 2blog2 N c limits the range of positive numbers, N − 2blog2 N c defines the range of negative numbers. A natural parameter governed by the above definition of valid positive numbers can be expressed as the largest power 2β with: k j β = log2 (N − 2blog2 N c ) (14) so that any number greater than zero and less that 2β is a valid positive number in Cryptoleq programs. Hence, the whole range of integers in a program is divided into the following four classes shown in fig. 3: (i) zero, (ii) valid positive numbers < 2β , (iii) valid negative numbers, and (iv) other numbers with undefined behavior if used in the program arithmetic. The β parameter introduced in eq. 14 defines numbers available to the arithmetic of Cryptoleq programs and provides a strong guarantee that each positive number has a corresponding negative, so that the Leq test (eq. 4) is defined correctly. Furthermore, β is required by the multiplication algorithm presented in section IV-C. It is worthwhile to note that some N moduli may correspond to a relatively small β value, which may not be sufficient to accommodate the required range of numbers in a program; selecting such an N modulus, however, has very low probability as the bitsize of N increases. 10 Bit representation of numbers in Cryptoleq is somewhat different from conventional representations. Programs can use a value either encrypted (as eq. 7), or unencrypted, as a TS value with t = m and s = 0. We name this latter case open representation: x = Open(m) = 1 + N m. (15) Typical bitwise operations over this bit representation of numbers are not naturally supported and require emulation within the program using multiplication and division. Using our definitions for operations O1,2 , the notion of negative numbers, and the ability to organize memory with sequential addresses, we can fully define our Cryptoleq abstract machine. Cryptoleq is able to obliviously execute code independent of whether values are constructed using encryption (eq. 7) or the open representation (eq. 15). This is beneficial since: • it provides backward compatibility to other single instruction architectures, and • both encrypted and unencrypted operands can be mixed within the same program to allow heterogeneous execution. D. Mixed-mode execution A Cryptoleq program can operate on open values (eq. 15), as well as encrypted ones (eq. 7). For the former case, program instructions effectively perform homomorphic subtraction (eq. 12) of the t parts (recall that open values have s = 0 by definition), and compare the t value directly with zero, as this is equivalent to Leq(·) comparison (eq. 4) since O2 (x) (eq. 5) and t (eq. 11) yield the same value for the same x. Hence, the effective operation that instructions perform on open values is the same as if plaintext values were processed, and if operations O1,2 in eq. 5 were redefined as: O1Subleq (x, y) = y − x and O2Subleq (x) = x. (16) Indeed, if the set of eqs. (2) to (5) formally defines the Cryptoleq abstract machine, then redefining O1,2 as in eq. 16 would be sufficient to define a Subleq abstract machine instead. Therefore, we can apply eq. 15 on all instructions and data of a Subleq program and run it natively using Cryptoleq. In practice, encrypted Cryptoleq programs can use open values in several cases (e.g. for iteration or array indexing). Coexisting open and encrypted values in a program does not affect correctness since their interrelated usage is limited: while an encrypted value can be multiplied by an open value to produce an encrypted result, they cannot be added together or converted from one to the other. Indeed, a property of secure encryption schemes is that encrypted values exist in their own domain and cannot be manipulated to leak data. To actually break the barrier between the encrypted and unencrypted domains, mixing open and encrypted values, obfuscated decryption and re-encryption capabilities can be used (section IV-B). 11 Fig. 4. Memory organization and inverted values. Cryptoleq instructions always treat open and encrypted values the same, as they are indistinguishable to the processor. In addition, modular multiplication of two open or two encrypted values is always homomorphic to adding the corresponding plaintexts. Nevertheless, Leq(·) comparisons (eq. 4) are expected to be performed only on open values, as applying this comparison on encrypted values with an unknown random part r would yield an unpredictable result. In some cases, the Leq comparison can be applied to encrypted values as well, but only if the compiler or the programmer can track modifications of the random part in encrypted values. In the latter case, it is possible to discriminate between expected encryptions. Mixed-mode provides the flexibility to prioritize the variables to be kept private, given specified performance constraints, experimentally demonstrated in section VI. E. Implementation remarks Memory organization: Since the Cryptoleq abstract machine represents memory addresses using encrypted values, any potential implementation with standard memory modules would require a translation from these encrypted value to natural memory indices. In this work, we propose three different memory organization approaches: 1) In the first approach, the memory is organized as a collection of sectors (fig. 4). Each sector is a collection of continuous segments (i.e. sequences of memory cells) and all cell addresses within one sector share the same s value, while all cell addresses within one segment have sequential t values. Incrementing a t value by a unit (eq. 13) returns the next cell address. 2) In the second approach, we employ the x values as addresses. Since the size of x is much larger compared to physical addresses, a “hash map” is used for translation. 3) The third memory organization also uses x values of addresses but implements a self-balancing binary search tree (e.g. red-black tree) data structure to allow efficient search of the physical address 12 corresponding to an x value. The first memory organization type is a natural selection when implementing the memory in hardware, while the second and third types are good matches for software emulations of Cryptoleq that leverage standard libraries (e.g. C++ STL). Operand representation in memory: A second implementation decision is related to the representation of Cryptoleq operands in memory, as TS or X values. The first memory organization type discussed above (i.e. use of sectors) is by design compatible with TS storage, since access to a memory location is direct. The second and the third organization types (i.e. use of hash map and binary search tree respectively) are generally faster when the X representation is used, since x values are used directly for address mapping. A benefit of the X representation is its convenience for multiplying operands (eq. 5), while Leq comparisons (eq. 4) can be implemented using comparisons with precomputed values as: Leq(x) = (x < 1 + N ) ∨ (x > N · 2blog2 N c ). (17) On the other hand, the TS representation requires temporary transformation to X before multiplying operands, but it is more convenient for Leq comparisons. An extensive comparison of the various options appears in section VI-B1. Avoiding modular inversion: Another important optimization is avoiding the modular inversions required in eq. 5, since such inversions incur high performance overheads. In this work, we propose to precompute and store the inverted values alongside the non-inverted ones. Then, a runtime inversion can be implemented by swapping the stored pair. In this case, the Cryptoleq processor must update both values in the pair when [B] is updated, so an additional modular multiplication is necessary: for example, if one stored pair is {x, x−1 } and another is {y, y −1 }, operation O1 in eq. 5 should be applied twice to create an updated pair {x−1 · y, x · y −1 }. The benefit of this optimization is the evasion of expensive modular inversions, but this is traded with the aforementioned additional multiplication and doubling of required memory. This inverted representation can be used with both TS and X notations; thus, there are four distinct operand representation configurations (as presented in fig. 4): (i) TS, (ii) X, (iii) TS with inverted TS (i.e. TS&ITS), and (iv) X with inverted X (i.e. X&IX). Note that t0 and s0 for inverted TS are not the modular inverses of t and s, but are computed using eq. 11 on x−1 . Selecting β : Eq. 14 essentially defines the maximum ranges for valid positive and negative numbers in the program’s arithmetic. Nevertheless, if β is always set to maximum using this equation, increasing the size of N (i.e. the security parameter) would actually reduce efficiency, since the maximum possible β increases as well. In practice, Cryptoleq allows programmers to fix β to a value smaller than the one in eq. 14. Indeed, an arithmetic range up to 64 bits for example is sufficient for most programs. As 13 demonstrated by our experiments in section VI-B, restricting β to standard program arithmetic sizes has a significant improvement on execution overheads for large N . F. Completeness of Cryptoleq Based on the previous discussion, our Cryptoleq abstract machine supports programs that homomorphically add or subtract values and can perform conditional jump based on non-encrypted values (these are the same basic operations as a typical OISC architecture [31]). An important concept that has not been introduced yet, however, is support for multiplication2 and comparison within programs. Other arithmetic OISC implementations address this problem at the software level, using multiplication algorithms expressed in native instructions. The same can be done in Cryptoleq. Multiplication algorithms, however, are optimally implemented using conditional jumps and in Cryptoleq this is available only for non-encrypted values (i.e. only such values can support multiplication). This limitation exists since conditional jumps over encrypted values could immediately leak side channel information about those values. Since multiplication (along with addition) is necessary for universal computation, one of our contributions is designing a special software function G that is used to implement a multiplication algorithm for encrypted values without conditional jumps, as well as to support encrypted comparisons. IV. M ULTIPLICATION OVER E NCRYPTED VALUES A. Notation Before introducing our algorithm for multiplication of encrypted values, as well as defining the corresponding software modules, we provide a brief description of our notation. For the remaining of this section, we employ special notation for binary and unary operations applied on ciphertexts, to avoid uncertainty compared to standard arithmetic operations applied on plaintexts (such as regular addition, subtraction, negation and multiplication). Therefore, we define the following signs on any ciphertexts x and y (defined as in eq. 7): 1) Plus sign with hat (x +̂ y ): Is a binary operator that denotes modular multiplication of ciphertexts x and y : x +̂ y ≡ Enc(Dec(x) + Dec(y)) = x · y mod N 2 . 2) Minus sign with hat (x −̂ y ): Is a binary operator that denotes modular multiplication of ciphertext x with the modular multiplicative inverse of another ciphertext y : 2 Division can be built on top of multiplication. 14 x −̂ y ≡ Enc(Dec(x) − Dec(y)) = x · y −1 mod N 2 . Likewise, the unary minus operator (−̂y ) is equivalent to Enc(0) −̂ y . 3) Star sign (x ? y ): Is a binary operator that denotes execution of alg. 1 on ciphertexts x and y : x ? y ≡ Enc(Dec(x) · Dec(y)) mod N 2 . In addition to the aforementioned operators, we also define the tilde accent ( e· ) over a given plaintext m, to denote encryption of this message with a new Paillier randomization parameter, as in eq. 7: m e ≡ Enc(m). Furthermore, as already defined in eq. 15, for a given ciphertext x, the bar accent x denotes the open value corresponding to that ciphertext. B. Function G Function G is a software module that performs heuristically obfuscated decryption and re-encryption. Our goal is to define the simplest mathematical function which is sufficient for designing any other complex algorithm in Cryptoleq, such as encrypted multiplication or comparison. Definition: Let x and y be encrypted values based on eq. 7, and let y 0 = rN · y mod N 2 be defined as a re-encrypted y with a new randomization parameter r. Following the notation in section IV-A, we define function G is as follows:  e 0, if Leq(x) G(x, y) = y 0 , otherwise. (18) Function G takes two encrypted arguments as inputs and returns the second argument modified with new randomization when the unencrypted value of the first argument is positive; otherwise, it returns an encryption of zero. In both cases, a different randomization parameter is used each time. Our observation is that function G can be expressed using Cryptoleq instructions. Moreover, since Cryptoleq’s O1 operation is multiplication (eq. 5), exponentiation is also easy to implement as a combination of squaring and multiplication operations based on the bit expansion of the exponent. In Cryptoleq, if an encrypted value is raised to exponent φ(kφ)−1 N , it actually decrypts into an open representation (as in eq. 15): −1 −1 xφ(kφ)N = (rN (1 + N km))φ(kφ)N = −1 = (1 + N km)φ(kφ)N = = 1 + N kφ(kφ)−1 N m = 1 + Nm (19) mod N 2 . Note that rN φ = 1 mod N 2 . Cryptoleq programs do not have to retain a copy of φ(kφ)−1 N , as its obfuscated bit expansion is sufficient to define the sequence of multiplications in calculating the result of function G; this sequence can be statically generated during program compilation. In case programmers 15 choose to use a fixed β , as mentioned at the end of section III-E, any plaintext value can be multiplied by a random coefficient s < 2βmax −βf ixed without affecting the result of Leq(x) in eq. 18. Hence, as a security enhancement, Cryptoleq programs can retail the obfuscated bit expansion of s · φ(kφ)−1 N and calculate the result of function G correctly. As it appears in the next section IV-C function G is adequate to implement a multiplication algorithm on encrypted values. Moreover, appendix A illustrates an example of how function G is being calculated. C. Multiplication Algorithm The multiplication algorithm3 is a software library procedure that produces an encrypted product given two encrypted factors. The requirements for this algorithm are: (i) it should be based only on addition and subtraction, (ii) it uses function G, and (iii) it cannot use any conditional jumps on encrypted values. The non-branching constraint ensures that the algorithm iterations are always deterministic and do not depend on any argument values. Moreover, section III-C defines the ranges of valid positive and negative values for Cryptoleq programs so that the β parameter can be used to determine the algorithm iterations without any risk of overflowing positive values. Another observation is that it is sufficient to construct a multiplication algorithm for positive values only, since multiplying arbitrary values reduces to multiplying positives using alg. 1: x ? y = (z1 ? z3 ) −̂ (z2 ? z3 ) z1 = (G(x, e 1) ? G(y, e 1)) +̂ (G(−̂x, e 1) ? G(−̂y, e 1)) (20) z2 = (G(x, e 1) ? G(−̂y, e 1)) +̂ (G(−̂x, e 1) ? G(y, e 1)) z3 = |x| ? |y| using the following formula for absolute values: |x| = G(x, x) +̂ G(−̂x, −̂x). (21) Eq. 20 expresses the multiplication of two arbitrary numbers x and y , based on the multiplication of their absolute values as well as multiplications of function G outputs (either e 0 or e 1 in this case). In addition, eq. 20 uses negations, which are defined as the modular additive inverses of the given values. Our proposed non-branching multiplication procedure is presented in alg. 1. The algorithm is the homomorphic equivalent of “shift & add” multiplication, and uses function G to add the value of the 3 This should not be confused with multiplication used for executing instructions (i.e. O1 in eq. 5), which is homomorphic to addition in the unencrypted domain. Here, the presented algorithm is built on top of this homomorphic addition and is denoted with the ? operator. 16 Algorithm 1 Multiplication (?): top level 1: procedure M ULTIPLY(x, y ) 2: sum ← e 0 3: for (β + 1) times do 4: z ← Div2(x) 5: bit ← x −̂ (z +̂ z) 6: sum ← sum +̂ G(bit, y) 7: y ← y +̂ y 8: x←z 9: 10: 11: . β is a global parameter . Div2 to be defined in alg. 2 end for return sum end procedure Algorithm 2 Division by 2 1: procedure D IV 2(x) 3: sum ← e 0 β p ← 2f 4: for β times do 2: 2 5: p2 ← Half2(p2 ) 6: y ← sum +̂ p2 7: y ← (y +̂ y) −̂ x 8: sum ← sum +̂ G(e 1 −̂ y, p2 ) 9: 10: 11: . Half2 to be defined in alg. 3 end for return sum end procedure multiplicand to an accumulator, if the least significant bit (LSB) of the multiplier is not zero, before left shifting the multiplicand and right shifting the multiplier. Homomorphic isolation of the LSB entails 3 steps: invoking alg. 2 that performs the homomorphic equivalent of integer division by 2 (i.e., a right shift); adding the quotient to itself (i.e., a left shift); and subtracting from the multiplier. Alg. 2 is also expressed as a non-branching algorithm and uses the Half2() auxiliary procedure (alg. 3). The input to Half2() can only be a power of 2 and the procedure divides its input by 2, using function G and absolute values as in eq. 21. An optimization for Half2() is to use a precalculated lookup table, which is our 17 Algorithm 3 Division by 2 of power of 2 1: procedure H ALF 2(x) 2: sum ← e 0 3: p2 ← e 1 4: for β times do 5: y ← (p2 +̂ p2 ) −̂ x 6: y ← G(|y| , e 1) 7: y ← G(e 1 −̂ y, p2 ) 8: sum ← sum +̂ y 9: p2 ← p2 +̂ p2 10: end for 11: return sum 12: . |·| - eq. 21 end procedure current implementation, reducing the number of calls to function G from O(β 3 ) to O(β 2 ). A numerical example on the execution of alg. 1 is presented in appendix B. V. C RYPTOLEQ E NHANCED A SSEMBLY L ANGUAGE A. Overview Cryptoleq is complemented by a developed piece of software named Cryptoleq Enhanced Assembly Language (CEAL). It is designed to aid in writing extended programs. Specifically, since the algorithms enabling universal computation on encrypted data have to be written in Cryptoleq, their complexity requires more expressive language than mere sequence of Cryptoleq instructions. CEAL allows writing symbolic notation avoiding explicit memory address manipulation. A CEAL compiler automates several procedures required for proper program execution, including, but not limited to, memory arrangement, address resolution, expression evaluation and macro definition substitution. It can also generate function G, as discussed in section IV-B. This section discusses the fundamental differences of CEAL compared to typical compilers4 . CEAL provides syntax for supporting encrypted values, either directly hardcoding them in a program or generating them during compilation time. The compiler can also generate random encrypted values 4 A detailed description of CEAL is out of scope of this paper and can be found at [34]. 18 to be stored in memory or values to be used as memory addresses. So the program data can be easily initialized with encrypted values, as well as assigned to randomly selected memory addresses. B. Arrays Similar to other high level programming languages, CEAL can allocate memory arrays and access memory cells by either indexes or pointers. This is supported by Cryptoleq instructions via self-modifying code. In order to access memory indirectly – via a pointer – for reading or writing, an instruction is used to write the address into the corresponding operand of the target instruction. The CEAL compiler allocates the required amount of memory, initializes it and assigns the memory addresses. An array can be placed in a continuous block of memory with all elements sharing the s part of their addresses. CEAL syntax provides three options for placing an array: (i) in the default block of memory with s = 0, (ii) starting from an explicitly defined address, or (iii) at a random position. In the last two cases, the addresses of the array elements are indistinguishable from encrypted values. Array element access is facilitated using pointer arithmetic, and more specifically modification of t part: the program uses the unit value, as discussed in eq. 13, to navigate through the array elements. C. Library, Multiplication and Function G Macros are introduced into CEAL to simplify repetition of instruction sequences, to make a local scope for names, and to build more complex constructions, such as functions. CEAL defines functions as special constructs built by small macros passing the execution to specific points in the program – function entries – while copying the return address in advance, so when the function finishes execution the control flow is passed back to the calling point. These macros also prepare the arguments and return values. In order to reuse useful general macros, a library written in CEAL can be attached to a program by an include directive. This is similar, for example, to C include directive and C standard library linking. The CEAL library implements several macros and algorithms, most notably multiplication of encrypted and unencrypted values, function G, and equal operation on encrypted values. Specifically, the developed CEAL library defines two multiplication algorithms as functions: one for encrypted and one for unencrypted values. The reason the algorithms are different is that multiplication of unencrypted values can use conditional jump, hence can be implemented in a more optimal way. On the contrary, encrypted multiplication in Cryptoleq is implemented by code written in CEAL implementing the three algorithms of section IV-C. 19 The current implementation of the CEAL compiler and the accompanying library, uses a specific built-in directive to generate the function G during compilation of a program. This directive takes three arguments: the first is a constant expression corresponding to the bit expansion of φ(kφ)−1 N , while the remaining arguments are the names of two macros implementing squaring (for bits 0 in the bit expansion), as well as squaring and multiplying (for bits 1).5 Using the provided bit expansion, the compiler generates a list of macro calls to either the squaring macro, or the squaring and multiplying one, mapping each such call to the corresponding bit in the bit expansion (starting from the least significant one, without generating any branching code). This list of macros is used at runtime to compute the decryption of the first ciphertext argument of function G. The compiler completes the implementation of function G, by referencing library code to perform the following steps: (i) generate new encryptions e 0 (using previous random encryptions of zero as randomness seed); (ii) depending on the result of a Leq test (eq. 4) on the decryption of the first ciphertext argument, the code returns either a newly computed e 0, or a re-encryption of the second ciphertext argument as e 0 +̂ y . At runtime, the aforementioned list of macro calls accepts a ciphertext x as input, and computes the decryption of x (as in eq. 19). The decryption result (x) is calculated with the assistance of a memory cell acting as an accumulator A, initialized to plaintext value 1 (for each decryption). If the next macro call in the list corresponds to squaring (i.e., there was a bit 0 in the expansion of φ(kφ)−1 N ), the memory cell containing x is updated with the square of x. Otherwise, if the next macro call in the list corresponds to multiplying and squaring (i.e., there was a bit 1 in the expansion of φ(kφ)−1 N ), A is updated with the product of x with A, before x is updated with the square of x. After all macros in the list are executed, the correct decryption result is computed. Finally, the executed code generates new e 0 values, and based on Leq(x), it either returns a e 0, or a re-encryption of the second ciphertext argument y . D. Equal function In addition to the four basic operations already discussed (i.e. addition, subtraction, multiplication and division), Cryptoleq programs require additional arithmetic operations over encrypted values. A prominent example is equivalence of two encrypted values that is homomorphic to equality of their plaintexts. Therefore, we define the equivalence comparison operation as follows:  e 1, if Dec(x) = Dec(y) Equal(x, y) = e 0, otherwise. 5 (22) Note that this multiplication and squaring refer to the standard modular multiplication, not the ? operation using alg. 1. 20 Algorithm 4 Private Information Retrieval 1: procedure P IR(input) 2: array table[6][2] 3: sum ← e 0 4: for i = 0 to 5 do 5: key ← table[i][0] 6: val ← table[i][1] 7: sum ← sum +̂ (val ? Equal(key, input)) 8: end for 9: return sum 10: . Equal returns e 0 or e 1 (eq. 22) end procedure This function can be easily expressed via function G: Equal(x, y) = e 1 −̂ (G(x −̂ y, e 1) −̂ G(y −̂ x, e 1)). (23) From a theoretical standpoint addition and multiplication operations are sufficient for Turing completeness. Hence, equality checks in the encrypted domain can be implemented using an bitwise OR operation over all bit encryptions of the difference of two numbers. In this work, the introduction of the Equal function is actually a practicality feature and not a theoretical requirement. It should be emphasized that the result of the Equal function is an encrypted value, hence no information about the compared plaintexts is actually leaked. Cryptoleq programs are not allowed to branch based on encrypted values, so the result of the Equal function cannot be used for control flow decisions. VI. P ERFORMANCE E VALUATION In this section, we assess Cryptoleq’s performance with respect to various memory organization options, encrypted operand representations, as well as different security and β parameter sizes, via a classic PIR case study. In addition to demonstrating correct functionality of our framework, the PIR case study allows us to report statistics over different operation modes, identify the most efficient configurations, and highlight the advantages of heterogeneous computation by obliviously mixing encrypted and unencrypted data. Moreover, we report Cryptoleq’s raw performance figures for encrypted addition, subtraction and multiplication operations, while comparing its efficiency to the HElib FHE library [23]. 21 TABLE I C OMPARISON OF DIFFERENT MEMORY MODELS AND OPERAND REPRESENTATIONS ( IN SECONDS ). TS X TS&ITS X&IX Sectors 3244 3236 79.3 57.0 Hash 3208 3201 61.9 16.3 Red-Black tree 3222 3195 61.8 15.1 A. PIR Case Study Private Information Retrieval (PIR) is a classic example of applications which require private computation. In the simplest scenario, the user maintains an encrypted database on an untrusted server. At some point, the user desires to retrieve some data from the database, without revealing any information about the inquiry itself, data stored in the database or the result of the inquiry. As a simple example, the database can be visualized as a table of key-value pair entries, e.g. {1:6, 2:7, 3:8, 4:9, 5:0, 6:1}. An inquiry to the database is a particular key and the expected output is the corresponding value. So, in this example, when the key 3 is requested the output returned to the user should be 8. As mentioned before, both the key input and the return result are encrypted. Therefore, PIR entails a brute-force search through all encrypted entries, secretly comparing database keys with the encrypted input, eventually returning the encrypted value when the keys match. A straightforward algorithm of the PIR example appears in alg. 4. The flow of the algorithm ensures that (i) no data are ever decrypted; and (ii) there is no branching based on sensitive data. This algorithm is implemented in CEAL in a similar fashion: the program iterates through the entries of the table, accumulating the result of the multiplication of the value of each entry and the Boolean result of the equal function between the encrypted key and the encrypted input. Alg. 4 is ported to CEAL using 24 lines of code, and when compiled it occupies approximately 30000 memory cells. B. Cryptoleq Performance Analysis 1) Memory options: Section III-E describes the various options for memory organization and operand representation. Table I shows the execution time of the PIR example running with 32-bit N for various configurations. The times are shown in seconds. The table columns define the memory cell type: TS, X, TS & inverted TS (TS&ITS), and X & inverted X (X&IX). For the last two configurations, as mentioned in section III-E, the memory storage requirement doubles since inverted values are also stored along the 22 Fig. 5. Calculation time (in seconds) and Instruction count (in 105 units) vs N bit size. original values. The rows of table I correspond to the selected memory type: Sector type, Hash map, or Red-Black tree. The results show that Hash and Red-Black tree with inverted X memory outperform all other configurations. Therefore, for the rest of the results, we select Red-Black tree and X&IX as the baseline configuration. It should be emphasized that the poor performance of Sectors is expected, as Cryptoleq performance is evaluated in emulation mode. Sectors are expected to outperform the complex Hash and RB-trees when implemented in real hardware. 2) Size of N : The next set of results investigates the effect of the security parameter size to the performance of Cryptoleq. Runtime (in seconds) and number of instructions (in 105 units) are jointly presented in fig. 5, for security parameter N ranging from 16 to 1024 bits. Since the data type range, defined by β (section III-E), affects the number of instructions executed, we perform two sets of experiments: default β , calculated to be exactly N bit size minus 3, and a restricted β = 8. With regards to unlimited β , we can quickly observe that the instruction and time overhead is prohibitive even for small sizes of N . Fortunately, as data type values over 64-bit do not natively exist in modern computer architectures and have to be emulated through big number libraries, the infeasibility of using unlimited β does not affect the practicality of Cryptoleq. Indeed, the fixed β results of fig. 5 showcase that when β is restricted to 8 bits, the performance and instruction overhead remain in the feasible range. An interesting observation is the difference in the rate of the increase of execution time to the number of instructions. Specifically, the runtime performance overhead increases roughly 3 times, while the instruction overhead grows approximately 2 times for 23 TABLE II P ERFORMANCE IMPACT OF β SELECTION . β Calls to G Instructions executed 8 498 4688612 16 1746 16696340 32 6546 64671092 64 25362 266779700 each doubling of the size of N . This is attributed to the fact that operand sizes more than 64 bits require extensive use of big number libraries, affecting runtime performance given the same number of instructions. 3) β performance: The multiplication algorithm (section IV-C) encompasses two enclosed loops proportional to β ; therefore, for each doubling of β , multiplication should take roughly 4 times longer. In this subsection we explore the performance implications of various β commonly used as native data type sizes, namely 8, 16, 32 and 64 bits. All experiments have been executed for security parameter size N = 1024, which is the minimum standard, not yet broken size. Table II presents the number of calls to function G, as well as the total number of instructions, for different β . The quadratic dependency to β , along with some expected overhead, can be extracted from the results. Since N is the same for all four cases, the running time is exactly proportional to the total number of instructions. The results corroborate that data type sizes typically used in programming languages (8-64 bits) can be practically used in Cryptoleq for acceptable, with regards to their security, key sizes. 4) Mixed-mode simulation: The results presented in the previous subsections combined the overhead of all stages required for the execution of the PIR example, including compilation time and loading time. As expected, compilation and loading times are proportional to the size of the program. Table III breaks down the execution time to the individual stage overheads, and presents several statistics for different implementations of the PIR algorithm: • Secure: The originally implemented PIR algorithm. • Mixed 1: Key values of the databases are now open (i.e. function Equal compares two open values and outputs encrypted 1 or 0). • Mixed 2: Open key values (Mixed 1), and removal of the secure multiplication (line 7) when key is not equal to input. • Open: All values are open (not encrypted). 24 TABLE III S TATISTICS FOR DIFFERENT OPERATION MODES . Secure Mixed 1 Mixed 2 Open Compilation, s 26.5 8.8 8.8 5.1 Loading, s 28.6 21.5 21.5 8.8 Execution, s 35.8 34.1 8.8 0.4 Time total, s 91.0 64.5 39.2 14.3 G calls 498 486 81 0 Open mult 0 0 0 6 Secure equal 6 0 0 0 Secure mult 6 6 1 0 Instr open 95847 13014 2499 1803 Instr secure 3053659 2990930 8981 0 Instr mixed 1535106 1499425 5173 0 Instr total 4688612 4503369 16653 1803 For the following experiments, security parameter size N is 1024 bits and β is fixed at 8 bits. While the total execution time of the Secure mode is 91 seconds, only 35.8 seconds is the running time. Compilation time incurs significant overhead as the compiler is generating the encrypted representation of the operands and program constants. When a smaller number of encrypted values is used, compilation times decreases proportionally, as shown by the compilation overheads of Mixed 1, Mixed 2 and Open. Similarly, loading time is also considerable for the Secure mode, due to the selected baseline configuration for memory cell representation: X&IX mode requires calculation of the inversion of encrypted memory cells during loading time. The overhead of the loading process remains significant for Mixed 1 and Mixed 2, due to the oblivious way the processor is treating open and encrypted values (modular inversions for open values are also computed). The difference in the execution times of the various versions of the PIR algorithm can be explained by the individual statistics shown in table III. Open instructions are classified as instructions where both A and B operands use open representation (i.e. their s part is zero), secure instructions have both operands encrypted while mixed instructions include one operand from each category. As expected, in the Secure mode, both secure multiplication and equal functions were called 6 times since the PIR database contains 6 entries. Furthermore, when the security requirements are reduced (modes Mixed 1 and Mixed 2), number of function G calls and secure instructions decreases, allowing faster computation. The latter highlights the heterogeneous property of the presented Cryptoleq abstract machine, where 25 Fig. 6. Experimental comparison on the addition, subtraction and multiplication overheads between Cryptoleq and HElib [23]. The results demonstrate that Cryptoleq additions and subtractions are 4 and 5 orders of magnitude faster than amortized HElib additions and subtractions respectively (over 500 operations). This large performance difference is expected, since Cryptoleq does not require expensive re-encryption operations periodically. Likewise, Cryptoleq’s multiplication (alg. 1) is about twice as fast compared to HElib’s amortized multiplication (over 9 operations). security requirements can be fine-tuned and prioritized according to performance implications, all within the same abstract machine supporting seamless manipulation of encrypted and unencrypted operands. C. Comparison to HElib In this section, we compare the raw performance of Cryptoleq’s addition, subtraction and multiplication against HElib FHE software library [23]. A major difference between Cryptoleq and HElib is the use of re-encryption operation; in the former, re-encryption (using function G) is needed to support each multiplication of ciphertexts, while in the latter it is needed to enable bootstrapping and refresh the noise of ciphertexts after a finite number of operations [22]. Notably, Cryptoleq does not require re-encryption for additions or subtractions, while HElib requires it after a small number of multiplications or a larger number of additions/subtractions. For the latter, this is necessary since multiplications accumulate a large amount of ciphertext noise, while additions/subtractions accumulate only little noise. In all cases, HElib’s encryption parameters control the noise tolerance of each ciphertext. Our experimental comparisons between Cryptoleq and HElib are presented in fig. 6. We evaluated both libraries on an Intel i7-3770 processor with 4GB memory, running GMP 6.1.0 and NTL 9.6.4 on Ubuntu 14.04. For our experiments, we configured a Cryptoleq security parameter size N of 1024 bits and a HElib security level of 76 bits (as in [22]), which offers approximately the same key strength [35]. Using this configuration in HElib, we were able to support up to 9 multiplications or up to 500 26 additions/subtractions, before a re-encryption operation was necessary; hence our experimental results for HElib report the amortized cost of additions, subtractions and multiplications, incorporating the cost of re-encryption over a fixed number of operations. Contrary to Cryptoleq where there is no noise accumulation, HElib ciphertexts require re-encryption depending on the size/levels of the arithmetic “circuit” computing each particular ciphertext, as well as the existing noise of other ciphertexts used as arguments in the aforementioned “circuit”. To compensate for this case by case cost, our experiments calculate an average case overhead, where a single operation is applied on ciphertexts with comparable noise levels. As reported in fig. 6, Cryptoleq’s subtractions and additions are 5 and 4 orders of magnitude faster than HElib’s addition and subtraction respectively.6 Similarly, in fig. 6 it is shown that encrypted multiplication is twice as fast in Cryptoleq, as it is in HElib. An important remark on comparing Cryptoleq to HElib is that the former provides a comprehensive framework for general purpose computation, rather than a mathematical library to evaluate arithmetic circuits. Indeed, since HElib is designed to evaluate polynomials, it first requires converting arbitrary programs into circuits featuring negations, additions and multiplications. In HElib, these conversions do not generally scale for larger input sizes (i.e., deeper circuits should be defined), and programmers should ensure bootstrapping operations are performed on each ciphertext, before noise levels reach maximum thresholds. Conversely, the CEAL compiler handles different input sizes automatically, without having to generate new Cryptoleq instruction sequences (i.e., the same instructions can support arbitrary data sizes). VII. C RYPTOLEQ S ECURITY C ONSIDERATIONS A. Design Objectives & Protection Strategy A major objective for the development and design of the Cryptoleq framework is native support for data privacy when computation is outsourced to semi-trusted parties. Our strategy to address this requirement is to protect memory confidentiality using two methods: (i) probabilistic encryption, and (ii) obfuscated decryption and re-encryption, depending on the type of mathematical operations required by the program. Thus, the following alternative protection modes are supported: 1) Encryption mode, where all memory locations are protected through Paillier PHE encryption, proven asymptotically secure as the security parameter increases. This mode is applicable when 6 Note that subtraction is Cryptoleq’s native operation, so addition is implemented using 2 subtractions along with a third subtraction for cleanup. 27 the mathematical operations in a program can reduce to subtraction of values. Prominent examples in this mode are algorithms for secure elections systems that tally votes. 2) Heuristic obfuscation mode, where all memory locations are also protected through encryption (identical to the previous one), and only in case a program needs to perform multiplication or comparison operations, obfuscated decryption and re-encryption is performed on the fly. Prominent examples in this mode are sorting and searching algorithms or Z-transforms on discrete-time signals. For mixed-mode execution and backward compatibility, Cryptoleq also supports an unencrypted mode, where memory contents are stored in plaintext format. B. Additional Discussion on Obfuscation Mode As a general protection mechanism, obfuscation can be leveraged for creating public-key cryptosystems from secret-key ones, as well as for converting public-key cryptosystems to homomorphic ones [36]. For the former, one could allow public encryption but private decryption, by obfuscating the encryption function corresponding to the secret key. Similarly, for the latter, full homomorphism can be added by creating an obfuscated algorithm that decrypts, applies a function on plaintexts, and re-encrypts the result using the cryptosystem’s key-pair. An important negative result from [36], however, is that strong obfuscator programs (called Virtual Black Boxes) do not generally exist, as there are inherently unobfuscatable functions. It is possible, however, to construct weaker obfuscator programs, based on the notion of indistinguishability obfuscation (iO). The latter notion mandates that the obfuscations of two programs that compute the same function and have the same size, must be computationally indistinguishable [36]. Even though this notion does not ensure that all information is hidden within the program, it was shown that indistinguishability obfuscators correspond to the “best-possible” obfuscators for a particular program [37]. The first construction for general purpose iO was proposed in 2013, leveraging FHE, cryptographic multilinear maps and matrix branching program for computation [38]; the key idea was to use a core iO obfuscator to protect a small fully homomorphic decryption program, to bootstrap the fully homomorphic evaluation of a much larger generic program. One limitation of cryptographic multilinear maps, however, is that their security is still investigated by the academic community, and several attacks [39]–[42], as well as mitigations [39], [43], [44] have been reported recently. Moreover, new type of attack (called zeroizing) that was not previously considered in security arguments and has yet to be demonstrated, has recently been deemed possible [39]–[41]. Considering that the discovery of secure (“clean”) multilinear maps is still an open problem in cryptography [45], as well as the attacks that have been reported recently, assurance in constructions 28 based on multilinear maps may be limited for now. In addition, the use of FHE as part of iO, effectively limits the practicality of such obfuscators. Indeed, recent performance evaluation of iO for a simple 16-bit point function showed prohibitive overheads (e.g., over 9 hours for obfuscation and at least 3 hours for evaluation, requiring more than 31GBs of program size [46], or a cluster of more than 20 computers [47]), while a 2-bit multiplication circuit could be obfuscated in 1027 years and evaluated in 108 years [48]. Evidently, the current state of the art with respect to cryptographic obfuscation does not allow practical implementations. Alternatively, obfuscation can also be approached heuristically, based on high performance on evaluation metrics, such as potency and resilience, at a reasonable cost. Indeed, it is possible to attain a realistic threat model where heuristic approaches that rely on obscurity are sufficient for most privacy-aware applications in a white-box setting. In this work, we focus on a heuristic approach for obfuscation, and function G is designed under those assumptions. C. Security Model & Limitations Following the description of our heterogeneous framework, we can further define our assumed security model. Both in case of encryption and obfuscation modes, all memory values within Cryptoleq programs are protected using a secure probabilistic PHE scheme. Thus, due to the security guarantees of that scheme, it is not generally possible to leak any plaintext information, either by examining ciphertexts in memory or by manipulating them using O1,2 operations (eq. 5). Moreover, in our case we do not consider active adversaries [49], as data confidentiality is the asset to be protected, and allowing malleability is a requirement for data manipulation. Instead of attacking the encryption scheme, a powerful adversary may attempt to analyze side-channel information of the Cryptoleq execution itself, such as IP trace patterns, memory access events or obfuscated decryptions, and extrapolate information about the protected values or the type of the running algorithm. Nevertheless, we remark that execution traces may depend only on unprotected data, since Leq comparisons (eq. 4) and branch decisions are not defined over ciphertexts, yielding unpredictable results. Naturally, the security guarantees of the underlying encryption scheme are not continuously applicable when function G is invoked to perform obfuscated decryption and re-encryption7 . Indeed, our code-based obfuscation can provide heuristic guarantees (compared to a provably secure cryptosystem), and thus our threat model is considering rational, semi-honest adversaries that are unable to deobfuscate function G routines or extract obfuscated bit-expansions of decryption key material stored inside the executable. 7 Since function G returns probabilistic ciphertexts, re-encryption also requires a trustworthy source of randomness. 29 An adversary may attempt to analyze the bit-expansion of φ(kφ)−1 N in an effort to recover φ and ultimately the PHE private key. Using a random blinding coefficient s, however, as mentioned in section IV-B, would make recovering φ from the bit expansion of s · φ(kφ)−1 N significantly harder. It may also be sufficient for an adversary that holds a copy of the executable, to detect the entry point and argument locations of the obfuscated function G and reuse this code until after decryption happens; nevertheless, when the blinding coefficient s is used, adversaries can only retrieve random multiples of the actual plaintexts. Since the resilience of the underlying obfuscation is critical, Cryptoleq can leverage self-modifying code, on-the-fly code generation, and oblivious mixing of encrypted computation with unencrypted, to increase obfuscation protections. Specifically for the blinding coefficient s, selfmodifying code allows runtime re-randomization of the bit expansion of s · φ(kφ)−1 N with new random s, without affecting correctness. As the previous discussion shows, there exist cases where deobfuscation could be a concern for the security of the system, as we employ code-based heuristic obfuscation methods when provably secure encryption cannot be applied. Still, in the rational, semi-honest adversarial model, where cryptanalysis or deobfuscation threats combined with memory snooping are not applicable, the heuristic guarantees of code-based obfuscation are sufficient for practical applications. Moreover, if the protected program does not use any multiplication or comparison operations, no obfuscated decryption routines are included in the executable and memory contents are protected with provably secure encryption. An important remark is that, since our re-encryption approach in function G generates new probabilistic ciphertexts on every invocation, no plaintext information about the inputs of function G is leaked to its outputs. Similarly, operations O1,2 (eq. 5) do not leak plaintext information either, as previously mentioned. Hence, if there existed an attack algorithm Adv that could predict plaintext information with probability better than a random guess, only by observing homomorphic operations and newly generated ciphertexts, then Adv would also break a semantically secure PHE scheme. VIII. R ELATED W ORK In the area of encrypted computation both theoretical and practical approaches have been proposed in the past. The authors of [50] have designed Ascend, which is a secure processor that employs Oblivious RAM to obfuscate memory accesses to encrypted external memory. Specifically, in Ascend, a block cipher accelerator decrypts all oblivious RAM read operations before instructions and data move into the processor caches, while data evicted from these caches are re-encrypted before oblivious memory storage. The processor contains a shared block cipher key and the chip itself is considered tamper-proof. Similarly, in Aegis [51], the authors propose a secure processor, which supports integrity attestation of 30 executed software similar to Trusted Platform Modules (tamper-evident execution), and provides privacy protection for off-chip memory using non-malleable symmetric encryption (tamper-resistant execution). The Aegis chip contains a permanent private key that is necessary to decrypt other symmetric keys used to protect data and instructions within program binaries. A theoretical approach on encrypted computation, leveraging FHE, is presented in [52], where the authors discuss an encrypted processing unit that supports static programs and provide proof of its correctness. Similarly, the authors of [14] use FHE circuits for outsourcing execution to the cloud, and further discuss the requirement of selective decryptions in order to detect termination of FHE encrypted programs (called the termination problem). In [53], the authors demonstrate general function evaluation and zeroknowledge protocols focusing on universally composable security, while [15] leverages homomorphic hashing for verifiable computation delegation. Encrypted computation using secure containers has emerged as a protection strategy for commodity processors as well. In [54], the authors present new architectural features that allow the creation of encrypted enclaves, to provide privacy and integrity protection from potentially malicious processes running on the same system with elevated privileges. In this case, a memory encryption engine protects all traffic between the processor and the system main memory. In the same direction, the hardware enforced isolation discussed in [55] provides integrity protection even in case the operating system kernel is under attack, using a cryptographic coprocessor provisioned with secret keys during fabrication. IX. C ONCLUSIONS In this paper, we have presented a new computational model based on the concept of single instruction architecture, able to execute programs whose instruction operands have been encrypted using Paillier PHE scheme. Universal computation is achieved by introducing a software function, which adds multiplication to the abstract machine’s native addition and subtraction operations. This function is expressed using the only available instruction. We have also developed an enhanced assembly language to facilitate the development of complex programs, in addition to a compiler and an emulator. We evaluated this framework and our experimental results show that Cryptoleq incurs practical overhead when used with typical range of valid numbers. Cryptoleq allows for several future improvements with regards to performance and security. The former can be improved through the introduction of high-radix representations (e.g. Montgomery), and advanced runtime techniques (such as automatic detection of open values to replace homomorphic multiplication with plaintext addition). Similarly, binary obfuscation is also a heavily researched topic and future work 31 will explore the application of such techniques to Cryptoleq binaries to enhance the obfuscation offered by our framework. R ESOURCES The executable files for the CEAL compiler and emulator, as well as sample Cryptoleq programs can be found at [34]. A PPENDIX A F UNCTION G C ALCULATION E XAMPLE To illustrate the function G operation, we select a small security parameter N = p·q = 3·5 = 15, β = 2 and a random parameter k = 2. Then, a message m1 = 3 can be encrypted as x = 415 · (1 + 15 · 2 · 3) = 109 mod 152 , using eq. 7 and a random r1 = 4. Similarly, a message m2 = 1 can be encrypted as y = 215 · (1 + 15 · 2 · 1) = 158 mod 152 using a random r2 = 2. For N = 15, we have Euler’s −1 −1 φ = (p − 1) · (q − 1) = 2 · 4 = 8 and (kφ)−1 N = (2 · 8)15 = 1; hence, φ(kφ)N equals 8 · 1 = 8 or 10002 in binary representation. Computing the value G(x, y) entails raising ciphertext x = 109 to power 10002 (as in eq. 19), before reducing to mod 152 at each step. As discussed in section V-C, the latter can be performed using squaring and multiplying over the bits of exponent 10002 . Initializing our accumulator A to 1, we have x ← 1092 = 181 for the LSB of 10002 , x ← 1812 = 136 for the second bit, x ← 1362 = 46 for the third bit, and A ← A · 46 = 46 for the non-zero MSB (squaring x for the MSB is unnecessary, as A already holds the open value x = 46). Using eq. 15, we can confirm that (46 − 1)/15 = 3 = m1 . Subsequently, we use eq. 17 to compute Leq(46) = F alse, since 46 > (1 + 15) and 46 < (15 · 2blog2 15c = 120). After selecting a new random r = 7, function G generates e 0 = 715 = 118. Finally, function G returns a re-encryption of argument y = 158, as e 0 +̂ y = 118 · 158 = 194. We confirm that ciphertext 194 is another encryption of m2 = 1, since 1948 = 16 = 1 + (15 · 1) (using eq. 19) is the open value for plaintext 1. Note that all steps are mod 152 . Alternatively, choosing m1 = 13 as our plaintext, we get x = 415 · (1 + 15 · 2 · 13) = 184 (using a random r1 = 4). Then, raising 184 to 10002 as before, would return x = 196 = 1 + (15 · 13) (i.e., the open value for 13). In this case Leq(196) = T rue, since 196 > (15 · 2blog2 15c = 120) (eq. 17). Finally, using a random r = 8, function G generates and returns e 0 = 815 = 107. As before, all steps are mod 152 . 32 A PPENDIX B E NCRYPTED M ULTIPLICATION E XAMPLE We demonstrate Cryptoleq’s ciphertext multiplication (alg. 1) using a simple example, with security parameter N = p · q = 7 · 11 = 77, β = 3 and random parameter k = 3. If x = 477 · (1 + 77 · 3 · 2) = 1248 is the encryption of m1 = 2 with a random r1 = 4, and y = 577 · (1 + 77 · 3 · 3) = 3776 with random r2 = 5 is the encryption of m2 = 3, then for alg. 1 it should hold Dec(x ? y) = 2 · 3 = 6. Unless noted otherwise, all operations are mod 772 . A. Integer Division by 2 for Power of 2 Inputs (Alg. 3) j−1 for j ∈ [1, β] (i.e., returns an encryption of the Given an ciphertext 2ej , auxiliary alg. 3 returns 2g next smaller power of 2). Since we have β powers of 2, as mentioned in section IV-C, this algorithm j−1 as a list of β (key:value) pairs. Since may be replaced by a loop-up table encoding Half2(2ej ) = 2g β = 3 in this example, we have 2e3 = 1481 for r = 2, 2e2 = 1307 for r = 3, 2e1 = 1248 for r = 4, and e 1 = 2390 with r = 5; hence, Half2 = [(1481 : 1307), (1307 : 1248), (1248 : 2390)]. The latter can also be re-randomized using a new e 0 and updating all 2ej in the table with e 0 +̂ 2ej . B. Integer Division by 2 for Any Input (Alg. 2) The returned value of Div2(x) is equivalent to Enc(bDec(x)/2c), so alg. 2 is homomorphic to integer division by 2 (which is useful for isolating an LSB in homomorphic “shift & add” multiplication, as in alg. 1). To compute Div(x) for x = 1248 using the steps of alg. 2, we first initialize sum with e 0 = 3155 (using a random r = 3). Moreover, p2 is initialized with 2e3 = 1481, which matches the first key of the Half2 look-up table. In the first loop iteration, p2 is updated with Half2(1481) = 1307, and local variable y is assigned 3155 · 1307 = 2930. Then, we compute the squaring 2930 · 2930 = 5637, as well as the inverse x−1 = 1248−1 = 4442, before updating local y with 5637 · 4442 = 1387. After generating a new e 1 = 3481 using r = 4, we find the inverse y −1 = 1387−1 = 4544 and compute e 1 −̂ y = 3481 · 4544 = 5021. Using the latter and p2 , we compute G(5021, 1307) = 1812; internally, function G generates its own e 0 = 1812 with a random r = 6, raises 5021 to power φ(kφ)−1 N = 180 to get open value 5545, and computes Leq(5545) = T rue (since 5545 > (77 · 2blog2 77c = 4928) using eq. 17).8 Finally, sum is increased to 3155 · 1812 = 1304. 8 Recall that if the result of Leq(·) was F alse, function G would have returned 1812 · 1307 = 2613 instead of 1812. 33 Since β = 3, the loop is executed two more times: in the second iteration p2 ← Half2(1307) = 1248, y ← 1304·1248 = 2846, then y ← 702·4442 = 5559, e 1 −̂ y = 3481·5272 = 1577, and G(1577, 1248) = 1697 (using e 0 = 1697 with r = 5, 1577180 = 5853 and Leq(5853) = T rue). Then, sum is updated with 1304 · 1697 = 1371. Similarly, in the last iteration, p2 ← Half2(1248) = 2390, y ← 1371 · 2390 = 3882, then y ← 4335 · 4442 = 4607, e 1 −̂ y = 3481 · 148 = 5294, and G(5294, 2390) = 3137 · 2390 = 3174 (using e 0 = 3137 with r = 8, 5294180 = 78 and Leq(78) = F alse). Finally, the algorithm returns sum = 1371 · 3174 = 5597. Correctness is verified since Dec(Div2(1248)) = Dec(5597) = (5597180 − 1)/77 = 1 mod 77 (using eq. 8), which equals bDec(1248)/2c = b2/2c = 1. The interested reader may also verify alg. 2 for y = 3776 encrypting m2 = 3: using the same r values for the different e 0 and e 1 as in the previous iterations, we get Div2(3776) = 5597. Since Dec(5597) = 1 = b3/2c, the result is also correct. C. Encrypted Multiplication (Top Level Alg. 1) Alg. 1 is homomorphic to traditional “shift and add” multiplication, and the returned value of Multiply(x, y) is equivalent to Enc(Dec(x) · Dec(y)). If x = 1248 and y = 3776 (for m1 = 2 and m2 = 3 respectively), then Dec(Multiply(x, y)) should equal plaintext 6. To verify correctness, we initialize sum with e 0 = 5163 (using r = 9) and execute the main loop β + 1 = 4 times. Inside the loop, z is updated with Div2(1248) = 5597, before this value is squared to 5597·5597 = 3502 and inverted to 3502−1 = 5030. Then, bit is updated with 1248 · 5030 = 4558, which corresponds to an encryption of the LSB of m1 = Dec(1248) = 2 = 102 , since Dec(4558) = 0. We then compute G(4558, 3776) = 5714 (using a new e 0 = 5714 with r = 4) and update sum with 5163 · 5714 = 4607 (note that this corresponds to a homomorphic “add”). Finally, y gets 3776 · 3776 = 4860 (i.e., y is homomorphically “shifted”), and x is updated with 5597 (note that Dec(5597) = 1) In the second iteration, we get will get z2 ← 2302, bit2 ← 5597·(2302·2302)−1 = 4225, G(4225, 4860)2 = 1697 · 4860 = 181 (using r = 5 for e 0 = 1697), sum2 ← 4607 · 181 = 3807, y2 ← 4860 · 4860 = 4393, and x2 ← 2302. Now, Dec(sum2 ) = 6, Dec(y2 ) = 12, and Dec(x2 ) = 0. If we continue iterating, we have z3 ← 2302, bit3 ← 2302 · (2302 · 2302)−1 = 2743, G(2743, 4393)3 = 5163 (using r = 9 for e 0), sum3 ← 3807 · 5163 = 906, y3 ← 4393 · 4393 = 5483, and x3 ← 2302 (note that Dec(sum3 ) = 6, Dec(y3 ) = 24 and Dec(x3 ) = 0).9 At last, z4 ← 2302, bit4 ← 2743, G(2743, 5483)4 = 3791 (with r = 2 for e 0), sum4 ← 906 · 3791 = 1755, y4 ← 5483 · 5483 = 3259, and x4 ← 2302. The returned value sum = 1755 is correct since Dec(1755) = (1755180 − 1)/77 = 6 = m1 · m2 . 9 Note that, in this simple example, z remains unchanged since the same r values were used in each invocation of Div2. 34 ACKNOWLEDGMENT This work was partially sponsored by the NYU Abu Dhabi Global Ph.D. Student Fellowship program, as well as the NYU Abu Dhabi Research Enhancement Fund. R EFERENCES [1] M. B. Taylor, “Bitcoin and the age of bespoke silicon,” in International Conference on Compilers, Architectures and Synthesis for Embedded Systems. IEEE Press, 2013, p. 16. [2] G. Woltman and S. Kurowski, “The great internet mersenne prime search,” [Online]. Available: http://www.mersenne.org, 2004. [3] R. Chow, P. Golle, M. Jakobsson, E. Shi, J. Staddon, R. Masuoka, and J. Molina, “Controlling data in the cloud: outsourcing computation without outsourcing control,” in Cloud Computing Security Workshop. ACM, 2009, pp. 85–90. [4] Y. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, “Cross-VM side channels and their use to extract private keys,” in Computer and Communications Security (CCS), 2012, pp. 305–316. [5] N. G. Tsoutsos, C. Konstantinou, and M. Maniatakos, “Advanced techniques for designing stealthy hardware trojans,” in Design Automation Conference (DAC), 2014, pp. 1–4. [6] G. T. Becker, F. Regazzoni, C. Paar, and W. P. Burleson, “Stealthy dopant–level hardware trojans,” in Cryptographic Hardware and Embedded Systems Workshop, 2013, pp. 197–214. [7] N. G. Tsoutsos and M. Maniatakos, “Fabrication attacks: Zero-overhead malicious modifications enabling modern microprocessor privilege escalation,” IEEE Transactions on Emerging Topics in Computing, vol. 2, no. 1, pp. 81–93, 2014. [8] K.-M. Chung, Y. Kalai, and S. Vadhan, “Improved delegation of computation using fully homomorphic encryption,” in Advances in Cryptology–CRYPTO 2010. Springer, 2010, pp. 483–501. [9] N. G. Tsoutsos and M. Maniatakos, “The HEROIC Framework: Encrypted Computation Without Shared Keys,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 34, no. 6, pp. 875–888, 2015. [10] C. Gentry, “Fully homomorphic encryption using ideal lattices,” in ACM Symposium on Theory of Computing, 2009, pp. 169–178. [11] M. Van Dijk, C. Gentry, S. Halevi, and V. Vaikuntanathan, “Fully homomorphic encryption over the integers,” in Advances in Cryptology–EUROCRYPT 2010. Springer, 2010, pp. 24–43. [12] C. Gentry, “A fully homomorphic encryption scheme,” Ph.D. dissertation, Stanford University, 2009. [13] N. Smart and F. Vercauteren, “Fully homomorphic encryption with relatively small key and ciphertext sizes,” Cryptology ePrint Archive, Report 2009/571, 2009, http://eprint.iacr.org/. [14] M. Brenner, J. Wiebelitz, G. Von Voigt, and M. Smith, “Secret program execution in the cloud applying homomorphic encryption,” in Digital Ecosystems and Technologies Conference (DEST), 2011, pp. 114–119. [15] D. Fiore, R. Gennaro, and V. Pastro, “Efficiently verifiable computation on encrypted data,” in Computer and Communications Security (CCS). ACM, 2014, pp. 844–855. [16] A. López-Alt, E. Tromer, and V. Vaikuntanathan, “On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption,” in ACM Symposium on Theory of Computing. ACM, 2012, pp. 1219–1234. [17] R. Gennaro and D. Wichs, “Fully homomorphic message authenticators,” in Advances in Cryptology-ASIACRYPT 2013. Springer, 2013, pp. 301–320. 35 [18] Y. Zhang, C. Papamanthou, and J. Katz, “Alitheia: Towards practical verifiable graph processing,” in Computer and Communications Security (CCS). ACM, 2014, pp. 856–867. [19] M. Naehrig, K. Lauter, and V. Vaikuntanathan, “Can homomorphic encryption be practical?” in Cloud Computing Security Workshop. ACM, 2011, pp. 113–124. [20] B. Schneier, “Homomorphic encryption breakthrough,” [Online]. Available: http://www.schneier.com/blog/archives/2009/ 07/homomorphic enc.html, 2009, (Accessed: 11/13/15). [21] Z. Brakerski, C. Gentry, and V. Vaikuntanathan, “(Leveled) fully homomorphic encryption without bootstrapping,” in Innovations in Theoretical Computer Science Conference, 2012, pp. 309–325. [22] S. Halevi and V. Shoup, “Bootstrapping for HElib,” in Advances in Cryptology–EUROCRYPT 2015. Springer, 2015, pp. 641–670. [23] ——, “HElib: Design and implementation of a homomorphic-encryption library,” [Online]. Available: https://github.com/ shaih/HElib. [24] A. Daly and W. Marnane, “Efficient architectures for implementing montgomery modular multiplication and RSA modular exponentiation on reconfigurable logic,” in ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2002, pp. 40–49. [25] T. Blum and C. Paar, “Montgomery modular exponentiation on reconfigurable hardware,” in Proceedings of the 14th IEEE Symposium on Computer Arithmetic. IEEE, 1999, pp. 70–77. [26] C. Mclvor, M. McLoone, and J. V. McCanny, “Fast Montgomery modular multiplication and RSA cryptographic processor architectures,” in Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 1. IEEE, 2003, pp. 379–384. [27] P. L. Montgomery, “Modular multiplication without trial division,” Mathematics of computation, vol. 44, no. 170, pp. 519–521, 1985. [28] Y. Hu, W. J. Martin, and B. Sunar, “Enhanced flexibility for homomorphic encryption schemes via CRT,” Proc. ACNS, Springer Verlag, LNCS, pp. 93–110, 2012. [29] P. Paillier, “Public-key cryptosystems based on composite degree residuosity classes,” in Advances in cryptology– EUROCRYPT’99, 1999, pp. 223–238. [30] S. Goldwasser and S. Micali, “Probabilistic encryption,” Journal of computer and system sciences, vol. 28, no. 2, pp. 270–299, 1984. [31] W. F. Gilreath and P. A. Laplante, Computer Architecture: A Minimalist Perspective. Springer, 2003, pp. 55–69. [32] O. Mazonka, “Bit copying - the ultimate computational simplicity,” Complex Systems, vol. 19, pp. 1–23, 2009. [33] O. Mazonka and A. Kolodin, “A simple multi-processor computer based on subleq,” arXiv preprint arXiv:1106.2593, 2011. [34] O. Mazonka, “Cryptoleq implementation repository,” [Online]. Available: https://github.com/momalab/cryptoleq/. [35] E. B. Barker, W. C. Barker, W. E. Burr, W. T. Polk, and M. E. Smid, “SP 800-57. Recommendation for Key Management, Part 1: General (revised),” National Institute of Standards & Technology, Gaithersburg, MD, United States, Tech. Rep., 2007. [36] B. Barak, O. Goldreich, R. Impagliazzo, S. Rudich, A. Sahai, S. Vadhan, and K. Yang, “On the (im) possibility of obfuscating programs,” in Advances in Cryptology–CRYPTO 2001, 2001, pp. 1–18. [37] S. Goldwasser and G. N. Rothblum, “On best-possible obfuscation,” in Theory of Cryptography. Springer, 2007, pp. 194–213. [38] S. Garg, C. Gentry, S. Halevi, M. Raykova, A. Sahai, and B. Waters, “Candidate indistinguishability obfuscation and 36 functional encryption for all circuits,” in Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on. IEEE, 2013, pp. 40–49. [39] D. Boneh, D. J. Wu, and J. Zimmerman, “Immunizing multilinear maps against zeroizing attacks.” IACR Cryptology ePrint Archive, vol. 2014, p. 930, 2014. [40] J.-S. Coron, T. Lepoint, and M. Tibouchi, “Cryptanalysis of two candidate fixes of multilinear maps over the integers.” IACR Cryptology ePrint Archive, vol. 2014, p. 975, 2014. [41] C. Gentry, S. Halevi, H. K. Maji, and A. Sahai, “Zeroizing without zeroes: Cryptanalyzing multilinear maps without encodings of zero.” IACR Cryptology ePrint Archive, vol. 2014, p. 929, 2014. [42] J. H. Cheon, K. Han, C. Lee, H. Ryu, and D. Stehlé, “Cryptanalysis of the multilinear map over the integers,” in Advances in Cryptology–EUROCRYPT 2015. Springer, 2015, pp. 3–12. [43] J.-S. Coron, T. Lepoint, and M. Tibouchi, “New multilinear maps over the integers,” in Advances in Cryptology–CRYPTO 2015. Springer, 2015, pp. 267–286. [44] S. Garg, C. Gentry, S. Halevi, and M. Zhandry, “Fully secure functional encryption without obfuscation.” IACR Cryptology ePrint Archive, vol. 2014, p. 666, 2014. [45] J. Zimmerman, “How to obfuscate programs directly,” in Advances in Cryptology-EUROCRYPT 2015. Springer, 2015, pp. 439–467. [46] D. Apon, Y. Huang, J. Katz, and A. J. Malozemoff, “Implementing cryptographic program obfuscation.” IACR Cryptology ePrint Archive, vol. 2014, p. 779, 2014. [47] D. J. Bernstein, A. Hülsing, T. Lange, and R. Niederhagen, “Bad directions in cryptographic hash functions,” in Information Security and Privacy. Springer, 2015, pp. 488–508. [48] S. Banescu, M. Ochoa, N. Kunze, and A. Pretschner, “Idea: Benchmarking indistinguishability obfuscation–a candidate implementation,” in Engineering Secure Software and Systems. Springer, 2015, pp. 149–156. [49] P. Paillier and D. Pointcheval, “Efficient public-key cryptosystems provably secure against active adversaries,” in Advances in Cryptology-ASIACRYPT99. Springer, 1999, pp. 165–179. [50] C. W. Fletcher, M. v. Dijk, and S. Devadas, “A secure processor architecture for encrypted computation on untrusted programs,” in Scalable Trusted Computing Workshop, 2012, pp. 3–8. [51] G. E. Suh, D. Clarke, B. Gassend, M. Van Dijk, and S. Devadas, “Aegis: architecture for tamper-evident and tamper-resistant processing,” in International Conference on Supercomputing. ACM, 2003, pp. 160–171. [52] P. T. Breuer and J. P. Bowen, “A fully homomorphic crypto-processor design,” in Engineering Secure Software and Systems. Springer, 2013, pp. 123–138. [53] R. Canetti, A. Jain, and A. Scafuro, “Practical UC security with a global random oracle,” in Computer and Communications Security (CCS). ACM, 2014, pp. 597–608. [54] F. McKeen, I. Alexandrovich, A. Berenzon, C. V. Rozas, H. Shafi, V. Shanbhogue, and U. R. Savagaonkar, “Innovative instructions and software model for isolated execution,” in Hardware and Architectural Support for Security and Privacy Workshop (HASP). ACM, 2013, pp. 1–10. [55] Apple Computer, “iOS Security (whitepaper),” [Online]. Available: https://www.apple.com/business/docs/iOS Security Guide.pdf, 2015.