Dual universality of hash functions and its

0 downloads 0 Views 346KB Size Report
Abstract. In this paper, we introduce the concept of dual universality of hash functions and present its applications to various quantum and classical ...
1

Dual universality of hash functions and its applications to classical and quantum cryptography Toyohiro Tsurumaru and Masahito Hayashi

arXiv:1101.0064v2 [quant-ph] 24 Jan 2011

Abstract In this paper, we introduce the concept of dual universality of hash functions and present its applications to various quantum and classical communication models including cryptography. We begin by establishing the one-to-one correspondence between a linear function family F and a code family C, and thereby defining ε-almost dual universal2 hash functions, as a generalization of the conventional universal2 hash functions. Then we give a security proof for the Bennett-Brassard 1984 protocol, where the Shor-Preskill–type argument is used, but nevertheless ε-almost dual universal2 functions can be used for privacy amplification. We show that a similar result applies to the quantum wire-tap channel as well. We also apply these results on quantum models for investigating the classical wire-tap channel and randomness extraction, and obtain various new results, including the existence of a deterministic hash function that is universally secure against different types of wire-tapper. For proving these results, we present an extremely simple argument by simulating the classical channels by quantum channels, where the strength of Eve’s wire-tapping can be measured by the phase bit error rate. Under this setting of quantum simulation, we demonstrate that our ε-almost dual universal2 functions are more relevant for security than the conventional ε-almost universal2 hash functions, by showing that the former functions correspond to a linear code family having an appropriate phase-error correcting property. These examples suggest the importance of quantum approaches in classical settings of information theory, as well as the dual universality of hash functions.

I. I NTRODUCTION The concept of universal hash functions [4] has a variety of cryptographic applications, for example, for the information theoretically secure signatures, the hash functions for the wire-tap channel and for privacy amplification [28]. In this paper, we introduce the concept of ε-almost dual universal hash functions and present its applications to various quantum and classical communication models including cryptography. The first application is to a security proof for quantum key distribution (QKD) and the quantum-wiretap channel. Next we apply our results on these quantum models for investigating the classical wire-tap channel and randomness extraction, and obtain various new results, such as the existence of a deterministic hash function that is universally secure against different types of wire-tapper. For proving these results, we present an extremely simple argument by simulating the classical channels by quantum channels, where the strength of Eve’s wire-tapping can be measured by the phase bit error rate. These examples suggest the importance of quantum approaches in classical settings of information theory, as well as the dual universality of hash functions. We begin in Section II by reviewing the conventional universal hash functions, i.e., the properties of ε-almost universal2 functions. Then we restrict ourselves to linear hash functions over a finite field Fn2 , and establish a one-to-one correspondence between a linear hash function family F and a linear code family C, by using the simple fact that a kernel of a linear function is a linear space, and thus can be considered as a code. This correspondence does not only allow us to define the code family C of a given universal hash function family F , but also the dual code family C ⊥ corresponding to it. Under this setting, interestingly, a simple algebraic argument shows that the universality of C (i.e., the property of C being universal2) also guarantees that of C ⊥ (see Fig. I). For example, (1) if C is universal2, or equivalently, 1-almost universal2 , then C ⊥ is 2-almost universal2, but nevertheless, (2) for an ε-almost universal2 code family C with ε > 1, the dual code family C is not necessarily ε-almost universal2, as can be seen from an explicit counterexample. These results lead us to introduce a new class of hash functions called an ε-almost dual universal2 hash function family, as a set of hash functions whose kernels form an ε-almost dual universal2 code family. This concept is indeed a generalization of the conventional universality2, since a universal2 hash function family is a special case of our ε-almost dual universal2 family. As we shall show in subsequent sections, this weaker notion of universality has applications in many communication models. In Section III, we also introduce the concept of the permuted code family, as the set of codes obtained by permuting bits of a given code C. Then we show the existence of a code C, whose permuted family CC is (n + 1)-almost dual universal2, with n being the bit length of C. The code C of this type is particularly useful when the setting of our communication model is invariant under bit permutations, since the average performance of the code C equals that of an (n + 1)-almost dual universal2 code family. Due to this property, the permuted code family plays a key role in showing the existence of a deterministic hash function that works universally for different types of channels. T. Tsurumaru is with Mitsubishi Electric Corporation, Information Technology R&D Center, 5-1-1 Ofuna, Kamakura-shi, Kanagawa, 247-8501, Japan (e-mail: [email protected]). M. Hayashi is with Graduate School of Information Sciences, Tohoku University, Aoba-ku, Sendai, 980-8579, Japan, and Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117542. (e-mail: [email protected])

2

In Section IV, as a preparation for later sections, we apply the results of Sections 2 and 3 to error correction. We show that a code C ∈ C serves as a good code when it is chosen randomly from an ε-almost universal2 code family C. In Section V, we apply these results to the security proof of a QKD protocol called the Bennett-Brassard 1984 (BB84) protocol [1]. We use the proof technique of the Shor-Preskill–type, which reduces the security of a secret key to the error correcting property of the Calderbank-Shor-Steane (CSS) quantum error correcting code (e.g., [26], [10], [29], [14]). This proof technique is elegant and widely used, but also has a drawback. That is, it requires the implementation of the classical CSS code in actual QKD systems, which can be difficult especially for large block lengths. On the other hand, by using the quantum de Finneti representation theorem, Renner [25] has shown the security of the BB84 protocol using universal2 hash functions for privacy amplification, which can be implemented easily in practice (see, e.g., [9]). The security proof of the present paper combines the best of both worlds; that is, it is based on the Shor-Preskill formalism, but it nevertheless allows to use ε-almost dual universal2 function. Note that the restriction on hash functions is relaxed here, since, as mentioned earlier, the conventional universal2 function family is a limited case of ε-almost dual universal2 families. Then, in Section VI and VII, we apply our results on QKD to the quantum and classical wiretap channels. In these model, a sender Alice has channels to two receivers, i.e., an authorized receiver Bob, and an unauthorized receiver Eve, often referred to as a wire-tapper. The channels from Alice to Bob and to Eve are not necessarily restricted to any type, but we assume that they are both specified when we analyze the security. The main issue here is to obtain the asymptotic secure transmission rate with appropriate coding protocols. The net transmission rate can be given as the information transmission rate R′ to Bob minus the sacrifice bit rate R. The former rate can be treated in the framework of error correcting code. The latter rate corresponds to a privacy amplification process. Under these settings, in Section VI, we consider a specific type of the quantum wire-tap channel where Alice and Bob are connected by the Pauli channnel. By applying our results on QKD to this model, we show that an ε-almost dual universal2 function family is sufficient for removing Eve’s information. Then by using the invariance of the channel under bit permutations, we also show the existence of a deterministic hash function that works universally, that is, the hash function whose construction does not depend on the phase error probability caused by the wiretapper. Next, in Section VII, we consider the classical wire-tap channel model, where all channels are classical. Here we only assume that the channel from Alice to Bob is binary, and besides that, all channels are general; i.e., they are not restricted to any form, unlike the Pauli channel of the previous section. In these channels, Alice and Bob use an ε-almost dual universal2 function family for privacy amplification. In order to analyze the security of this model, we simulate the classical channel by using a quantum channels, where the strength of Eve’s wire-tapping is reflected in the phase bit error rate. Applying the idea of quantum error correcting code, we show the strong security for the case where the input alphabets is {0, 1}. As shown in Section II, the class of ε-almost dual universal2 functions is wider than that of universal2 functions. We also give a counterexample for the strong security under the ε-almost universality2 for privacy amplification. This example tells us that the ε-almost dual universality2 is more relevant for strong security than the ε-almost universality2 (c.f., Fig. I). Then using the result of Section III, we construct a deterministic hash function that is universally secure regardless of the form of Eve’s channel. This example suggests the importance of quantum approaches in the classical setting, as well as the dual universality of hash functions. 㱑-almost universal hash functions universal hash functions

example given in Subsection 7.3 modified Toeplitz matrices permuted code family given in Section 3

dual universal hash functions 㱑-almost dual universal hash functions

strongly secure hash functions

Fig. 1. Relation among linear hash functions (when ε increases as a polynomial of n). The modified Toeplitz matrices are given by a concatenation (X, I) of the Toeplitz matrix X and the identity matrix I, mentioned in Section II.

Finally in Section VIII, we show that the above result on wire-tap channels can be used to analyze the performance of randomness extraction. Again by noting the invariance under bit permutations, we can construct a deterministic hash function that works universally even when the original random source is an arbitrary binary distribution with probability p. It should also be noted that the asymptotic generation rate of our extractor is larger than the minimum entropy achieved in the conventional method [18].

3

II. D UAL

UNIVERSALITY OF A CODE FAMILY

A. Linear universal hash functions as a linear code family We start by reviewing the basic properties of universal2 hash functions. Consider sets A and B, and also a function family F consisting of functions from A to B; that is, F is a set of function F = {fr |r ∈ I} with fr : A → B, where I denotes a set of indices r of hash functions. Our purpose is to select fr with an equal probability and use them as a hash function, and for this purpose, we always let |A| ≥ |B| ≥ 2. We say that a function family F is ε-almost universal2 [4], [31], if, for any pair of different inputs x1 ,x2 , the collision probability of their outputs is upper bounded as Pr [fr (x1 ) = fr (x2 )] =

ε 1 # { r ∈ I | fr (x1 ) = fr (x2 ) } ≤ . |I| |B|

(1)

The parameter ε appearing in (1) is shown to be confined in the region ε≥

|A| − |B| , |A| − 1

(2)

and in particular, a function family F attaining the equality of (2) is called an optimally universal2 function family [27]. On the other hand, a family F with ε = 1 is simply called a universal2 function family. Two important examples of universal2 hash function families are the Toeplitz matrices (see, e.g., [20]), and multiplications over a finite field (see, e.g., [4], [2]). A modified form of the Toeplitz matrices is also shown to be universal2, which is given by a concatenation (X, I) of the Toeplitz matrix X and the identity matrix I [16]. The (modified) Toeplitz matrices are particularly useful in practice, because there exists an efficient multiplication algorithm using the fast Fourier transform algorithm with complexity O(n log n) (see, e.g., [9]). In this paper, we focus only on linear functions over a finite field F2 . We assume that sets A,B are Fn2 , Fm 2 respectively with n ≥ m, and fr are linear functions over F2 . Note that, in this case, there is a kernel Cr corresponding to each fr , which is a vector space of n − m dimensions or more. Also note that, conversely, when given a vector subspace Cr ⊂ Fn2 of n − m dimensions or more, one can always construct a linear function f˜r : Fn2 → Fn2 /Cr ∼ = Fl2 , l ≤ m

(3)

1

This means that, by considering Cr as a error-correcting code , we can always identify a linear hash function fr and a error correcting code Cr . In this terminology, the definition of ε-universal2 function family of (1) takes the form h i ∀x ∈ Fn2 \ {0}, Pr f˜r (x) = 0 ≤ 2−m ε, (4)

which can further be rewritten as

∀x ∈ Fn2 \ {0}, Pr [x ∈ Cr ] ≤ 2−m ε.

(5)

This shows that the set of kernel C = {Cr |r ∈ I} contains sufficient information for determining if a function family F = {fr |r ∈ I} is ε-almost universal2 or not. To see this in more detail, we give explicit constructions. For later convenience, we denote a generating matrix of a code C by G(C), so that the raws of G(C) are basis vectors of C. We also denote a parity check matrix of C by H(C), hence one may choose H(C) = G(C ⊥ ). If one wants to construct Cr from fr , let x be a column vector, and define a linear function fr as y = fr (x) = Mr x by using an m × n-matrix Mr . Here Mr corresponds to a parity check matrix of error-correcting code Cr , and thus the row vectors of Mr spans Cr⊥ . On the contrary, if one wants to construct a linear function f˜r : Fn2 → Fm 2 ⊥ ⊥ n ⊂ F as {u , . . . , u }, and a basis of ≤ m, and take a basis of C from a code Cr , one do as follows. First, let l := dim C 1 l 2 r r P T ˜ ˜ ˜ Fm 2 as {v1 , . . . , vm }. Then define a matrix Mr = i vi ui , and let fr (x) = Mr x. ˜ It should be noted that, in fact, this construction of fr has an ambiguity that comes from choices of bases {ui } and {vi }. By the above procedure, even when one constructs Cr from fr , and then f˜r from the obtained Cr , f˜r and fr may not equal in general. In this paper, however, we do not worry about this ambiguity, because (i) the ambiguity does not affect the property of f˜r being ε-almost universal2, and (ii) the ambiguity is absent after all when we actually implement and operate universal hash functions for cryptographic purposes; in such cases, we never think of Cr as a vector space, but rather specify matrices Mr or basis sets of Cr explicitly. Note that a similar situation happens with error-correcting codes as well; i.e., it is convenient to interpret Cr as a mathematical vector space when one analyzes the code theoretically, but in practice one can never implement a code as a program or a circuit without specifying the basis vectors, or equivalently, the parity check and the generating matrices. 1 For

the present, we take a standpoint that any vector subspace of Fn 2 is a code, whether or not it can actually correct errors.

4

B. Dual universality of a code family From these arguments, we define the universality of error-correcting codes as follows. Definition 1: We define the minimum (respectively, maximum) dimension of a code family C = {Cr |r ∈ I} as tmin := minr∈I dim Cr (respectively, tmax := maxr∈I dim Cr ). Definition 2: We define the dual code family C ⊥ of a given linear code family C = {Cr |r ∈ I} as the set of all dual codes of Cr . That is, C ⊥ = {Cr⊥ |r ∈ I}. Definition 3: We say that a linear code family C = { Cr ⊂ Fn2 | r ∈ I } of minimum (or maximum) dimension t is an ε-almost universal2 code family, if the following condition is satisfied ∀x ∈ Fn2 \ {0}, Pr [x ∈ Cr ] ≤ 2t−n ε.

(6)

As in the case of a universal2 function family, ε is bounded from below by (2) as ε ≥ (2n − 2n−t )/(2n − 1). For the case where ε achieves this minimum, we say that C is optimally universal2 . Similarly, if ε = 1, we call C a universal2 code family. We also introduce the notion of dual universality as follows. Definition 4: We say that a code family C is ε-almost dual universal2 , if the dual family C ⊥ is ε-almost universal2. Hence, accordingly, Definition 5: A linear function family F = {fr |r ∈ I} is ε-almost dual universal2 , if the kernels Cr of fr form an ε-almost dual universal2 code family. An explicit example of a dual universal2 function family (with ε = 1) can be given by the modified Toeplitz matrices mentioned earlier [14], i.e., a concatenation (X, I) of the Toeplitz matrix X and the identity matrix I. This example is particularly useful in practice because it is both universal2 and dual universal2 (c.f., Fig. I), and also because there exists an efficient algorithm with complexity O(n log n). With these preliminaries, we can present the following main theorem of this section: Theorem 1: Given an ε-almost universal2 code family C of minimum dimension t, the dual code family C ⊥ is a 2(1 − t−n 2 ε) + (ε − 1)2t -almost universal2 code family with maximum dimension n − t. That is, for ∀x ∈ Fn2 \ {0}, the dual code family C ⊥ satisfies   (7) Pr x ∈ Cr⊥ ≤ (1 − 2t−n ε)2−t+1 + ε − 1. In other words, the code family C is also 2(1 − 2t−n ε) + (ε − 1)2t -almost dual universal2. Proof: For x, y ∈ Fn2 , let   px := Pr x ∈ Cr⊥ , Vx

:=

{y ∈ Fn2 |(x, y) = 0} = {x, 0}⊥ ,

where (x, y) denotes the inner product of x, y. Since #(Vx \ {0}) = 2n−1 − 1, X X 2t−n ε ≥ 2t−n ε(2n−1 − 1) = y∈Vx \{0}

y∈Vx \{0}

Pr [y ∈ Cr ] .

(8) (9)

(10)

Now, (i) If x ∈ Cr⊥ , it means that Cr ⊂ Vx , and we have dim(Cr ∩Vx ) = dim Cr ≥ t. Hence it follows that #(Cr ∩Vx \{0}) = t #(Cr \ {0}) / Cr⊥ , we have dim(Cr ∩Vx ) ≥ t− 1, and thus #(Cr ∩Vx \ {0}) ≥ 2t−1 − 1. P ≥ 2 − 1. On the other hand, (ii) If x ∈ Because y∈Vx \{0} Pr [y ∈ Cr ] is equal to the average of the number of #(Cr ∩ Vx \ {0}), relations (i) and (ii) yields X Pr [y ∈ Cr ] ≥ px (2t − 1) + (1 − px )(2t−1 − 1) = 2t−1 + px 2t−1 − 1. (11) y∈Vx \{0}

Combining (10) and (11), we have 2t−n (2n−1 − 1)ε ≥ 2t−1 + px 2t−1 − 1, which leads to inequality (7). Theorem 2: Inequality (7) of Theorem 1 is tight. That is, for an integer t ≤ n, an element x ∈ Fn2 \ {0}, and a positive real 2−21−t number ε ≤ 1−2 1−n , there exists an ε-almost universal2 code family C with minimum dimension t satisfying the equality of (7). 2−21−t t−n In the above theorem, the real number ε = 1−2 ε)2−t+1 + ε − 1 ≤ 1. 1−n is the maximum number satisfying (1 − 2 n n Proof: Fix x ∈ F2 . Then define a code family A = {Ar } in F2 as follows. Choose randomly an t-dimensional subspace of Vx = {y ∈ Fn2 |(x, y) = 0}. That is, select t linearly independent elements from Vx randomly, and let them span a subspace Ar . Then one has: 2t − 1 y ∈ Vx \ {0}, Pr [y ∈ Ar ] = n−1 . (12) 2 −1 We also define another code family B = {Br } as follows. First choose an t − 1-dimensional subspace of Vx randomly, and then include an additional basis element z 6∈ Vx to it, so that they form an t-dimensional subspace in total. Then the following inequalities hold: y ∈ Vx \ {0}, y 6∈ Vx ,

2t−1 − 1 , 2n−1 − 1 Pr [y ∈ Br ] = 2t−n .

Pr [y ∈ Br ] =

(13) (14)

5

Finally, define a code family C = {Cr } by combining A with probability p, and B with probability 1 − p, where p is defined by  p := 1 − 2t−n ε 2−t+1 + ε − 1. (15)

One may wonder that this construction using probability p deviates from our definition of universal2 code family that each element Cr is chosen with the uniform probability. One way to cure this problem is to include multiple copies of A and B in C. For example, if p = a/b with a, b ∈ N, then construct C as a combination of a copies of A and b − a copies of B. From (12), (13), and (14), it is straightforward to see that C is ε-almost universal2. Also note, since x ∈ Cr⊥ holds only when A is chosen, we have   (16) Pr x ∈ Cr⊥ = p. Hence, C indeed attains the equality of (7). We give some useful examples of Theorems 1 and 2. We apply these results to several communication models in later sections. Corollary 1: The following relations hold for a code family C and the dual family C ⊥ : 1) If C is optimally universal2, C ⊥ is also optimally universal2. In other words, an optimally universal2 family C is also optimally dual universal2 . 2) If C is universal2 (i.e., 1-almost universal2), C ⊥ is 2-almost universal2. In other words, a universal2 family C is also 2-almost dual universal2. 3) For ε > 1, however, an ε-almost universal2 family C is not necessarily ε′ -almost dual universal2. That is, there is an example of an ε-almost universal2 family C with maxx Pr[x ∈ Cr⊥ ] = 1. Proof: Items 1 and 2 are obvious. For item 3, choose ε so that the right hand side of (7) equals 1. C. Generalization to subcode and extended code families For the application to quantum key distribution, it is convenient to generalize the concept of a universal2 code family to those C = {C2r } consisting solely of extended codes of C1 . Definition 6: Let C1 ⊂ Fn2 be a fixed m-dimensional code. A code family C2 = {C2r | r ∈ I} is called an extended code family of C1 , if each C2r is an extended code of C1 , i.e., ∀r ∈ I, C1 ⊂ C2r . An extended code family C of C1 is called an ε-almost universal2 extended code family of C1 with minimum (or maximum) dimension t, if ∀x ∈ Fn2 \ C1 , Pr [x ∈ C2r ] = Pr [[x] ⊂ C2r ] ≤ 2t−n ε, where [x] denotes the coset with the representative x in Fn2 /C1 . By considering a universality of a dual code family of such extended code family, we are naturally led to the following definition of universal2 subcode families. Definition 7: Let C1 ⊂ Fn2 be a fixed m-dimensional code. A code family C2 = {C2r | r ∈ I} is called a subcode family of C1 , if each C2r is a subcode of C1 , i.e., ∀r ∈ I, C2r ⊂ C1 . A subcode family C2 of C1 is called an ε-universal2 subcode family of C1 with minimum (or maximum) dimension t, if ∀x ∈ C1 \ {0}, Pr [x ∈ C2r ] ≤ 2t−m ε. One explicit construction of C2 is to first let D = {Dr ∈ Fm 2 |r ∈ I} be a universal2 code family with minimum dimension t, and then define generating matrix of C2r ∈ C2 by G(C2r ) := G(Dr )G(C1 ). For these types of codes as well, we can prove a theorem similar to Theorems 1 and 2. Theorem 3: Let C1 ⊂ Fn2 be a fixed m-dimensional code, and C2 be an ε-almost universal2 subcode family C2 of C1 with minimum dimension t ≤ m. Then the dual code family C2⊥ is a 2(1 − 2t−m ε) + (ε − 1)2t -almost universal2 extended code (subcode) family of C1⊥ with maximum dimension n − t. That is,   ⊥ ∀x ∈ F2 \ C1⊥ , Pr x ∈ C2r ≤ (1 − 2t−m ε)2−t+1 + ε − 1. (17)

In other words, the subcode family C2 is also a 2(1 − 2t−mε) + (ε − 1)2t -almost dual universal2 extended code family of C1 . 2−21−t Moreover, for an integer t ≤ m, an element x ∈ F2 \ C1⊥ , and a positive real number ε ≤ 1−2 1−m , there exists an ε-almost universal2 subcode family C2 of C1 with minimum dimension t satisfying the equality of (17). Proof: For an ε-almost universal2 subcode (extended code) family C2 of C1 , the equivalence relations C1 ∼ = Fm = Fn2 /C1⊥ ∼ 2 m hold. The proofs of the above theorems with F2 can be applied to this theorem. Theorem 4: Let C1 ⊂ Fn2 be a fixed m-dimensional code, and C2 be an ε-almost universal2 extended code family C2 of C1 with minimum dimension t ≥ m. Then the dual code family C2⊥ is a 2(1 − 2t−n ε) + (ε − 1)2t−m -almost universal2 subcode family of C1⊥ with maximum dimension n − t. That is,   ⊥ ≤ (1 − 2t−n ε)2−t+m+1 + ε − 1. (18) ∀x ∈ C1⊥ \ {0}, Pr x ∈ C2r

6

In other words, the extended code family C2 is also a 2(1 − 2t−n ε) + (ε − 1)2t−m -almost dual universal2 subcode family of C1 . 2−21−t+m Furthermore, for an integer m ≤ t ≤ n, an element x ∈ C1⊥ \ {0}, and a positive real number ε ≤ 1−2 1−n+m , there exists an ε-almost universal2 extended code family C2 of C1 with minimum dimension t satisfying the equality of (18). Proof: Similarly, for an ε-almost universal2 extended code family C2 of C1 , the equivalence relations Fn2 /C1 ∼ = C1⊥ ∼ = n−m n−m F2 hold. Under this equivalence, C2r /C1 can be regarded as subspace of F2 with the minimum dimension t − m. The proofs of the above theorems with F2n−m and the minimum dimension t − m can be applied to this theorem. Furthermore, the concept of a subcode family and an extended code family can be extended to the case where C1 is also randomly chosen. A family of a pair of codes {C1,r ⊂ C2,r }r is called an ε-almost universal2 extended code pair family with minimum (or maximum) dimension t when it satisfies the condition ∀x ∈ Fn2 \ {0}, Pr [x ∈ C2,r \ C1,r ] ≤ 2t−n ε, III. P ERMUTED CODE FAMILY In some applications, our setting is invariant under permutations of the order of bits in Fn2 . For example, in wire-tap channels which we consider in later sections, independent and identically distributed (i.i.d.) channels are assumed and thus the protocol is invariant under permutations of bits. Then a code C ⊂ Fn2 has the same performance as any bit-permuted code of C. In order to formulate such situations, we introduce the permuted code family of a code C as a code ensemble consisting of bit-permuted codes of C CC := {σ(C)|σ ∈ Sn }.

(19)

Here Sn denotes the symmetric group of degree n, and σ(i) = j means that σ ∈ Sn maps i to j, where i, j ∈ {1, . . . , n}. The code σ(C) is the one obtained by permuting bits of C by a permutation σ; if x = (x1 , . . . , xn ) ∈ C, then xσ := (xσ(1) , . . . , xσ(n) ) ∈ σ(C). In what follows, we denote the distribution of the Hamming weight k of codewords in C by P rC ; that is, the number of codewords with weight k contained in C is |C|P rC (k). Similarly, the weight distribution P rC of a code family C is obtained by averaging P rC over C ∈ C with an equal probability. By using these concepts, we can show the existence of a fixed code C, whose permuted code family CC is ε-almost universal2, with ε being sufficiently small. Lemma 1: The permuted code family CC is ε(C)-almost universal2 code family, where ε(C) := max εk (C), 1≤k≤n

εk (C) :=

2t P rC (k)  n

(20)

k

and t is the dimension of the code C. Proof: Any code C ′ ∈ CC has an identical weight distribution P rC . By averaging them over all C ′ ∈ CC , we see that code family C also has the weight distribution P rC . That is, a code C ′ ∈ CC contains 2t P rC (k) elements of weight k on  n n average. On the other hand, the number of elements x ∈ F2 with weight k is k , and due to the symmetry of CC under bit permutations, each of them is contained in some C ′ ∈ CC with the same probability. Thus, an element x ∈ Fn2 with weight k t belongs to the code C ′ ∈ CC with the probability 2 P rnC (k) . By taking the maximum with respect to k, we obtain (20). (k ) Theorem 5: For any t ≤ n, there exists a t-dimensional code C ∈ Fn2 such that ε(C) ≤ n + 1. Proof: Let C be a universal2 code family. Then, Eεk (C) ≤ 1. The Markov inequality yields 1 , (21) P r{εk (C) ≥ n + 1} ≤ n+1 and thus n P r{ε1 (C) < n + 1, . . . , εn (C) < n + 1}c = P r ∪1≤k≤n {εk (C) ≥ n + 1} ≤ . (22) n+1 Hence, there exists a code C such that εk (C) < n + 1

(23)

for k = 1, . . . , n. Similarly, we can define the permuted extended code pair family for a given pair of codes C2 ⊂ C1 as the family of code pairs CC2 ⊂C1 := {σ(C2 ) ⊂ σ(C1 )|σ ∈ Sn }. We define ε(C1 /C2 ) := max1≤k≤n εk (C1 ) − εk (C2 ). Using the same discussion as the proof of Theorem 5, we can show that the permuted code pair family CC2 ⊂C1 is ε(C1 /C2 )-almost universal2 extended code pair family. Furthermore, we can show the following theorem. Theorem 6: For any t ≤ n and a code C2 , there exists a t-dimensional code C1 ∈ Fn2 such that C2 ⊂ C1 and ε(C1 /C2 ) ≤ n + 1. This theorem can be shown in the same way as Theorem 5 by choosing CC as a universal2 extended code family of C2 . In later sections, we use these results for showing the existence of deterministic hash functions that work universally for classical and quantum wire-tap channels and for randomness extraction.

7

IV. A PPLICATION TO

ERROR CORRECTING CODES

In this section, as a preliminary for later section, we apply the results of Section II to error correction. We use a code C ∈ C chosen randomly from an ε-almost universal2 code family C for error correction, and show that it indeed serves as a good code. As previous work, for example, Brassard and Salvail applied universal2 codes in the context of information reconciliation (Ref. [3], Theorem 6). Muramatsu and Miyake have also studied a similar problem using a somewhat generalized definition of universal hash functions [24]. Here we present a much simpler evaluation by employing a more restrictive condition for the ensemble of codes than [24]. We consider a noisy channel with the additive noise, and denote the probability that the noise x ∈ Fn2 occurs by P X (x). We also denote by Pˆ X (k) the probability that an error with the Hamming weight k occurs. In this channel, the sender Alice uses an ε-almost universal2 code family as error correcting codes. The receiver Bob applies to his message the maximum likelihood decoder. In order to evaluate its performance, we focus on the decoding error probability, with which the decoder makes a wrong guess. We denote the decoding error probability by Pe (C) for a fixed code C. From now on, we often treat a code C as a random variable that is randomly chosen with the equal probability from the ε-almost universal2 code family C. For example, we denote the expectation of variable A with respect to the random variable C as EC∈C A. In this notation, the main purpose of this section is to evaluate EC∈C Pe (C), i.e., the average of Pe (C) when C is randomly chosen from C. First we consider the case where an ε-almost universal2 code family C with maximum dimension tmax is used. Hence the decoder outputs tmax bits, and the coding  rate is R = tmax /n. If a bit flip error of k bits occurs in the channel, the average decoding error probability is less than min 2nh(k/n) 2tmax −n ε, 1 ≤ ε2−n[1−h(k/n)−R]+ for ε ≥ 1, where [x]+ := max{x, 0}. This is because the decoding error of the maximum likelihood decoder occurs when {x : |x| ≤ k} ∩ (C \ {0}) 6= ∅. For a given weight distribution Pˆ X (k) of errors, we obtain EC∈C Pe (C) ≤ ε

n X

Pˆ X (k)2−n[1−h(k/n)−R]+ .

(24)

k=0

As to the asymptotic behavior, one can easily see that, when the probability Pˆ X {k|1 − h (k/n) > R + δ} appoaches 1 for sufficientlt small δ > 0, the right hand side of (24) converges to zero. However, for the cases of finite n, it is not easy to calculate similar bounds. Hence, next we further assume that the channel is memoryless; that is, the probability distribution P X of errors x is assumed to be the binary distribution with probability p. In this channel, the maximum-likelihood decoder is equivalent to the minimum Hamming distance decoder. In this case, by modifying Gallager’s bound for the random coding [8], we can obtain the following simple bound. Theorem 7: When P X (x) is given as the n-th independent and identical distribution of the distribution (1 − p, p), then the average decoding error probability of error correction using an ε-almost universal2 code family C with maximum dimension tmax = nR satisfies EC∈C Pe (C) ≤ min εs 2−n[−sR+E0 (s,p)] ,

(25)

i1+s h 1 1 . E0 (s, p) := s − log2 p 1+s + (1 − p) 1+s

(26)

EC Pe (C) ≤ 2−nE(R,p) max{ε, 1}

(27)

0≤s≤1

where

This theorem is shown in Appendix A. The function E0 (s, p) defined in (26) is in fact the specialized form of Gallager’s E0 (s, p) for the binary symmetric channel and the uniform input distribution [8]. Hence by using the method of [8], the right hand side of (25) can be used to evaluate the exponential decreasing rate of EC∈C Pe (C) with respect to n as follows. Corollary 2: Under the same conditions as Theorem 7, EC Pe (C) can be bounded from above as

where E(R, p) is Gallager’s reliability function E(R, p) := max −sR + E0 (s, p). 0≤s≤1

(28)

In particular, E(R, p) is strictly positive for R < 1 − h(p). Proof of Corollary 2: The first half of the corollary is obvious. Denote the argument of the maximum by ER (s, p) := ∂ ER (s, p) s=0 = 1 − h(p) − R > 0 if R < 1 − h(p). Hence ER (s, p) attains its −sR + E0 (s, p). Then ER (0, p) = 0, and ∂s positive maximum value at s ∈ (0, 1]. (Also see Ref. [8].) The exponential decreasing rate E(R, p) of (27) can also be verified from (24) by using the type method [7]. Since Pˆ X (k) ≤ 2−nd(qkp) with q = k/n for the binary symmetric channel [7], the right hand side of (24) can be evaluated as ε

n X

k=0

Pˆ X (k)2−n[1−h(k/n)−R]+ ≤ ε(n + 1) max Pˆ X (k)2−n[1−h(k/n)−R]+ k

≤(n + 1)ε max 2−n([1−h(q)−R]+ +d(qkp)) = (n + 1)ε2−n min0≤q≤1 [1−h(q)−R]+ +d(qkp) , 0≤q≤1

(29)

8

where d(qkp) := q log pq + (1 − q) log using the relation

1−q 1−p .

One can see that the exponential decreasing rate of (29) indeed equals E(R, p) by

min [1 − h(q) − R]+ + d(qkp) = max −sR + E0 (s, p).

0≤q≤1

0≤s≤1

(30)

The proof of this relation is given, e.g., in Csisz´ar-K¨orner [7] in a more general form. However, since a simpler proof of (30) can be given by using the property of additive channels, we reproduce it in Appendix B for readers’ convenience. Now, we consider the case where the sender and the receiver use a fixed t-dimensional code C that satisfies the condition of Theorem 5, i.e., a code C whose permuted code family CC is (n + 1)-almost universal2. If the error distribution P X is permutation invariant, e.g., if the channel is binary symmetric, we have Pe (C) = Pe (σ(C)) for any permutation σ ∈ Sn , which implies that Pe (C) = Eσ∈Sn Pe (σ(C)). In other words, one may evaluate Pe (C) as if the code family CC were actually used. Thus, by applying (25) and by noting n + 1 > 1, we obtain the inequality Pe (C) ≤ (n + 1)2−nE(R,p)

(31)

with R = t/n. Note that the code C satisfies this inequalities for any p. In the rest of this section, we show that the above results also hold for the case where the information is encoded by the coset C1 /C2 of two given codes C1 and C2 satisfying C2 ⊂ C1 ⊂ Fn2 . These codes are used for constructions of the quantum Calderbank-Shor-Steane (CSS) codes, and for this reason, they are often called the classical CSS codes. In this section, we restrict ourselves to the following type of classical communication. A message to be sent is a coset [x] ∈ C1 /C2 , and when the sender wants to send [x], she chooses an element randomly from the set x + C2 with the equal probability and sends it. On the receiver’s side, Bob first applies the maximum likelihood decoder of C1 on the received sequence and obtains an element y ∈ C1 . Then, he obtains a coset [y] ∈ C1 /C2 as the final decoded message. We denote the decoding error probability of this decoder by Pe (C1 /C2 ). We assume that the subcode C2 is fixed, and the larger code C1 is randomly chosen with the equal probability from the εalmost universal2 extended code family C of C2 with maximum dimension tmax . Again, the purpose of the following discussion is to evaluate EC1 ∈C Pe (C1 /C2 ). By a similar argument as above, when the bit flip error occurs on k bits in the noisy channel, we can show that EC1 ∈C Pe (C1 /C2 ) is less than min{2nh(k/n) ε2tmax −n , 1} ≤ ε2−n[1−h(k/n)−R]+ , R = tmax /n for ε ≥ 1. Thus, for any weight distribution Pˆ X of errors, we have EC1 ∈C Pe (C1 /C2 ) ≤ ε

n X

Pˆ X (k)2−n[1−h(k/n)−R]+ .

(32)

k=0

If we further assume the channel is memoryless, as a generalization of Theorem 7 and Corollary 2, we have the following. Theorem 8: When P X (x) is given as the n-th independent and identical distribution of the distribution (1 − p, p), then an ε-almost universal2 extended code family C of C2 with the maximum dimension tmax = nR satisfies EC1 ∈C Pe (C1 /C2 ) ≤ min εs 2−n[−sR+E0 (s,p)] .

(33)

EC1 ∈C Pe (C1 /C2 ) ≤ 2−nE(R,p) max{ε, 1}.

(34)

0≤s≤1

and thus Similarly, ǫ-almost universal2 extended code pair family {C1,r ⊂ C2,r } satisfies

EC1,r ⊂C2,r Pe (C1,r /C2,r ) ≤ min εs 2−n[−sR+E0 (s,p)]

(35)

EC1,r ⊂C2,r Pe (C1,r /C2,r ) ≤2−nE(R,p) max{ε, 1}.

(36)

0≤s≤1

This theorem is also shown in Appendix A in a way similar to Theorem 7. Finally, for a given code C2 , we can choose another fixed code C1 satisfying the condition of Theorem 6, i.e., C2 ⊂ C1 and ε(C1 /C2 ) ≤ n + 1. We then assume that the sender and the receiver use this fixed pair for error correction. If the distribution P X is permutation invariant, we have Pe (C1 /C2 ) = Pe (σ(C1 )/σ(C2 )) for any permutation σ ∈ Sn , which implies that Pe (C1 /C2 ) = Eσ∈Sn Pe (σ(C1 )/σ(C2 )). Thus one may evaluate Pe (C1 /C2 ) as if the n + 1-almost universal2 permuted code family CC2 ⊂C1 were actually used. Applying (33), we obtain the inequality Pe (C1 /C2 ) ≤ (n + 1)2−nE(R,p) .

Note that the code C1 satisfies this inequality for any p.

(37)

9

V. Q UANTUM

KEY DISTRIBUTION

In this section, we apply the results of previous sections to the security proof of quantum key distribution (QKD). In QKD, Alice and Bob need to perform a key distillation protocol to generate a secret key from the sifted key that they obtained as a result of the quantum communication. We consider the following type of the BB84 protocol using a function family l F = {fr : Fm 2 → F2 |r ∈ I} for privacy amplification. BB84 protocol using universal hash function family: 1) Alice and Bob establish sifted keys, and estimate the bit error rate by the usual procedure of the BB84 protocol, such as the one given in [26]. That is, a) Alice sends Bob qubit states chosen randomly out of {|0z i, |1z i, |0x i, |1x i}. b) Bob receives and measures them with randomly chosen bases {z, x}. c) By using the authenticated public channel, Bob announces his measurement bases for all qubits, and they keep only the bits for which they chose the same basis. d) They reveal randomly sampled bits over the public channel, and calculate the estimated bit error rate. If the rate is too high, they abort the protocol. As a result, Alice and Bob obtains sifted key kA , kB ∈ Fn2 , respectively. T , with ⊕ denoting XOR. 2) Alice picks a random number rA ∈ Fl2 , and announces v = kA ⊕ G(C1 )rA ′ 3) Bob calculates RB = kB ⊕ v and by correcting its errors using C1 , he obtains RB ∈ C1 . Then he calculate l T raw bit rB ∈ F2 satisfying RB = G(C1 )rB . (Thus rA = rB with high probability). l 4) Alice selects a linear universal2 function fr : Fm 2 → F2 randomly and announces it to Bob. Then they calculate secret keys sA = fr (rA ) and sB = fr (rB ). By using the widely used proof technique due to Shor and Preskill [26], [10], [29], [14], the unconditonal security of this protocol has been shown for the case where F consists of the completely random linear functions [29], [14]. On the other hand, by using the quantum de Finneti representation theorem, Renner proved the unconditional security of the BB84 protocol using universal2 hash functions for privacy amplification [25]. In this section, we present a security proof of the Shor-Preskill–type that holds with a weaker condition on F , i.e., with F being an ε-almost dual universal2 family. Note that the condition on F is indeed relaxed, since, as shown in Sec. II, the universal2 function family is a limited case of ε-almost dual universal2 families. Note also that our method has an extra advantage that, unlike in [25], Alice and Bob do not need to perform random permutations of the sifted key bits. Conversely, if the random permutation is already implemented in one’s QKD system, or if the channel is permutation invariant, our hash function can be replaced by the one using the deterministic code obtained in Theorem 6, since the permuted codes of this code form an (n + 1)-almost dual universal2 code family. For showing the security, it is convenient to rewrite the protocol in terms of the classical CSS code as follows. BB84 protocol using code family C2 : 1) Alice and Bob establish sifted keys kA , kB ∈ Fn2 by the same procedure as in the above protocol. 2) Alice picks RA ∈ C1 randomly and sends v = kA ⊕ RA to Bob over the public channel. ′ ′ 3) Bob calculates RB = v ⊕ kB , and by correcting its errors using C1 , he obtains RB ∈ C1 . (Thus RA = RB with high probability.) 4) Alice selects code C2r randomly and announces it to Bob. They both obtain secret keys as cosets of C2r , ′ i.e., SA = RA + C2r , SB = RB + C2r . For the sake of simplicity, we will restrict ourselves to this protocol for the rest of this section. We begin by reviewing some of the known results and clarify notations. Assume that the quantum channel between Alice and Bob is given by an arbitrary quantum operation Λ, and thus the sifted key is affected by Λ. As discussed in [13], [14], since the above type of the BB84 protocol is invariant under twirling of qubits, without loss of generality, one may consider the Pauli channel Λt obtained by twirling the original channel Λ. The Pauli channel Λt can generally be described by the joint probability distribution P XZ of phase error and bit error (in this section, we call an error in the x basis the phase error, and in the z basis the bit error). That is, Λt transforms an n-qubit state ρ to X † P XZ (x, z)Z x X z ρ (Z x X z ) , (38) Λt (ρ) = x,z∈Fn 2

where Zx : = Xz : =

σzx1 ⊗ · · · ⊗ σzxn ,

σxz1 ⊗ · · · ⊗ σxzn

10

with σx and σz being the Pauli and x = (x1 , . . . , xn ), z = (z1 , . . . , zn ) ∈ {0, 1}n. We denote the marginal distribution Pmatrices,XZ X of phase error by P (x) = z∈Fn P (x, z). As in the previous section, Pˆ X (k) denotes the distribution of the Hamming 2 weight k of x obeying P X (x). Next, before considering the secret key, we evaluate the security of the sifted key v as an illustration. The result here will also be used in later sections on wire-tap channels and randomness extraction. Let ρA,E be Alice’s and Eve’s total system when the when the first step of the protocol (i.e., the quantum communication part) is finished. If one employs the security criteria that takes into account the universal composability [25], the security of the sifted key can be evaluated by Eve’s distinguishability kρA,E − ρA ⊗ ρE k1 , with ρA := Tr E ρA,E and ρE := Tr A ρA,E 2 . Alternatively, one may evaluate the security by Eve’s Holevo information χ := Tr ρA,E (log ρA,E − log ρA ⊗ ρE ). These values are known to be bounded from above as [13], [14] √ p kρA,E − ρA ⊗ ρE k1 ≤ 2 2 Pph (39) χ ≤ ηn (Pph ), (40) where Pph is the phase error probability of the channel Λt . That is, Pph := 1 − P X (x = 0n ). The function ηn is defined as  −x log x − (1 − x) log(1 − x) + nx if x ≤ 1/2 ηn (x) := (41) 1 + nx if x > 1/2. Now we turn to the security of the secret key. The only difference here is that the key is effectively sent through the quantum channel that is error-corrected by the quantum CSS code corresponding to the classical CSS code C1 , C2 . Hence by using essentially the same argument as above, the security can be evaluated by the phase error probability that remains after the quantum error correction. When one sees it in the phase basis (i.e., the x basis), thisprobability is given by the decoding error probability of the classical CSS code C2⊥ /C1⊥ , which we denote by Pph C2⊥ /C1⊥ . Then the security of the secret key can be evaluated as √ q  (42) kρA,E − ρA ⊗ ρE k1 ≤ 2 2 Pph C2⊥ /C1⊥ ,  ⊥ ⊥ (43) χ ≤ ηl Pph (C2 /C1 ) . For the case of C1 = Fn2 , essentially the same relation was noted by Koashi [19]  and Miyadera [22]. Then we apply the results of the previous section to evaluate Pph C2⊥ /C1⊥ . In our BB84 protocol, the subcode C2 ⊂ C1 is randomly chosen from an ε-almost dual universal subcode family C with minimum dimension m − l of a fixed code C1 . This corresponds to the case where the dual code C2⊥ is chosen from the ε-almost universal2 extended code family of the fixed code C1⊥ with maximum dimension n − m + l. Thus by applying inequality (32), we have n X  Pˆ X (k)2−n[S−h(k/n)]+ , EC2 ∈C Pph C2⊥ /C1⊥ ≤ ε

(44)

k=0

where S = (m − l)/n is the sacrificed √ bit rate, i.e. the ratio of bits reduced by privacy amplification. Therefore, from (39), (40), and from the concavity of x 7→ x, x 7→ ηl , we have v u n √ u X EC2 ∈C kρA,E − ρA ⊗ ρE k ≤ 2 2tε Pˆ X (k)2−n[S−h(k/n)]+ , k=0

EC2 ∈C χ



ηl

ε

n X

k=0

ˆX

P (k)2

−n[S−h(k/n)]+

!

.

In practical QKD systems, the weight distribution Pˆ X needs to be estimated from the bit error rate of sampled bits (see, e.g., [13], [14]). If the phase error rate pph = k/n is estimated to be less than a certain value pˆph with the exception of a negligiblly small probability, and if S > h(ˆ pph ), then the argument ε2−n[S−h(k/n)]+ converges to zero for n → ∞. Asymptotically, it is sufficient to sacrifice n [h (ˆ pph ) + δ] bits by privacy amplification with an arbitrary δ > 0. From the above argument, we see that for the security of QKD, it is sufficient to choose the code C2 from an ε-almost dual universal2 subcode family of C1 , while the existing results (e.g., [25]) guarantee the security only when the code C2 is randomly chosen from a universal2 subcode family of C1 . Since a universal2 subcode family of C1 is a 2-almost dual universal2 subcode family of C1 (Theorem 4), our condition is strictly weaker than that by [25]. It should also be noted that by setting C1 = Fn2 , our argument also applies to Koashi’s proof technique [19]; that is, random matrices appearing in Koashi’s protocol can be replaced by an almost dual universal2 code family. 2 P Recall 1that, in our protocol, Alice is assumed to choose her sifted key uniformly. Hence ρA,E can generally be described as ρA,E := , . . . , vn |⊗ρE (v1 , . . . , vn ), where ρE (v1 , . . . , vn ) denotes Eve’s density matrix when Alice’s sifted key is v = (v1 , . . . , vn ). v1 ,...,vn 2n |v1 , . . . , vn ihv P1 In this case, Tr E ρA,E = v1 ,...,vn 21n |v1 , . . . , vn ihv1 , . . . , vn | gives the fully mixed state.

11

VI. Q UANTUM

WIRE - TAP CHANNEL

We apply our results of the previous section on QKD for showing the security in the quantum wire-tap channel model. In this model, the channel from Alice to Bob and the channel from Alice to Eve are both specified. Particularly, in this section, we assume that the channel from Alice to Bob is given by the n-multiple use of the Pauli channel which is described by the joint distribution P ZX of bit error and phase error on a single qubit system. We also assume that phase error and bit error occur independently, and denote the phase error probability Q by pph . This corresponds to a limited case of the Pauli channel n n discussed in the previous section, i.e., P X Z (x, z) = ni=1 P X (xi )P Z (zi ) with P X (1) = 1 − P X (0) = pph . As to the channel to Eve, we assume that Eve can access all part of the environment system corresponding to this channel. Our goal is to show that Alice can send secret classical information via the quantum channel to Bob by the following coding protocol (c.f. the paragraph below (31)). First, Alice chooses a classical CSS code C1 , C2 . A message to be sent is a coset [x] ∈ C1 /C2 , and when the sender wants to send [x], she chooses an element randomly from the set x + C2 with the equal probability and sends it. On the receiver’s side, Bob first applies the maximum likelihood decoder of C1 on the received bit sequence and obtains an element y ∈ C1 . Then, he obtains a coset [y] ∈ C1 /C2 as the final decoded message. From Eve’s point of view, this protocol is equivalent to the situation where Alice sends her classical information [x] ∈ C1 /C2 by encoding it to a state |[x]i of the quantum CSS code (see, e.g., [26]). Hence we can evaluate the security of [x] by the same argument as the previous section, i.e., by inequality (42) or by (43), depending on one’s security criteria. By noting that the channel between Alice and Bob is i.i.d., we can apply a simple bound given in Theorem 8. Thus, if a fixed a code C1 ,  and an ε-almost dual universal2 subcode family of C of C1 are used, the average of Pph C2⊥ /C1⊥ satisfies  (45) EC2 ∈C Pph C2⊥ /C1⊥ ≤ 2−nE(1−S,pph ) max{ε, 1}.

Here tmin = n(1 − S) is the minimum dimension of C2 , and tmax = nS is the maximum dimension of C2⊥ , which equals the sacrificed bit length. As one can see from Corollary 2, the exponential decreasing rate E (1 − S, pph ) on the right hand side of (45) is strictly positive for S > h(pph ). By using (45), the averages of Eve’s distingushability kρAE − ρA ⊗ ρE k1 and the Holevo information χ = Tr ρAE (log ρAE − log ρA ⊗ ρE ) can be evaluated as √ 1 3 EC2 ∈C kρAE − ρA ⊗ ρE k1 ≤ 2− 2 nE(1−S,pph )+ 2 max ε, 1 , (46)   −nE(1−S,pph ) max{ε, 1} . (47) EC2 ∈C χ ≤ ηl 2

with l = dim C1 − tmin being the length of message. In fact, it can be shown that this protocol is also secure even if Alice and Bob use a fixed pair of codes C1 , C2 . This is shown by noting that our setting is permutation invariant, and hence the above discussion can be extended to the permuted code pair family. As shown in Section III, given a code C1 , we can choose another t-dimensional code C2 such that C1⊥ ⊂ C2⊥ and ε(C2⊥ /C1⊥ ) ≤ n + 1. Then by combining (37), (42), and (43), we see that the security of C1 , C2 can be evaluated as √ 1 3 n + 1 2− 2 nE(1−S, pph )+ 2 , kρAE − ρA ⊗ ρE k1 ≤ (48)   −nE(1−S, pph ) χ ≤ ηl (n + 1) 2 (49) with the message length l = dim C1 − t. Note that the construction of code C2 is universal in that it does not depend on the value of pph . Hence, the linear map defined by C1 → C1 /C2 can be regarded as a type of deterministic universal hash function which is secure for an arbitrarily given quantum Pauli channel. VII. C LASSICAL

WIRE - TAP CHANNEL

A. Security criteria and upper bounds Next we apply the above results to the classical wire-tap channel model, where all channels are classical. As in the previous section, we assume that the channels from Alice to Bob and the channel to Eve are both specified. However, we stress that, unlike in the previous section, general forms of these channels considered. That is, the main channel from Alice to Bob is described by a distribution W B : i 7→ WiB , where WiB is the general probability distributions describing the outputs of Bob with Alice’s input bit i = 0, 1. Similarly, the wire-tap channel from Alice to Eve is also given by the general distribution W E : i 7→ WiE . For Alice’s input of n bits, the outputs are given as n-multiple use of these channels as En Bn (e1 , . . . , en ) := WxE1 (e1 ) · · · WxEn (en ). W(x (b1 , . . . , bn ) := WxB1 (b1 ) · · · WxBn (bn ), W(x 1 ,...,xn ) 1 ,...,xn ) In these channels, Alice and Bob perform the same protocol as the previous section to convey a secret bit string. Alice’s message is a coset [x] ∈ C1 /C2 of a classical CSS code C1 , C2 , and Bob decodes it using the maximum likelihood decoder. If one uses the universally composable security criteria, Eve’s distingushability can be evaluated by the L1 distance (variational distance) of the classical distribution, defined by

X n 1

En (50)

W[x] − WCE1 . |C1 /C2 | 1 [x]∈C1 /C2

12

P n n Here, for a given subset Y ⊂ Fn2 , we introduced a distribution WYE := |Y|−1 x∈Y WxE , i.e., the average distribution of n Eve’s output when Alice sends an element of Y with the equal probability. Hence WCE1 corresponds to the case where Eve obtains no information concerning [x]. Alternatively, if one uses the security criteria based on the mutual information between Alice and Eve, it can be written as     X n 1 En IC1 /C2 (An : E n ) := H WCE1 − . (51) H W[y] |C1 /C2 | [y]∈C1 /C2

In particular, a protocol on the wire-tap channel is said to have the strong (resp., weak) security when the mutual information I(An : E n ) (resp., n1 I(An : E n )) converges to zero as n → ∞ (see, e.g., [21]). We evaluate these quantities by using the results of the previous section. We do this by simulating our classical channels by the quantum Pauli channel, i.e., we consider a quantum channel obtained by the purification of our classical channel. In order to see that it is P indeed possible, first denote by HE Eve’s original system describing the classical channel WiE , where she receives a state e WiE (e)|eihe| as a result of Alice’s input i. Then, by choosing another environmentP system HR suitably, one can construct pure states ρ0,E and ρ1,E on the extended system HE ⊗ HR such that Tr HR ρi,E = e WiE (e)|eihe| and the fidelity between ρ0,E and ρ1,E equals the fidelity F between the two distributions W0E , W1E on Eve’s side. That is, q Xq p √ √ Tr ρ0,E ρ1,E = Tr | ρ0,E ρ1,E | =F := W0E (e) W1E (e). (52) e

Without loss of security, we may assume that Eve can not only access to HE , but also to the larger system HE ⊗ HR , since it can only increase Eve’s information. Similarly, we may also assume that she can perform any quantum measurement in HE ⊗ HR . Then the situation is the same as that of the quantum wire-tap channel discussed in the previous section, and thus the L1 distance (50) can be bounded from above by that of the quantum case, kρA,ER − ρA ⊗ ρER k1 , where X X 1 1 ρA,ER := |[y]ih[y]| ⊗ ρx1 ,ER ⊗ · · · ⊗ ρxn ,ER . (53) |C1 /C2 | |[y]| [y]∈C1 /C2

(x1 ,...,xn )∈[y]

By a similar argument, the mutual information (51) can also be bounded from above by the quantum mutual information D (ρA,ER k ρA ⊗ ρER ). We advance our analysis further by restricting ourselves to a specific type of the Pauli channel Λt where bit error and phase error occur independently. Recall that this type of the channel was also discussed in the previous section. For each bit, the channel Λt between Alice and Bob can be described by Stinespring representation using Eve’s state |ψi and a unitary operator V as   (54) Λt (ρ) = TrE V ρ ⊗ (|ψihψ|)ER V † , X X q |ψi := pX (a)pZ (b) |a, bi, (55) a=0,1 b=0,1

V

:=

X X

(σxa σzb )A ⊗ (|a, biha, b|)ER .

(56)

a=0,1 b=0,1

If this channel simulates Eve’s extended channel, we have   ρi,ER = TrA V (|iihi|)A ⊗ (|ψihψ|)ER V † for i = 0, 1,

where |iiA is the qubit state corresponding to Alice’s input i. Since Eve’s states ρi,ER in the extended space HE ⊗ HR are pure states, they can be represented with a suitable choice of the basis |0iER , |1iER as ρi,ER = (|ϕi ihϕi |)ER

with |ϕi iER :=

p √ 1 − pph |0iER + (−1)i pph |1iER

(57) (58)

and pph = pX (1). (See, also, Section VII of [13].) By noting that the fidelity between |ϕ0 iER and |ϕ1 iER is 1 − 2pph , and E from (52), we have pph = 1−F 2 . Hence we have shown that Eve’s classical channel Wi with the fidelity F can be simulated by the environment of the above Pauli channel with the phase error probability pph = 1−F 2 . n The secrecy in the n-mutiple use of the channel, i.e., in the presence of Eve’s channel W E , can also be evaluated by the n-mutiple use of this Pauli channel. Therefore, by using (46) and (47), the averages of Eve’s distingushability and the mutual information between Alice and Eve IC1 /C2 (An : E n ) are evaluated as follows.

13

Theorem 9: Consider the classical wiretap channel model defined at the beginning of this section. If Alice and Bob choose C2 from an ε-almost dual universal2 subcode family C of C1 , the secrecy of message [x] ∈ C1 /C2 is given by  

X √ n n 3 1 1

E EC2 ∈C  ε, 1 ,

W[x] − WCE1  ≤ 2− 2 nE(1−S,pph )+ 2 max |C1 /C2 | 1 [x]∈C1 /C2     EC2 ∈C IC1 /C2 (An : E n ) ≤ ηl 2−nE(1−S,pph ) max{ε, 1} , P p E p E W0 (e) W1 (e) is the fidelity between Eve’s probability distributions W0E and W1E where pph := 1−F e 2 , and F := corresponding to Alice’s input i = 0, 1. The parameter tmin = Sn is the minimum dimension of C2 , which equals the length of the sacrificed bits. The length of message [x] is given by l := dim C1 − tmin .  As one can see from Corollary 2, if the sacrifice bit rate S := tmin /n is greater than h 1−F , the decreasing rate E(1−S, pph ) 2 is strictly positive, and thus the strong security is guaranteed. Furthermore, employing the same idea as Hayashi [16], we can apply this result to secret key agreement (distillation) even in the classical setting. As in the case of the quantum wire-tap channel, we can show that our protocol is secure even if Alice and Bob use a fixed pair of codes C1 , C2 . Corollary 3: Consider the same setting as Theorem 9 except that, instead of C1 and a code family C, Alice and Bob use a fixed pair of codes codes C1 , C2 satisfying the condition of Theorem 6. Then the secrecy of [x] is given by

X √ n 1 1 3

En n + 1 2− 2 nE(1−S, pph )+ 2 , (59)

W[y] − WCE1 ≤ |C1 /C2 | 1 [y]∈C1 /C2   (60) IC1 /C2 (An : E n ) ≤ ηl (n + 1) 2−nE(1−S, pph )

with message length l = dim C1 − dim C2 . Proof: This is again proved by using the permutation invariance of the channel, i.e., by extending the discussion of Theorem 9 to the permuted code pair family. As shown in Theorem 6, given a code C1 , one can choose another code C2 such that C1⊥ ⊂ C2⊥ and ε(C2⊥ /C1⊥ ) ≤ n + 1. Then by a similar argument with (48) and (49), we obtain (59) and (60). Note that the construction of the code C2 does not depend on the fidelity F , or on the form of the channel to Eve. So, the hash function C1 → C1 /C2 can be regarded as a deterministic universal hash function in this classical setting as well. Also, it should be noted that the above discussion can be applied to the general √ quantum √ case. That is, when Eve’s output state is given as Wi for Alice’s input i = 0, 1, the same result holds with F = Tr | W1 W2 |. B. Comparison with existing results

In order to compare our results of this section with existing ones, we here review the history of the studies of the wire-tap channel. For a sacrifice bit rate R greater than the mutual information I(A : E) between Alice and Eve, Wyner [32], and Csisz´ar and K¨orner [6] showed the weak security in terms of Maurer and Wolf [21]. Csisz´ar [5] showed the strong security with the same sacrifice bit rate in terms of Maurer and Wolf [21]. Hayashi [12] gave the concrete exponential decreasing rate for the strong security with the same sacrifice bit rate. These studies use completely random coding as privacy amplification process. That is, no linear functions are used in this process. Bennett et al. [2] proposed to use universal2 hash functions for privacy amplification. Maurer and Wolf [21] applied this idea to the secret key agreement, which is different setting form wiretap channel. They showed the strong security with universal2 hash functions for privacy amplification. Based on these ideas, Hayashi [16] showed the strong security with universal2 hash functions when the sacrifice bit rate is greater than the mutual information I(A : E). Muramatsu and Miyake [23] considered a more general condition [24] than the ε-almost universal2 functions of the code for privacy amplification. Under this condition, they showed the weak security with the same sacrifice bit rate. However, Watanabe et al. [30] pointed out that their method cannot derive the strong security based on Hayashi’s idea [15] in the case of secret key agreement from correlated source. Hence, the existing results can be summarized  as follows. Suppose thatthe sacrifice bit rate S is greater than the mutual information I(A : E) := H 12 W0E + 12 W1E − 12 H W0E − 21 H W1E , with H(P ) being the Shannon entropy of the distribution P . Then, (i) The strong security holds if the code C2 is chosen from a universal2 subcode family of C1 [12]. (ii) The weak security holds if the code C2 is chosen from an ε-almost universal2 subcode family of C1 [23], but the strong security is not necessarily guaranteed, because of the counterexample we give in Theorem 10. On the other hand, our results of this section can be summarized as follows: (a) An ε-almost dual universal2 subcode family of C1 can guarantee the strong security when the asymptotic sacrifice bit rate S is greater than h( 1−F 2 ). (Theorem 9) (b) There exists a deterministic universal hash function that guarantees the strong security when S is greater than h( 1−F 2 ). The construction of this hash function does not depend on the form of the channel to Eve.

14

In comparison with the existing results, the advantage of our (b) is clear; all existing hash functions are constructed randomly. Note that it is indeed possible to construct a deterministic function from a set of randomized functions by choosing the best one, but the deterministic function thus obtained generally depends on the form of the channel to Eve. The advantage of our method (a) needs a more complicated explanation. In comparison with (i), method (a) has a advantage that it can guarantee the strong security with a weaker condition on code C2⊥ . Indeed, as we have shown in Theorem 3, the ε-almost dual universality2 of (a) is strictly weaker than the universality2 of (i). In comparison with (ii), however, it is not clear whether our condition on C2 is weaker or stronger. Rather, the advantage of (a) against (ii) is that (a) can achieve the strong security, while (ii) can only achieve the weak security. The impossibility of the strong security under the condition of (ii) will be shown in Theorem 10 by giving a counterexample. 1−F On the other hand, the disadvantage of our method  is1 the required  1asymptotic  sacrifice bit rate h( 2 ), which is larger 1 1 E E E E than that of the existing methods, H 2 W0 + 2 W1 − 2 H W0 − 2 H W1 . The fact that our rate is indeed larger can be shown by the information processing inequality concerning the quantum relative entropy for the TP-CP map ρ 7→ Tr HR ρ. More specifically, if we compare the rates the case  where Eve’s channel is a binary symmetric channel with error probability  for p p, our required sacrifice bit rate is h 12 − p(1 − p) , while that of the existing methods is 1 − h(p). As we have plotted in Fig. 2, the difference of the two rates is relatively small for small p. This means that our method is particularly effective for small p, under the assumption that an ε-almost dual universal2 subcode family C of C1 can be implemented with a relatively small amount of calculation. 1.0

0.8

0.6

0.4

0.2

p 0.1

0.2

0.3

0.4

0.5

  p Fig. 2. Asymptotic sacrificed bit rates of classical the wire-tap channel model. Upper line: h 12 − p(1 − p) of our method using an ε-almost dual universal2 subcode family (the present paper). Lower line : 1 − h(p) of the existing method using a universal2 subcode family (previous papers [32], [6], [5], [12], [16], [23]).

C. ε-almost dual universality2 vs. ε-almost universality2 Finally, as mentioned earlier, we present an example of the classical wire-tap channel model that can vividly contrast the properties of the ε-almost dual universality2 and the ε-almost universality2. In this example one sees that, if ε ≥ 2, an ε-almost universal2 subcode family (of C1 = Fn2 ) cannot necessarily guarantee the strong security. In other words, the choice of the code C2 from an ε-almost universal2 subcode family of C1 is not sufficient for the strong security. Note that we have shown in this section that the ε-almost dual universality2 is indeed sufficient for this purpose. Hence, at least in the setting of this section, the ε-almost dual universality2 is the more relevant criterion for security. Theorem 10: Assume that the channel from Alice to Bob is noiseless, and the channel to Eve is binary symmetric with error probability p. There exists an example of a 2-almost universal2 code family C for which the hash functions (i.e., Fn2 → Fn2 /C2 with C2 ∈ C) cannot guarantee the strong security. Proof: Choose an arbitrary universal2 code family C ′ = {C2′ ⊂ F2n−1 }. Then define another code family C in Fn2 , consisting of C2 := { x||0 | x ∈ C2′ } with C2′ ∈ C. Here, akb denotes the concatenation of a and b. Hence for any C2 ∈ C, there exists C2′ ∈ C ′ , such that C2 consists of x ∈ C2′ concatenated with a zero. Note that the code family C is obviously 2-almost universal2, but its dual code family C ⊥ cannot be ε-almost universal2 for any ε < 1, because x = 0 . . . 01 ∈ C for all C ∈ C ⊥ . When Alice transmits a coset [x] ∈ Fn2 /C2 as her secret message, she chooses x ∈ [x] randomly and sends it to Bob. Due to our construction of C, the n-th bit of x is preserved in [x] as it is without being canceled by privacy amplification. Since Eve receives this n-th bit with the error probability p, Eve’s mutual information regarding [x] is greater than 1 − h(p). Therefore, the strong security does not hold with these hash functions.

15

VIII. A PPLICATION TO RANDOMNESS EXTRACTION The results of the previous section can be applied for showing the security of classical randomness extraction. The goal of the randomness extraction model is to extract, from a given source A, a longest random string possible which obeys uniform distribution. In this section, we consider the n-th i.i.d. source An of a general binary source A. In order to extract randomness from An , Alice performs privacy amplification by the linear map Fn2 → Fn2 /C. That is, on receiving a sequence x ∈ Fn2 from An , she randomly chooses a code C from a code family C, and calculates the corresponding secret bits as a coset [x] ∈ Fn2 /C. The security of the secrecy of [x] can be evaluated as follows. As one can see from the proof below, we obtain these inequalities by reducing the problem to the wire-tap channel of the previous section. Theorem 11: Let An be the n-th independent and identically distributed sources of a general binary distribution A, given P n n n by (P A (0), P A (1)) = (1 − p, p). Also let PCA denote the distribution of coset [x] ∈ Fn2 /C, i.e., PCA ([x]) := z∈[x] P A (z). n Also let UCA denote the uniform distribution in F2 /C. Under these conditions, if C is randomly chosen from an ε-almost dual universal code family C with minimum dimension tmin , the secrecy of [x] can be evaluated as

n

√ n 3 1

ε, 1 , EC∈C PCA − UCA ≤ 2− 2 nE(R,pph )+ 2 max 1   h  n i ≤ ηl 2−nE(R,pph ) max{ε, 1} EC∈C n − dim C − H PCA p with pph = 1/2 − p(1 − p). The minimum dimension tmin equals the bit length reduced by the hash function. Thus the length of a random number is l = n − tmin , and the generation rate of the extractor R = 1 − tmin /n. Proof: We prove this theorem by reducing the problem to the wire-tap channel model. For this purpose, we introduce n n n n fictitious Eve who is correlated with Alice by the joint distribution QA E . We assume that QA E is the n-th identical and independent distribution of QAE given by  1 AE 2 (1 − p) for xi + yi = 0 (mod 2), (61) Q (xi , yi ) = 1 for xi + yi = 1 (mod 2) 2p Qn n n for i = 1, . . . , n. That is, QA E (x, y) = i=1 QAE (xi , yi ), for x, y ∈ {0, 1}n. Note that the conditional distribution n n QE |A (y|x) can be regarded as defining the additive channel from Alice to Eve. Also note that the marginal distributions of n n Alice and of Eve are uniform; QA (x) = QE (y) = 2−n for ∀x, y. This setting can be considered as a special case of the classical wire-tap channel where there is no error between Alice and Bob. In this case, the bit error correction is not pnecessary between W0E and W1E is F = 2 p(1 − p). and we may take C1 = Fn2 . As to the phase error, from (52) and (61), the fidelity p Hence Eve’s channel can be simulated by the phase error probability pph = 21 − p(1 − p). n n n n We consider Alice’s P A (x) as P the conditional distribution P A (x) = QA |E (x | y = 0n ). Then the corresponding hash n n n value is distributed as PCA ([x]) = z∈[x] QA |E (z | 0n ). In fact, due to the additivity of the channel, we have a more general n n n n n relation, P A (x + y) = QA |E (x|y) = 2n QA E (x, y) for ∀x, y. Thus, by noting the linearity of the hash function, we obtain X n n n QA E (z, y) (62) PCA ([x] + [y]) = 2n z∈[x]

n n for arbitrary [x] ∈ Fn2 /C and y ∈ Fn2 . By using (62), the variational distance PCA − UCA 1 can be rewritten as

n

X n n

A PCA ([x]) − 2tmin −n

PC − UCA = 1

= 2−n

[x]∈Fn 2 /C

X

X

X

X

n y∈Fn 2 [x]∈F2 /C

En W[x] (y)

n A PC ([x] + [y]) − 2tmin −n

X n n A E tmin −n 2n = 2−n Q (z, y) − 2 . n /C y∈Fn [x]∈F z∈[x] 2 2

of the previous section, which is the probability that Eve receives y under the condition that Alice’s P −1 P P n n En An An E n message is [x], takes the form W[x] (y) = Q (z) (z, y) = 2n−tmin z∈[x] QA E (z, y). Then z∈[x] z∈[x] Q the above variational distance can be rewritten further as

n

X X n 1

A −n En (y) − 2 W

PC − UCA = [x] |Fn2 /C| 1 n y∈F [x]∈Fn /C 2 2

X 1

En En − U =

,

W [x] |Fn2 /C| 1 n Now note that

[x]∈F2 /C

16

n

n where U E is the uniform

over F2 .

Adistribution n An

Then by averaging PC − UC 1 over codes C ∈ C, and applying Theorem 9, the variational distance between the randomness extractor and the true uniform distribution can be evaluated as

n

√ n 3 1

EC∈C PCA − UCA ≤ 2− 2 nE(R,pph )+ 2 max ε, 1 . 1

By a similar argument, this randomness can also be evaluated in terms of divergence as h  n i  n n

EC∈C n − dim C − H PCA = EC∈C D PCA UCA =



EC∈C IFn2 /C (An , E n )   ηl 2−nE(R,pph ) max{ε, 1} .

Furthermore, since our setting is permutation invariant as in the previous two sections, we can show that the secrecy can be guaranteed by using a fixed code C. Corollary 4: Under the same conditions as Theorem 11, and for a fixed t-dimensional code C satisfying the condition of Theorem 5, the randomness extractor Fn2 → Fn2 /C satisfies

n

√ n 1 3

A n + 1 2− 2 nE(R, pph )+ 2 ,

PC − UCA ≤ 1    n ≤ ηn−t (n + 1)2−nE(R, pph ) . n − dim C − H PCA

Note that the construction of C does not depend on the value of pph . Hence the hash function Fn2 → Fn2 /C using this fixed code C can be

regarded as a deterministicuniversal hash function for randomness extraction. By using this function, both

P An − U An and n − dim C − H P An converge to zero exponentially, when the asymptotic random number generation C C 1   p C rate R is smaller than 1 − h 1/2 − p(1 − p) . On the other hand, the asymptotic rate of conventional hash functions is the min-entropy rate − log(1 − p) obtained, e.g., in [18]. Since   1 p (63) 1−h − p(1 − p) ≥ − log(1 − p) 2 as shown in Appendix C, our hash function indeed achieves a better asymptotic generation rate. This fact is also illustrated in Fig. 3. R 1.0

0.8

0.6

0.4

0.2

p 0.1

0.2

0.3

0.4

0.5

Fig. 3. Asymptotic random number rates  generation  of randomness extractors. The upper line (normal line) shows h(p), which is theoretically optimal. p The middle line (thick line) 1 − h 21 − p(1 − p) of the present paper. The lower line (dashed line) − log(1 − p) using conventional method [18].

IX. C ONCLUSION In this paper, we first established the one-to-one correspondence between a function family F and a code family C. Then we showed that the universality of C restricts that of C ⊥ , and by using this fact, introduced a new class of universal code family called an ε-almost dual universal2 code family, which is indeed a generalization of a universal2 code family. We also presented applications of this concept to several communication models, namely, error correction, quantum key distribution, the quantum wire-tap channel, and the classical wire-tap channel.

17

For example, as to quantum key distribution, we proved the security of the BB84 protocol using ε-almost dual universal2 functions for privacy amplification, by using the Shor-Preskill–type proof technique. In the context of the wire-tap channel, in Section VII, we have shown the strong security for the classical wire-tap channel when the subcode C2 for privacy amplification is chosen from ε-almost dual universal2 subcode family of C1 . We have also succeeded in showing the existence of a deterministic hash function that does not depends on the form of Eve’s channel and achieves the strong security. However, method has a disadvantage that it requires that the sacrifice bit rate S is greater than p P p our E (e) W E (e). This rate is larger than the mutual information I(A : E) between Alice ) with F := the value h( 1−F W 0 1 e 2 and Eve. Hence, it is an open problem whether the strong security holds when the sacrifice bit rate S is greater than Eve’s mutual information I(A : E), and the subcode C2 for privacy amplification is chosen from ε-almost dual universal2 subcode family of C1 . Similarly, we have shown that an ε-almost dual universal2 code family can be used for randomness extraction. Combining this fact and the model’s invariance under bit permutations, we have constructed a deterministic hash function that is secure against a binary distribution with an arbitrary probability p. Since its construction does not depends on the parameter p, it is indeed a “universal” hash function. The generation rate of the presented method is larger than the minimum entropy, which is the generation rate of the conventional method [18]. Finally, we explain the relation of our results to a universal quantum CSS code found by Hamada [11] for sending quantum states. In his paper, he focused on an ensemble of classical self-dual codes. Then combining qubits based on the bit basis and qubits based on the phase basis, he succeeded in constructing a universal quantum CSS code from a set of universal classical self-dual codes by choosing C1⊥ = C2 . His code can be applied to QKD, where Alice can send information by using both of the bit basis and the phase basis. On the other hand, it cannot be applied to our quantum wire-tap channel model in a straightforward manner, where only the bit basis is used for sending the classical message. This is because our method employs two codes C1 and C2 chosen separately. Our method for constructing a deterministic universal hash function would not work either, if we were to restrict our codes to self-dual codes. Recall that the key point of our method is the concept of a “permuted code pair family.” R EFERENCES [1] C. H. Bennett and G. Brassard, “Quantum Cryptography: Public Key Distribution and Coin Tossing”, Proceedings of IEEE International Conference on Computers Systems and Signal Processing, Bangalore India, pp.175-179, December 1984. [2] C. H. Bennett, G. Brassard, C. Crepeau, and U.M. Maurer, “Generalized privacy amplification,” IEEE Trans. Inform. Theory, vol. 41, pp.1915-1923 (1995). [3] G. Brassard and L, Salvail, “Secret-Key Reconciliation by Public Discussion,” in T. Helleseth (Ed.): Advances of Cryptology - Eurocrypt ’93, LNCS 765, pp.410-423 (1994). [4] J. L. Carter and M. N. Wegman, “Universal Classes of Hash Functions,” J. Comput. System Sci. 18, pp.143-154 (1979). [5] I. Csisz´ar, “Almost Independence and Secrecy Capacity,” Problems of Information Transmission, vol.32, no.1, pp.40-47 (1996). [6] I. Csisz´ar and J. K¨orner, “Broadcast channels with confidential messages,” IEEE Trans. Inform. Theory, vol. 24(3), pp.339-348 (1979). [7] I. Csisz´ar and J. K¨orner, Information theory: Coding Theorem for Dicsrete Memoryless systems, Academic Press, New York, (1981) [8] R. G. Gallager, Information Theory and Reliable Communication, John Wiley & Sons (1968). [9] G. H. Golub, and C. F. Van Loan, Matrix Computation, Third Edition, The John Hopkins University Press, 1996. [10] D. Gottesman, H.-K. Lo, N. L¨utkenhaus, and J. Preskill, “Security of quantum key distribution with imperfect devices,” Quant. Inf. Comput. 5, pp.325-360 (2004). [11] M. Hamada, “Reliability of Calderbank-Shor-Steane Codes and Security of Quantum Key Distribution,” Journal of Physics A: Mathematical and General, vol.37, no.34, 8303 (2004). [12] M. Hayashi, “General non-asymptotic and asymptotic formulas in channel resolvability and identification capacity and its application to wire-tap channel,” IEEE Trans. Inform. Theory, vol. 52, No. 4, pp.1562-1575 (2006). [13] M. Hayashi, “Practical Evaluation of Security for Quantum Key Distribution,” Phys. Rev. A, 74, 022307 (2006). [14] M. Hayashi, “Upper bounds of eavesdropper’s performances in finite-length code with the decoy method,” Phys. Rev. A 76, 012329 (2007); Phys. Rev. A 79, 019901(E) (2009). [15] M. Hayashi, “Second-Order Asymptotics in Fixed-Length Source Coding and Intrinsic Randomness,” IEEE Trans. Inform. Theory, vol. 54, pp.4619-4637 (2008). [16] M. Hayashi, “Exponential decreasing rate of leaked information in universal random privacy amplification,” arXiv:0904.0308, to be published in IEEE Trans. Inform. Theory. [17] See, e.g., J. Justesen and T. Hoholdt, Course In Error Correcting Codes, European Mathematical Society (2004). [18] J. Kamp, A. Rao, S. Vadhan, D. Zuckerman, “Deterministic Extractors For Small-Space Sources,” STOC07, to appear in JCSS. [19] M. Koashi, “Simple security proof of quantum key distribution based on complementarity,” New J. Phys. 11, 045018 (2009). [20] Y. Mansour, N. Nisan, P. Tiwari, “The Computational Complexity of Universal Hashing,” in STOC ’90, Proceedings of the twenty-second annual ACM symposium on Theory of computing, pp.235-243 (1990). [21] U. Maurer and S. Wolf, “Infromation-theoretic key agreement: From weak to strong secrecy for free,” Advances in Cryptology–EUROCRYPT 2000, LNCS 1807, pp.351-368 (2000). [22] T. Miyadera, “Information-Disturbance Theorem for Mutually Unbiased Observables,” Phys. Rev. A 73, 042317 (2006). [23] J. Muramatsu and S. Miyake, “Construction of Codes for Wiretap Channel and Secret Key Agreement from Correlated Source Outputs by Using Sparse Matrices,” arXiv:0903.4014. [24] J. Muramatsu and S. Miyake, “Hash Property and Coding Theorems for Sparse Matrices and Maximum-Likelihood Coding,” IEEE Transactions on Information Theory, Volume: 56, Issue: 5, pp. 2143-2167 (2010); arXiv:0801.3878. [25] R. Renner, “Security of Quantum Key Distribution,” PhD thesis, Dipl. Phys. ETH, Switzerland, 2005; arXiv:quantph/0512258. [26] P. W. Shor and J. Preskill, “Simple Proof of Security of the BB84 Quantum Key Distribution Protocol,” Phys. Rev. Lett. 85, pp.441-444 (2000). [27] D. R. Stinson, “Universal hashing and authentication codes,” in J. Feigenbaum (Ed.): Advances in Cryptology - CRYPTO ’91, LNCS 576, pp.62-73 (1992).

18

[28] D. R. Stinson. “Universal hash families and the leftover hash lemma, and applications to cryptography and computing,” J. Combin. Math. Combin. Comput. 42, pp.3-31 (2002). [29] S. Watanabe, M. Ryutaroh, and U. Tomohiko, “Noise Tolerance of the BB84 Protocol with Random Privacy Amplification,” International Journal of Quantum Information, Vol.4, No.6, pp.935–946, 2006. [30] S. Watanabe, M. Ryutaroh, and U. Tomohiko, “Strongly Secure Privacy Amplification Cannot Be Obtained by Encoder of Slepian-Wolf Code,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol.E93-A, No.9, pp.1650–1659 (2010). [31] M. N. Wegman and J. L. Carter, “New Hash Functions and Their Use in Authentication and Set Inequality,” J. Comput. System Sci. 22, pp.265-279 (1981). [32] A. D. Wyner, “The wire-tap channel,” Bell. Sys. Tech. Jour., vol. 54, pp.1355-1387 (1975).

P ROOFS

A PPENDIX A T HEOREMS 7 AND 8

OF

First, we show Theorem 7. Due to the linearity, it is sufficient to evaluate the probability that the received signal is erroneously n decoded to C \ {0} when 0 ∈ C is sent. Let PX (x) be the n-independent and identical extension of the distribution (1 − p, p). n Since the phase error x occurs on n-bits sequence with the probability PX (x), applying Gallager’s evaluation[8] to this error 1 probability, for 0 ≤ s ≤ 1 and 0 ≤ a = 1+s , we obtain  s s  X X X X  P n (y + x) a 1 1 n n n X  = (PX (y + x)) 1+s  . Pe (C) ≤ PX (y)  (PX (y)) 1+s  n (y) PX n n y∈F2

y∈F2

x∈C\{0}

x∈C\{0}

Thus, the error probability P (C) is bounded from above by this value. Any ε-almost universal2 code family satisfies the P P 1 1 n n (y + x) 1+s ≤ ε2tmax −n x∈Fn PX inequality EC∈C x∈C\{0} PX (y + x) 1+s . Taking the average concerning the ensemble for 2 C, we obtain the upper bound  s X X 1 1 n n PX (y + x) 1+s  EC∈C Pe (C) ≤ EC∈C PX (y) 1+s  y∈Fn 2

≤ ≤

x∈C\{0}

1 1+s



X

n PX (y)

X

1 n PX (y) 1+s ε2tmax −n

y∈Fn 2

y∈Fn 2

EC∈C 

X

n PX (y + x)

1 1+s

s

1

s

x∈C\{0}

X

x∈Fn 2



n PX (y + x) 1+s  ,

(64)

s  P 1 n does not depend on y, it can (y + x) 1+s where the concavity of x 7→ xs is used. Since the quantity ε2tmax −n x∈Fn PX 2 s P s  P 1 1 n n = εs 2stmax −sn PX (x) 1+s . Hence, the right hand side of (64) be replaced with ε2tmax −n x∈Fn PX (x) 1+s x∈Fn 2 2 becomes   s 1+s X X X 1 1 1 n n n PX (x) 1+s  PX (x) 1+s  = εs 2stmax −sn  PX (y) 1+s εs 2stmax −sn  x∈Fn 2

x∈Fn 2

y∈Fn 2

s stmax −sn n[s−E0 (s,p)]

= ε 2

2

.

(65)

From this, we obtain Theorem 7. Next, we show Theorem 8. Due to the linearity, it is sufficient to evaluate the probability that the received signal is erroneously decoded to C1 \ C2 when Alice sends 0 ∈ C2 . The difference from the above case is the derivation of (64). This part of derivation can be replaced as follows.  s X X 1 1 n n EC1 ∈C Pe (C1 /C2 ) ≤ EC1 ∈C PX (y) 1+s  PX (y + x) 1+s  y∈Fn 2

≤ ≤

X

n PX (y)

x∈C1 \C2

1 1+s

y∈Fn 2

X

y∈Fn 2

n PX (y)

1 1+s



EC1 ∈C 

X

n PX (y + x)

1 1+s

x∈C1 \C2

ε2tmax −n

X

x∈Fn 2

n PX (y + x)

1 1+s

s 

s

 .

Combining this and (65), we obtain Theorem 8. Replacing C1 and C2 by C1,r and C2,r in the derivation, we obtain the evaluations (35) and (36) for EC1,r ⊂C2,r Pe (C1,r /C2,r ).

19

P ROOF

A PPENDIX B OF E QUATION (30)

In order to prove this equation, it is convenient to introduce another binary distribution Pθ = (pθ , 1 − pθ ) that is derived from P = (p, 1 − p), where pθ is defined by pθ pθ := θ p + (1 − p)θ with the convention that p0 = 0 if p = 0. The distribution Pθ , parameterized by a real number θ ≥ 0, is often called the exponential family of P . We also define a function ψ(θ) by   ψ(θ) := log pθ + (1 − p)θ .

Then the following relations are useful for simplifying calculations of divergence d(pkq) and entropy h(p). For θ ≥ 0, we have ψ ′ (θ) = −d(pθ kp) − h(pθ ), ψ ′′ (θ) ≥ 0, h(pθ ) = dh(pθ ) = dθ

−θψ ′ (θ) + ψ(θ), −θψ ′′ (θ) ≤ 0, −ψ(θ) − (1 − θ)ψ ′ (θ), −ψ ′ (θ).

d(pθ kp) = d(pθ kp) + h(pθ ) =

We shall make frequent use of these formulas in what follows. Note that E0 (s, p) can be rewritten as   1 E0 (s, p) = s − (1 + s)ψ . 1+s First, we prove Equation (30) for the limited case where the minimum is evaluated over q = pθ with 0 ≤ θ ≤ 1. Lemma 2: If R < 1 − h(p), min d(pθ kp) + [1 − h(pθ ) − R]+ = E(R, p). 0≤θ≤1

(66)

′′ Proof: ER (s, p) = −sR + E0 (s, p) is convex with respect to s, since ER (s, p) = (1 + s)−3 ψ ′′ (1/(1 + s)) ≥ 0. We define the critical rate Rc by  Rc := 1 − h p1/2 , ∂ER R such that, if R ≤ Rc (resp., R ≥ Rc ), then ∂E ∂s s=1 ≥ 0 (resp., ∂s s=1 ≤ 0). Then, if R ≤ Rc , the maximum of ER is attained at s = 1:

E(R, p) =

= =

ER (1, p) = −R + 1 − 2ψ(1/2)

 d p1/2 p + 1 − h(p1/2 ) − R

min d ( pθ k p) + 1 − h(pθ ) − R.

0≤θ≤1

∂ [d(pθ kp) − h(pθ )] = The last line follows by noting that d(pθ kp) + 1 − h(pθ ) − R attains its minimum at θ = 1/2, since ∂θ ′′ ′′ (θ − 1/2)ψ (θ) with ψ (θ) ≥ 0. Also by noting that 1 − h(p ) − R ≥ 0 for R ≤ R , we see that (66) is satisfied for R ≤ Rc . c 1/2 ∂ER ∂ER On the other hand, if R > Rc , we have ≤ 0, and also > 0 from R < 1 − h(p). Thus the maximum is ∂s s=1 ∂s s=0 R = 0, i.e., attained at sR ∈ (0, 1] satisfying ∂E ∂s s=sR     1 1 1 ψ − = 1 − R. (67) ψ′ 1 + sR 1 + sR 1 + sR

Hence E(R, p)



= ER (sR , p) = −ψ

 = d p(1+sR )−1 p .

1 1 + sR

 

sR − ψ′ 1 + sR



1 1 + sR



(68)

Note that the condition (67) can also be written as 1 − h p(1+sR )−1 − R = 0. Then by noting that d(pθ kp) − h(pθ ) is monotonically increasing for 1/2 ≤ θ ≤ 1, whereas d(pθ kp) decreasing, we see that the minimum of (66) is attained for θ = (1 + sR )−1 . Hence (66) holds for R > Rc as well.

20

Proof of Equation (30): Let M1

:=

M2

:=

min d(qkp) + [1 − h(q) − R]+ ,

0≤q≤1

min d(pθ kp) + [1 − h(pθ ) − R]+ .

0≤θ≤1

Then from Lemma 2, it suffices to show M1 = M2 . Since M1 ≤ M2 holds trivially, it remains to show M1 ≥ M2 . Denote the value of q attaining the minimum of M1 by q˜. Then we have d(˜ q kp) ≤ d(p0 kp)

(69)

since otherwise, M1

> d(p0 kp) + [1 − h(˜ q ) − R]+

≥ d(p0 kp) + [1 − h(p0 ) − R]+ ≥ M2 ,

(70)

which contradicts M1 ≤ M2 . The second line of (70) follows by noting that h(˜ q ) ≤ h(p0 ) with p0 being the uniform distribution. Note that this is true even when p = 0 (resp. p = 1) because then q˜ = 0 (resp. q˜ = 1) due to the condition d(˜ q kp) < ∞. By a straightforward calculation, one can show that, given an arbitrary combination of p, q, θ satisfying d(qkp) = d(pθ kp), h(pθ ) − h(q) =

d(qkpθ ) 1−θ

(71)

holds. From (69), d(˜ q kp) = d(pθ˜kp) holds for some θ˜ ∈ [0, 1]. Then by using (71), we see that h(pθ˜) ≥ h(˜ q ), and thus M1 ≥ M2 . A PPENDIX C P ROOF

OF I NEQUALITY

(63)

Define two functions f1 (p) := 1 − h



 1 p − p(1 − p) , 2

f2 (p) := − log(1 − p).

Then f2 is convex since f2′′ (p) = (1−p)−2 ≥ 0. The concavity of f1 can be shown as follows. By a straightforward calculation,   1 − 2x 1 , f1′′ (p) = 3 4x + log 4x 1 + 2x where x := We have f1′′ (p) ≤ 0 since

p p(1 − p).

  d 1 − 2x 4 4x + log =4− ≤0 dx 1 + 2x 1 − 4x2

if 0 ≤ p ≤ 1/2 and thus 0 ≤ x ≤ 1/2. Then by noting f1 (0) = f2 (0) and f1 (1/2) = f2 (1/2), we obtain (63).