Lezione 7 The Advanced Encryption Standard (AES)

9 downloads 54 Views 111KB Size Report
On January 2, 1997, the National Institute of. Standards and Technology (NIST) announced the initiation of a new symmetric-key block cipher algorithm as.
Lezione 7 The Advanced Encryption Standard (AES)

On January 2, 1997, the National Institute of Standards and Technology (NIST) announced the initiation of a new symmetric-key block cipher algorithm as the new encryption standard to replace the DES. The new algorithm would be named the Advanced Encryption Standard (AES). Unlike the closed design process for the DES, an open call for the AES algorithms was formally made on September 12, 1997.

The requirements of AES is as follows: (1) The call stipulated that the AES would specify an unclassified, publicly disclosed symmetric-key encryption algorithm(s). (2) The algorithm(s) must support (at a minimum) block sizes of 128-bits, key sizes of 128-, 192-, and 256-bits, and should have a strength at the level of the triple DES, but should be more efficient then the triple DES. (3) It should work on a variety of different hardware. (4) The algorithm(s), if selected, must be available royalty-free, worldwide.

On August 20, 1998, NIST announced a group of fifteen AES candidate algorithms. These algorithms had been submitted by members of the cryptographic community from around the world. Public comments on the fifteen candidates were solicited as the initial review of these algorithms (the period for the initial public comments was also called the Round 1). The Round 1 closed on April 15, 1999. Using the analyses and comments received, NIST selected five algorithms from the fifteen.

The five AES finalist candidate algorithms were MARS (from IBM), RC6 (from RSA Laboratories), Rijndael (from Joan Daemen and Vincent Rijmen), Serpent (from Ross Anderson, Eli Biham, and Lars Knudsen), and Twofish (from Bruce Schneier, John Kelsey, Doug Whiting, David Wagner, Chris Hall, and Niels Ferguson). These finalist algorithms received further analysis during a second, more in-depth review period (the Round 2).

In the Round 2, comments and analysis were sought on any aspect of the candidate algorithms, including, but not limited to, the following topics: cryptanalysis, intellectual property, cross-cutting analyses of all of the AES finalists, overall recommendations and implementation issues. On October 2 , 2000, NIST announced that it has selected Rijndael to propose for the AES.

Outline  About the Finite Field GF(pn)  The Basic Algorithm  The Layers  Decryption  Design Consideration

1 About the Finite Field GF(pn) For every power p n of a prime, there is exactly one finite field with p n elements. But the integer modulo p n does not form a field, since the n

congrucence px ≡ 1(mod p ) does not have a solution.

Example 1 Construct GF(22 ). Solution : Let Z2 [ X ]be the set of polynomials whose coefficients are integers mod 2, such as X 6 + X + 1, X . The constant polynomials 0,1 are also in Z 2 [ X ]. We can add, subtract, and multiply in this set, as long as we work with the coefficients mod 2, such as ( X 3 + X + 1)( X + 1) = X 4 + X 3 + X 2 + 1. We can perform division with remainder, just as with the integers. For example, we divide X 2 + X + 1 into X 4 + X 3 + 1, get X 4 + X 3 + 1 = ( X 2 + 1)( X 2 + X + 1) + X . We can write this as X 4 + X 3 + 1 ≡ X (mod X 2 + X + 1). Therefore, we can define Z 2 [ X ](mod X 2 + X + 1) to be the set {0,1, X , X + 1} of polynomials of degree at most 1. For addition and multiplication mod X 2 + X + 1, it is a field with 4 elements.

1.1 The Construction of the Finite Field GF(pn) The general procedure for constructing a finite field GF ( p n ). (1) Z p [ X ] is the set of polynomials with coefficients modp. (2) Choose P( X ) to be an irreducible polynomial modp of degree n. (3) Let GF ( p n ) be Z p [ X ]mod P( X ). Then GF ( p n ) is a field with p n elements. # What happens if we do the same construction for two different irreducible polynomials, both of degree n? It is possible to show that these are essentially the same field.

1.2 Division

The Extended Euclidean Algorithm Example 2 Consider GF(28 ) = Z 2 [ X ](mod X 8 + X 4 + X 3 + X + 1), find the inverse of X 7 + X 6 + X 3 + X + 1. Solution : Calculate gcd( X 7 + X 6 + X 3 + X + 1, X 8 + X 4 + X 3 + X + 1)(remainder → divisor → dividend → ignore) is the same as for integers : X 8 + X 4 + X 3 + X + 1 = ( X + 1)( X 7 + X 6 + X 3 + X + 1) + ( X 6 + X 2 + X ) X 7 + X 6 + X 3 + X + 1 = ( X + 1)( X 6 + X 2 + X ) + 1. Therefore, 1 = ( X 2 )( X 7 + X 6 + X 3 + X + 1) + ( X + 1)( X 8 + X 4 + X 3 + X + 1). Reducing mod X 8 + X 4 + X 3 + X + 1, we obtain : ( X 2 )( X 7 + X 6 + X 3 + X + 1) ≡ 1(mod X 8 + X 4 + X 3 + X + 1).

1.3 GF(28) Use GF(28 ) = Z 2 [ X ](mod X 8 + X 4 + X 3 + X + 1) as an example. Every element can be represented uniquely as a polynomial b7 X 7 + b6 X 6 + b5 X 5 + b4 X 4 + b3 X 3 + b2 X 2 + b1 X + b0 , where each bi is 0 or 1. The 8 bits b7b6b5b4b3b2b1b0 represent a byte. For example, X 7 + X 6 + X 3 + X + 1 becomes 11001011. Addition is the XOR of the bits : (X 7 + X 6 + X 3 + X + 1)+(X 4 + X 3 + 1)→ 11001011XOR 00011001 = 11010010 → X 7 + X 6 + X 4 + X . Multiplication is (X 7 + X 6 + X 3 + X + 1) ( X ) → 11001011(shift left and append a 0) → 110010110 → 110010110 XOR100011011(subtract X 8 + X 4 + X 3 + X + 1, if the first bit is 1) = 010001101. In summary, we see that the operations in GF (28 ) is efficiently.

2 The Basic Algorithm For simplicity, we restrict to 128 bits, and firstly give a brief outline of the algorithm. The algorithm consists of 10 rounds. Each round has a round key, derived from the original key. There is also a 0th round key using the original of 128 bits. A round starts with an input of 128 bits and produces an output of 128 bits.

There a four basic step, called layers, that are used to form the rounds: (1) The ByteSub (SB) Transformation: This non-linear layer is for resistance to differential and linear cryptanalysis attacks. (2) The ShiftRow (SR) Transformation: This linear mixing step causes diffusion of the bits over multiple rounds. (3) The MixColumn (MC) Transformation: This layer has a purpose similar to ShiftRow. (4) AddRoundKey (ARK) Transformation: The round key is XORed with the result of the above layer.

A round is then ByteSub

ShiftRow

MixColumn

AddRoundKey

Rijndael Encryption (1) ARK, using the 0th round key. (2) Nine rounds of BS, SR, MC, ARK, using round keys 1 to 9. (3) A final round: BS, SR, ARK, using the 10th round key. # The final round omits Mixcolumn layer.

3 The Layers

The 128 input bits are grouped into 16 bytes of 8 bits each, call them a0,0 , a1,0 , a2,0 , a3,0 , a0,1 , a1,1 ,  , a3,3 , and are arranged int 4 × 4 matrix  a0 , 0 a  1,0  a2 , 0   a3,0

a0,1 a0, 2 a1,1 a1, 2 a2,1 a2, 2 a3,1 a3, 2

a0,3  a1,3  . a2 , 3   a3,3 

In the following, we' ll need to work with the finite field GF (28 ). The model of GF (28 ) depends on a choice of irreducible polynomial of degree 8. The choice for Rijndeal is X 8 + X 4 + X 3 + X + 1. The elements of GF (28 ) can be represented by bytes. They can added by XOR. They also be multiplied in a certain way. Each element has a multiplicative inverse.

3.1 The ByteSub Transformation

S − Box (16 ×16)

99 124 119 123 242 107 111 197 48 1 103 43 254 215 171 118 202 130 201 125 250 89 71 240 173 212 162 175 156 164 114 192 183 253 147 38 4 199 35 195

54 24

63 150

9 83

27 32

110 90 160 82 252 177 91 106

131 209

44 0

26 237

247 204 5 154

52 7

165 229 241 113 216 49 21 18 128 226 235 39 178 117 59 214 179 203 190 57

41 74

47 88

132 207

208 239 170 251 67 77 81 163 64 143 146 157 205 12 19 236 95 151

51 56 68

96 129 224 50

79 58

220 10

144 136 70 36 92 194

231 200 186 120

55 37

109 141 213 78 169 108 86 244 234 101 122 174 8 46 28 166 180 198 232 221 116 31 75 189 139 138

34 73

42 6

133 69 249 2 127 80 245 188 182 218 33 16 23 196 167 126 61 100

227 76

238 184 211 172

112 62 181 102 72 3 246 14 97 225 248 152 17 105 217 142 148 155

53 30

140 161 137

153

13

191 230

66

104

65

20 98

69 159 168 255 243 210 93 25 115

222 94 145 149

87 185 134 193 135 233 206 85 45

15

176

84

11 219 228 121

29 40

158 223

187

22

3.1 The ByteSub Transformation (Continued) Wirte a byte as 8 bits : abcdefgh. Look for the entry in the abcd row and efgh column. For example, if the input byte is 10001011, we look in row 9 and column 12. The entry is 61, which is 111101 in binary. The output of ByteSub is again a 4 × 4 matrix of bytes.  a0 , 0 a  1,0  a2 , 0   a3,0

a0,1 a0, 2 a1,1 a1, 2 a2,1 a2, 2 a3,1 a3, 2

a0,3  b0,0 a1,3   b1,0 → a2,3  b2,0   a3,3  b3,0

b0,1 b0, 2 b1,1 b1, 2 b2,1 b2, 2 b3,1 b3, 2

b0,3  b1,3  . b2,3   b3,3 

3.2 The ShiftRow Transformation The four rows of the matrix are shifted cyclically to the left by offsets of 0,1,2, and 3, to obtain c0,0 c  1,0  c2 , 0  c3,0

c0,1 c0, 2 c1,1 c1, 2 c2,1 c2, 2 c3,1 c3, 2

c0,3  b0,0 c1,3   b1,1 = c2,3  b2, 2   c3,3   b3,3

b0,1 b0, 2 b1, 2 b1,3 b2,3 b2,0 b3,0 b3,1

b0,3  b1,0  . b2,1   b3, 2 

3.3 The MixColumn Transformation The output of the ShiftRow step is a 4 × 4 matrix (ci , j ) with entries in GF (28 ). Multiply this by a matrix, again with entries in GF (28 ), to produce the output (d i , j ), as follows : 00000010  00000001   00000001   00000011  d 0,0 d 0,1 d d1,1 1, 0  = d 2,0 d 2,1   d 3,0 d 3,1

00000011 00000001 00000001 c0,0  00000010 00000011 00000001  c1,0 00000001 00000010 00000011 c2,0  00000001 00000001 00000010 c3,0 d 0, 2 d1, 2 d 2, 2 d 3, 2

d 0 ,3  d1,3  .  d 2 ,3  d 3,3 

c0,1 c0, 2 c1,1 c1, 2 c2,1 c2, 2 c3,1

c3, 2

c0,3  c1,3  c2 , 3   c3,3 

3.4 The RoundKey Addition

The round key, derived from the original key consists of 128 bits, which are arranged in a 4 × 4 matrix (ki , j ) consisting of bytes. This is XORed with the output (d i , j ) in the MixColumn step :  d 0, 0 d  1,0  d 2, 0   d 3,0 e0,0 e 1, 0  = e2,0   e3,0

d 0,1 d 0, 2 d1,1 d1, 2 d 2,1 d 2, 2 d 3,1 d 3, 2 e0,1 e0, 2 e1,1 e1, 2 e2,1 e2, 2 e3,1

e3, 2

d 0,3   k 0, 0 d1,3   k1,0 ⊕  k 2, 0 d 2,3   d 3,3   k3,0 e0,3  e1,3  . e2,3   e3,3 

k0,1 k0, 2 k1,1 k1, 2 k 2,1 k 2, 2 k3,1 k3, 2

k 0,3  k1,3  k 2,3   k3,3 

3.5 The Key Schedule

The original key consists of 128 bits, which are generated into a 4 × 4 matrix of bytes. Label the first four colums W (0),W (1), W (2),W (3). The new columns are generated recursively. If 4 /| i , then W (i ) = W (i − 4) ⊕ W (i − 1). If 4 | i , then W (i ) = W (i − 4) ⊕ T (W (i − 1)), where T (W (i − 1)) is the transformation of W (i − 1). a  a  b   e  e ⊕ (10) (i −4) / 4   b  b   c  f  f S −box           Let W (i − 1) = . → → →   c   c  d  g  g          h  d  d   a   h   = T (W (i − 1)). The round key for the ith round consists of the columnsW (4i ), W (4i + 1), W (4i + 2),W (4i + 3).

3.6 The Construction of the S-Box The S - box has a simple mathematical description. The inverse of the byte x7 x6 x5 x4 x3 x2 x1 x0 in GF (28 ) can be represented by y7 y6 y5 y4 y3 y2 y1 y0 . Suppose the inverse of the byte 00000000 is 00000000. The entry of x7 x6 x5 x4 x3 x2 x1 x0 in the S - box can be compute by 1 1  1  1 1  0 0  0

0 0 0 1 1 1 1  y0  1  z0  1 0 0 0 1 1 1  y1  1  z1  1 1 0 0 0 1 1   y 2  0   z 2        1 1 1 0 0 0 1   y3   0   z 3  + = .     z4 1 1 1 1 0 0 0  y 4 0        1 1 1 1 1 0 0  y5  1  z5  0 1 1 1 1 1 0  y6  1  z6        0 0 1 1 1 1 1  y7  0  z7 

3.6 The Construction of the S-Box (Continued) Example 3 The inverse of the byte 11001011 in GF (28 ) is 00000100. We calculate 1 1  1  1 1  0 0  0

0 1 1 1 1 1 0 0

0 0 1 1 1 1 1 0

0 0 0 1 1 1 1 1

1 0 0 0 1 1 1 1

1 1 0 0 0 1 1 1

1 1 1 0 0 0 1 1

1 0 1 1 1 0 1 1 1 1 0 1       1 0 0 1 + = . 0 0 0 1       0 0 1 0 0 0 1 0       1 0 0 0

This yield the byte 00011111 = 31. We check the row 1100 + 1 = 13 and the column 1011 + 1 = 12 in the S - box. We also obtian the entry 31.

4 Decryption Each of the steps ByteSub, ShiftRow, MixColumn, and AddRoundKey is invertible: (1) The inverse of ByteSub is another lookup table, called InvByteSub (IBS). (2) The inverse of ShiftRow is obtained by shifting the rows to the right instead of to the left, yielding InvShiftRow (ISR).

(3) The transformation InvMixColumn (IMC) is given by multiplication by the matrix 00001110  00001001   00001101   00001011

00001011 00001101 00001001 00001110 00001011 00001101 . 00001001 00001110 00001011  00001101 00001001 00001110

(4) AddRoundKey is its own inverse.

Therefore, Rijndael encryption ARK BS,

SR,

MC,

Rijndael decryption ARK, ISR, IBS ARK ⇒

 BS, BS,

ARK, IMC, ISR, IBS

SR, MC, ARK SR, ARK.

 ARK, IMC, ISR, IBS ARK.

We can rewrite the decryption to achieve the same structure as encryption. Clearly, the oder of ISR and IBS can be reversed. Applying MC and then ARK to a matrix (ci , j ) is gave as (ci , j ) → (mi , j )(ci , j ) → (ei , j ) = (mi , j )(ci , j ) ⊕ (ki , j ). The inverse is obtained by solving (ei , j ) = (mi , j )(ci , j ) ⊕ (ki , j ). Since (ci , j ) = (mi , j ) −1 ((ei , j ) ⊕ (ki , j )) = (mi , j ) −1 (ei , j ) ⊕ (mi , j ) −1 (ki , j ), the process is (ei , j ) → (mi , j ) −1 (ei , j ) → (mi , j ) −1 (ei , j ) ⊕ (ki′, j ), where (ki′, j ) = (mi , j ) −1 (ki , j ). The first arrow is IMC. Let InvAddRoundKey(IARK) be XORing with (ki′, j ). We can use " IMC and IARK" to replace " ARK and IMC".

Now, the decryption is given by Rijndael decryption ARK, IBS, ISR IMC,  IMC,

IARK, IBS, ISR IARK, IBS, ISR

ARK.

Rijndael Decryption (1) ARK, using the 10th round key. (2) Nine rounds of IBS, ISR, IMC, IARK, using round keys 9 to 1. (3) A final round: IBS, ISR, ARK, using the 0th round key.

# To keep the perfect structure, the MC is omitted in the last round of the encryption.

5 Design Consideration (1) Unlike the Feistel system, all bits are treat uniformly. This has effect of diffusing the input bits faster. It can be shown that two rounds are sufficient to obtain full diffusion.

(2) The S-box is constructed in an explicit and simple algebraic way so as to avoid the mysteries of trapdoors built into the algorithm. It is excellent at resisting differential and linear cryptanalysis, as well as interpolation attacks. (3) The SR step is added to resist truncated differentials and square attack. (4) The MC causes diffusion among the bytes.

(5) The ARK involves nonlinear mixing of the key bits. The mixing is designed to resist the known part key attack. The round constants are used to eliminate symmetries. (6) The number of rounds was chosen to be 10 because there are attacks that are better than brute force up to seven rounds in 2004. No known attack beats brute force for seven or more rounds. It was felt that three extra rounds provide a large enough margin of safety.