the behaviour of the cryptographic primitives in case ...

2 downloads 0 Views 216KB Size Report
Email: [email protected], mircea.stratulat@cs.upt.ro. Abstract: In case of cryptographic algorithms, the benchmark is generally created on input data in ...
The Behaviour of the Cryptographic Primitives in Case of Large Data Volumes Tomoiaga Radu, Stratulat Mircea Faculty of Automatics and Computers University Politehnica of Timişoara Timisoara, V.Parvan 2, Timisoara, 1900 Romania Email: [email protected], [email protected] The quality characteristics of a system are conditioned by the quality attributes, and each attribute is measured through one or more metrics. One or more evaluation elements correspond to a metrics [3]. The high costs characteristic to the development process for software products are correlated to their performances. The performances are set by taking into consideration the time necessary for the obtaining of the final solution and the results of the proposed problem. The performance level is directly influenced by the software product quality. The quality of a computer program can be divided as it follows: design quality, execution quality, compliance quality, the easy usage and maintenance possibility. The performance evaluation is generally made according to: characteristics, attributes, metrics and evaluation elements. The quality is determined by the quality attributes. Each attribute is measured through one or more metrics and one or more evaluation elements correspond to a metrics. Sora, I., in [4], discusses about algorithms for parallel calculation by debating in a chapter about the evaluation of performances and of metrics in parallel programs. In this presentation the author takes into consideration as performance metrics the execution time, cost, efficiency, acceleration, overload and we have as a goal the performance win in parallel programming regarding the sequential one. For example, the execution time is divided into two components: sequential execution time (defined as the time from the beginning until the end of the program execution on a sequential computer) and the parallel execution time (i.e. the time form the beginning of execution until the finishing of the last parallel subtask). Another performance metrics can be the overload (in case of parallel programming - Total parallel Overhead), which is defined as being the difference between the total summed up work time of all CPUs and the time necessary for the quickest sequential algorithm. The total summed up work time of all CPUs is defined as being the calculation time to which the communication time and temporary inactivity time cumulated for all CPUs are added [4]. A different metric is represented by the acceleration. It appeared as a solving for the question: „How many times quicker is solved a parallel algorithm then a sequential one?” The acceleration is defined as the rapport between the time necessary for running an algorithm on a CPU and the time necessary for solving the algorithm in parallel on several identical CPUs. We must specify that the acceleration is a metric that evaluates the performance of the algorithm and not that of the parallel calculation system [4]. Another metrics

Abstract: In case of cryptographic algorithms, the benchmark is generally created on input data in small amounts. The computation time is low and for accuracy, the test is repeated several times and the benchmark result is calculated as an average of the obtained times. In the present paper, we shall analyze the performance of the cryptographic algorithms for large data volumes. The large data are saved on external drives (hard disk), and as news, the access time for the information on the hard disk is added to the encrypting time. Keywords-component Cryptography, benchmark, primitives, performance analysis

I.

INTRODUCTION

A support for the benchmark development is brought by the consortium SPEC (Standard Performance Evaluation Cooperative[1]) and TPC (Transactions Processing Council) [2] that was founded in year 1998. This provides benchmarks and guides in order to improve the tests quality. The last ones appeared as a reaction to the fact that many benchmark programs were created incorrectly and lead finally to wrong results. In order to keep up with the technological development, the benchmarks must be periodically updated in order to obtain correct results. The trend of some producers is to sale their products by specifying the obtained results on certain sets of tests that underline the respective product. More than that, products are designed to run better on certain benchmarks in order to obtain better results than other competitor products. Even benchmarks are developed so that they run in the shortest time possible on certain products, in order to underline them on the market. In order to interpret the results of a benchmark correctly one needs to understand the used algorithms, functions and the performance evaluation metrics. II.

METRICS AND PERFORMANCE EVALUATION

„The performances are determined by taking into consideration the duration of obtaining the final solution and the consumption of resources necessary for complete solving of the current problems. The performance level is influenced by the way in which the program product was assure for conception – design quality, execution quality, conformity quality, capacity for current usage and maintenance capacity ”. [3]. The quality indices are organized on different levels: characteristics, attributes, metrics and evaluation elements [3].

c 978-1-4244-6363-3/10/$26.00 2010 IEEE

447

that also refers to parallel CPUs is the efficiency and it is defined as being the rapport between the acceleration and the number of CPUs. Its value is normally subunit and reaches the value 1 only in an ideal case when the parallel system does not overload. There can be regarded one more metrics, i.e. the cost. It is actually the total summed up work time of all CPUs. Thus we can go back and express the efficiency as being the computation time in sequential case linked to the cost. In case the cost of solving the problems by means of parallel algorithm equals the execution time of the quickest sequential algorithm, we say that the parallel algorithm is optimal in cost (or optimal cost) [4]. Pocatilu, P. presents in [5] more models. The first one is that of complexity of the informatic system McCabe, also presented in [8]. It refers to the evaluation of the programs complexity. A second model would be the performance model COCOMO (COnstructive COst MOdel). This model refers to cost estimation, effort estimation and that of the time necessary for developing a software. It is also reminded the Halstead model, which has as a target the evaluation of calculations complexity form the source code point of view. III.

talk about asymmetric encrypting. Taking in consideration that an asymmetric algorithm is a great calculation power consumer, the asymmetric encrypting is used for small dimension messages or is used for encrypting a key and this key is used for symmetric encrypting. The symmetric encrypting is a cipher. The cipher can be of type block or stream (block cipher respectively stream cipher). When we talk about encrypting we can describe it as a known function, which applies a transformation upon the input message by also using in the process the key in order to obtain the encrypted text as a result. Taking into consideration this encrypting system, its security consists in keeping the key secret. In order to obtain the message again from the encrypted text, the reverse transformation is used, together with the same key. The key does not have to be identical with the one used in the encrypting process, but it is necessary that one is easy to deduce from the other [10]. For the practical part we have chosen the following symmetric algorithms: DES, 3DES, AES. In the project process of the AES algorithm, the following criteria have been regarded: the algorithm had to be resistant to all known attacks, the code had to be compact, the speed high on several platforms and its design had to be simple. [11]

CRYPTOGRAPHIC PIMITIVES

Solga, M. in [6] creates an evaluation of performances for cryptographic functions using the four categories of cryptographic primitives: MAC functions, Hash functions, symmetric algorithms and asymmetric algorithms. In the present paper, there have been chosen only three categories, i.e.: MAC functions, Hash functions and symmetric algorithms. This chapter presents the chosen cryptographic primitives. Their choosing has been made based on the taxonomy created by Schneier, B. in [7]. SHA384 has been left out because, according [7], it is useless because the algorithm based on SHA512 is run to obtain the result and then a number of bits are left out.

C. Hash Functions This category (one way hash functions, compression function, message digest, fingerprint, cryptographic checksum [12], neglecatble functions [13]) have been chosen the following functions: SHA1,SHA256, SHA512, MD5. These functions are considered iterative. An iterative hash function assumes the input division in fix dimensioned blocks. These blocks are processed in order by using compression functions and intermediary recursive stages. The result of the last iteration is the result of the hash function [7]. A hash function must have the following properties [14]: Transformation-mix. For any x input, the result h(x) must not be computationally distinguished from a string of characters in the interval [0,2h). Resistance to collision. It is necessary to be impossible to find two inputs x, z with x ≠ z so that h(x)=h(z). One way. Having the result of the hash algorithm h, it should be impossible to calculate the input x, so that h=h(x). Practical efficiency. Being the input x, the calculation of h(x) must be realized in a sufficiently short period. Keyless. The algorithm does not use any key in process of obtaining the result.

A. HMAC Algorithms (Authentification Codes of Messages) The algorithms based on authentification codes of messages (MAC- message authentification code) are algorithms that receive on input a secret key and a string of variable length that requires authentification and that will generate on output a string of fix length. The string of fix length is called MAC. A category in this class is represented by HMAC algorithms. A MAC algorithm built by using cryptographic primitives of the type of hash functions is called HMAC. The algorithms implemented in the practical part of this category are: HMACSHA1, HMACSHA256, HMACSHA512, HMACSHAMD5. For example, in order to obtain the MAC of a message M, by using a key k and an algorithm hash, we use the formula MAC=h(k║M) [14]. This method is considered to be simple and it is necessary to be calculated in the process the hash function applied on the mentioned formula result [9]. HMAC have been defined in FIPS 198, and then replaced by FIPS 1981.

IV.

THE RESULTED APPLICATIONS

In this chapter the applications implemented in the four development environments are described. For the Windows platform (XP SP3 and 2000 SP4), applications in Visual Studio 2008 were developed (more precise Visual C# and Visual Basic .Net). Classes used in test applications belong to System.Security.Cryptography. In order to test algorithms mentioned in this benchmark in the UNIX operating system the Live CD Kubuntu 9.04 distribution has been selected. The fact that the files have been read and written on a NTFS partition is a disadvantage due to the fact that the operating system originally runs on EXT2/EXT3, but at the same time an advantage, because they can be easily compared with visual

B. Symmetric Algorihtms In case the same key is used both for the encrypting and decrypting process, then the algorithm is called symmetric (algorithm with symmetric key). In case the encrypting process uses a key, and the decrypting process uses another key, the we

448

449

PC

Processor

PC1

Core(TM)2 Duo P8400 @ 2.26GHz, CPU Wolfdale

PC2

Core(TM)2 Duo CPU E6750 @ 2.66GHz ,Conroe

PC3

Pentium(R) 4 CPU 3.00GHz Prescott-2M

PC4

Dual CPU E2140 @ 1.60GHz CPU Conroe-1M

PC5

Intel Core 2 Quad Yorkfield

Memory Slot 1+ Slot3 Samsung 2048 MBytes (400 MHz) Slot 1+Slot3 Kingston 2048 MBytes (333 MHz) XMM1+3 JTAG 256 MBytes 533 MHz Slot 1+ Slot3 Nanya 512 MBytes (333 MHz) Slot 1+Slot3 Kingston 2048 MBytes (400 MHz)

HDD WD 250 GB 5400 RPM Seagate 500 GB 7200 RPM WD 80 GB 7200 RPM Maxtor 250 GB 7200 RPM 2 HDD WD 1 TB RAID

Table 4.1 Platform description (information obtained with System Info for Windows) Comparare AES 1 GB

2 GB

3 GB

4 GB

7 GB

8 GB

9 GB

10 GB

5 GB

6 GB

2500

Timp(Secunde)

2000

1500

1000

500

0 PC1

PC2

PC3

PC4

PC5

Figure 4.1 AES Algorithm Visual Basic Hash functions on 1GB file

MD5

SHA1

SHA256

SHA512

100 Time(Seconds

media considering the fact they have also been ran on NTFS partitions. The test consisted in running the algorithms on files of 1 MB, 100 MB, 1 GB, 2 GB, 3 GB, .., 10 GB sizes. The algorithms in this phase have been implemented in Visual Basic, C# and under Unix, using OpenSSL libraries. These tests stimulate the encrypting applications of files and hard disks (applications like TrueCrypt, Kruptus etc.). The time is calculated this way (Visual Basic example): TimeSpan duration = stopTime - startTime; duration.TotalMilliseconds For Unix the „time” command is used associated with the Open SSL command like in the following example: time openssl dgst -md5 test1mb.txt (UNIX). For these tests five workstations have been used as test platforms, configured as shown in table 4.1. The platforms have been chosen so that the algorithms could be tested on multiple types of units and systems. For small size files, an increased performance is provided by the algorithms in C#. La For large size files, up to 5 GB, C# the performance increases in time, uniformly and has a shorter computing span / time than VB or Open SSL. Starting with 6 GB files an emphasized performance increase than with VB or Open SSL can be observed. The sole uniform increase for all ten large size files is that of OpenSSL, but the time obtained is not better than the one obtained using VB, respectively C#. By looking at the results it can be observed that platform PC2 has the shortest computing time regarded as an average for all tested algorithms and all large size test files. It can be seen that OpenSSL’s time is close to VB’s, while C# needs almost twice as much than VB’s time or OpenSSL’s to run the given task in case of algorithm HMAC SHA256 when applied to a 4GB size file on the PC1 platform. In case of function HMAC MD5 when it is tested on a 5 GB file on platform PC4 it can be observed that the times needed are almost identical. Comparing the computing time of hash functions on platform PC1 in the OpenSSL environment for the 8 GB entry file, it can be seen that SHA1 and SHA512 primitives have recorded the shortest time while MD5 recorded a longer one and SHA256 the longest. The timings are close as values. The situation is repeated in a similar way for platform PC4. Although the units have two cores, coming from different classes, a computing time increase can be observed within tests performed on platform PC4. The conclusion hat hash functions are influenced by the amount of memory, given the fact that PC4 has only 1 GB RAM while PC1 has 3 GB RAM can be drawn. The fact that SHA256 recorded a longer time than SHA512 comes as a surprise. In case of C# language it can be observed that SHA512 is the slowest algorithm out of the four tested and the computing time is longer in case of platform PC4 than that of PC1. Timing of platform PC1 using Visual Basic libraries is the shortest within this tests segment. Increase of time periods follows the increase trend seen in the case of C# where MD5 has the shortest computing time (regardless the platform) and SHA512 has the longest computing time.

80 60 40 20 0

PC1

PC2

PC3

PC4

PC5

Figure 4..2 HASH functions applied on a 1 GB file

Next, the performance of the DES algorithm in the three environments is assessed. The chosen file was the 6 GB one. On platform PC2 it will be seen that under Unix, DES records the best computing time. Visual Basic records an increase of approximately 48 seconds compared to OpenSSL and C# an increase of circa 125 seconds compared to OpenSSL and cca. 77 seconds compared to VB. In exchange, on platform PC3 under Unix, DES obtains the poorest / worst computing time. The best computing time is the one obtained in Visual Basic followed closely by C#.

Criptare simetrica Fisier 1GB

AES

Des

had the worst performances on Dot Net, although the hardware configuration was better than on the platforms. A crucial factor for these results was the RAID configuration for the two hard drives which was a mirroring RAID. In the case of the platforms that were subject of this benchmark we can easily spot a winner, but the issue is not that simple when discussing about the programming languages and the operating systems and winner couldn’t be established. Every programming language and applications that were tested get better performances then the others in some points of the tests, been beaten by the others in other tests, or in the same test at another level. During the last years, due to the slow processors evolution the, hard computing power application developers oriented towards other type of processors. Graphic Processors were taken in consideration. This were initially designed and developed for 3D rendering, video encoding and decoding and for games engines. Software like this require a big amount of computing power and the processors of the personal computers couldn’t offer this. Graphic Card developers designed more and more powerful graphical processors and gave the software developers the chance to write their own programs the use co processing on CPU and GPU. Manavski Svetlin presents in [15] the results on testing CUDA compatibility in hardware acceleration for AES on NVIDIA graphic cards. His best result was on AES 128, for an 8 MB input file, the performance being of 8.28 Gbps. The GPU algorithm was 19,60 time faster than the CPU algorithm. As a future research we shall try to implement the encrypting algorithm AES in order to run on a graphic unit, by using CUDA. In a second stage, we will try to implement, optimize and integrate the algorithm in a software application like OpenSSL, so that it can be used in the application and can benefit of the acceleration of the graphic unit.

3DES

250

Time (Seconds)

200

150

100

50

0 PC1

PC2

PC3

PC4

Figure 4.3.Symmetric encrypting (Unix) 1 GB file

PC 2 SHA 256 450000 400000 350000 300000 250000 200000 150000 100000 50000 0 1 GB

2 GB

3 GB

4 GB

VB

5GB

6 GB

C#

7 GB

8 GB

9 GB

10 GB

OpenSSL

Figure 4.4 Linear increase SHA256 PC2

Form the tests effectuated in Visual Basic and C# on the PC5 platform, it resulted that the AES algorithm needs the same computation time. In case of DES a difference of approximately 157 seconds between C# and Visual Basic can be noticed. 3DES in C# needs approximately 31 more seconds than Visual Basic. Generally, in case of symmetrical algorithms, for platform PC5 the computing time periods are noticeably longer than those of the other platforms while, in case of HMAC functions, the necessary run time periods are the shortest compared to the other platforms. Thus, it can be concluded that the reading / burning time of the storing media influences the performance of the cryptographic primitives. V.

REFERENCES [1] SPEC benchmarks, 1998, www.spec.org [2] Transactions Processing Council, 1998, www.tpc.org [3] http://facultate.regielive.ro/referate/calculatoare/ calitatii_in_industria_software-59516.html [4] http://www.cs.utt.ro/~ioana/calc_par/c4.ppt [5] http://revistaie.ase.ro/content/15/Pocatilu.pdf [6] Solga M., Groza B., Evaluarea performanţelor computaţionale pentru funcţii criptografice simetrice şi asimetrice, pe platformele Windows şi Unix, 2008. [7] SCHNEIER, B ,NIELS FERGUSON. SCHNEIER'S CRYPTOGRAPHY CLASSICS LIBRARY: PRACTICAL CRYPTOGRAPHY, 2003 [8] Thomas J. McCabe - A Testing Methodology Using the Cyclomatic Complexity Metric, Computer Systems Laboratory, National Institute of Standards and Technology, Gaithersburg, 1996 [9] Anderson R. Security Engineering: A Guide to Building dependable Distributed Systems Second Edition, Ed. John Wiley and Sons, 2008 [10] Groza B. , Universitatea Politehncă din Timişoara CONSTRUCŢII CRIPTOGRAFICE HIBRIDE, BAZATE PE TEHNICI SIMETRICE ŞI ASIMETRICE - APLICAŢII ÎN SISTEME DE CONDUCERE Teza de Doctorat, 2008 , www.aut.upt.ro/~bgroza [11] http://adi.ro/pub/scoala/Master/SemI/SecDate/Materiale%20CURS%202007/ [12] Schneier, B. Applied Cryptography, 2nd ed. New York: John Wiley & Sons, 1996. [13] Groza B. , Universitatea Politehncă din Timişoara,Introducere în Sistemele Criptografice cu Cheie Publică, 2007 , www.aut.upt.ro/~bgroza [14] Wenbo Mao. Modern Cryprography : Theory an Practice (Hewlett- Packard Profesional Books),Ed. Prentice Hall 2003 [15] Manavski Svetlin, CUDA Compatible GPU as an efficient Hardware Accelerator for AES Cryptorgraphy, 2007, www.manavski.com

CONCLUSIONS AND FUTURE RESEARCH

After analyzing the obtained results of the benchmarks, the processors offer performance that varies depending on the size of the input data, the algorithms, the memory characteristics, programming language but also on the operating system. Regarding the tests done on big volume data we can say that C# and Visual Basic have almost identical performances beginning from 6 Gb to 10 Gb on AES. Up to 6 Gb C# has an easy advantage (fig 4.3). OpenSSL has almost the same behavior as Visual Basic up to 9 Gb, over this step, the computing time for OpenSSL increases unexpected. This behavior is characteristic for PC3, platform that was one of the slowest from the chosen ones. When looking at hash algorithms(especially SHA256), on platform PC2 has been observed that Visual Basic and OpenSSL have obtained similar times on the entire pool and C# had lower performances. PC2 has, in general, the best performance on the entire pool of tests. The surprise in these tests was PC5, which in Windows 2000

450