A Flexible GridFTP Client for Scheduling of Big Data Transfers

2 downloads 28928 Views 238KB Size Report
Abstract—Big Data generated in massive amounts by digital sources ranging from scientific instruments, business trans- actions to the social networks, has ...
2013 IEEE 16th International Conference on Computational Science and Engineering

A Flexible GridFTP Client for Scheduling of Big Data Transfers Esma Yildirim Department of Computer Engineering Fatih University Buyukcekmece, Istanbul, Turkey [email protected]

Abstract—Big Data generated in massive amounts by digital sources ranging from scientific instruments, business transactions to the social networks, has changed the way we understand and handle data. It has caused scientific and business community, as well as governments to focus on urgent technologies and policies to provide novel tools for management, analysis, access and scheduling of Big Data. These tools have to be flexible and scalable enough to be able to manage data in exa-scale with the help of data centers that can hold thousands of compute and storage nodes interconnected with high speed networks. In this study, we target the performance improvement that might have been achieved from scheduling of big data transfers and provide a flexible client based on a very widely adopted and acclaimed protocol GridFTP. The latest client provided by the Globus Toolkit project does not answer to the needs of highly intelligent optimized data transfer algorithms. With this flexible client, developers can implement various kinds of scheduling algorithms as well as apply optimization techniques like pipelining, parallelism and concurrency in much less restricted use cases. The ability to enqueue, dequeue, combine, sort and divide data transfers into groups helped apply these techniques easily resulting in performance improvements in terms of throughput in highspeed networks. The client was used to implement two different algorithms which were able to exploit its abilities and provided performance improvements in both cases comparing to baseline GridFTP and optimized UDT results. Keywords-Big data; data scheduling, GridFTP, client, throughput optimization

I. I NTRODUCTION Big Data refers to large, diverse, complex, longitudinal and distributed datasets generated from various sources such as scientific instruments, sensors, internet transactions, email, video, click streams and all other digital sources available today or in the future [1]. Large scientific experiments, such as environmental and coastal hazard prediction [2], climate modeling [3], genome mapping [4], and high-energy physics simulations [5], [6] generate data volumes reaching hundreds of terabytes per year [7]. Data collected from remote sensors and satellites, dynamic data driven applications, digital libraries and preservations are also producing extremely large datasets for real-time or offline processing [8], [9]. This data deluge in the scientific applications necessitates collaboration and sharing among 978-0-7695-5096-1/13 $31.00 © 2013 IEEE DOI 10.1109/CSE.2013.155

the national and international education and research institutions which results in frequent large-scale data movement across widely distributed sites. These datasets are only getting bigger and bigger making them harder to manage and share with other researchers. A very similar trend is seen in the commercial applications as well. According to a recent study by Forrester Research [10], 77% of the 106 large organizations that operate two or more data-centers run regular backup and replication applications among three or more sites. Also, more than 50% of them have over one petabyte of data in their primary datacenter and expect their inter-datacenter throughput requirements to double or triple over the next couple of years [11]. As a result Google is now deploying a large-scale inter-datacenter copy service [12] and background traffic becomes important in Yahoos aggregate inter-datacenter traffic. Several optical networking initiatives such as, ESnet [13], GEANT2 [14], Internet 2 [15], LONI [16] , [17] and Xsede [18] provide high-speed networking connectivity reaching up to 100Gbps to their users to mitigate the data bottleneck. Unfortunately, majority of the users fail to obtain even a fraction of the theoretical speeds promised by these networks due to issues such as sub-optimal protocol tuning, inefficient end-to-end routing, disk and CPU performance bottlenecks on the sending and receiving sites. Many different transfer protocols have been developed [19], [20], [21], [22], [23], [24] in the transport and application layer to transfer data in large amounts in high speeds. GridFTP [24] is one of the most advanced and highly adopted protocols that is used in high-speed networks to move around Big Data. A rate based protocol UDT[21] is an option in GridFTP as an alternative for the underlying TCP protocol. However UDT has problems filling the high-speed bandwidth in local or metropolitan area networks. Another UDP based protocol UFTP[23] is designed to perform well for multicast transfers. A very successful rate based protocol FASP used by Aspera[22] only compares itself to the traditional FTP protocol and there is not a single study that compares it to the advanced features for high-speed data transfers in GridFTP. An important issue in Big Data transport is the dataset characteristics in addition to network and end-system bottlenecks. Unfortunately, there are very few studies that pay 1067

attention to that dimension [25]. Transferring large datasets that consist of many small files causes many bottlenecks that are present (e.g. transport protocol not reaching full network utilization due to short-duration transfers, connection startup/tear down overhead) in addition to those occurring in transferring large files (e.g.protocol inefficiency, end-system limitations). With GridFTP, several data optimization techniques are given to the users and developers to overcome these bottlenecks and increase the total throughput of their data transfers. This protocol is also an essential part of data schedulers that coordinates the data transfers in Grid and Cloud environments [26], [27]. Three of the most powerful functions provided by GridFTP are pipelining, parallelism and concurrency. With pipelining, a single control and data channel is used for all the files and they are sent back-toback without waiting the transfer complete command for the previous transfer. With parallelism, different portions of the same file are sent through different data channels while in concurrency, multiple files are sent through different channels at the same time. Setting the optimal numbers for these parameters is extremely hard. Unfortunately the current GridFTP client globus-url-copy does not give the flexibility to change them dynamically for a single data set. However it is very important for data scheduling algorithms to be able to reorder, combine, divide these data sets and apply different parameter settings to find the optimal scheduling. In this study, we analyze the current needs of scalable data scheduling algorithms and develop a client based on the GridFTP client API. This new client allows the division, reordering, combining, enqueuing and dequeuing of dataset transfers and is able to apply different pipelining, parallelism and concurrency settings to different parts of the dataset. Adaptive, optimization-based scheduling algorithms are easily implemented and their performance improvements on transfer throughput is measured. Two data scheduling algorithms are implemented by using the client API and tested on Futuregrid testbed [28]. Both algorithms dynamically alter GridFTP parameters and achieve performance improvements that will not be gained otherwise without knowing the right combination of parameter settings.

since the whole data transfer graph and the duration of the data transfers are needed to be known before hand. Other types of scalable algorithms also exist in which, datasets are transferred from multiple replicas residing on different sources [34]. In some cases, these datasets are divided so that they can actually be sent over different paths to make use of additional network bandwidth [35]. On other cases, file transfers on the same path are grouped and sent together[9]. Adaptive algorithms exist [36], [37], where each file is divided and transferred on multiple streams or multiple file transfers are started at the same time by using concurrency and parallelism notions. The level of concurrency and parallelism is changed based on the current achieved transfer throughput continuously. Optimization algorithms, on the other hand, do not continuously change these numbers but, by making use of small number of adaptive samplings, applies mathematical models to find the optimal values of parallelism, pipelining and concurrency [38], [39]. Most of the algorithms presented above are evaluated theoretically or implemented specifically based on the needs of the algorithm by using protocols such as GridFTP. Modern day Data Schedulers such as Globus Online[27] and Stork [26], queue data transfers and apply their own algorithms to the file transfers. Both provides GridFTP protocol as the baseline service for transfers. Globus Online provides data management capabilities to users as hosted Software-as-a-Service (SaaS) and manages fire-and-forget file transfers for Big Data through thin clients over the Internet. However, it does not provide any optimization capabilities. It sets pipelining, parallelism and concurrency statically for three groups of datasets: average file size less than 50MB, greater than 250MB and in between. Stork, on the other hand, applies the optimization model in [38] for single file transfers and it is also possible to set a static concurrency level to tasks submitted to the scheduler. Each transfer task is started immediately based on this level. This concurrency notion is different from GlobusOnline’s where, a group of similar transfers are started at the same time. We understand that by using different scheduling and optimization techniques, the data transfer throughput performance can be increased dramatically if an optimal or near-optimal solution is found. From the methods described above, a practical client API to implement data scheduling algorithms should allow the dataset transfers to be: A. enqueued, dequeued; B. sorted based on a property; C. divided, combined into chunks; D. grouped by source-destination paths; E. done from multiple replicas

II. R ELATED W ORK File Transfer Scheduling problem dates back to 1980s. One of the earliest works published in 1983 [29], proposes list scheduling algorithms in which the transfers are ordered in a list from the largest task to the smallest one. This idea has been used in many other algorithms with the addition of extra parameters to sort the transfers [30], [31], [32], [33]. Bandwidth of the path and size of the file are used to calculate how long a file transfer task will take. This ordering usually gives a near-optimal solution while some other methods, such as integer programming, are applied to find the optimal solution [34]. But they are not feasible

III. I MPLEMENTATION D ETAILS One of the most important setbacks in the current GridFTP client is that although it allows you to set parallelism and concurrency statically and sets its own

1068

default value for pipelining for a directory transfer which consists of a large number of files, it does not allow you to change it during the course of the transfer (pp=pipelining, p=parallelism, cc = concurrency):

IV. A LGORITHMS In this section, two different algorithms are presented which are implemented by using the data structures and functions described in Section III. The first algorithm is a pure adaptive algorithm which dynamically changes the concurrency level of the dataset transfer, while the second one is a hybrid of adaptive as well as optimization algorithm based on mathematical models.

globus-url-copy -pp -p 5 -cc 4 src url dest url A list of source and destination url pairs could be given as a file list parameter, however the benefits of pipelining can not be exploited as their developers indicate.

A. Adaptive Concurrency Algorithm The adaptive algorithm takes a file list structure returned by list files and the number of files in a chunk as input. It divides the file list into chunks each having the specified number of files in a chunk parameter. Starting with concurrency level of 1, performs the first chunk transfer and keeps the returned throughput value. Then it increases the concurrency level for the consequent chunk transfers as long as the measured throughput for each chunk transfer is greater than the previous one. If the throughput decreases the concurrency level is decreased as well. In this way, changing the concurrency level based on the changes in the throughput allows to adapt network variations and also achieve better performance by using concurrent transfers. Our client serves to the ability to divide dataset transfers into chunks and apply adaptively changing concurrency levels.

globus-url-copy -pp -p 5 -cc 4 -f filelist.txt The client presented in this study, provides flexible data structures and functions designed to allow the developer to achieve the five goals laid out in the previous section. The main data structure is called 𝑔𝑙𝑜𝑏𝑢𝑠 𝑓 𝑖𝑙𝑒 𝑡 which is designed to hold the file information such as full source and destination paths, file name and file size.The file size information can be used in constructing data chunks based on the total size, calculation of throughput and time it takes to complete the transfer based on the bandwidth information. The source and destination paths are necessary in case of combining and dividing the dataset as well as changing the source based on the available replica locations. file name is used to reconstruct the full paths. This data structure is used with list files function. For a given source and destination url, the 𝑙𝑖𝑠𝑡 𝑓 𝑖𝑙𝑒𝑠 function contacts the GridFTP server, gathers the file names at the source url and fills the file list structure.When this function returns the information about all of the files at the source url is constructed along with the number of files. In this way, it is possible to divide, combine, sort, enqueue, dequeue file transfers and alter their path informations. The 𝑝𝑒𝑟𝑓 𝑜𝑟𝑚 𝑡𝑟𝑎𝑛𝑠𝑓 𝑒𝑟 function is the most important function in this client which carries out the actual dataset transfer with the optimization parameters. This function no longer operates on source and destination url of the directory like globes-url-copy does, but manipulates the file list based on the given optimization parameters which set pipelining, parallelism and concurrency levels. Based on the concurrency level, different pipeline queues are set up and their parallelism levels are set and the transfers in each queue are carried out at the same time by a different handle(which corresponds to a GridFTP client structure that is responsible for a connection) as many as the concurrency level. Now pipelining level can be set for each one of the queues. The chunk size is the total size of the file array and used in calculation of the throughput value. With these basic functions, it is really easy to implement auxiliary functions that manipulates the file list structure such as combining, dividing, sorting and altering. In the following section, we make use of these abilities through different algorithms.

Algorithm 1 Adaptive CC Require: 𝑙𝑖𝑠𝑡 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 ∨ 𝑛𝑜 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 𝑖𝑛 𝑎 𝑐ℎ𝑢𝑛𝑘 ∨ 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 𝑐𝑐 ← 1 while 𝑇 ℎ𝑒𝑟𝑒 𝑎𝑟𝑒 𝑚𝑜𝑟𝑒 𝑐ℎ𝑢𝑛𝑘𝑠 do 𝑃 𝑒𝑟𝑓 𝑜𝑟𝑚 𝑡𝑟𝑎𝑛𝑠𝑓 𝑒𝑟 if 𝑐𝑢𝑟𝑟 𝑡ℎ𝑟 < 𝑝𝑟𝑒𝑣 𝑡ℎ𝑟&𝑐𝑐 >= 2 then 𝑐𝑐 ← 𝑐𝑐/2 else 𝑐𝑐 ← 𝑐𝑐 × 2 end if end while=0 B. PP-CC-P-Optimization Algorithm The second algorithm targets to find the optimal values of pipelining, concurrency and parallelism, first through minimal samplings, then transfers the rest of the dataset with these values. According to Yildirim et al [39] , pipelining is good for file transfers of which sizes are less than the Bandwidth-Delay Product(BDP) of the network. Also parallelism is not good for small files. The optimal pipelining level for the small files is approximately: 𝑜𝑝𝑡𝑝𝑝 ≈ 𝐵𝐷𝑃/𝑚𝑒𝑎𝑛 𝑓 𝑖𝑙𝑒 𝑠𝑖𝑧𝑒 + 1

(1)

For larger files, a static value of 𝑝𝑝 = 2 is enough. Considering the dataset at hand, consists of files smaller than

1069

Algorithm 2 Optimal PP Require: 𝑙𝑖𝑠𝑡 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 ∨ 𝑠𝑡𝑎𝑟𝑡 𝑖𝑛𝑑𝑒𝑥 ∨ 𝑒𝑛𝑑 𝑖𝑛𝑑𝑒𝑥 ∨ 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 ∨ 𝑚𝑖𝑛 𝑐ℎ𝑢𝑛𝑘 𝑠𝑖𝑧𝑒 ∨ 𝑝𝑎𝑟𝑒𝑛𝑡 𝑝𝑝 ∨ 𝑚𝑎𝑥 𝑝𝑝 Calculate 𝑚𝑒𝑎𝑛 𝑓 𝑖𝑙𝑒 𝑠𝑖𝑧𝑒 Calculate 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑜𝑝𝑡 𝑝𝑝 Calculate 𝑚𝑒𝑎𝑛 𝑓 𝑖𝑙𝑒 𝑠𝑖𝑧𝑒 𝑖𝑛𝑑𝑒𝑥 if 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑜𝑝𝑡 𝑝𝑝! = 1& 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑜𝑝𝑡 𝑝𝑝 ∕= 𝑝𝑎𝑟𝑒𝑛𝑡 𝑝𝑝 & 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑜𝑝𝑡 𝑝𝑝 𝑠𝑡𝑎𝑟𝑡 𝑖𝑛𝑑𝑒𝑥& 𝑚𝑒𝑎𝑛 𝑓 𝑖𝑙𝑒 𝑠𝑖𝑧𝑒 𝑖𝑛𝑑𝑒𝑥 < 𝑒𝑛𝑑 𝑖𝑛𝑑𝑒𝑥& 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑐ℎ𝑢𝑛𝑘 𝑠𝑖𝑧𝑒 > 2 ∗ 𝑚𝑖𝑛 𝑐ℎ𝑢𝑛𝑘 𝑠𝑖𝑧𝑒 then call 𝑜𝑝𝑡𝑖𝑚𝑎𝑙 𝑝𝑝 𝑑𝑖𝑣𝑖𝑑𝑖𝑛𝑔 𝑡ℎ𝑒 𝑐ℎ𝑢𝑛𝑘 𝑏𝑦 𝑚𝑒𝑎𝑛 𝑖𝑛𝑑𝑒𝑥 (𝑠𝑡𝑎𝑟𝑡 𝑖𝑛𝑑𝑒𝑥− > 𝑚𝑒𝑎𝑛 𝑖𝑛𝑑𝑒𝑥) call 𝑜𝑝𝑡𝑖𝑚𝑎𝑙 𝑝𝑝 𝑑𝑖𝑣𝑖𝑑𝑖𝑛𝑔 𝑡ℎ𝑒 𝑐ℎ𝑢𝑛𝑘 𝑏𝑦 𝑚𝑒𝑎𝑛 𝑖𝑛𝑑𝑒𝑥 (𝑚𝑒𝑎𝑛 𝑖𝑛𝑑𝑒𝑥 + 1− > 𝑠𝑡𝑎𝑟𝑡 𝑖𝑛𝑑𝑒𝑥) else 𝑜𝑝𝑡 𝑝𝑝 = 𝑝𝑎𝑟𝑒𝑛𝑡 𝑝𝑝 end if=0

Algorithm 3 Optimal CC Require: 𝑙𝑖𝑠𝑡 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 ∨ 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓 𝑖𝑙𝑒𝑠 ∨ 𝑚𝑖𝑛 𝑐ℎ𝑢𝑛𝑘 𝑠𝑖𝑧𝑒 ∨ 𝐵𝐷𝑃 ∨ 𝑚𝑖𝑛 𝑛𝑜 𝑜𝑓 𝑐ℎ𝑢𝑛𝑘𝑠 Sort the files Create chunks by applying OPTIMAL PP algorithm while Number of chunks is less than min chunk no do Divide largest chunk end while 𝑐𝑢𝑟𝑟 𝑐𝑐 ← 1 𝑝𝑟𝑒𝑣 𝑡ℎ𝑟 ← 0 perform transfer for the first chunk while current throughtput greater than previous throughput do perform transfer for the consequent chunk 𝑐𝑢𝑟𝑟 𝑐𝑐 = 𝑐𝑢𝑟𝑟 𝑐𝑐 ∗ 2 end while opt cc = prev cc Combine chunks with same pp values perform transfer for the rest of the chunks with optimal pp and cc =0

BDP, setting different pipelining levels to different groups of chunks with different mean file sizes is a promising solution to increase performance. To divide the dataset into chunks and assign optimal pipelining levels to each chunk, a recursive division algorithm is applied(Algorithm 2). The dataset is divided recursively until the optimal pipelining level for the parent chunk becomes equal to the divided chunk. There are also other stopping conditions. First, a chunk can not be less than the minimum chunk size provided. This ensures that chunks do not become very small because larger chunks tend to achieve better throughput. Also, not to overwhelm the GridFTP server with back-to-back transfer requests a maximum pipelining level is given and any pipelining value calculated can not be greater than this value. When the chunks are divided and optimal pipelining values are assigned, an adaptive concurrency scheme is applied to each chunk. During the course of the transfers further division/combination actions might be taken. The next algorithm(Algorithm 3) finds the optimal concurrency level in addition to optimal pipelining level in an adaptive strategy. This algorithm takes file list, number of files, min chunk size, BDP and min number of chunks as input. The first step sorts the files in ascending order based on their sizes. Then OPTIMAL PP algorithm is applied and the file list is recursively divided with each chunk being assigned their pp values. If the number of created chunks is less than min number of chunks parameter, the largest chunk is divided until there are enough available chunks to apply an adaptive concurrency strategy. Each chunk is transferred by setting its optimal pipelining and current

concurrency value and the subsequent chunks are transferred by exponentially increasing the concurrency values until the measured throughput starts to drop down. In this case, the previous concurrency level is chosen as optimal and the rest of the chunks are transferred with this value. However before that, if chunks with same pp values exist, they are combined so that chunk transfers with same pp and cc values will not be conducted separately. This algorithm realizes several of the goals mentioned in Section II such as sorting, dividing, combining datasets. For datasets with file sizes greater than BDP, a similar algorithm is applied to add parallelism. In this case optimal pipelining level is set to 2 statically and a new chunk is created with a precalculated minimum chunk size. The parallelism level is increased first as in Algorithm 3 and after finding the optimal parallelism value, the concurrency value is increased adaptively until an optimal value is found. The rest of the dataset is transferred by setting these optimal values. V. E XPERIMENTS The implemented algorithms with the introduced flexible client are tested in Futuregrid [28] Testbed between two clusters called Sierra and Hotel. This testbed has a wide-area high-speed 1Gbps network and the delay between the two clusters is around 80ms. The theoretical speed for transferring a large file that is able to fill the network pipe is around 800-900Mbps. However it is not possible to say the upper theoretical limit for a set of small files because it changes based on the characteristics of the dataset. The clusters have GPFS file system installed and for each experiment a 4000-

1070

a)Adaptive-CC Throughput

b)Adaptive-CC Concurrency Level

700 600 500 400 300 200 100 50

GridFTP fast pp baseline Optimized UDT 250 files-in-a-chunk 500 files-in-a-chunk 1000 files-in-a-chunk

250 files-in-a-chunk 500 files-in-a-chunk 1000 files-in-a-chunk CC

Mbps

64

32 16 8 4 2 1

1 2 3 4 5 6 7 8 9 10111213141516

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

chunk #

chunk #

c) PP-CC-P Optimal Algorithm Chunksize

d) PP-CC-P Optimal Algorithm Throughput

800

8e+08

600

Mbps

Bytes

Chunksize 1e+09

6e+08 4e+08

GridFTP fast pp baseline Optimized UDT Throughput

400 200

2e+08 16

8

04-

4

04-

2

04-

1

Figure 1.

04-

04-

16

8

04-

4

04-

2

04-

1

04-

04-

pp-p-cc

pp-p-cc

4000 1MB files

file dataset is generated within the allowed disk quota. There are two different cases in terms of the characteristics of the dataset. In the first case, 4000 1MB files are generated while in the second, the dataset is constructed of 4000 random-size files. The maximum file size for this case is 2MB and the files are uniformly distributed resulting in a mean of 1MB file size. Small file sizes are chosen because pipelining and concurrency are most useful for these types of transfers which are in the form of large number of small files. Most of the problems do not occur for large file transfers. We compare the algorithms’ throughput to the baseline GridFTP throughput with data channel caching and the default pipelining level set. We also implemented an optimized UDT client/server similar to GridFTP data channel caching for transferring large number of small files in which a single connection is opened and the files are sent back-to-back using the same channel. The results are presented for both cases in Figures 1 and 2. According to Figure 1, the Adaptive-CC algorithm tested with different chunk sizes in the range of [250-500-1000] files, results in increased concurrency levels for larger chunk sizes and higher throughput values, while for smaller chunks the concurrency level becomes stable around 16-32 resulting in lower throughput values(1.a and b). The highest throughput achieved is around 750Mbps. In all of the cases a performance improvement can be observed comparing to the baseline GridFTP. The 500- and 1000-file-chunk cases passes the performance of the optimized UDT. For the Optimization algorithm (1.c and d), the x-axis represents

pipelining(pp), parallelism(p) and concurrency(cc) values set for the chunk transfer. The optimal pipelining value is set as 4 and the concurrency level is increased as the throughput continues to increase with it. The maximum throughput achieved is around 800Mbps and it eventually passes the optimized UDT performance when it finds the optimal settings. The chunks are recursively divided into equal lengths. The performance is increased with adaptively changing throughput values. For the second case(Figure 2), since the file sizes are generated randomly, the throughput results are more unstable and lower comparing to the first case. The highest throughput achieved is around 500Mbps. For the adaptive-CC algorithm, concurrency levels are elevated but the throughput differences are less significant (Figure 2.a) resulting in a wavy behavior. The baseline GridFTP is outperformed in most chunk transfers and some of the 250-file-chunks outperforms the optimized UDT results. For the optimization algorithm, since the mean file size value is less than BDP, chunks are recursively divided and different optimal pipelining levels are assigned to each chunk(Figure 2.c and d). The difference between the chunk sizes on the other hand is higher due to the nature of the algorithm (bigger size files tends to have smaller pipelining values resulting in no further division).The baseline GridFTP is outperformed much more quickly than optimized UDT but the optimal case is more successful than both UDT and baseline GridFTP. The highest throughput achieved by the algorithms may change depending on the characteristics of the dataset(e.g.

1071

a)Adaptive-CC Throughput

b)Adaptive-CC Concurrency Level

700 600 500 400 300 200 100 50

GridFTP fast pp baseline Optimized UDT 250 files-in-a-chunk 500 files-in-a-chunk 1000 files-in-a-chunk

250 files-in-a-chunk 500 files-in-a-chunk 1000 files-in-a-chunk CC

Mbps

64

32 16 8 4 2 1

1 2 3 4 5 6 7 8 9 10111213141516

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

chunk #

chunk #

c) PP-CC-P Optimal Algorithm Chunksize

d) PP-CC-P Optimal Algorithm Throughput

800

8e+08

600

Mbps

Bytes

Chunksize 1e+09

6e+08 4e+08

GridFTP fast pp baseline Optimized UDT Throughput

400 200

2e+08

1

16

8

03-

4

04-

2

05-

-0

08-

11

16

8

03-

4

Figure 2.

04-

2

05-

1

-0

08-

11

pp-p-cc

pp-p-cc

4000 randomly generated files [size range 1Byte-2MB]

file size and chunk size) but in the end a significant performance improvement is observed in all of the cases. The protocol bottlenecks are eliminated for large number of small file transfers and the optimization algorithm performs better than all of the cases at its optimal settings. The algorithms are easily implemented due to the flexibility of proposed client functionality. The variety of the algorithms can be extended and would be just as easy to implement as well.

ACKNOWLEDGMENT This material is based upon work supported in part by the National Science Foundation under Grant No. 0910812 to Indiana University for ”FutureGrid: An Experimental, HighPerformance Grid Test-bed.” Partners in the FutureGrid project include U. Chicago, U. Florida, San Diego Supercomputer Center - UC San Diego, U. Southern California, U. Texas at Austin, U. Tennessee at Knoxville, U. of Virginia, Purdue U., and T-U. Dresden.

VI. C ONCLUSIONS

R EFERENCES [1] N. S. Foundation, “Core Techniques and Technologies for Advancing Big Data Science Engineering (BIGDATA,” Tech. Report.

A flexible GridFTP client is implemented which has the ability to comply with different-natured data scheduling algorithms designed to move around Big Data. The new client can overcome the lacks of the current GridFTP client and allows easy implementation of various kinds of algorithms by allowing optimized values set for pipelining, parallelism and concurrency. The adaptive and optimization algorithms implemented with the new client sorts, divides and combines datasets easily and improves the achieved throughput. The current structures and functions of the client allows enqueuing and dequeuing of dataset transfers and alteration of source and destination paths. As future work these auxiliary functions will be implemented in a generic way and to be able to download from multiple replicas at the same, the GridFTP client API will be altered so that it is thread-safe. Reusing the open connections is also another future goal to reduce the overhead of the client.

[2] R. J. T. Klein, R. J. Nicholls, and F. Thomalla, “Resilience to natural hazards: How useful is this concept?” Global Environmental Change Part B: Environmental Hazards, vol. 5, no. 1-2, pp. 35 – 45, 2003. [3] J. Kiehl, J. J. Hack, G. B. Bonan, B. A. Boville, D. L. Williamson, and P. J. Rasch, “The national center for atmospheric research community climate model: Ccm3,” Journal of Climate, vol. 11, no. 6, pp. 1131–1149, 1998. [4] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic Local Alignment Search Tool,” Journal of Molecular Biology, vol. 3, no. 215, pp. 403–410, October 1990. [5] A Toroidal LHC ApparatuS Project (ATLAS). [Online]. Available: http://atlas.web.cern.ch/

1072

[6] Cms: The US Compact Muon Solenoid Project. [Online]. Available: http://uscms.fnal.gov/ [7] T. Hey and A. Trefethen, “The data deluge: An e-science perspective,” in Grid Computing - Making the Global Infrastructure a Reality. Wiley and Sons., 2003, pp. 809–824. [8] E. Ceyhan and T. Kosar, “Large scale data management in sensor networking applications,” in Proceedings of Secure Cyberspace Workshop, Shreveport, LA, November 2007. [9] S. Tummala and T. Kosar, “Data management challenges in coastal applications,” Journal of Coastal Research, vol. special Issue No.50, pp. 1188–1193, 2007.

[25] J. Bresnahan, M. Link, R. Kettimuhu, D. Fraser, and I. Foster, “Gridftp pipelining,” in Teragrid Conference, November 2007. [26] T. Kosar, M. Balman, E. Yildirim, S. Kulasekaran, and B. Ross, “Stork data scheduler: Mitigating the data bottleneck in e-science,” The Philosophical Transactions of the Royal Society A, vol. 369, no. 1949, pp. 3254–3267, 2011. [27] B. Allen, J. Bresnahan, L. Childers, I. Foster, G. Kandaswamy, R. Kettimuthu, J. Kordas, M. Link, S. Martin, K. Pickett, and S. Tuecke, “Software as a service for data scientists,” Communications of the ACM, vol. 55, no. 2, pp. 81–88, 2012.

[10] F. Research, “The Future of Data Center Wide-Area Networking,” info.infineta.com/l/5622/2011-01-27/Y26.

[28] “Futuregrid testbed,” http://www.futuregrid.org.

[11] N. Laoutaris, M. Sirivianos, X. Yang, and P. Rodriguez, “Inter-datacenter bulk transfers with netstitcher,” in ACM SIGCOMM, 2011, pp. 74–85.

[29] E. G. C. Jr., M. R. Garey, D. S. Johnson, and A. S. LaPaugh, “Scheduling file transfers in a distributed network.” in PODC, R. L. Probert, N. A. Lynch, and N. Santoro, Eds. ACM, 1983, pp. 254–266. [Online]. Available: http://dblp.unitrier.de/db/conf/podc/podc83.htmlCoffmanGJL83

[12] D. Ziegler, “Distributed Peta-Scale www.cs.huji.ac.il/ dhay/IND2011.html.

Data

Transfer,”

[13] Energy sciences network (ESNet). [Online]. Available: http://www.es.net/ [14] Geant2 network. [Online]. Available: http://www.geant2.net/ [15] Internet2. [Online]. Available: http://www.internet2.edu/ [16] Louisiana optical network initiative (LONI). [Online]. Available: http://www.loni.org/ [17] Arra/ani testbed. [Online]. Available: https://sites.google.com/a/lbl.gov/ani-100g-network [18] TeraGrid/XSEDE. [Online]. Available: http://www.xsede.org/ [19] S. Floyd, “Rfc3649: Highspeed tcp for large congestion windows.” [20] C. Jin, D. X. Wei, S. H. Low, G. Buhrmaster, J. Bunn, D. H. Choe, R. L. A. Cottrell, J. C. Doyle, W. Feng, O. Martin, H. Newman, F. Paganini, S. Ravot, and S. Singh, “Fast tcp: from theory to experiments,” IEEE Network, vol. 19(1), pp. 4–11, Feb. 2005. [21] “Udt, udp-based data transfer,” http://udt.sourceforge.net/. [22] Aspera: Moving the world’s data at maximum speed. [Online]. Available: http://asperasoft.com

[30] A. Giersch, Y. Robert, and F. Vivien, “Scheduling tasks sharing files from distributed repositories.” in Euro-Par, ser. Lecture Notes in Computer Science, M. Danelutto, M. Vanneschi, and D. Laforenza, Eds., vol. 3149. Springer, 2004, pp. 246–253. [Online]. Available: http://dblp.unitrier.de/db/conf/europar/europar2004.htmlGierschRV04 [31] G. K. 0002, . atalyrek, T. M. Kur, P. Sadayappan, and J. H. Saltz, “Scheduling file transfers for dataintensive jobs on heterogeneous clusters.” in Euro-Par, ser. Lecture Notes in Computer Science, A.-M. Kermarrec, L. Boug, and T. Priol, Eds., vol. 4641. Springer, 2007, pp. 214–223. [Online]. Available: http://dblp.unitrier.de/db/conf/europar/europar2007.html0002CKSS07 [32] M. Hu, W. Guo, and W. Hu, “Dynamic scheduling algorithms for large file transfer on multi-user optical grid network based on efficiency and fairness.” in ICNS, J. L. Mauri, V. C. Giner, R. Tomas, T. Serra, and O. Dini, Eds. IEEE Computer Society, 2009, pp. 493–498. [Online]. Available: http://dblp.uni-trier.de/db/conf/icns/icns2009.htmlHuGH09 [33] Y. Lin and Q. Wu, “On design of bandwidth scheduling algorithms for multiple data transfers in dedicated networks.” in ANCS, M. A. Franklin, D. K. Panda, and D. Stiliadis, Eds. ACM, 2008, pp. 151–160. [Online]. Available: http://dblp.uni-trier.de/db/conf/ancs/ancs2008.htmlLinW08

[23] B. Schuller and T. Pohlmann, “Uftp: High-performance data transfer for unicore,” in Proceedings of UNICORE Summit, July 2011.

[34] G. Khanna, . V. atalyrek, T. M. Kur, R. Kettimuthu, P. Sadayappan, and J. Saltz, “A dynamic scheduling approach for coordinated wide-area data transfers using gridftp.” in IPDPS. IEEE, 2008, pp. 1–12. [Online]. Available: http://dblp.unitrier.de/db/conf/ipps/ipdps2008.htmlKhannaCKKSS08

[24] W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and I. Foster, “The globus striped gridftp framework and server,” in SC ’05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing. Washington, DC, USA: IEEE Computer Society, 2005, p. 54.

[35] G. K. 0002, . V. atalyrek, T. M. Kur, R. Kettimuthu, P. Sadayappan, I. T. Foster, and J. H. Saltz, “Using overlays for efficient data transfer over shared wide-area networks.” in SC. IEEE/ACM, 2008, p. 47. [Online]. Available: http://dblp.unitrier.de/db/conf/sc/sc2008.htmlKhannaCKKSFS08

1073

[36] W. Liu, B. Tieman, R. Kettimuthu, and I. Foster, “A data transfer framework for large-scale science experiments,” in Proc. 3rd International Workshop on Data Intensive Distributed Computing (DIDC ’10) in conjunction with 19th International Symposium on High Performance Distributed Computing (HPDC ’10), Jun. 2010. [37] M. Balman and T. Kosar, “Dynamic adaptation of parallelism level in data transfer scheduling.” in CISIS, L. Barolli, F. Xhafa, and H.-H. Hsu, Eds. IEEE Computer Society, 2009, pp. 872–877. [Online]. Available: http://dblp.unitrier.de/db/conf/cisis/cisis2009.htmlBalmanK09a [38] E. Yildirim, D. Yin, and T. Kosar, “Prediction of optimal parallelism level in wide area data transfers,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 12, pp. 2033–2045, 2011. [39] E. Yildirim, J. Kim, and T. Kosar, “How gridftp pipelining, parallelism and concurrency work: A guide for optimization of large dataset transfers,” in Proceedings of Network-Aware Data Management Workshop (NDM 2012), November 2012.

1074