DESIGNING VIDEO ENCODING FOR EFFICIENT ...

DESIGNING VIDEO ENCODING FOR EFFICIENT VIDEO PROCESSING BUSINESS SERVICES Agustinus Biotamalo Lumbantoruan, Swiss German University, EduTown BSD City Tangerang 15339 Indonesia. [email protected] Charles Lim, Department of Information Technology, Swiss German University, EduTown BSD City Tangerang 15339 Indonesia, [email protected]

ABSTRACT Business that provide video processing services always face challenges on how to speed up the video-related services in such that users come back for more services. Video encoding is one of the largest time-consuming processes for video-related services. A simple First Come First Serve algorithm is proposed to be used to distribute task between servers. Several file transfer method are explored and shown to provide good performance result in servicing overall user experience using the services provided. This paper also will show experiment results using uniform and as well as heterogeneous hardware platform in performing the service. At the end of paper, hardware and software recommendation will be presented so that business people can utilize the best solution for video-related services on the Internet.

Keywords: Distributed server, video encoder, first come first serve, scheduler.

INTRODUCTION There are many aspects that affect the performance of video on demand. Those aspects are video file upload, video storage, video processing or encoding and video streaming. This paper emphasizes more on the video processing aspect. Video encoding is the process of preparing the video for output, where the digital video is encoded to meet proper formats and specifications for recording and playback through the use of video encoder software [1]. In Video on demand service provider, the uploaded videos needed to be processed or encoded into various formats before it goes public. Video encoding process takes times to be processed especially if each of the video uploaded by the users is encoded into multiple formats. There need some method that could make the video encoding process faster. The method this paper proposed is to distribute the videos. After distributing the job batch, the distributed video encoder server will process each job into multiple formats simultaneously. This method will be tested using uniform and heterogeneous hardware platform. The tests will show on what kind of hardware platform does this solution fits with.

The processor is keeps on getting advance and faster. This processor improvement could improve the time it takes to process the video. This paper exploits the performance of a latest processor by processing a video into two different resolution, different video bitrate and same format which is MP4. Video bitrate is the speed of the data transfer. In terms of video, this means more data is included in a shorter audio/visual interval [2]. The purpose of this paper is to provide insight on how to accelerate video processing using a distributed method. In addition this paper also shares things that must be consider when applying distributed video encoding system in a heterogeneous and uniform environment.

BACKGROUND Networks become faster and distributed processors can be more tightly integrated. Individual computers also become more powerful, which means that computer grids are increasingly able to solve increasingly complex problems [3]. In addition as the individual computers are becoming more powerful, this performance improvement can gives a positive contribution to the video encoding performance. However having a network of computers equipped with uniform processors could yield a different result to a network of computers equipped with heterogeneous processors. The result is influenced by two factors. Those factors are computer processing capability and the job batch size produced or given by the scheduler. The scheduler job is to create jobs based on a policy. First come first serve is the policy used in this paper. Simplicity and easy implementation is the reasons why it is implemented.

RESEARCH METHODOLOGY This research will begin with literature review from books, papers and internet about scheduler, first come first serve, uniform and heterogeneous processors environment and video encoder. The objective of this literature review is to understand fundamental of scheduling and the processor environment. SCHEDULER The scheduler is the part of distributed video encoder system responsible for develop set of jobs and fine available distributed video encoder server to execute the new set of jobs. The scheduler will look up for a distributed video encoder and send the jobs through the encoder’s assigned port.

There are two different types of scheduling algorithms. Those types are preemptive and nonpreemptive algorithm. In a non-preemptive schedule the processing of a task on a given processor cannot be suspended until its completion[4]. A preemptive schedule is a schedule which one may suspend the execution of a task on a given processor cannon be suspended until its completion[5]. The current scheduler runs on a Windows Operating system and programmed using Java 1.6 programming language. First Come First Serve Even thou First Come First Serve is self explanatory which is execute the processes in the order they arrive and to execute them to completion [4, http://www.cs.nott.ac.ud/~gxk/courses/g53ops/Scheduling/sched04-fcfs.html].

Local Dispatcher

Scheduler

Local Dispatcher

Waiting Jobs Database Figure 1. Come First Serve Overview. Based on [5, Uwe Schwiegelshohn and Ramin Yahyapour, On The Design and Evaluation of Job Scheduling Algorithms, p. 12-13, 199, Germany Dortmund: University of Dortmund]

Figure 1 illustrates the First Come First Serve algorithm overview. The waiting jobs are stored inside a database. The scheduler picks the jobs based on the date it was created and limited to a total of 5 jobs. For instance if there is job which was created last week ago and a new job is created now, the job that was created last week will be put in the queue first. The scheduler looks up for the available local dispatcher. The local dispatcher shaded in red in Figure 10 illustrates that the local dispatcher not available or busy. The blue shaded local dispatcher means it’s available to execute the next jobs. The scheduler server will send those jobs to that available local dispatcher. This is how the scheduling algorithm will be implemented. First Come First Serve algorithm is used because it is simple to implement and does not require a

processing power to create a schedule. As in Figure 10, the jobs was created by doing a search on the database based on the date it was created. The disadvantages of First Come First Serve Algorithms are convoy effect and it is dependable on the order of task arrival. Convoy effect is caused by the transaction waiting for other transactions to commit or finish [6]. In other word the scheduler must waits for the distributed computers finished processing their existing jobs.

UNIFORM AND HETEROGENEOUS PROCESSORS ENVIRONMENT Heterogeneous Developing a list of jobs is an easy task, but the hardest part is assigning the jobs especially in a heterogeneous environment. Heterogeneous computing can be viewed in two ways: either as a means of increasing the performance of an application beyond the level it can achieve on any single machine, or as a means of reducing the cost of executing an application without affecting performance[7]. The most difficult problem associated with heterogeneous distributed computing is the mapping and scheduling problem. So then a question arises does the First Come First Serve policy able to distribute jobs in a heterogeneous environment. Soon enough this paper will reveal the answer in the result and discussion section. Uniform Unlike heterogeneous processors environment, a uniform processors environment is an environment equipped with the same processors model, speed and architecture. By contrast, each processor in a uniform parallel machine is characterized by its own computing capacity, with the interpretation that a job that executes on a processor of computing capacity for time units completes units of execution. (Observe that identical parallel machines are a special case of uniform parallel machines, in which the computing capacities of all processors are equal.)[8]. As new and faster processors become available, one may choose to improve the performance of a system by upgrading some of its processors. If the only model we have available is the identical multiprocessors model, we must necessarily replace all the processors simultaneously. With the uniform parallel machines model, we can however choose to replace just a few of the processors, or indeed simply add some faster processors while retaining all the previous processors. That is the disadvantages in term of financial when applying distributed video encoding in a Uniform environment [9].

TEST RESULTS Simultaneous or Parallel Video Encoding Test on a Core i7 Processor

Figure 2, Serial vs Parallel video processing into Multiple Formats

Figure 3 shows the video encoding into two formats performance in parallel and series using a high end multi core processor. Encoding videos into two formats in parallel performs much faster because the processors processes the videos into two formats at the at the same time. In serial processing, the processors waits for 480p processing done before going to encode the video into 720p. The parallel video processing into multiple formats performed 63,6% faster compare to the serial.

Figure 3

Figure 4 and Figure 5 shows the screenshot of the CPU usage of a serial video processing. When the video encoding process takes place, a command prompt window appears on the screen as shown in Figure 4 and Figure 5. The top side of the screen shot shows the eight cores processor processing the videos. On the left top side is the percentage of processor usage while processing the video into various resolutions. Figure 4 shows the processor usage of the serial video processing. In average it utilized 25% of it’s power to process the video into one resolution. There is some more space for the processor to process the video into another format.

Figure 4

Figure 5 shows the performance usage of the processor when it processing a video into two different resolutions simultaneously or in parallel. The processor usage increased instantly from 25% to more than 50% when it is processing video into two different resolutions simultaneously. This means that for the processor to become efficiently fully utilized, it should have the ability to encode as many video processing as it can simultaneously.

Uniform Environment Test

Figure 5, Distributed Encoder Server Vs Stand Alone Encoder Server

Figure 6 bar graphs that illustrates the performance of stand alone vs distributed encoder server to process each level of batch. In x-axis shows the size of the video level batch size. As the video level batch size gets bigger, the time it took for both systems becomes longer also. This test clearly shows that distributed encoder server performs much faster compare to the stand alone encoder server in average by 296%. The low performance processor in the third test cause the distributed encoder server performed slower compare to the stand alone encoder server. Therefore the First Come First Serve scheduling algorithm is best suited with a distributed video encoding system that has a same specification instead of a distributed video encoding system that has various specification. It would yield a similar result, perhaps four to ten times faster compare to the Pentium 4’s performance. “speedup on a multiprocessor actually exceeds what would be expected by simply adding up the number of processors in use”[10]. This means that as the number of processor’s core increase, the performance will be much faster.

Table 1

Batch

1

2

3

4

5

Batch Size (MB)

489

649

1960

833

417

1

2

3

4

2

Processor

Core 2 Duo

Core i7

Pentium 4

Pentium 4

Core i7

Transfer In (seconds)

1080

1720

1140

840

300

Encoding (seconds)

2160

1523

8340

3480

360

Transfer Out (seconds)

180

180

120

60

60

Processing (Second)

3420

2820

9600

4380

720

Distributed number

Encoder

Server

Processing Time to encodes 5 batches (Second) 9600

Table 1 shows the performance result of the second test. There colored column shows the processor used during the test. There are five batch of jobs shown in the table which were distributed to four distributed video encoder servers in the network. The high end processor in the third test processed the last or the fifth batch. The main point of Table 1 is that the lower end processors which was the Pentium 4 received the two largest batch of jobs. While the Core i7 and Core 2 Duo which were the higher end processor received the batch of jobs which was smaller compare to the jobs given to the lower end processors.

Heterogeneous Environment Test

Figure 6

Figure 7 shows the result of the third test, the stand alone encoder server performed 230% faster. This massive slow down was caused by the jobs distribution produced from the first come first serve scheduling algorithm. The first come first serve algorithm sent jobs to the distributed video encoder server without considering the processors capability processing the jobs. CONCLUSION

In conclusion processing a video into multiple formats in parallel or processing videos into multiple formats simultaneously performed much faster compare to processing it in serial which in average is about 63,6%. Processing a video into two different formats consumed roughly between 40%~60% of the processing power compare to the serial video processing where it consumed roughly 25%. First come first serve scheduling algorithm is a simple and good algorithm when it is running in a uniform environment. It performs in average 290% faster compare to the stand alone encoder server. The dark side is that it performed not as expected when it was running in a heterogeneous environment.

RECOMMENDATION Hardware It is recommended for business key people to consider a high end processors implemented in their project. These high end processors could accelerate the video encoding process into various outputs in parallel.

Software Enhancing the current first come first serve scheduling algorithm. Making sure that the first come first serve algorithm aware of the jobs created based on the processor’s capability. By making the first come first serve scheduling algorithm smarter, hopefully it could improve the video encoding performance in a heterogeneous environment.

REFERENCES

[1] 2011, August, Video Encoding. [Online]. HYPERLINK “http://www.webopedia.com/TERM/V/video_encoding.html" http://www.webopedia.com/TERM/V/video_encoding.html [2] Simon Slangen. (2009, July) Digital Video Formats and Video Conversion Explained . [Online]. HYPERLINK " http://www.makeuseof.com/tag/video-conversion-the-technicalitiesexplained/” http://www.makeuseof.com/tag/video-conversion-the-technicalities-explained/

[3] Gridcafe. Breaking Moore’s Law. [Paper]. HYPERLINK "http://www.gridcafe.org/BreakingMoore-law.html" http://www.gridcafe.org/Breaking-Moore-law.html [4] Teofilo Gonzales and Sartaj Sahni. (2010, July) Preemptive Scheduling of Uniform Processor Systems, pg 1. [Paper] University Park, Pennsylvania and University of Minnesota, United States of America. [5] Teofilo Gonzales and Sartaj Sahni. (2010, July) Preemptive Scheduling of Uniform Processor Systems, pg 1. [Paper] University Park, Pennsylvania and University of Minnesota, United States of America.http://www.huffingtonpost.com/steve-hamby/5-

reasons-to-go-cloud_b_898133.html [6] Jennifer L. Welch (October, 2001p.93), Distributed Computing: 15th International Conference, Proceedings (Lecture Notes in Computer Science) [Book] Lisbon, Portugal [7] J. B. Andrews and C. D. Polychronopoulos. (1991) Pg 345-356 Ananalytical approach to performance/cost modeling of parallel computers, J. Parallel Distributed Computing, vol. 12 [8] Shelby Funk, Joel Goosens, Sanjoy Baruah Pg 1, On-Line Scheduling on Uniform Multiprocessors. [Paper] University of North Carolina, United States of America and University Libre de Bruxelles, Belgiun [9] Shelby Funk, Joel Goosens, Sanjoy Baruah Pg 2, On-Line Scheduling on Uniform Multiprocessors. [Paper] University of North Carolina, United States of America and University Libre de Bruxelles, Belgiun [10] William Stallings (July 22 2004 5 edition), Operating Systems, p.455, Prentice Hall