NoC-AXI Interface for FPGA-based MPSoC Platforms - IEEE Xplore

64 downloads 4659 Views 198KB Size Report
email: {maanrl, masdan, juplos, pakrli}@utu.fi. Abstract- Streaming applications are a keystone in several emerging multimedia services like DVB-IPTV, VoD and.
NoC-AXI Interface for FPGA-based MPSoC Platforms Marco Ramirez, Masoud Daneshtalab, Juha Plosila, Pasi Liljeberg Department of Information Technology University of Turku

email: {maanrl, masdan, juplos, pakrli}@utu.fi Abstract- Streaming applications are a keystone in several emerging multimedia services like DVB-IPTV, VoD and on-line gaming. Due to the high computing requirements and real-time constraints inherent to this kind of applications multi-processor system-on-chip (MPSoCs) have been proposed as a solution. In addition, the FPGA technology has become popular among systems-on-chip (SoCs) designers due to its low development cost and short time to market. Here we present a FPGA-based MPSoC platform for streaming applications where the important component of this platform is the AXI interface. 1. INTRODUCTION IP-based Digital Video Broadcasting, Video on Demand and on-line gaming are some examples of the most popular services on the Internet. Video streaming is a very resource demanding process since it involves the execution of complex algorithms for video compression as well as networking tasks. An equally important characteristic of the video streaming is its real-time nature. Providing the necessary throughput while meeting the timing constraint is necessary to maintain an acceptable quality of service. These stringent requirements make single-processor architectures inadequate for the video streaming so that multi-processor systems-on-chip (MPSoCs) have been proposed as an alternative approach [1]-[4]. MPSoCs typically integrate several processing cores, memory modules as well as complementary circuitry in a chip. In the near future it is expected that MPSoCs will contain hundreds of processing elements interconnected by networks-on-chip (NoC) [3]-[6]. This high amount of computational power makes MPSoCs a suitable solution for embedded systems targeting video streaming applications. Recently FPGA has become a more dominant technology for MPSoC designers. Short development cycles and low costs as well as a broad spectrum of IP cores are making the FPGA architecture more popular among designers. This change in the market makes necessary to have a MPSoC platform that allows prototype implementation and benchmarking. The HeMPS framework (HF) [7] is a popular tool for the MPSoC generation . The HF generates a parameterizable MPSoC architecture based on the Plasma

c 978-1-4673-2256-0/12/$31.00 2012 IEEE

479

processor [8] and the HERMES NoC [9]. A typical instance of the architecture has three main components: processing elements, routers and a task repository. PEs are wrapper modules containing a Plasma processor, an internal RAM memory, a network interface and a DMA controller. The network topology is organized as a 2-D mesh and uses an XY routing algorithm. The task repository is a local memory used to store the applications that are allocated on demand on the slave PEs. In this work we present a FPGA-based MPSoC platform, named AXI-based MPSoC (AXIM). The first AXIM prototype is based on the HeMPS architecture, however different processing cores and NoCs will be evaluated in the future. 2. AXIM PLATFORM The AXIM platform has three main components: a cluster of processing elements (CPE), the AXI subsystem and an embedded controller. AXIM implements the same type of PEs and NoC used by HeMPS. However there are two key differences between both architectures. First is AXIM uses the AXI interface (described in Section 3) in order to provide interoperability between the CPE and the AXI subsystem. Second is the modification of the task repository structure and location. In AXIM the task repository location is split between the NAI's local memory and the platform's external memory. The AXI subsystem is constituted by a set of AXI peripheral controllers. The AXIM platform incorporates controllers for external DDR memory, Ethernet, FLASH memory and USB host. The AXI subsystem constitutes the set of sinks and sources for the streams to be processed. The embedded controller (EC) moderates the AXI subsystem. It runs an operating system with the required drivers in order to configure the system's peripherals during the initialization phase. EC also runs a web server in order to handle the job requests received via HTTP. When a service is requested the EC informs the Master PE about the details of the request. Then the Master PE performs the task allocation according to the current available resources in CPE.

information is kept as pairs of integers describing each task's base address and block size. The LM content is initialized during the FPGA configuration phase but an update command will be added in the future. One of the goals of the AXIM platform is to analyze different aspects of the NoC-AXI in order to improve its efficiency regarding streaming applications. The use of internal buffers, transaction prioritization and reordering, and multiple NoC-AXI should be explored during this research work. 4. CONCLUSIONS AND FUTURE WORK In this paper an AXI-based MPSoC was presented. In addition, the NoC-AXI interface was presented for this platform. Current work is focused on the NoC-AXI’s simulation and verification. Future work involves the implementation of the AXIM platform on a Xilinx Virtex6 FPGA. Fig. 1.

NAXI platform block diagram.

REFERENCES [1]

Ahsan Shabbir et. al., “Distributed resource management for concurrent execution of multimedia applications on MPSoC platforms,” ICSAMOS, pp. 132-139, 2011.

[2]

M. Fattah et. al., “Exploration of MPSoC Monitoring and Management Systems,” in Proceedings of IEEE International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), pp. 13, June 2011, France.

[3]

M. Daneshtalab et. al., “Adaptive Input-output Selection Based On-Chip Router Architecture,” Journal of Low Power Electronics (JOLPE), Vol. 8, No. 1, pp. 11-29, 2012.

[4]

Network Interface: lies between NOC-AXI and NoC to handle the credit-based data flow control between the router and the rest of the internal units.

M. Fattah et. al., “Transport Layer Aware Design of Network Interface in Many-Core Systems,” in Proceedings of IEEE International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2012.

[5]

AXI units: two units are in charge of handling the transactions on the AXI bus, one unit for each type of transaction (read/write). The AXI units are independent thus allowing the simultaneous execution of transactions.

M. Daneshtalab et. al., “Memory-Efficient On-Chip Network with Adaptive Interfaces,” IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems (IEEE-TCAD), Vol. 31, No. 1, pp. 146-159, Jan 2012.

[6]

M. Dehyadegari et. al., “An Adaptive Fuzzy Logic-based Routing Algorithm for Networks-on-Chip,” in Proceedings of 13th International Conference on Adaptive Hardware and Systems (AHS), pp. 208-214, June 2011, USA.

[7]

E. A. Carara et. al., “HeMPS - a framework for NoC-based MPSoC generation,” In proc. of ISCAS, pp. 1345-1348, 2009.

[8]

http://opencores.org/project,plasma

[9]

Fernando Moraes et. al., “HERMES: an infrastructure for low area overhead packet-switching networks on chip,” Integration, the VLSI Journal, Vol. 38, No. 1, pp. 69-93, , 2004.

A block diagram of the AXIM platform is shown in Fig. 1. In order to provide the interoperability between the CPE and the AXI subsystem it is necessary to introduce a new element to the system: the NoC-AXI interface. 3. NOC-AXI INTERFACE A keystone of the prototyped platform is the NoC-AXI interface since it creates the communication channel between the CPE and the AXI bus. NoC-AXI is also responsible for allocating part of the system's task repository. The NoC-AXI functionality is provided by several internal units:

Control unit (CU): processes the incoming packets from NI and sends signals to other units depending on the received command. Three commands have been defined: READ, WRITE and GET TASK. The READ and WRITE commands correspond to the typical read/write memory block operations. The GET TASK command instructs CU to fetch a task's code from the repository and send it to a PE. CU is also generates the correspondent headers for the outgoing packets. Local Memory unit: stores information related to the set of applications available in the repository. The

480