RECONFIGURABLE HARDWARE ACCELERATION OF WLAN SECURITY Neil Smyth1, Máire McLoone2, John V. McCanny2 1
Amphion Semiconductor Ltd., Belfast, Northern Ireland. [email protected]
Institute of Electronics, Communications and Information Technology, Queen’s University Belfast, Northern Ireland [email protected]
, [email protected]
development, and offers enhanced security at the MAC layer. The scheme offers an improved RC4  based scheme for legacy systems together with AES-based encryption  in newer WLAN devices. It has also been designed to integrate with IEEE 802.1x  to provide a system whereby clients and access points must query an authentication server, offered improved network security. The existing WEP security scheme has effectively been enhanced to an implementation known as TKIP, designed for use in legacy systems, and which is currently being offered to the consumer by the interim WPA (Wi-Fi Protected Access) standard as IEEE 802.11i is finalised. The design can implement numerous AES modes of operation and perform various data logic and arithmetic functions. It also has dedicated instructions to perform Michael authentication , a packet authentication algorithm developed for IEEE 802.11i, and 32-bit cyclic redundancy checks (CRC32).
ABSTRACT A novel Wireless Local Area Network (WLAN) security processor is described in this paper. This processor is capable of offloading all security encapsulation in an IEEE 802.11i compliant Medium Access Control (MAC) layer to a reconfigurable hardware accelerator. Embedded software provides flexible support for many other RC4 and AES based security protocols, such as those relevant to Internet Protocol Security (IPSec). The unique design is primarily targeted at WLAN applications, and as such is capable of performing Wired Equivalent Privacy (WEP), Temporal Key Integrity Protocol (TKIP), Counter mode with CBC-MAC Protocol (CCMP), and Wireless Robust Authentication Protocol (WRAP). The use of dedicated instructions designed for WLAN applications results in reduced instruction code footprints in comparison to general-purpose processors, and provides the high throughput necessary for 54 Mbps IEEE 802.11 a/g.
BACKGROUND ON AES AND IEEE 802.11
The limited processing power and battery life of wireless devices is contradictory to the ever-increasing data throughputs of complex security protocols, which are in increasing demand due to the growth of wireless technologies . The nature of frequently changing and evolving security protocols also necessitates the use of devices with re-programmable hardware utilising embedded software. Such devices support variable functionality to counteract security weaknesses, and provide a degree of future-proofing for what will inevitably become legacy hardware. IEEE 802.11i  is an optional amendment to the IEEE 802.11 standard that is currently in the last stages of
0-7803-8504-7/04/$20.00 ©2004 IEEE
The AES (Rijndael) Algorithm
AES is a block cipher specified by the NIST (National Institute of Science and Technology), and is a standardized form of the Rijndael symmetric cipher. AES has a 128-bit block size and a variable key length of 128, 192 or 256 bits. This fast high security cipher is currently being introduced into many hardware and software security products, driven by the need for increased security in Internet traffic and many other multimedia products.
MDUs to/from LLC
Sensed Medium Activity MAC
IEEE 802.11 Access Point
MDUs to/from LLC
IEEE 802.11 Station Figure 1 IEEE 802.11 Device-to-Device interface
medium. By monitoring the activity on the wireless medium through the PHY, the MAC will determine if it can transmit data if it believes the wireless medium is inactive. Encryption and other cryptographic processing of frames in IEEE 802.11i occur at the MAC layer, prior to passing frames to the PHY. All frames delivered to the PHY from the MAC are composed of header fields, an optional data payload field, and a Frame Check Sequence (FCS) composed of a CRC32 checksum for error detection purposes. The security schemes in IEEE 802.11i only alter the data payload, and subsequently the FCS field that is calculated over the data field.
Background On IEEE 802.11
The IEEE 802.11 standards define the MAC (Medium Access Control) and PHY (Physical) layers of wireless LAN. The original standard described a wireless communication technology that operated at 1 Mbps. The IEEE 802.11b amendment introduced in 1999 increased the maximum throughput to 11 Mbps, while the newly created IEEE 802.11 a and g standards have introduced new technologies to increase the maximum theoretical throughput of this wireless communication technology to 54 Mbps. As IEEE 802.11 is a form of wireless communication, it doesn’t offer the inherent security of a wired LAN, as wireless communications disseminate information indiscriminately. To offer a level of security similar to that of wired LAN’s, the optional Wired Equivalent Privacy amendment was introduced. This provided a means of confidentiality and authentication in the packetised data, through the use of the RC4 stream cipher for encryption and cyclic redundancy checks to provide a checksum for authentication purposes. WEP has been shown to be a weak security protocol with many flaws , and manufacturers have improved upon the standard by introducing their own amendments and enhancements. To address the need for enhanced security as the uptake of wireless communications increases, IEEE is developing the 802.11i standard for enhanced MAC security. This new standard provides a standardized upgrade of the WEP scheme for implementation on legacy systems, which is known as TKIP. However, new devices are expected to use the higher security AES block cipher. The basic outline of the processing layers in an IEEE 802.11 station is illustrated in Figure 1. The MAC layer accepts data for transmission in the form of MDU’s (MAC Data Units) from the LLC (Logical Link Layer) in the system. The MAC creates and passes MPDU’s (MAC Physical Data Units) and other Control and Management packetised data (known as frames) to the PHY layer. The PHY performs modulation of the input frames to produce output data suitable for transmission over the wireless
WLAN SECURITY PROCESSOR ARCHITECTURE
The WLAN security processor is composed of the basic elements of any RISC processor - a memory controller to interface with RAM, an operation decode unit, a register bank, write-back logic, an Arithmetic and Logic Unit (ALU), and a barrel shifter. In addition to these units, RC4 and AES encryption coprocessors have been added, and are accompanied by IEEE 802.11i specific instructions to provide support for CRC32 checksums and Michael authentication tags. The processor can execute the majority of instructions in two or three cycles. This is attributable to pipelined execution logic, allowing the processor to operate at the target frequency of 80MHz in Altera Stratix and Xilinx Virtex2 technologies, which is a common operating frequency in MAC/PHY products. Synchronous read RAM is used to efficiently contain the microcode that defines the frame encapsulation schemes, all input frames, and all generated output frame data.
ADDR WE WDATA RE RDATA
Register Bank DESTINATION
ALU (with Michael instructions)
Memory Control AES coprocessor
Fetch and decode logic
CFG_ENAB CFG_RW CFG_ADDR CFG_IN CFG_OUT
256x8 DP RAM
Execution logic Configuration And Status
Figure 2 WLAN Security Processor Block Diagram
The processor exercises the embedded software to generate the encapsulated data output. This output can then be read from the RAM by the MAC host microprocessor. The encapsulation code used to operate the processor is implemented as 32-bit opcodes, and up to 8 kbyte of RAM may be devoted to storage of this code. As an example of code size, the WRAP reference code has been successfully written to occupy less than 1 kbyte, and offers full WRAP encapsulation and decapsulation.
Other WLAN security solutions have been designed as single chip solutions for a variety of applications. For example, Cavium Networks’ NITROX processors  offer impressive data throughputs and versatility, but lack the compactness and power efficiency of a dedicated WLAN security solution. WLAN specific solutions offer very specific and efficient solutions that unfortunately do not offer the versatility of software (such as Sci-Worx’s WEP IP core  and Helion’s 802.11i CCM IP Core ). The major advantage of the WLAN security processor over these solutions is that it has been specifically targeted at WLAN applications, and balances the use of hardware encryption coprocessors with a controlling processor. This allows the bulk of data to be processed efficiently, while reserving the simple packet processing and data manipulation tasks to the processor’s ALU, thus maintaining speed and efficiency with future upgrade ability. The WLAN security processor described in this paper was simulated using ModelTech ModelSim, and verified against a functional C++ model using selfchecking testbenches. Test vectors were obtained from various sources, such as NIST , IEEE 802.11 Task Group I , and the IETF , in order to verify the capability of the core to perform AES encryption and the various packet encapsulation schemes.
The AES coprocessor generates all keyspace on-the-fly, requiring no memory to store this keyspace. RC4 requires an initialisation period of 1152 cycles regardless of data payload length. The number of cycles required for initialisation is fixed for each security scheme, hence contributing to larger performance degradation in smaller frames, particularly when RC4 is utilized. This is illustrated in Figure 3 and Figure 4. AES-based CCMP, illustrated in Figure 4, requires less initialisation than RC4-based WEP or TKIP, and is largely attributable to the absence of key initialisation in AES. The degradation in performance of encryption to decryption in WEP, TKIP and CCMP is attributable to additional processing required in the decryption algorithms. WRAP encryption and decryption are largely identical and therefore have an identical processing bandwidth, as seen by the performance illustrated in Figure 5.
0 20 0 40 0 60 0 80 0 10 00 12 00 14 00 16 00 18 00 20 00
Byte length of MSDU
Byte length of MSDU
Figure 4 TKIP Performance
Figure 6 WRAP Performance
Figure 5 CCMP Performance
Byte length of MSDU
Byte length of MSDU
10 00 12 00 14 00 16 00 18 00 20 00
Figure 3 WEP Performance
Xilinx Virtex2 -5
Table 1 WLAN Security Processor Technology Resource Usage
The core was synthesized using Synplify Pro to create netlists for FPGA implementation. Altera Quartus II and Xilinx Foundation Series 5.2 were used to place and route the netlist onto Altera Stratix and Xilinx Virtex II devices respectively. Synopsys Design Compiler was also used to synthesize the core using TSMC 0.18 um standard cell libraries under worst-case conditions. The target clock rate for all implementations was 80 MHz, and the performance results are illustrated in Table 1 below.
The RAM size may differ depending upon application. The figures quoted are worst case (i.e. one 256x8 dualport RAM for RC4 functionality, one 4096x32 dual-port RAM for packet buffer and instruction memory). The smaller 256x8 dual-port RAM is required, but the larger buffer RAM may be single-port, and may be reduced in size depending upon required specification.
MAC host microprocessor
WLAN Security Processor
Dual-port Buffer RAM
AHB Interface 802.11 b/g PHY
Lower MAC Packet Buffer / PHY interface
Figure 6 Wireless LAN PC Card Block Diagram
APPLICATION EXAMPLE – 802.11 B/G WLAN PC CARD WITH ENHANCED MAC SECURITY
In this paper, a novel WLAN Security Processor is described which incorporates IEEE 802.11i specific instructions and AES and RC4 acceleration. It has been recognized that there is a security processing gap in wireless devices, caused by the low power and relatively low processing capabilities of such devices, and the demands of complex security protocols on microprocessor technologies. The design described here provides a processor designed specifically to perform efficient cryptographic processing of WLAN frames, with little intervention from the host microprocessor, allowing more processing power to be used to enhance and improve the services on a wireless handset. For example, the user interface may be more feature rich and responsive, there may be less lag experienced when using data services, and dedicated hardware can perform cryptographic functions more efficiently than a general-purpose processor thus improving battery life.
The application outlined in Figure 6 above utilizes an 802.11b/g baseband processor interfaced to an AHB system bus with a packet buffer to queue 802.11 frames and act as a data interface to the MAC layer. It also has a host microprocessor implementing the 802.11 MAC. A WLAN Security Processor is used to rapidly design a high security 802.11 b/g system (operating at 11/54 Mbps) by removing a heavy burden from the MAC host microprocessor. The code size required to implement 802.11i is also reduced in comparison to using a general purpose processor. This allows the host microprocessor to dedicate itself to performing other tasks, or enable it with the extra processor cycles to perform more features.
Providing a software engine on which to execute the packet processing using dedicated cryptographic instructions allows changes to be made to the method of encapsulation, while maintaining the efficiency and high throughput of hardware encryption acceleration. The current fluctuations in IEEE 802.11i standards can be overcome by implementing the WLAN Security Processor into a design, as it can be reprogrammed with new embedded software. The AES and RC4 encryption instructions of the WLAN security processor also give a significant advantage in greatly reducing the software footprint. The processor may be programmed to perform encapsulation of other packet types, such as IPSec packets utilizing AES or RC4 based encryption. The ability to enable extra functionality by a simple software upgrade can be used as a distinguishing feature of a WLAN product in a competitive marketplace, and offers the user a degree of future-proofing in their wireless LAN system.
 Jon A. LaRosa, “WPA: A Key Step Forward in Enterprise-class Wireless LAN (WLAN) Security”, Meetinghouse Data Communications, URL:http://www.meetinghousedata.com/landing/wp.shtml , April 2004.  Cavium Networks NITROX Processors. URL:http://www.cavium.com, April 2004.  Sci-Worx GmbH, WEP IP Core. URL:http://www.sci-worx.com, April 2004.  Helion Technology, CCM IP Core. URL:http://www.heliontech.com, April 2004.  Internet Engineering Task Force (IETF) IPSec, Requests For Comments and Internet Drafts, URL:http://www.ietf.org, April 2004.
 S.Ravi, A. Raghunathan, and M. Sankaradass, “Securing Wireless Data: System Architecture Challenges,” ISSS’02, October 2002, Kyoto, Japan.  S. Gayal and S. A. Vetha Manickam, “Wireless LAN Security Today and Tomorrow,” Center for Information and Network Security, Pune University, 2002.  IEEE 802.11 Wireless LAN Standards. IEEE 802.11 Working Group, Task Group I, URL:http://grouper.ieee.org/groups/802/11, April 2004.  B. Schneier, Applied Cryptography: Protocols, Algorithms and Source Code in C, John Wiley and Sons, 1996.  Alleged RC4 C++ source code, URL:http://cryptopp.sourceforge.net/docs/ref5/arc4_8cppsource, April 2004.  National Institute of Standards and Technology, AES (Rijndael) Specification and Information, URL:http://csrc.nist.gov/encryption/aes/rijndael, April 2004.  M. Welschenbach, Cryptography in C and C++. Apress, 2001.  J. S. Park and D. Dicoi, “WLAN Security: Current and Future,” IEEE Internet Computing September/October 2003, URL:http://computer.org/internet/.  J. Williams, “Providing for Wireless LAN Security, Part 2,” IT Professional, IEEE Internet Computing, November/December 2002/2003, URL:http://computer.org/internet/.