FPGA-Accelerated 3D Reconstruction Using Compressive ... - CiteSeerX

0 downloads 0 Views 154KB Size Report
pectation Maximization (EM) [2] algorithm by introducing Total. Variation(TV) .... These different invocations take different FIFO chan- nels and memory interfaces ... tracer_loop would process a pre-determined number of rays. Note it is possible ...
FPGA-Accelerated 3D Reconstruction Using Compressive Sensing §

Jianwen Chen§ , Jason Cong§ , Ming Yan† and Yi Zou§

Computer Science Department University of California, Los Angeles Los Angeles, CA 90095, USA

† Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095, USA

[email protected],{cong@cs,yanm@math,zouyi@cs}.ucla.edu ABSTRACT

2.

The radiation dose associated with computerized tomography (CT) is significant. Optimization-based iterative reconstruction approaches, e.g., compressive sensing provide ways to reduce the radiation exposure, without sacrificing image quality. However, the computational requirement such algorithms is much higher than that of the conventional Filtered Back Projection (FBP) reconstruction algorithm. This paper describes an FPGA implementation of one important iterative kernel called EM, which is the major computation kernel of a recent EM+TV reconstruction algorithm. We show that a hybrid approach (CPU+GPU+FPGA) can deliver a better performance and energy efficiency than GPU-only solutions, providing 13X boost of throughput than a dual-core CPU implementation.

Typically, image reconstruction requires the number of samples (measurements or observations) that is above the Nyquist limits. By exploiting the sparsity of the objects, the number of samples can be reduced significantly. Compressive sensing technique exploits this fact to perform the reconstruction of signals or images. In our case, suppose the image is x, we make use of the sparsity of |∇x| in the algorithm. EM+TV reconstruction [4] tries to solve the non-linear optimization problem:

Categories and Subject Descriptors

The first term is the TV term and the second one is the EM term. We ignore the mathematic details that can be referred in [4], but show the pseudo-code for the core computing functions instead.

B.7.1 [Integrated Circuits]: Types and Design Styles—Algorithms implemented in hardware

min x

Algorithms, Design, Performance

1. INTRODUCTION The industry trend of CT imaging is moving towards low-dose CT. Although it is possible to reduce the dose directly and apply image-space denoising on the noisy FBP image, a more desired approach is to reduce the number of sampling used and apply compressive sensing-based iterative reconstructions. For a review of the CT image reconstruction and optimization-based iterative schemes, please refer to the recent survey [3]. This paper presents our effort to accelerate one iterative reconstruction algorithm called EM+TV [4], which extends classic Expectation Maximization (EM) [2] algorithm by introducing Total Variation(TV) regularization terms. We implemented the EM kernel completely on virtex 6 FPGAs. Our implementation is done at C-level by using AutoESL high-level-synthesis tool [1] from Xilinx.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. FPGA’12, February 22–24, 2012, Monterey, California, USA. Copyright 2012 ACM 978-1-4503-1155-7/12/02 ...$10.00.

∫ Ω

|∇x| + α

M ∑

((Ax)i − bi log(Ax)i )

i=1

xj ≥ 0, j = 1, · · · , N

2.1 General Terms

ALGORITHM OVERVIEW

(1)

Ray Tracing

EM algorithm is often implemented with a ray-driven forwardprojection and a voxel-driven back-projection. To facilitate hardware sharing, we use ray-driven approach in both forward and backward projections. The code of the forward and backward projection is shown in Figure 1. The code first finds out the direction for the next voxel in the ray, then it performs multiply-and-accumulate operation to accumulate the sinogram or update the image. The tracing stops if the voxel hits the boundary of the object.

2.2

Intersection Computation

The tracer_precal() function is responsible for computing the intersection point of the ray with the object and find out the parameter required for the tracing. Given a source coordinate (sx , sy , sz ) and destination (dx , dy , dz ), the procedure finds out the intersection point with the object which is a cube 0 ≤ x < Nx , 0 ≤ y < Ny , 0 ≤ z < Nz . A number of divisions are used in the procedure.

3. 3.1

IMPLEMENTATION & OPTIMIZATION Parallel Backward Projection

The forward projection can be parallelized easily. A large number of parallel unit can operate on the forward ray tracers simultaneously for different source and detector pairs. For backward projection, there are dependencies among views. Moreover, even within one view, there are conflicts when two parallel units update one pixel. To resolve the data conflicts within one view, atomic functions that guarantee the mutual exclusion of an address in memory, can be used to handle such potential data conflicts. However, our target FPGA platform do not provide atomic operations

/ / EMupdate : r a y−t r a c i n g a l g o r i t h m f o r a l l t he views for a l l the detectors { tracer_precal(); / / f i n d i n i t i a l r a y p a r a m e t e r s / / λx , λy , λz , λ0 , vx , vy , vz , / / Lenx , Leny , Lenz , signx , signy , signz i f ( mode ==0) t e m p s i n o = 0 ; / / f o r w a r d p r o j e c t i o n else v a l u e = sinogram ( . . ) ; / / backward p r o j e c t i o n f o r (i = 0; i < Nx + Ny + Nz ; i + +) / / ( t r a c e r _ l o o p ) { if ( λx