Advanced Imaging

AdvancedImagingPro.com

   

Advanced Imaging Magazine

Updated: January 12th, 2011 09:49 AM CDT

Architecture for Airborne Applications

Processors for Airborne Intelligence, Reconnaissance, and Surveillance
Images courtesy SRC Computers
Figure 1: In the Implicit+Explicit Architecture, Dense Logic Devices (DLDs) encompass a family of components that includes microprocessors, digital signal processors, as well as some ASICs. These processing elements are all implicitly controlled and typically are made up of fixed logic that is not altered by the user.
Figure 2: Systems can be built with a single MAP processor and microprocessor combination, or when more flexibility is desired, Multi-Ported Common Memory accommodating up to three MAP processors and Hi-Bar switches accommodating thousands of MAP processors can be employed.
Figure 3: SRC servers that use the Hi-Bar crossbar switch interconnect can incorporate common memory nodes in addition to microprocessor and MAP nodes. Each of these common memory nodes contains an intelligent DMA controller and up to 16 GBs of DDR-2 SDRAM.
Figure 4: The MAP processor used in this system was the most powerful SRC-6 MAP processor ever produced. It was coupled to an Intel Pentium microprocessor and used a Fedora Linux operating system.
Figure 5: The second airborne system in production is a 10-module system designed for payload bay 3 of the General Atomics Sky Warrior, but is also usable in other larger manned and unmanned platforms. It contains a dual Xeon motherboard, a Hi-Bar switch, 750 Gbytes of removable encrypted storage, 28 VDC power system, thermal solution and a mixture of up to 10 MAP processors or common memory modules.
Figure 6: This system is being designed to withstand an operating range from –50C to +50C, an altitude limit in excess of 25,000 feet. and will meet shock and vibration requirements for single engine aircraft weighing less than 12,500 pounds.
Figure 7: A grayscale pixel’s intensity is simply the pixel’s eight-bit numeric value, but the intensity information is distributed among the individual RGB values for a color pixel. To obtain the intensity value for an RGB pixel, each 24 bit RGB value is transformed from the RGB color space to the Hue-Saturation-Intensity (HSI) color space. The intensity values for all pixels in both frames are then histogrammed. From these two intensity histograms, a statistical Cumulative Distribution Function (CDF) is created and then normalized for each frame. A mapping function is created from these two normalized CDF arrays to map the original color pixel intensity values to a new intensity value such that the new intensity value distribution matches the GS pixel intensity value distribution. The original intensity values are re-mapped and the new HSI image is transformed back into the RGB color space.
Figure 8: The MAP processor’s GCM Bank 0 acts as a frame buffer for the RGB image and GCM Bank 1 acts as a frame buffer for the GS image. In stage 0, two RGB and six GS pixel intensities are histogrammed in parallel every clock. The integer RGB intensity calculation is part of the RGB histogramming pipeline. After all pixel intensities for both frames are histogrammed, stage 1 calculates the CDF arrays for both histograms for all histogram bins in parallel. Stage 2 normalizes both CDF arrays in parallel, a single precision floating point (SPFP) calculation. Stage 3 uses both normalized CDF arrays to generate the histogram matching MAP array. Finally, stage 4 re-reads the RGB image data two RGB pixels per clock from GCM Bank 0 and calculates the HSI pixel values. The two integer intensity values select two new intensity values from the Map array (generated in stage 3). The two new intensity values are cast to SPFP, and together with the two SPFP pixel hue and saturation values, are converted back to the 24 bpp RGB color space and stored in GCM Bank 1.
Figure 9: The CPU normalized cross-correlation application is a single threaded, serial implementation of the algorithm shown in Figure 8.
Advertisement

By David B. Pointer
SRC Computers

The polar format algorithm has low computational cost, but it has some limitations that make the backprojection algorithm more attractive. For instance, for backprojection the user can choose any imaging grid, while there is only one imaging grid available for the polar format algorithm. Also, the backprojection algorithm intrinsically allows the ability to add or subtract pulses from an image that is unavailable in any other imaging algorithm.

The backprojection computational technique is broken down into the following steps:

• Image or a slice of a 3-D object is broken down into a set of 1-D projections

• Each projection is filtered individually

• These projections are backprojected together

• Original image or cross-section is reconstructed

The backprojection algorithm is also used extensively in medical imaging. Applications utilizing computed tomography in the medical industry are Single Photon Emission Computerized Tomography (SPECT), Position Emission Tomography (PET), Computed Tomography (CT) Scan or Computed Axial Tomography (CAT) Scan, and Magnetic Resonance Imaging (MRI).

Conversion Motivation

The algorithm developers at the Air Force Research Lab (AFRL) use MATLAB for prototyping and long-term evaluation of their computational techniques. The compute time for generating a single 2D image using the entire input data volume takes approximately 1.3 – 1.5 hours based upon the microprocessor performance.

The AFRL group was very interested dramatically reducing this computation time to enhance the productivity of their algorithm developers.

Microprocessor Implementation

This implementation converted all of the original MATLAB code into the C Language. The imaging routines implemented on the MAP processor took advantage of many of the optimization techniques supported by the MAP Compiler. These optimizations included spreading the computational array across multiple on-board memory banks, using Block RAM arrays, using two User Logic Chips and overlapping DMAs with compute. The performance of a compute loop when pipelined is one iteration of the loop every clock. The major consumer of computational time was the summation of the contributions of each swath to every voxel in the 2D image. The architecture of the Series H processor provides the ability to structure the algorithm in novel ways.



Subscribe to our RSS Feeds