Images courtesy SRC Computers
Figure 1: In the Implicit+Explicit Architecture, Dense Logic Devices (DLDs) encompass a family of components that includes microprocessors,
digital signal processors, as well as some ASICs. These processing elements are all implicitly controlled and typically are made up of fixed logic that is not altered by the user.
Figure 2: Systems can be built with a single MAP processor
and microprocessor combination, or when more flexibility is
desired, Multi-Ported Common Memory accommodating up to
three MAP processors and Hi-Bar switches accommodating
thousands of MAP processors can be employed.
Figure 3: SRC servers that use the Hi-Bar crossbar switch interconnect can incorporate common memory nodes in addition to microprocessor and MAP nodes. Each of these common memory nodes contains an intelligent DMA controller and up to 16 GBs of DDR-2 SDRAM.
Figure 4: The MAP processor used in this system was the most powerful SRC-6 MAP processor ever produced. It was coupled to an Intel Pentium microprocessor and used a Fedora Linux operating system.
Figure 5: The second airborne system in production is a 10-module system designed for payload bay 3 of the General Atomics Sky Warrior, but is also usable in other larger manned and unmanned platforms. It contains a dual Xeon motherboard, a Hi-Bar switch, 750 Gbytes of removable encrypted storage, 28 VDC power system, thermal solution and a mixture of up to 10 MAP processors or common memory
Figure 6: This system is being designed to withstand an operating range from –50C to +50C, an altitude limit in excess of 25,000 feet. and will meet shock and
vibration requirements for single engine aircraft weighing less than 12,500 pounds.
Figure 7: A grayscale pixel’s intensity is simply the pixel’s eight-bit numeric value, but the intensity information is distributed among the individual RGB values for a color pixel. To obtain the intensity value
for an RGB pixel, each 24 bit RGB value is transformed from the RGB color space to the Hue-Saturation-Intensity (HSI) color space. The intensity values for all pixels in both frames are then histogrammed.
From these two intensity histograms, a statistical Cumulative Distribution Function (CDF) is created and then normalized for each frame. A mapping function is created from these two normalized CDF
arrays to map the original color pixel intensity values to a new intensity value such that the new intensity value distribution matches the GS pixel intensity value distribution. The original intensity values are
re-mapped and the new HSI image is transformed back into the RGB color space.
Figure 8: The MAP processor’s GCM Bank 0 acts as a frame buffer for the RGB image and GCM Bank 1 acts as a frame buffer for the GS image. In stage
0, two RGB and six GS pixel intensities are histogrammed in parallel every clock. The integer RGB intensity calculation is part of the RGB histogramming
pipeline. After all pixel intensities for both frames are histogrammed, stage 1 calculates the CDF arrays for both histograms for all histogram bins in parallel.
Stage 2 normalizes both CDF arrays in parallel, a single precision floating point (SPFP) calculation. Stage 3 uses both normalized CDF arrays to generate
the histogram matching MAP array. Finally, stage 4 re-reads the RGB image data two RGB pixels per clock from GCM Bank 0 and calculates the HSI pixel
values. The two integer intensity values select two new intensity values from the Map array (generated in stage 3). The two new intensity values are cast to
SPFP, and together with the two SPFP pixel hue and saturation values, are converted back to the 24 bpp RGB color space and stored in GCM Bank 1.
Figure 9: The CPU normalized cross-correlation application is a single threaded, serial implementation of the algorithm shown in Figure 8.
SRC Computers (Colorado Springs, Colo.) has developed a new hardware architecture and programming environment that deliver orders of magnitude more performance per processor than current high-performance microprocessors. This new architecture is called the IMPLICIT+EXPLICIT™ Architecture. Systems built with this architecture execute the user’s code, written in ANSI standard high-level languages such as C or Fortran, on a mixture of tightly coupled implicitly controlled microprocessors and explicitly controlled reconfigurable MAP processors. This allows the programmer to utilize both implicitly controlled functions, such as running a standard Linux operating system and executing legacy codes, as well as the explicitly controlled features such as the use of application specific data prefetch, data access, and functional units. This architecture is applicable to systems ranging in size from handheld devices to large multi-rack systems.
The fundamental IMPLICIT+EXPLICIT Architecture is shown in Figure 1. In this architecture, the explicit and implicit processors are peers with respect to their ability to access system memory contents. In this fashion, overhead associated with having both types of processors working together on the same program is minimized. This allows the programmer to utilize whichever processor type is best for a given portion of the overall application without concern for control handoff penalties.
In this architecture, Dense Logic Devices (DLDs) encompass a family of components that includes microprocessors, digital signal processors, and some ASICs. These processing elements are all implicitly controlled and typically are made up of fixed logic that is not altered by the user. These devices execute software-directed instructions on a step-by-step basis in fixed logic having predetermined interconnections and functionality. On the other hand, Direct Execution Logic is a family of components that is explicitly controlled and is typically reconfigurable. This includes Field Programmable Gate Arrays (FPGAs), Field Programmable Object Arrays (FPOAs) and Complex Programmable Logic Devices (CPLDs). These devices allow the programmer to establish an optimized configuration of functional units to implement desired computational, prefetch and/or data access functionality for maximizing the parallelism inherent in the particular code. The SRC implementation of a Direct Execution Logic processor is the MAP® processor.
Both DLD and DEL-based processing elements are interconnected as peers to a shared system memory in one fashion or another. It is not required that interconnects support cache coherency since data sharing can be implemented in an explicit fashion.
Computation in the MAP processor uses dynamic logic, which conforms to the application rather than forcing the application into a fixed microprocessor architecture where one size must fit all. This delivers the most efficient circuitry for any particular code in terms of the precision of the functional units and the parallelism that can be found in the code. The result is a dynamic application-specific processor that can evolve along with a given code and can be reprogrammed in a fraction of a second to handle different portions of the code. The MAP processor combines performance of a special purpose computer and the economy of a general-purpose machine.