Why fpga for dsp
This means that the decimal point is located in second position on the left. The next two boxes control the way the result value is delimited on the left and right sides. For the left-hand side, it is possible to choose between wrapping and saturation. A number of rounding approaches are possible for the right-hand side. All the other parameters refer to the VHDL generation.
In principle, all that is now missing is the assignment of the physical pins and the clock to the algorithm. In the current example, the serial interface to the codec was implemented in VHDL. The relevant files are therefore added to the design. This can then, for example, be loaded onto an evaluation board via JTAG and be executed immediately.
This makes it clear that the selected FPGA is more than generously sized for a single equalizer. Flip-flops were primarily used for the serial interface and for shift registers. Look-up tables are used when adding the filters. The design presented here is completely parallel, with the result that an output value is calculated in every FPGA clock cycle. This means that the solution is optimized for the processing speed.
However, to achieve this, the equalizer occupies 20 of the 40 available 18xbit multipliers on the chip. The filters could have been implemented more economically in sequential form. However, the required development time would have been a little greater and the maximum achievable data rate would have been reduced. The data rate, in particular, indicates the phenomenal arithmetic performance offered by today's FPGAs.
The equalizer presented here processes 18 k samples per second. If adapted correctly, our equalizer would therefore be able to cope with more than audio channels. Select a Web Site. Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select:. Select the China site in Chinese or English for best site performance.
Other MathWorks country sites are not optimized for visits from your location. Toggle Main Navigation. Technical Articles and Newsletters. Search MathWorks. Close Mobile Search. Contact sales. Stettbacher, Stettbacher Signal Processing In terms of their size and processing speeds, modern FPGAs Field Programmable Gate Arrays have attained a level that makes it possible not only to perform individual mathematical operations but also to accommodate entire Signal Processing algorithms.
Modern FPGAs. FPGAs for signal processing. See also the useful article from Xilinx on this subject. The DSP is a specialised microprocessor - typically programmed in C, perhaps with assembly code for performance.
It is well suited to extremely complex maths-intensive tasks, with conditional processing. It is limited in performance by the clock rate, and the number of useful operations it can do per clock. In contrast, an FPGA is an uncommitted "sea of gates". The device is programmed by connecting the gates together to form multipliers, registers, adders and so forth. Using the Xilinx Core Generator this can be done at a block-diagram level.
Their performance is limited by the number of gates they have and the clock rate. In comparison with the DSP this gives M multiplies per second. When sample rates grow above a few Mhz, a DSP has to work very hard to transfer the data without any loss. This is because the processor must use shared resources like memory busses, or even the processor core which can be prevented from taking interrupts for some time.
A DSP is optimised for use of external memory, so a large data set can be used in the processing. FPGAs have a limited amount of internal storage so need to operate on smaller data sets. However FPGA modules with external memory can be used to eliminate this restriction. A DSP is designed to offer simple re-use of the processing units, for example a multiplier used for calculating an FIR can be re-used by another routine that calculates FFTs.
If a major context switch is required, the DSP can implement this by branching to a new part of the program. In contrast, an FPGA needs to build dedicated resources for each configuration.
If the configurations are small, then several can exist in the FPGA at the same time. The input registers, output registers and adder unit are present in the DSP48 slice. There are a few additional slices required for sample and coefficient address generation and control. If the system specification required a higher-performance FIR filter, a parallel structure could be implemented.
This structure, which is also commonly referred to as a systolic FIR filter, uses pipelining and adder chains to exploit maximum performance from the DSP48 slice.
The input is fed into a cascade of registers that acts as the data sample buffer. Each register delivers a sample to a DSP48 which is then multiplied by the respective coefficient.
The adder chain stores the partial products that are then successively combined to form the final result. No external logic is required to support the filter and the st ructure is extendable to support any number of coefficients.
This is the structure that can achieve maximum performance, because there is no high-fanout input signal. From this example, you can clearly see that the FPGA not only significantly outperforms a classic digital signal processor, but it does so with much lower clock rates and therefore lower power consumption.
The device may be further tailored to take advantage of data sample rate specifications that may fall in between the extremes of sequential MAC operation and full parallel operation. You may also consider additional trade-offs between performance and resource utilisation involving symmetric coefficients, interpolation, decimation, multiple channels or multirate. If the system sample rate is below a few kilohertz and is a single-channel implementation, the DSP may be the obvious choice.
However, as sample rates increase beyond a couple of megahertz, or if the system requires more than a single channel, FPGAs become more attractive. At high data rates the DSP may struggle to capture, process and output the data without any loss.
This is due to the many shared resources, buses and even the core within the processor. The FPGA, however, can dedicate resources to each of these functions. DSPs are instruction based, not clock based. Typically, three to four instructions are required for any mathematical operation on a single sample. The data must first be captured at the input, then forwarded to the processing core, cycled through that core for each operation and then released through the output.
In contrast, the FPGA is clock based, so every clock cycle has the potential ability to perform a mathematical operation on the incoming data stream. Since the DSP operates on instructions or code, the programming mechanism is standard C or, for higher performance, low-level assembly. This code may have high-level decision trees or branching operations, which may prove difficult to implement in an FPGA.
IP cores are available for FPGAs addressing video, image-processing, communications, automotive, medical and military applications.
0コメント