Sunday, June 20, 2010

Synphony C Compiler

As the result of acquisition of Synfora, now Synopsys provides a C Compiler for FPGA and ASIC development, currently mainly focus on prototyping. It is competing directly with Mentor's
CatapultC, Forte's Cynthesizer, and Cadence's C-to-Silicon tools:

High Level Synthesis with Synphony C Compiler
Delivering the Lowest Power hardware for Mobile and Consumer Devices

Design teams are under growing pressure to create faster, cheaper, better products. Increasingly power consumption is becoming the most critical differentiator, and designers struggle to indentify where power is being consumed and how to reduce the power consumption. Synphony C Compiler is the industry’s first algorithmic synthesis tool that automatically optimizes the power consumption at the system level using a variety of techniques including automatic multi-level clock gating insertion along with the necessary control logic. Synphony C Compiler has delivered savings of up to 50% using this technique.

About Synphony C Compiler
Synopsys's Synphony C Compiler creates application accelerators from sequential, untimed C algorithms for complex processing hardware in video, imaging, wireless and security domains. Synphony generates RTL, verification test-benches, SystemC models at multiple levels of accuracy, software driver and interoperability scripts.

Synphony C Compiler delivers high productivity gains by creating application accelerators from high level C/C++ code and automating the verification down through implementation. Synphony C Compiler achieves excellent QoR through a unique parallelizing compiler, using multi-level hierarchical abstraction and IP reuse.

Synphony C Compiler
Synphony C Compiler introduces a major innovation in algorithmic synthesis -- automatic multi-level clock gating insertion – to enable power optimizations at the system level and eliminate all manual work. In traditional RTL (Register Transfer Language) design methodologies, inserting clock gating at a block level is usually a manual effort because it requires the knowledge of when the block is inactive. Using Synphony C Compiler, the designer uses directives to specify where to insert clock gating, and Synphony does the rest automatically. In all cases the user can make changes without having to impact the algorithm or the code.

As a result, Synphony C Compiler allows designers to retain all the productivity benefits of automated synthesis and verification, including reduced design and verification time and the ability to react very rapidly to changes in the design specification, while optimizing the IC power consumption.

Key Capabilities

Coarse-Grain Clock Gating:
Synphony C Compiler builds the clock gating infrastructure to turn off complete blocks at the top level of the design; for example, to turn off the complete quantize stage of an imaging pipeline. Of critical value is the control logic that will indicate when the block can be turned off. Synphony C Compiler understands exactly when every block is active versus idle, so the clock enable logic can be designed automatically. There is no need for time consuming manual analysis to decide “when” a block can be turned off.

Fine-Grain Clock Gating:
There may be significant power saving byturning off only portions of a block, for example a TCAB used in a top level block or in another TCAB. Like coarse-grain clock gating, Synphony C Compiler automates clock gating insertion for TCABs hierarchically.

Automatic Functional Verification:
Synphony C Compiler provides automatic functional verification to check the sequencing of clock gating for both coarse-grained as well as fine-grained clock gating.

Integration with downstream tools:
Synphony C Compiler automatically generates waveforms in VCD/FSDB formats to enable power measurement in down-stream power analysis tools.

Key Benefits
  • Significant power savings: >50% for some applications
    • Power savings are over-and-above what can be achieved with gate-level clock gating in down-stream tools
  • Fully automated and easy to use
    • Eliminates time-consuming manual effort to insert clock gating, its verification and power measurement with down-stream tools

  1. The following table shows the power savings achieved on two customer designs using Synphony C Compiler. Most of the benefits in the video design come from coarse-grained clock gating, whereas most of the benefits in the wireless design come from fine-grained clock gating. Although it is not always possible to predict which technique will deliver the best results, Synphony C Compiler makes it easy to rapidly create the designs and measure the results.

    Fine-Grained Clock gatingCoarse + Fine-Grained
    Clock gating
    Video design50%53%
    Wireless design4.73%22.4%

  2. A design for a low density parity check (LDPC) decoder for the next generation wireless handset SoC achieved 23.5% reduction in dynamic power over an identical design using a standard flow.
  3. An evaluation of the effectiveness of the approach using 8 complex applications in video, imaging and wireless domains demonstrated the following:
    • Up to 50% reduction in dynamic power for executing a single task and up to 30% savings while executing a large number of tasks
    • Average power reduction of 22% for a single task and 15% over multiple tasks

Synphony Model Compiler
High Level Synthesis with Synphony Model Compiler

Figure 1: Synphony Model Compiler provides a faster, more automated path from
high level algorithm descriptions to FPGA or ASIC, prototypes, and verification flows.

Faster and More Efficient Model Creation
Modeling environments are popular for algorithm design and exploration because they allow concise representations of behavior at very high levels of abstraction. These environments provide sophisticated design capture, simulation, and analysis tools for multiple domains. However, problems arise when the designer needs to translate the design intent into their RTL counterparts for use with ASIC or FPGA implementation tools. In particular, traditional methods have proven to be very time consuming and/or prone to error because of re-coding and re-verification into the RTL domain. The Synphony Model Compiler solution addresses these problems by providing an easy and automated method to synthesize high-level algorithmic representations from the Simulink/MATLAB model-based environment.

Optimizations, Exploration, and Verification from a Single Model
Synphony Model Compiler enables rapid exploration of architectural tradeoffs from a single model and reduces errors and risk by maintaining consistent verification across multiple architecture choices and target technologies. Given the user-specified target and architectural constraints, the HLS engine automatically optimizes at multiple levels by applying pipelining, scheduling, and binding optimizations across the entire system, including IP blocks and throughout design hierarchy. Synphony Model Compiler also includes advanced technology characterizations that utilize Synplify Premier or Design Compiler for FPGA or ASIC respectively. This provides accurate timing estimation needed to make device-specific optimizations across FPGA and ASIC targets. More importantly, it increases the reliability of verification through these design project phases, regardless whether the target is for FPGA prototyping, fast architecture exploration, or ASIC implementation.

C-Output for Earlier Software Development and Faster System Validation
The difficult and time consuming effort of creating models for system validation and functional verification is a major challenge in today’s system modeling and verification environments. Synphony Model Compiler addresses this challenge by combining its highly efficient modeling flow with C-Output model generation. In addition to optimized RTL, the HLS engine generates flexible, high performance fixed-point ANSI-C models that can be used in virtual platforms for early software development and a variety of other system simulation environments.

Synphony Model Compiler brings these capabilities together for the first time in a single environment that supports complete, integrated solutions with Synopsys’ FPGA implementation, ASIC implementation, and hardware-assisted verification flows.

Improved Reliability and Time to Market
The benefits of Synphony Model Compiler are the ability to validate algorithm concepts much earlier in the design cycle, catch functional and system level problems much earlier, and more rapidly explore design space tradeoffs. With a more automated flow from higher levels of abstraction, Synphony Model Compiler gives system and algorithm designers much more power to realize these benefits and significantly improve the reliability and time to market of their ASIC and FPGA projects.

Features Benefits
Synthesizable fixed-point high level IP model library
  • Eliminates writing of fixed-point models from scratch
  • Faster verification at higher levels of abstraction
  • Offers more control over results
High Level Synthesis Optimizations and Transformations
  • Automatic system-wide pipeline insertion scheduling and resource sharing
  • IP-aware micro architecture optimization
  • Automatic retiming and pipelining at the architecture level
  • Automatic scheduling for area optimization
  • Target-aware optimization for FPGAs and ASICs
Integrated ASIC Flow
  • Automatic generation of RTL constraints and scripts for Design Complier
  • Advanced timing estimation using Design Compiler
  • Rapid architecture exploration of speed, area and power tradeoffs
Integrated FPGA Flow
  • Automatic generation of RTL constraints and scripts for Synplify Pro / Synplify Premier
  • Advanced timing estimation using Synplify Pro / Synplify Premier
  • Optimized resource mapping to advanced FPGA devices such as hardware multipliers, MACS, adders, memories and shift registers
RTL Testbench Generation
  • Automatic generation of text vectors and scripts for RTL verification in VCS
C-model Generation for Software Development and System Validation
  • Fast model creation for C-based verification
  • Begin software development earlier using virtual prototypes

Synfora Introduces PICO Extreme

New technology enables the implementation of larger and complex sub-systems
By Gabe Moretti

EDA DesignLine

Venice, Florida — Synfora, Inc. has announced the availability of PICO ExtremeTM, and called it a breakthrough in algorithmic synthesis technology. The PICO platform automatically creates complex hardware sub-systems (application engines) from sequential untimed C algorithms. Tools based on the PICO platform allow designers to explore programmability, performance, power, area and clock frequency. PICO Extreme enables the implementation of larger and more complex sub-systems using a recursive system composition methodology based on Synfora's innovative tightly coupled accelerator blocks (TCAB) technology.

The technology is based on the recognition that when using C to describe hardware implementations, a C procedure is semantically equivalent to a Verilog module or a VHDL entity. Therefore both recursion and hierarchy can be used to increase the efficiency of designers and tools alike. Users are able to designate parts of their algorithm as custom building blocks.

These application-specific building blocks are C procedures that can be designed and verified standalone and then automatically integrated and scheduled as if they were primitive computing elements. In addition, TCABs can be composed of TCABs providing recursive composition of blocks to an arbitrary depth. This composition methodology improves the ability of the compiler to find better optimization, which improves performance and reduces area. With PICO Extreme, building hardware with pre-created blocks reduces the total runtime.

Along with the TCAB technology, PICO Extreme also delivers the following capabilities for reduced power and ease of integration into the SoC:

  • An advanced clock gating scheme that enables the designer to gate the clock of a complete processing function (loop nest) as a single entity halting any activity within the processing function (including the clock tree) and only requiring one clock gating cell.
  • The ability to extract and export mapping information that enables C-RTL equivalence checking tools to verify the equivalence between PICO-generated RTL and C. This information includes design latency/throughput, bit-accurate mapping of external C variables and stream functions to RTL block interfaces including scalar, stream and memory ports, and bit-accurate mapping of internal C variables to RTL wires, registers and memory objects.
  • An option to create OCP-IP compliant host interface to ease integration into the rest of the SoC
Here is the comparisons between different FPGA C compilers:

BDTI Certified Results for Synfora PICO High-Level Synthesis Tool

An FPGA-based implementation of a complex video motion analysis algorithm (BDTI Optical Flow application) using Synfora’s PICO C synthesis tools outperformed a traditional DSP processor implementation on throughput by a factor over 40x achieving a processing rate of 204 frames per second and provided a 30X price/performance advantage over DSPs. The PICO implementation required fewer code modifications to the reference code than the DSP implementation to achieve the best performance.

According to the BDTI’s Optical Flow application analysis, the overall development efforts for the FPGA based system and the DSP based system were comparable even though somewhat different skill sets were required. Evaluation results for the PICO High Level Synthesis platform produced results with an area efficiency comparable to a hand-coded RTL design. On the second BDTI Work Flow, the design implemented with the PICO High Level Synthesis platform required only 6.4% of FPGA resources compared to 5.9% for the hand coded design.

To evaluate the PICO High Level Synthesis platform, BDTI used two complex DSP applications. The first is an Optical Flow video motion analysis application, which was used to compare the performance and price performance of an FPGA-based implementation using PICO C synthesis tools with an implementation on a TI TMS320 DSP using TI’s software development tool chain. The second is a wireless receiver application, which was used to compare the relative cost efficiency of an implementation obtained using the PICO C synthesis flow with a Xilinx FPGA compared to an implementation which used hand-coded RTL.

In addition, BDTI engineers using PICO C synthesis tools to independently implement designs scored the tool on a number of usability metrics including out-of-the-box experience, ease of use, the extent of modification to the reference code, skill level required, the effort required to get to a first compiling version and the total effort required.

BDTI is an independent analysis firm that employs a rigorous evaluation methodology to measure the quality of results (performance and price-performance of designs) and usability (productivity and ease-of-use) of DSPs, FPGAs, and high-level synthesis tools. BDTI benchmark suites are recognized world-wide by processor vendors and systems developers alike as a trusted means to understand the relative capabilities of embedded processing devices and tools.

More info: BDTI Certified Results for the Synfora PICO High-Level Synthesis Tool

No comments:


Blog Archive

About Me

My photo
HD Multimedia Technology player