Tuesday, December 30, 2008

XUP and H.264 MP HD Decoder

The following paper presents the architecture, design, validation, and hardware prototyping of the main architectural blocks of main profile H.264/AVC decoder, namely the blocks: inverse transforms and quantization, intra prediction, motion compensation and deblocking filter, for a main profile H.264/AVC decoder. These architectures were designed to reach high throughputs and to be easily integrated with the other H.264/AVC modules. The architectures, all fully H.264/AVC compliant, were completely described in VHDL and further validated through simulations and FPGA prototyping. They were prototyped using a Digilent XUP V2P board, containing a Virtex-II Pro XC2VP30 Xilinx FPGA. The post place-and-route synthesis results indicate that the designed architectures are able to process 114 million samples per second and,  in the worst case, they are able to process 64 HDTV frames (1080x1920) per second, allowing their use in H.264/AVC decoders targeting real time HDTV applications.

http://www.sbc.org.br/bibliotecadigital/download.php?paper=623


Finite-State Machine


Externally, the FSM is defined by its primary inputs, outputs and the clock signal. The clock signal determines when the inputs are sampled and outputs get their new values. Internally, it means, that machine stores a state which is updated at each tick of the clock. There are two major types of FSM.  If the primary outputs depend on the current state only, then it's a Moore machine.



If the primary outputs are a function of both the primary inputs and the current state, it is known as Mealy machine.



This paper gives an example for FSM with VHDL, in which ISE is used for viewing wave forms.



The following gives example for FSM with bluespec.



Wednesday, December 24, 2008

Future HD Video

SAD implementation in FPGA hardware

In this paper, a new unit intended to augment a general-purpose core that is able to perform a  SAD operation was proposed. This SAD implementation can easily be extended to perform the complete SAD operation

http://citeseer.ist.psu.edu/old/487250.html

other related papers and similar documents are also listed.




Tuesday, December 23, 2008

Useful Bluespec Examples

To store vectors or read in vectors in the testbench, a RegFile can be applied to initialize its contents at start of simulation.

// Copyright 2008 hdfpga.blogspot.com  All rights reserved.
package Tb;

import FIFO ::*;
import Vector ::*;
import FIFOF ::*;
import RegFile::*;

(* synthesize *)
module mkTb (Empty);
  RegFile#(Bit#(9), Bit#(8)) rFile <- mkRegFileLoad("test.dat", 0, 15);
  Reg#(Bit#(9)) cnt <- mkReg(0);
  rule readAndDisp(cnt <>
       $display("#%03d: 0x%02x", cnt, rFile.sub(cnt));
       cnt <= cnt + 1;
  endrule
  rule finished(cnt == 15);
       $display("Finished");
       $finish(0);
  endrule
endmodule: mkTb
endpackage: Tb

Moreover, to feed data from RegFile to an application:

// Copyright 2008 hdfpga.blogspot.com  All rights reserved.
package mkTb;

import FIFO ::*;
import Vector ::*;
import FIFOF ::*;
import RegFile::*;
import Connectable::*;
import GetPut::*;

interface IInputGen;
    interface Get#(Vector#(4, Bit#(8))) ioout;
endinterface

interface ITestApp;
    interface Put#( Vector#( 4, Bit#(8)) ) ioin;
    interface Get#( Vector#(16, Bit#(8)) ) ioout;
endinterface

(* synthesize *)
module mkInputGen( IInputGen );

    RegFile#(Bit#(9), Vector#(4, Bit#(8))) rfile <- mkRegFileLoad("test.hex", 0, 4);
   
    FIFO#(Vector#(4, Bit#(8)))   outfifo <- mkFIFO;
    Reg#(Bit#(9))    index   <- mkReg(0);

    rule output_byte (index <>
       //$display( "inputbyte %x", rfile.sub(index) );
       outfifo.enq(rfile.sub(index));
       index <= index+1;
    endrule

    rule end_of_file (index == 4);
       $finish(0);
      //outfifo.enq(EndOfFile);
    endrule
   
    interface Get ioout = fifoToGet(outfifo);
   
endmodule

(* synthesize *)
module mkTestApp( ITestApp );

    RWire#(Vector#(16, Bit#(8))) pix_out    <- mkRWire;
    RWire#(Vector#( 4, Bit#(8))) pix_in     <- mkRWire;

    Reg#(Bit#(9)) cnt <- mkReg(0);
    Reg#(Bit#(4)) step <- mkReg(0);

    rule process_mode( isValid(pix_in.wget()));
     Vector#(4, Bit#(8)) pix = fromMaybe( ?, pix_in.wget() );
 $display("#%03d: 0x%02x", k, pix[k]);
    endrule

    interface Put ioin;
       method Action put( Vector#(4, Bit#(8)) pix ) if (step <= 3);
     //$display ("%d 0x%02x", cnt, pix[cnt]);
            pix_in.wset(pix);
         endmethod
    endinterface

    interface Get ioout;
         method ActionValue#(Vector#(16, Bit#(8))) get() if (isValid(pix_out.wget));
              return fromMaybe(?, pix_out.wget());
         endmethod
    endinterface
endmodule

(* synthesize *)
module mkTb (Empty);

    IInputGen     inputgen    <- mkInputGen();
    ITestApp     TestApp    <- mkTestApp();

    Reg#(Bit#(8)) x <- mkReg(0);
    Reg#(Bit#(9)) cnt <- mkReg(0);

    mkConnection( inputgen.ioout, TestApp.ioin );

    rule connect;
        ///Vector#(4, Bit#(8)) pix = newVector;
        ///let x <- inputgen.ioout.get();
        //$display ("IO out %0d", x);
        //$display ("IO out 0x%02x", x);
        ///pix = x;
        ///TestApp.ioin.put(pix);
        cnt <= cnt + 1;
    endrule

  rule finished(cnt == 4);
     $display("Finished");
      $finish(0);
   endrule  
endmodule: mkTb

endpackage: mkTb


Thursday, December 18, 2008

Bluespec

Bluespec presents the hardware designer an exciting new way to simplify the complexity of constructing control logic while retaining full control over the architecture and performance of the design.

Bluespec’s ESL synthesis toolset for control logic and complex datapath designs significantly accelerates hardware design & reduces verification costs delivering:

  1. Over a 50% reduction in time to a verified design;
  2. Less than 50% of the bugs compared to RTL design;
  3. Design exploration and feature changes can be made correctly and much more quickly

http://www.xilinx.com/products/design_tools/logic_design/advanced/esl/bluespec.htm

Bluespec learning documentations:

http://sites.google.com/a/bluespec.com/learning-bluespec/Home/BSV-Documentation

Import C to Bluespec:


Some blogs about Bluespec:

Thursday, December 11, 2008

External Memory Controller for XUP

http://ce.et.tudelft.nl/publicationfiles/1203_672_soc_conf_rev3_1.pdf

An implementation of an On Chip Memory (OCM) based Dual Data Rate external memory controller (OCM2DDR) for Virtex II Pro is described. The proposed OCM2DDR controller comprises Data Side OCM (DSOCM) bus interface module, read and write control logic, halt read module and Xilinx DDR controller IP core. The presented design supports 16MB of external DDR memory and 32 to 64 bits data conversion for single read and write operations. The implementation uses 1063 slices of Virtex2Pro FPGA and runs at 100 MHz. The major benets of the proposed design are high bandwidth to external memory with reduced and more predictable access times compared to the Xilinx PLB DDR controller implementation. More specially, the read and write accesses are 2,44 and 4,25 times faster, than the PLB based solution respectively.

Simple Speedups for XUP Board

  http://www.et.byu.edu/groups/ececmpsysweb/cmpsys.2005.winter/teams/ibox/groups/ibox/public/brian/index.html

In this document, a couple of easy tricks to help speed up the numerous multimedia applications that one might find useful to port to the ML-XUP board are described, including 

Sunday, December 7, 2008

Low-cost FPGAs and H.264


Suhel Dhanani from Ocean Logic published a paper "Video encoding with low-cost FPGAs for multi-channel H.264 surveillance":



FPGA-to-ASIC Conversion Flow

Wolfgang Hoeflich from AMI Semiconductor described how a high-definition video scaler ASIC was quickly created using a flexible FPGA-to-ASIC conversion flow. This ensured reproduction of the FPGA functionality and enabled first time fully functional silicon supporting video resolutions up to 1080p.

http://www.ent.eetchina.com/PDF/2006NOV/DTCOL_2006NOV16_AVDE_TA_01.pdf?SOURCES=DOWNLOAD 

Synplicity Inc. has released Identify Pro tool allows full visibility into FPGA-based ASIC prototyping in 2007

http://www.eetasia.com/ART_8800467308_480400_NP_4afd3bf3.HTM

http://www.eetindia.co.in/ART_8800466635_1800000_NP_0db8aeb6.HTM

SimGen is an EDIF/VHDL/FPGA to ASIC Conversion Utility and Simulation Generator for Tanner Tools EDA.

http://mc1soft.com/simgen/

On Semiconductor provide services of FPGA-toASIC

http://www.onsemi.com/PowerSolutions/content.do?id=16582

Epson has a FPGA to ASIC conversion

http://www.eea.epson.com/portal/pls/portal/docs/1/603441.PDF

NEC also has a conversion and demonstrated why this conversion was needed

http://www.cin.ufpe.br/~if729/arquivos/manuais/nec_asic_fpga.pdf

HD Video Encoding with DSP and FPGA


TI proposed to use Digital signal processors (DSPs) handle the vast majority of video encoding applications unaided, and FPGAs as a co-processor to offload certain tasks that satisfy even the most demanding video applications.


HD Video Test Clips and Video Format Conversion


Test clip @ 1280 x 720 (720p) or 1920 x 1080 (1080p) can be downloaded from

http://www.microsoft.com/windows/windowsmedia/musicandvideo/hdvideo/contentshowcase.aspx

The WMV files can be converted to other formats such as YUV using ffmpeg:


The command is
ffmpeg -i test.wmv -s 1920x1088 test.yuv -vframes number

For convertion from yuv422 to yuv420:
ffmpeg -pix_fmt yuv422p -s 704x576 -i 704_576_100f_422.yuv -pix_fmt yuv420p 704_576_100f_420.yuv


If want to convert to avi, use
ffmpeg -s 1920x1088 -i test.yuv -vcodec copy test.avi

If need to convert to a jpeg, for example, yuv2jpeg,
ffmpeg -vframes 1 -i test.264 testvid%d.jpeg

If need to convert from bmp to a yuv 420 / 422, for example, bmp2yuv, or bmp2yuv420, or bmp2yuv422,
ffmpeg -f image2 -s 1600x1200 -vcodec bmp -i 1600x1200.bmp -pix_fmt yuv420p test420.yuv
ffmpeg -f image2 -s 1600x1200 -vcodec bmp -i 1600x1200.bmp -pix_fmt yuv422p test422.yuv

Here lists 1080i video music clips:

Some polpular CIF and QCIF test video sequences are listed in the following website. All the sequences are the 4:2:0 YUV format. And all video sequences are compressed in the 7-Zip format.

Saturday, December 6, 2008

Running Linux on a Xilinx XUP Board


http://www.hybridos.crhc.uiuc.edu/20060623-XUP-Linux-Tutorial-REVISION-FINAL.pdf

A tutorial for booting a fully functional operating system based on the Linux 2.4 kernel on a Xilinx University Program Virtex II-Pro based development board was presented by John H. Kelm. Furthermore, a reconfigurable hardware accelerator that can be accessed directly by applications or via a character device driver was described.

Crosstool that is a software package created by Dan Kegel that allows x86 Step 11 Linux machines to target the PowerPC405 core of the XUP board was applied.



MEMOCODE and XUPV2P


The MEMOCODE (the ACM-IEEE International Conference on Formal Methods and Models for Co-Design) Contest has been sponsored by Xilinx, Bluespec, CEDA, and Nokia since sometime.

In the MEMOCODE 2007 (the 5th), the basic design challenge was to implement a high-performance Matrix-matrix multiplication (MMM) using any HW and SW design methodology and targeting any FPGA development platform of the contestants’ choice. 

http://www.ece.cmu.edu/~jhoe/distribution/mc07contest/

In the MEMOCODE 2008, the hardware accelerated crypto sorter designs were proposed for the MEMOCODE 2008 HW/SW co-design contest. The goal was to sort an encrypted database of records partitioning the problem between a PowerPC processor and the dedicated hardware resources available on a Xilinx Virtex II Pro FPGA. The MIT team won the top honor. The code is in OPENCOREThe documentation can be downloaded from OPENCORE also.

The following link listed some the submission and the corresponding documentations:

http://rijndael.ece.vt.edu/memocontest08/everybodywins/

Followers

Blog Archive

About Me

My photo
HD Multimedia Technology player