Global Sources
EE Times-India
 
EE Times-India > EDA/IP
 
 
EDA/IP  

Simultaneously optimise data path, loop unrolling factor

Posted: 19 Aug 2015     Print Version  Bookmark and Share

Keywords:IP core  data path  unrolling factor  design space exploration  CDFG 

In the semiconductor design flow of systems, proportion of in-house development is gradually regressing due to factors like time to market pressure and expectation of mass production. Besides, with the emergence of complex SoC designs, semiconductor companies are increasingly reliant on outsourcing macro component design to third party vendors, which after design is imported back as IP cores. IP cores are custom application specific designs tailor-made based on the strict user specification and constraints.

However, the design process of IP is quite convoluted and non-trivial when considered in the context of loop-based control data flow graph (CDFG) applications. This is because it requires advanced optimisation techniques that can handle exploration of not only optimal data path architecture but also optimal loop unrolling factor (UF).

Design space exploration framework

Digital IP cores are re-usable designs that are employed in SoCs as macro-components. The design process of IP cores can initiate from any design abstraction level, such as behavioural level/system level or register transfer level (RTL).

When initiated from RT-level, the design involves hardware description language (HDL) coding with subsequent optimisations made in the code followed by RTL synthesis and simulation/emulation. However, IP core design when initiated at behavioural level/system level, incurs non-triviality. This is because IP core design through behavioural synthesis in the context of CDFGs involves a complex mechanism called design space exploration (DSE), which is responsible for performing trade-off among multiple orthogonal parameters, such as:

  1. maximisation of DSE speed and minimisation of DSE inaccuracy,
  2. maximisation of circuit speed and minimisation of circuit area/power, and
  3. maximisation of loop unroll of the design for highest speed and lowest of power due to loop unroll[1].

Concurrent trade-off between the three aforesaid orthogonal parameters also implies that the DSE process should be capable of simultaneous data path optimisation and loop unroll optimisation which is not addressed by the industry so far. Generally, such optimisation problems are intractable in nature and considered NP [non-deterministic polynomial time] hard. To solve such complex problems, mapping of advanced nature-inspired meta-heuristics to the DSE framework, such as particle swarm optimisation (PSO) algorithm based DSE[1], bacterial foraging optimisation algorithm (BFOA) based DSE[2], is essential.

Now that the background has been laid, let's discuss the procedure to optimise an IP core for CDFGs. I propose to deploy PSO-DSE framework[1] for solving the IP core optimisation problem for CDFGs. Firstly, for converting the generic PSO algorithm to a DSE framework following mapping is used:

  1. Velocity of particle in search space (v) → Exploration drift in design space
  2. Particle encoding (Xi)→ Design solution in the space.
  3. Number of dimensions of particle encoding (D) → Number of resource types in CDFG + UF depth.

After the mapping process is over, the generic PSO function which involves velocity sub-function and stochastic variables such as inertia weight, acceleration coefficient, social and cognitive component needs to be modified to customise for the DSE problem. The original PSO function after modification is as follows:

Vid+ = ωVid + b1r1[R1bi - Rid] + b2r2[Rgb - Rid] (eqn. 1)

Rid+ = Rid + Vid+ (eqn. 2)

In equation (1), ωVid is called inertia component, b1r1[R1bi - Rid] is called the cognitive component, b2r2[Rgb - Rid] is called the social component, ω is inertia weight, b1 and b2 are acceleration coefficients and r1 and r2 are random numbers between [0-1], Rlbi is the resource value of the local best position of the 'i' th particle, Rgb is the resource value of the global best particle position, Rid is the resource value of the current position of 'i' th particle.

In equation (2), Rid+ is the next position of the 'i'th particle and Vid+ is the updated value of the 'i'th particle.

The encoding process of an ith particle must include both the data path resource configuration (Rx) as well as the loop unrolling factor value (UF) i.e.

Xi = (Rx, UF),

where Rx is a vector of resource types of the data path.

The initial search space must comprise of at least three particles where the first particle is encoded with the minimum values of resource for each type and minimum unrolling factor value i.e.:

Rx = (1,1,...1) and UF = 1

The second particle is encoded with the maximum values of resource for each type and maximum unrolling factor value i.e.:

Rx = (R1max, R2max...RDmax) and UF = I;

where I is the total loop iteration count of the CDFG application.


1 • 2 Next Page Last Page



Comment on "Simultaneously optimise data path, l..."
Comments:  
*  You can enter [0] more charecters.
*Verify code:
 
 
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

 

Go to top             Connect on Facebook      Follow us on Twitter      Follow us on Orkut

 
Back to Top