Path: EDN Asia >> Design Centre >> IC/Board/Systems Design >> RTL synthesis requirements for sub-20nm designs
IC/Board/Systems Design Share print

RTL synthesis requirements for sub-20nm designs

20 Feb 2014  | David Stratman, Sanjiv Taneja

Share this page with your friends

The small world of sub-20nm design has already dawned and has brought a new set of challenges for register-transfer level (RTL) designers as the race for best performance, power, and area (PPA) continues unabated. Challenges include giga-scale integration of new functionality; new physics effects; new device structures such as FinFETs, multi-Vt and multi-channel devices; interconnect stacks with vastly varying resistance characteristics between the top and bottom layers; and process variation.

These challenges are raising several questions. For example, can RTL synthesis handle giga-scale, giga-hertz designs in a timeframe of market relevance? Can logic synthesis perform accurate and predictive modelling of the interconnect stack and the physical effects in RTL? How do new device structures affect dynamic and leakage power trade-off and library choices? This paper will explore these challenges and provide an overview of state-of-the-art technology to address them in a predictable and convergent design flow.


The interconnect challenge
The fundamental development requirement for interconnects is to meet the high-speed transmission needs of global/local signals of chips, despite further scaling of feature sizes. As illustrated in figure 1, the total interconnect length in 20nm is 6000m in one square-centimeter of the chip area, representing a 2x increase over that of 32nm and over 3x increase over that of 65nm.


Figure 1: The total interconnect length in 20nm is 6000m in one square-centimeter of the chip area.


It is no surprise that the "interconnect/gate delay gap" continues to widen with the interconnect delay increasingly determining the chip performance (figure 2). The length of Metal 1 and the intermediate wires usually shrinks with traditional scaling, so any impact of their delay on performance is minimal. Boasting the longest wire lengths, global interconnects are likely to be impacted the most by the degraded delay. Materials changes or some amelioration of the copper (Cu) resistivity increase won't be enough to meet overall performance requirements. In figure 2, we can see the delay of Metal 1 and global wiring in future generations. One can integrate repeaters to address the delay in global wiring, but this approach comes with the trade-off of more power consumption as well as the need for increased chip area.


Figure 2: Above we can see the delay of Metal 1 and global wiring in future generations (Source: ITRS, 2011).


The use of heterogeneous multi-core systems is further exacerbating the interconnect challenge as interconnecting a large number of processor and GPU cores creates a large number of criss-crossing wires and spaghetti-like routing congestion.


Hierarchical scaling
High-performance processor cores rely on a large number of metal layers, applying a hierarchical wiring approach where the pitch and thickness at each conductor level is increased steadily to mitigate the performance impact of interconnect delay. From Cu wiring to low-k dielectrics, ASICs have many technology attributes in common with MPUs. Compared to MPU design, however, ASIC design methodology is generally more regular, with Metal 1, intermediate, semi-global (2x intermediate), and global (4x intermediate) wire pitches.


Interconnect stack and need for layer-aware optimisation
As depicted in figure 4, the interconnect resistance in advanced process nodes (20nm and beyond) can exhibit two orders of magnitude difference in the resistance values between lower and upper metal layers. The capacitance variation is relatively minor – note that the capacitance values increase for the upper layers, whereas the resistance values decrease. The delay calculation based on old assumptions about "layer-agnostic" delay calculation and optimisation breaks down. As a result, the synthesis tools need to evolve and consider at least multiple bins of the layer stacks based on the magnitude of the difference in resistance values, and use this information to drive layer-aware optimisation during synthesis.


Figure 3: These typical cross sections of hierarchical scaling show an MPU device (left) and an ASIC device (right).


Figure 4: This capacitance and resistance per layer data was generated by the Encounter RTL Compiler for typical advanced node library.


1 • 2 Next Page Last Page


Want to more of this to be delivered to you for FREE?

Subscribe to EDN Asia alerts and receive the latest design ideas and product news in your inbox.

Got to make sure you're not a robot. Please enter the code displayed on the right.

Time to activate your subscription - it's easy!

We have sent an activate request to your registerd e-email. Simply click on the link to activate your subscription.

We're doing this to protect your privacy and ensure you successfully receive your e-mail alerts.


Add New Comment
Visitor (To avoid code verification, simply login or register with us. It is fast and free!)
*Verify code:
Tech Impact

Regional Roundup
Control this smart glass with the blink of an eye
K-Glass 2 detects users' eye movements to point the cursor to recognise computer icons or objects in the Internet, and uses winks for commands. The researchers call this interface the "i-Mouse."

GlobalFoundries extends grants to Singapore students
ARM, Tencent Games team up to improve mobile gaming


News | Products | Design Features | Regional Roundup | Tech Impact