L04 – Verilog Events, Timing, and Testbenches
Ryan Robucci1. References
- (Schaumont 2010) Schaumont, Patrick
A Practical Introduction to Hardware/Software Codesign, Springer ISBN 978-1-4614-3737-6 https://www.springer.com/gp/book/9781441960009
2. Single Assignment Code
Consider the following code:
a=b+1; a=a*3;
This is the same as
a = (b+1)*3;
single assignment to a
can be implemented in hardware with
or perhaps
3. Conversion to Single Assignment Code
- ensure the in a pass of code a given variable is only assigned once
- conversion involves
- introduction of new variAbles and
- renaming to ensure single assignments in a section of code.
a=b+1; a=a*3;
a1=b+1; a2=a1*3;
- simplifies identification of the producing expression
- easier to understand the underlying data-flow
- data-dependency graphs can be generated more easily from single-assignment code
- unique identifiers facilitate association of hardware nets to variables in code
4. MERGE
MERGE is a conceptual tool that can be introduced facilitate code analysis. We'll first motivate MERGE with an example.
Code with Loop/Branch
a=b; for(i=1;i<6;i++){ a=a+i; } //(Schaumont 2010)
Conversion?
a1=b; for(i=1;i<6;i++){ a2= a? + i; //what to use? // for first iteration need a1, // thereafter need a2 } //(Schaumont 2010)
Solution is to use introduce the concept of a MERGE**. A MERGE maps to a mux in hardware, typically creating a loop or feedback.
Still must ensure that no combinatorial loops are formed, adding registers and multiple clock cycles as needed (below a2 may be implemented a register to ensure this).
a1=b; for(i=1;i<6;i++){ a3=MERGE(i==1?a1:a2); //now a1 and a2 will be registers a2=a3+1; } // (Schaumont 2010)
!!! WARNING MERGE IS ONLY A CONCEPT * MERGE is a */concept/, NOT an ACTUAL VERILOG LANGUAGE CONSTRUCT. Do not attempt to use merge in your code.
5. Unrolling and Simplification
Unrolling can be used to create single assignment code:
a1=b; a2=a1+1; a3=a2+2; a4=a3+3; a5=a4+4; a6=a5+5;
hardware if implemented in one clock cycle with 5 addersAlgorithmic simplification , in this case arithmetic simplification can trim unnecessary complexity
a = b+15;
implies a single addition
6. Single Assignment Code Example
This example code describes an iterative algorithm that will be implemented with registers
Original Code
int gcd(int a , int b) { while ( a != b) { if ( a > b ) a = a-b; else b = b-a; } return a; } // Shaumont 2010
Single Assignment Code:
int gcd(int a1 , int b1) { while (MERGE(_?a1:a2)!= MERGE(_?b1:b2){ a3 = MERGE(_?a1:a2); b3 = MERGE(_?b1:b2); if (a3 > b3) a2 = a3-b3; else b2 = b3-a3; } return a2; } // Schaumont 2010
Hardware Implementation:
- Single Assignment Code allows examination of data dependencies and hardware resources such as what can be done in a single clock cycle (combinatorial) and where a register is required.
- These concepts are also important when writing behavioral HDL code in Verilog or VHDL.
7. Synthesizeable Combinatorial Code with attention to Loops
- Today we'll discuss a few constructs which involve a concern of Control Loops and Data (dependency) Loops in procedural HDL
- We'll first concern ourselves with combinatorial behavior and then allow additional flexibility with sequential
- Should not attempt to synthesize any code that implements unresolvable dependency loops as combinatorial HW (executes in one clock cycle)
8. Single Assignment Code
Think of each statement as a node on a graph with the edges denoting dependencies. Nodes can be producers and consumers of values. A graph with loops cannot be directly resolved as a combinatorial circuit. The inputs not generated from within the code are also nodes (they represent an assignment from elsewhere).
always_comb begin //always @ (a,b,c) y1 = a+b; y2 = y1+c; y3 = y1+y2*y2; end
- Note how in this code segment alone the output is not well-defined. It is not clear of y1,y2 are used elsewhere or if they are just working variables for partial results.
Branches can be thought of as multiplexers that depend on the evaluation of conditional expression. A new flag variable resulting from the condition evaluation may be introduced to make this clear.
always_comb begin //always @ (a,b) y1 = a+b; if (y1 < 0) y2 = 0; else y2 = y1; end
explicit flagalways_comb begin //always @ (a,b) y1 = a+b; flag = (y1 < 0); y2 = flag?0:y1; end
- To achieve the status of single assignment code, every variable may only be assigned once
- We may need to convert code to an equivalent single-assignment code to understand its underlying structure. To do this introduce additional variables when variables are assignment more than once
Original Code 1:
y = a+b; s = y+c; y = y+s*s;
Single Assignment Code 1:
y1 = a+b; s = y1+c; y2 = y1+s*s;
Original Code 2:
y = a+b; if (y < 0) y = 0;
Single Assignment Code 2:
y1 = a+b; flag = (y1 < 0); y2 = flag?0:y1;
9. Externally formed feedback loops
- Consumption preceding production: If a procedural block consumes its own outputs before generating them, and external loop may be inferred
module top (u,v,a,b); output logic u; //u has scope outside the procedural block output logic v; //v has scope outside the procedural block input logic a; input logic b; always_comb begin // @ (*) begin u = v & a; //consumes v and a, produces u v= u & b; //consumes u and b, produces v end endmodule
Some synthesis tools might produce an error and halt synthesis, but the SystemVerilog standard only states that
Software tools should perform additional checks to warn if the behavior within an
always_comb
procedure does not represent combinational logic, such as if latched behavior can be inferred.
An advanced code linter can find the issue:
Vivado 2023.2.2 exhibited no difference in behavior for
always_comb
vs.always @(*)
in this code- Inter-block combinational feedback: Combinational loops can be formed among multiple blocks, and this is to be avoided as well.
always @(*) begin: process_u u = y & a; end always @(*) begin: process_y y= u & b; end
10. Verilog Synthesis: Feedback (data dependency loops)
It is important to be able to identify data dependency loops. Unresolvable loops cannot be implemented with combinational hardware.
Example 1
always @ (a,b) begin y = 1; y = y&a; y = y&b; end
- No feedback after substitutions
Example 2
always @ (a,y) begin y = ~(y&a); end
- Unresolveable Feedback
Example 3
always @ (a,b,yA,yB) begin yA = ~(yB&a); yB = ~(yA&b); end
Feedback
Example 4
always @ (posedge clk) begin y = ~(y&a); end
This clearly does not attempt to describe combinatorial hardware it uses an edge-selective trigger to describe sequential hardware. A register
y
is inserted by a synthesizer, but alongside other blocks, the danger is non-deterministic pre-synthesis simulation. This code should use non-blocking assignment.
11. A test for data dependency loops
- A conceptual test to understand if procedural code is implementing strictly combinatorial logic is to set every combinatorial result assigned to x at the entry point in the code, representing an erasure of any previous result lingering in the simulation
- If behavior would change in any case then code is not implementing strictly combinatorial logic
always @ (a,y) begin y = ~(y&a); end
always @ (a,y) begin y=1’bx; //** y = ~(y&a); end
- ❌ when
a
:y='x
always @(a,b) begin y = 1; y = y&a; y = y&b; end
always @(a,b) begin y=1’bx; //** y = 1; y = y&a; y = y&b; end
- ✔️combinatorial
always @(a,b,yA,yB) begin yA = ~(yB&a); yB = ~(yA&b); end
always @(a,b,yA,yB) begin yA=1’bx; yB=1’bx; //** yA = ~(yB&a); yB = ~(yA&b); end
- ❌ when
a==1
:yA='x
- ❌ when
a==1,b==1
:yA='x
yB='x
always @(a,b) begin if (a) p=1; y = ~(p&b); end
always @(a,b) begin p=1’bx; y=1’bx; //** if (a) p=1; y = ~(p&b); end
- ❌ when
a==1,b==1
:y='x
always@(a,b,c,s,y) begin if (s) begin p = (a&b); y = ~(yA|c); end else begin y = a|b; end end
always@(a,b,c,s,y) begin p=1’bx; y=1’bx; //** if (s) begin p = (a&b); y = ~(yA|c); end else begin y = a|b; end end
- ❌ when
s
:p='x
Remember, in pre-synthesis simulation you will see 1'bx
as a result,
but through synthesis the possibility of 1'bx
is instead interpreted
as a "don't care". If your code already achieved complete assignment
without the initialization to 1b'x
, then the initialization should
have no affect.
‼️ Note 'x in Simulation vs. Synthesis 'x in a resulting in simulation output means that in simulation the value was undetermined. When a synthesizer analyzes code and determines that the code assigns 'x, it interprets it as a don't care and may optimizes according to implementation cost.
12. A test for implied loops in mixed logic blocks
- The combinational erasure test works for internal combinatorial values computed in edge-selective-trigger blocks.
- Setting combinatorial values to
'x
at code entry may help identify issues in pre-synthesis simulation - In the following code, does that combinatorial erasure have an effect?
always @ (posedge clk) begin y <= _y; _y = ~(a&b); end
always @ (posedge clk) begin _y = 1’bx; y <= _y; _y = ~(a&b); end
❌ fails test
always @ (posedge clk) begin _y = ~(a&b); y <= _y; end
always @ (posedge clk) begin _y = 1’bx; _y = ~(a&b); y <= _y; end
✔️ passes test
13. Unwanted Synthesized Cycle Iteration Loop (unintended cycle delay)
module littleloop(input clk,input a,input b,output reg y); always @ (posedge clk) begin: label reg _y; y <= _y; _y = ~(a&b); end endmodule
Figure 1: little feedback loop
module littlecorrection(input clk,input a,input b,output reg y); always @ (posedge clk) begin: label reg _y; _y = ~(a&b); y <= _y; end endmodule
Figure 2: no feedback loop
Module with Enable:
- The following modules produce the same result, with feedback from the register. Some give an explicit name to the net attached to the register's data input as _y.
✔️ explicit full specification of _y
in each pass
module enableV1(input clk,input en,input d,output reg y); always @ (posedge clk) begin: label reg _y; if (en) begin _y=d; end else begin _y=y; end y <= _y; end endmodule
Figure 3: explicit _y
❌ wrong style, even if result is the same memory inferred from _y
rather than y
is not proper
module enableV2(input clk,input en,input d,output reg y); always @ (posedge clk) begin: label reg _y; if (en) begin _y=d;\\ end y <= _y; end endmodule
✔️ implied memory default for y
is intended
module enableV3(input clk,input en,input d,output reg y); always @ (posedge clk) begin: label if (en) begin y<=d; end end endmodule
✔️ in any given pass _y
is only consumed after a production
module enableV3(input clk,input en,input d,output reg y); always @ (posedge clk) begin: label logic _y; if (en) begin _y = d; y<=_y; end end endmodule
14. Combinatorial Synthesis: Loops
We will classify traditionally coded loops in procedural code by how they are expected to be interpreted by synthesizers.
- Static Loops: Number if iterations defined at compile time.
- Could directly perform finite loop unrolling
- Often synthesizers cannot convert non-static loops to combinatorial circuit.
- Dynamic-loops are those for which determination of the number of loop iterations required is determined at run-time, usually because the number of loops is determined by a run-time variable.
- In the example below, the condition that is checked before every iteration is dependent on assignments within the body of the loop.
Furthermore, the multiple data movements are problematic
While-like Loop:
temp=datain; count =0; for (index=0;|temp;index=index+1) begin if temp[0]==1 (count=count +1) begin temp = temp>>1; end end
The previous code can be rewritten to have well-defined static loop count and no implied data movement. The constant
8
sets the statically defined loop count.count =0; for(index=0;index<8;index=index+1) begin if temp[index]==1 (count=count +1); end
!!! Tip What can and cannot be synthesized depends on the synthesizer The logic function is not fundamentally excluded from mapping to cominational hardware, as seen in the next example. So, while this code might not be synthesizable by every synthesizer today, others might be able to analyze the code and interpret the combinational hardware function successfully. Also what a given synthesizer can't do today, it may be able to do tomorrow. Your synthesizer should provide documentation on what constructs are allowed and it is worth checking yearly.
15. Synthesis: Feedback (data dependency loops)
Registered output logic (mix comb and seq.) should be separated to understand the dependencies. New variables may be introduced to denote the difference in signals before and after a register.
module counter(input clk, output reg [3:0] count); parameter CNT_MAX=11; always @ (posedge clk) begin if (count == CNT_MAX) count <= 0; else count <= count + 1; end endmodule
Feedback is perhaps unclear here. See rewrite below.
module counter(input clk, output reg [3:0] count); parameter CNT_MAX=11; reg [3:0] count_prereg; always @ (posedge clk) begin count <= count_prereg; end always @ * begin: comb1 reg flag; flag = (count == CNT_MAX); count_prereg = flag?0:(count+1); //could also have written if else using flag condition end endmodule
Figure 4: counter
{ signal : [ { name: "clk", wave: "p...." , node: '..af.' }, { name: "count", wave: "23456", data: "9 10 11 0 1" , node: '..bg.' , phase:-0.25}, { name: "flag", wave: "0.10.", node: '..c..' , phase:-0.5}, { name: "count+1", wave: "34567", data: "10 11 12 1 2", node: '..d..' , phase:-0.75}, { name: "count_prereg", wave: "34567", data: "10 11 0 1 2" , node: '..e..' , phase:-1} ], edge: ['a-~>b','b-~>c','b-~>d', 'c-~>e', 'd-~>e', 'f-~>g'], config: { hscale: 2 }, }
- Feedback across clock cycles is acceptable.
- No feedback in combinational partition.
16. Synthesis: keyword disable
- The keyword
disable
may be used to implement a "break" from a loop. Consider this not yet covered and avoid for now if without further study.
17. Sequential Synthesis: Multi-Cycle Loops
Note that you may be able describe a sequential circuit with non-static loops, but it is common to be NOT SUPPORTED by synthesizers. The following suggests a multi-cycle (multiple clock periods) operation.
"while" loop with iteration synced to a clock:
count=0; for(index=0;|temp;index=index+1) begin @(posedge clk); if temp[0]==1 (count=count +1) temp>>1; end
18. Synthesis: Multi-cycle Operations
- This is a forward note, we'll learn about this more later in the course.
It is typical to employ multi-cycle operations to reduce hardware through resource sharing (reuse of hardware in difference clock cycles) and reduce the critical path lengths.
Implicit state machines to describe multi-cycle operations are not supported for synthesis: In the land of simulation one can model such behavior as follows, though it is usually not supported for synthesis
always @ (posedge clk) begin temp=a*b; //cycle 1: mutiply inputs a,b @(posege clk) y=temp*c; //cycle 2: mutiply by *c end
- We'll want to understand how to implement multi-cycle operations using state machines to create variations based on the number of resources (e.g. multipliers, registers) affordable or desired, as factors like clock-cycle latency and throughput, power, clock freq., etc…
!!! Warning Multi-cycle computations Multi-cycle computations and rarely supported constructs above are shown here for completeness. Later lectures will formally introduce methods for such descriptions. Do not attempt to use these in synthesizable code without further study.
19. Key Points pertaining to Loops
- Combinational hardware with data dependency loops cannot be synthesized
- Static for-loops can be synthesized by being unrolled
- Dynamic for-loops might not be understood by the synthesizer
- Dynamic for-loops with timing control might be synthesized as a "multicycle operation" or a state machine, but many synthesizers do not support this construct
- We'll want to formalize multi-cycle operations as state machines a little later in this course.
20. Generate Constructs (structural for loops, and conditional structuals)
- Outside procedural blocks, keywords
for
andif
are supported
:CUSTOMID: RANDOM-ID-catjy
structural "for loop": for-in-generate
- Uses a special indexing variable defined using genvar. Within a generate block indicated using keyword generate
Use for repetitive structural instantiations
genvar index; generate for (index=0; index < 8; index=index+1) begin: gen_code_label BUFR BUFR_inst ( .O(clk_o[index]), // Clock buffer output .CE(ce), // Clock enable input .CLR(clear), // Clock buffer reset input .I(clk_i[index]) // Clock buffer input ); end endgenerate
for-in-generate example: Adder
An adder is a good example since it involves a repetitive structure with interconnection between instantiations (/use for param) / Study the details of the carefully crafted index in the carry out of each instance
assign cin[0]=1'b0; genvar index; generate for (index=0; index < 8; index=index+1) begin: some_hierarchical_label adder adder_inst ( .cin(cin[index]), .a(a[index]), .b(b[index]), .cout(cin[index+1]), .y(u[index]) ); end endgenerate
- ⭐ Note Updated Syntax in SystemVerilog: In SystemVerilog the
generate
andendgenerate
keywords are optional. Under many use cases, those keywords could be obviously inferred by the interpreter, such as when the for loop utilizes agenvar
variable, and thus the additional keywords are superfluous. - ✏️ Note Hierarchical names of generate blocks
- explicit labels: (if statements, case-items, and for-loops), can be explicitly labeled
- use label on
begin...end
block , such assome_hierarchical_label
in the previous example. - Variables and instances defined within can be accessed with the hierarchy
<parent>.<some_hierarchical_label>.<children>
or in the case of for loops:<parent>.some_some_hierarchical_label[<generation_index>].<children>
- use label on
inferred hierarchical names from conditions : In SystemVerilog, names for conditional statements are implied for unnamed if blocks if involving simple named parameter checking
module top (...); parameter test_bench_only = 0; genvar i; if (test_bench_only) //inferred heirarchy top.test_bench_only logic a; // top.test_bench_only.a ... else logic b; // top.test_bench_only.b
- explicit labels: (if statements, case-items, and for-loops), can be explicitly labeled
21. Concluding Points
- Combinatorial dependency loops cannot be synthesized
- Static for loops can be synthesized by being unrolled
- Dynamic for Loops may not be synthesizable
- Dynamic for Loops with timing control may be synthesized as a "multicycle operation" or a state machine.
- Repetitive/Patterned Structural Instantiations may be done with for…generate loops
- We'll want to formalize multi-cycle operations as state machines.
22. Interesting Synthesis Example of Control Signals and Data
A conditional addition such as
if temp[0]==1'b1 (count=count +1);
are often synthesized with an adder, as in
count=count+temp[0];
simplifying the circuity.
23. Example Exercise and Discussion on Static/Dynamic Loop Determination
Does the for loop below represent a dynamic or static loop?
module myModule(a,... input [4:0] a; reg [29:0] thermocode; integer i; always ... begin thermocode=0; for(i=0; i<a;i++) begin thermocode[i]=1'b1; end end .. endmodule
Hint1: Start by trying to unroll the loop iterations
thermocode[0]=1'b1 thermocode[1]=1'b1 thermocode[2]=1'b1 thermocode[3]=1'b1 ..where to stop?
Where to stop is not obvious until a
is known, which only happens at
"run-time"
How to make the loop static?
Proposal 1: set a
to a fixed value? * this would make the loop static
but since a is an input from another module we want to keep the input
Next Hint: Can you predetermine the possible range of i
, which is
based on the range of a
?
Responses:
- 230
- 0 to 31
- [0,a)
- 25
Lets try to unroll the loop starting with the first iteration i=0.
i=0 involves setting thermocode[0]
When is thermocode[0] set to 1?
Ans: when a>0, resulting in thermocode[0]=a>0
thermocode[0]=a>0 thermocode[1]=a>1 thermocode[2]=a>2 thermocode[3]=a>3 thermocode[4]=a>4 thermocode[5]=a>5 thermocode[6]=a>6 thermocode[7]=a>7 thermocode[8]=a>8 thermocode[9]=a>9 thermocode[10]=a>10 ... thermocode[20]=a>20 thermocode[21]=a>21 thermocode[22]=a>22 thermocode[23]=a>23 thermocode[24]=a>24 thermocode[25]=a>25 thermocode[26]=a>26 thermocode[27]=a>27 thermocode[28]=a>28 thermocode[29]=a>29 //thermocode[30]=a>30 xx thermocode is only 30-bit //thermocode[31]=a>31 xx thermocode is only 30-bit
We can realize that though a
can go up to 29, the thermocode only
includes 30 bits
Now, lets shorthand this long set of repetitive code as a static loop:
for(i=0; i<30;i++) begin thermocode[i]=a>i; end
- So, should the original code be considered "synthesizable" or not?
- Answer:
- depends on the synthesizer. The context in which the loop exists (here the context involves a bounded
a
) may be of importance, but may or may not be taken into account by the synthesizer. - Increasingly smart synthesizers can synthesize dynamic loops, but the designer should understand the equivalent static loop and the relationship to hardware.
- depends on the synthesizer. The context in which the loop exists (here the context involves a bounded