Lecture 18 – Iterative Arithmetic Implementations

Ryan Robucci

References

Sequential Multiplication using Shift-AND-Add

Sequential Multiplication using iterative leftshift of input operand (Unsigned)

Sequential Multiplication using iterative rightshift of partial result (Unsigned)

Simple Iterative ("Long") Division

Depictions of implementations

a b stage shift S h i f t L b D a a a b b stage shift S h i f t D L 1 initialize a+b s s 1 R0 b a 0 Load Enable Enabler sign bit 0 1 initialize Initializer 0 0 s s R0 1 sign bit R R R R

V a r i b l e S h i f t stage shift L D sign bit a a b b R0 0 1 0 s 1 initialize R s R

initial shift S h i f t L F i x e d D S h i f t R 1 initialize s 0 1 D s 1 0 a b sign bit a b 1 initialize 0 s R0 R R

Iterative Division of Integers (Restoring/"UNDO")

Q=N/D; assuming n=0, N>D

  1. Initialize r=Nr=N
  2. For for i = m-1 downto 0:
  1. Construct Q: Q=0m1qi2iQ=\sum_0^{m-1}q_i2^i

For the algorithms listed in these slides, some liberty has been taken on the presented algorithms, please refer to the referenced text or many research papers available to work out final details of hardware implementation

Example

m:8
Q=N/D=233/5
Initial Remainder:233
       7:R:     233, D*2^i:     640  Decision:restore
       6:R:     233, D*2^i:     320  Decision:restore
       5:R:     233, D*2^i:     160  Decision:subtract
       4:R:      73, D*2^i:      80  Decision:restore
       3:R:      73, D*2^i:      40  Decision:subtract
       2:R:      33, D*2^i:      20  Decision:subtract
       1:R:      13, D*2^i:      10  Decision:subtract
       0:R:       3, D*2^i:       5  Decision:restore
Bit Indexes used for subtraction: [5, 3, 2, 1]
Q:46
Q*D:230
N:3

Non-Restoring Division (no UNDO required)

No Restoration (No Restoration Required)

Ex: Solve Q=10.125/3, but this time must use all weights, either subtracting or adding.

Algorithm for Iterative Division of Integers (Non-Restoring)

  1. Initialize r=Nr=N
  2. for i = m-1 downto 0:
    qi={1,r0 ((produces subtraction))1,r<0 ((produces addition))q_i = \begin{cases} 1 , & r \ge 0 \text{ ((produces subtraction))} \\ \red{-1} , & r \lt 0 \text{ ((produces addition))} \end{cases}
    r=rqi(D×2i)r=r - q_i \cdot (D \times 2^{i})
  3. Final step: if r<0 then q0=0,r=r+D\red{\text{if }r<0 \text{ then } q_0=0,r=r+D}
  4. Construct Q: Q=0m1qi2iQ=\sum_0^{m-1}q_i2^i

Example

m:8
Q=N/D=233/5
Initial Remainder:233
7:R:     233, D*2^i:     640 Decision:sub
6:R:    -407, D*2^i:     320 Decision:add
5:R:     -87, D*2^i:     160 Decision:add
4:R:      73, D*2^i:      80 Decision:sub
3:R:      -7, D*2^i:      40 Decision:add
2:R:      33, D*2^i:      20 Decision:sub
1:R:      13, D*2^i:      10 Decision:sub
0:R:       3, D*2^i:       5 Decision:sub
Bit Indexes used for subtraction: [7, 4, 2, 1, 0]
Bit Indexes used for addition: [6, 5, 3]
Q:47
Q*D:235
R:-2
Remainder Correction:
Q:46
Q*D:230
R:3

Alternative: Scaling the remainder rather progressively smaller weight

Iterative Division of Integers (Restoring, Scaling Remainder)

  1. Initialize r=Nr=N
  2. for i = m-1 downto 0:
  1. r=r/2m\red{r=r/2^{m}} ((corrects scale))
  2. Construct Q: Q=0m1qi2iQ=\sum_0^{m-1}q_i2^i

Example:

7:R:     233, D*2^(m-1):     640 Decision:restore
6:R:     466, D*2^(m-1):     640 Decision:restore
5:R:     932, D*2^(m-1):     640 Decision:subtract
4:R:     584, D*2^(m-1):     640 Decision:restore
3:R:    1168, D*2^(m-1):     640 Decision:subtract
2:R:    1056, D*2^(m-1):     640 Decision:subtract
1:R:     832, D*2^(m-1):     640 Decision:subtract
0:R:     384, D*2^(m-1):     640 Decision:restore
R:768
Scaled Remainder:3
Bit Indexes used for subtraction: [5, 3, 2, 1]
Q:46
Q*D:230
R:3

Iterative Division of Integers (Non-Restoring,Scaling Remainder)

  1. Initialize r=Nr=N
  2. for i = m-1 downto 0:
  1. r=r/2m\red{r=r/2^{m}} ((corrects scale))
  2. Final step: if r<0r<0, q0=0q_0=0,r=r+Dr=r+D
  3. Construct Q: Q=0m1qi2iQ=\sum_0^{m-1}q_i2^i

Example:

7:R:     233, D*2**(m-1):640  Decision: subtract
6:R:    -814, D*2**(m-1):640  Decision: add
5:R:    -348, D*2**(m-1):640  Decision: add
4:R:     584, D*2**(m-1):640  Decision: subtract
3:R:    -112, D*2**(m-1):640  Decision: add
2:R:    1056, D*2**(m-1):640  Decision: subtract
1:R:     832, D*2**(m-1):640  Decision: subtract
0:R:     384, D*2**(m-1):640  Decision: subtract
Bit Indexes used for subtraction: [7, 4, 2, 1, 0]
Bit Indexes used for addition: [6, 5, 3]
Q:47
Q*D:235
R:-2
Remainder Correction:
Q:46
Q*D:230
R:3

Final Change: Iterative Division of Integers (Non-Restoring, Scaling Remainder Before Subtraction)

  1. Initialize r=Nr=N
  2. for i = m-1 downto 0:
  1. r=r/2mr=r/2^{m} ((corrects scale))
  2. Final step: if r<0r<0, q0=0q_0=0,r=r+Dr=r+D
  3. Construct Q: Q=0m1qi2iQ=\sum_0^{m-1}q_i2^i

Example:

7:N:     233, D*2^m:1280  Decision:sub
6:N:    -814, D*2^m:1280  Decision:add
5:N:    -348, D*2^m:1280  Decision:add
4:N:     584, D*2^m:1280  Decision:sub
3:N:    -112, D*2^m:1280  Decision:add
2:N:    1056, D*2^m:1280  Decision:sub
1:N:     832, D*2^m:1280  Decision:sub
0:N:     384, D*2^m:1280  Decision:sub
Bit Indexes used for subtraction: [7, 4, 2, 1, 0]
Bit Indexes used for addition: [6, 5, 3]
Q:47
Q*D:235
R:-2
Remainder Correction:
Q:46
Q*D:230
R:3

Extension of Non-Restoring Division to Signed Operands

Division by Convergence

Solution:

Error:

Algorithm

Each step is two multiplications and a two's complement operation

Convergence of D to 1

(Koren Page 214)
D=DR1R2R3...=(1y0)×[(1+y0)(1+y02)(1+y04)...]=(1+y0)×[(1y0)(1+y02)(1+y04)...]\begin{array}{rl} D_\infty & = D \cdot R_1 \cdot R_2 \cdot R_3... = \\ & (1-y_0) \times \left[(1+y_0)(1+y_0^2)(1+y_0^4)...\right]= \\ & \textcolor{red}{(1+y_0)} \times \left[\textcolor{red}{(1-y_0)}(1+y_0^2)(1+y_0^4)...\right] \end{array}

Lookup Tables

Lookup tables can be used to bypass the first few iterations. This use of lookup tables will be discussed in the context of another method

Division by Reciprocation

Newton-Raphson Iteration

Quadratic Convergence of Division by Reciprocation (rate of reduction of error)

In each iteration, we obtain double the number of bits of the previous iteration (1 extra, then 2 extra then 4 extra...)
To show:

Quadratic Convergence of Div by Reciprocation

We can also show the direct convergence xi1Dx_i \rightarrow \frac{1}{D} :
Let the initial guess be
x0=1x_0 = 1

x1=2D\begin{aligned} x_1 = 2-D \end{aligned}

x2=(2D)(2D(2D))=(2D)(1+1D(2D))=(2D)(1+12DD2))=(2D)(1+(D1)2)\begin{aligned} x_2 &= (2-D)(2-D(2-D))\\ &= (2-D)(\green{1+1}-D(2-D))\\ &= (2-D)(1+1-2D-D^2))\\ &= (2-D)(1+\red{(D-1)^2}) \end{aligned}

x3=[(2D)(1+(D1)2)](2D[(2D)(1+(D1)2)])=[(2D)(1+(D1)2)](1+1D[(2D)(1+(D1)2)])=[(2D)(1+(D1)2)](1+14D+6D24D3+D4)=[(2D)(1+(D1)2)](1+(D1)4)\begin{aligned} x_3 &= \purple{[(2-D)(1+(D-1)^2)]}(2-D\purple{[(2-D)(1+(D-1)^2)]})\\ &= [(2-D)(1+(D-1)^2)](\green{1+1}-D[(2-D)(1+(D-1)^2)])\\ &= [(2-D)(1+(D-1)^2)](1+ 1 - 4 D + 6 D^2 - 4 D^3 + D^4)\\ &= [(2-D)(1+(D-1)^2)](1+\red{(D-1)^4}) \end{aligned}

xi=(2D)(1+(D1)2)(1+(D1)4)(1+(D1)2i1)=(1(D1))(1+(D1)2)(1+(D1)4)(1+(D1)2i1)x_i = (2-D)(1+(D-1)^2)(1+(D-1)^4)\cdot\cdot\cdot(1+(D-1)^{2^{i-1}}) = (\textcolor{red}{1}-(D-\textcolor{red}{1}))(1+(D-1)^2)(1+(D-1)^4)\cdot\cdot\cdot(1+(D-1)^{2^{i-1}})

let D=1+yD=1+y;y=D1y=D-1
xi=(1(y))(1+(y)2)(1+(y)4)(1+(y)2i1)x_i=(1-(y))(1+(y)^2)(1+(y)^4)\cdot\cdot\cdot(1+(y)^{2^{i-1}})
This is the series from earlier, therefore
limxii=11+y=1D\underset{i\rightarrow \infty}{\lim x_i} = \frac{1}{1+y}=\frac{1}{D}

Lookup Table for Initial Guess

  • Ex: Div by Reciprocation
    • Given j bits to pre-calculate, a table of 2j2^j entries (initial approximation) should be provided of the desired accuracy of n bits
    • optimal choice is function (1/x) evaluated center of interval (proof omitted).

Square Root

Iteration Implementation

Multi-Cycle

Loop Unrolling

Pipelining

Lookup Tables for Initial Guess

  • reduce run-time cycles with precomputed iterations