CORDIC

Ryan Robucci

Department of Computer Science and Electrical Engineering
• Spacebar to advance through slides in order • Arrow keys for navigation
• ESC/O-Key to see slide overview • ? to see help
Printable Version

References

CORDIC

  • COordinate Rotation Digital Computer
  • developed by Jack Volder in 1959
  • developed as an iterative algorithm to convert between polar and cartesion coordinates using ONLY shift, add, and subtract operations
  • Has two modes of operation (circular rotation and vectoring) that can be used to supporting computuation of trig,hyperpolic, linear and logarithmic functions
    sin/cos, tan, polar-rect coordinate conversion

Circular rotation mode of operation:
Using an iterative algorithm, transform a vector x0,y0x_0,y_0 to a resulting vector xN,yNx_N,y_N using N iterations

  • inputs :
    • x0,y0: input vector v
    • θ\theta: angle
  • outputs :
    • xN,yN: output vector after N iterations

Our goal is to compute a final x,y using n iterations
xn=x0cos(θ)y0sin(θ)x_n = x_0 \cos (\theta) - y_0 \sin(\theta)
yn=y0cos(θ)+x0sin(θ)y_n = y_0 \cos (\theta) + x_0 \sin(\theta)

Simplifying:

Factor cos(θ)cos(\theta):
xn=cos(θ)(x0y0tan(θ))x_n = \cos (\theta) (x_0 - y_0 \tan(\theta))
yn=cos(θ)(y0+x0tan(θ))y_n = \cos (\theta) (y_0 + x_0 \tan(\theta))

We'll address the two muliplications of each equation to simplify them

Simplifying multiplications by tan(θ)\tan(\theta)

we are going to restrict allowable rotations θ\theta such that tan(θ)=±2i\tan(\theta) = \pm 2^{-i} so that the multiplication can be replaced by a shift and add/sub. We'll predetermine allowable rotations and store them in a table:

i tan(θ)=±2i\tan(\theta)=\pm 2^{-i} tan1(2i)=θi\tan^{-1}(2^{-i}) = \theta_i θi2π\frac{\theta_i}{2\pi} degrees
0 ±20\pm 2^{-0} ±tan1(1)=±0.7853981633974483\pm \tan^{-1}(1) = \pm 0.7853981633974483 ±0.125=1/8th\pm 0.125=1/8^{th} turn ±45.0\pm 45.0
1 ±21\pm 2^{-1} ±tan1(1/2)=±0.4636476090008061\pm \tan^{-1}(1/2) = \pm 0.4636476090008061 ±0.07379180882521663\pm 0.07379180882521663 ±26.565051177077986\pm 26.565051177077986
2 ±22\pm 2^{-2} ±tan1(1/4)=±0.24497866312686414\pm \tan^{-1}(1/4) = \pm 0.24497866312686414 ±0.03898956518868466\pm 0.03898956518868466 ±14.036243467926479\pm 14.036243467926479
3 ±23\pm 2^{-3} ±tan1(1/8)=±0.12435499454676144\pm \tan^{-1}(1/8) = \pm 0.12435499454676144 ±0.019791712080282773\pm 0.019791712080282773 ±7.125016348901799\pm 7.125016348901799
4 ±24\pm 2^{-4} ±tan1(1/16)=±0.06241880999595735\pm \tan^{-1}(1/16) = \pm 0.06241880999595735 ±0.00993426215277042\pm 0.00993426215277042 ±3.5763343749973515\pm 3.5763343749973515
5 ±25\pm 2^{-5} ±tan1(1/32)=±0.031239833430268277\pm \tan^{-1}(1/32) = \pm 0.031239833430268277 ±0.004971973911794637\pm 0.004971973911794637 ±1.7899106082460694\pm 1.7899106082460694
6 ±26\pm 2^{-6} ±tan1(1/64)=±0.015623728620476831\pm \tan^{-1}(1/64) = \pm 0.015623728620476831 ±0.0024865936394752068\pm 0.0024865936394752068 ±0.8951737102110744\pm 0.8951737102110744
7 ±27\pm 2^{-7} ±tan1(1/128)=±0.007812341060101111\pm \tan^{-1}(1/128) = \pm 0.007812341060101111 ±0.0012433726968348697\pm 0.0012433726968348697 ±0.4476141708605531\pm 0.4476141708605531
8 ±28\pm 2^{-8} ±tan1(1/256)=±0.0039062301319669718\pm \tan^{-1} (1/256) = \pm 0.0039062301319669718 ±0.0006216958343570503\pm 0.0006216958343570503 ±0.22381050036853808\pm 0.22381050036853808
9 ±29\pm 2^{-9} ±tan1(1/512)=±0.0019531225164788188\pm \tan^{-1}(1/512) = \pm 0.0019531225164788188 ±0.0003108491029616858\pm 0.0003108491029616858 ±0.1119056770662069\pm 0.1119056770662069
10 ±210\pm 2^{-10} ±tan1(1/1024)=±0.0009765621895593195\pm \tan^{-1}(1/1024) = \pm 0.0009765621895593195 ±0.0001554246997050102\pm 0.0001554246997050102 ±0.05595289189380367\pm 0.05595289189380367
...
  • To achieve other angles not in the restricted set, we can allow a combination of multiple rotations from any of the allowable angles.

  • The series of rotations can be first calculated using an iterative process with an angle accumulator R as follows

  • Initialize R to θ\theta

  • In each iteration i, we will rotate by θi\theta_i or θi-\theta_i according to R: R=RriθiR=R-r_i \theta_i; ri={1,R01,otherwiser_i=\begin{cases}1,&R\ge0\\-1,&\text{otherwise}\end{cases}

    • R \to 0, the total rotation is recorded in rir_i: i=0riθi\sum_{i=0}^\infty r_i \cdot \theta_i

Supported Input Argument Range

  • The range for this approach can be found by assuming rir_i is the same in every iteration (+1):
    • i=0θi2π0.27745\frac{\sum_{i=0}^\infty \theta_i}{2 \pi} \approx 0.27745
    • Therefore the range is just beyond ±\pm 1/4^1/_4 turn (±100\approx \pm 100^\circ)

We'll solve this later with an intialization step

Simplifying multiplications by cos(±θi)\cos (\pm\theta_i)

Iteration:

  • xi+1=cos(±θi)(xiyitan(±θi))x_{i+1} = \cos (\pm\theta_i) (x_i - y_i \tan(\pm\theta_i))
  • yi+1=cos(±θi)(yi+xitan(±θi))y_{i+1} = \cos (\pm\theta_i) (y_i + x_i \tan(\pm\theta_i))

Note:
cos(±θi)=+cos(θi)\cos (\textcolor{red}{\pm}\theta_i) = \textcolor{red}{+}\cos (\theta_i) ; tan(±θi)=±tan(θi)\tan (\textcolor{red}{\pm}\theta_i) = \textcolor{red}{\pm}\tan (\theta_i)

therefore

  • xi+1=cos(θi)(x0riyitan(θi))x_{i+1} = \cos (\theta_i) (x_0 - r_i \cdot y_i tan(\theta_i))
  • yi+1=cos(θi)(y0+rixitan(θi))y_{i+1} = \cos (\theta_i) (y_0 + r_i \cdot x_i tan(\theta_i))

We can pre-calculate each cos as (KiK_i) and use the fact that tan(θi)=2i\tan(\theta_i)=2^i

Ki=cos(tan1(2i))=11+22iK_i = \cos(\tan^{-1}\left(2^{-i}\right))=\frac{1}{\sqrt{1+2^{-2i}}}

Xi+1=Ki[XiriYi2i]=Ki[Xiri(Yi<<i)]X_{i+1} = K_i [X_i-r_i Y_i 2^{-i}]=K_i [X_i-r_i (Y_i \texttt{<<}i)]
Yi+1=Ki[YiriXi2i]=Ki[Yiri(Xi<<i)]Y_{i+1} = K_i [Y_i-r_i X_i 2^{-i}] =K_i [Y_i-r_i (X_i \texttt{<<}i)]

  • Furthermore we can save all the multiplications by KiK_i till the end and multiply just the final result by Ki\prod K_i
  • Hence, each stage is reduced to a shift and an add/sub for X and Y !!!!

Number of Required Iterations

  • If the output is n bits, then an iteration with i=n would involve a shift of n producing an add/sub 0, no update.
    Xn+1=Xnrn(Yi<<n)=Xnri0=XnX_{n+1}=X_n-r_n (Y_i \texttt{<<}n) = X_n-r_i \cdot \cancel0 = X_n
  • Perform iterations until but not including i=n

Initialization Step

Introduce initial rotation to reduce |R| to be less than 2π4\frac{2\pi}{4}

  • May note that tan is unquie only an iterval 2π4<θ<2π4\frac{2\pi}{4} \lt \theta \lt -\frac{2\pi}{4} (Q1 or Q4), but this doesn't turn out to be an issue in the meathod we'll use ...
  • We'll use an alternative, cheap initial rotation without use of trigonometric functions.
  • While only Q2 and Q3 vectors are a problem, Volder proposed to rotate all inputs by ±π/2\pm \pi /2
    • X0=vYX_0 = - v \cdot Y
    • Y0=vX0Y_0 = v \cdot X_0
    • θ0=θvπ2\theta_0 = \theta -v \cdot \frac{\pi}{2}
    • v={+1,θ0>01otherwisev = \begin{cases} +1,&\theta_0>0\\ -1 & {\rm otherwise} \end{cases}
    • Note that Q1 vectors are moved into Q4 and visa versa, which is fine.

Number of Steps (Complexity)

  • n+1 iterations including inital iteraion:
    • initial iteration: check sign, swap, one negate
    • n iterations: check sign, shift-add/sub
  • final scale by Ki\prod K_i can be performed if desired (converges to approximately a factor of 1.647 for a few iterations or more )
    • can also treat this as "system gain" factor and combine with overall system gain
      • such system gain factors are common in DSP operations
      • correction at end may or may not be required based on the application. If comparing to other results with the same system gain (scale factor), the adjustment is not nessisary.

Speed an Pipelining

  • CORDIC requires n+1 iterations
    • single hardware iteration stage

      • if run at system clock, computation suffers latency and low throughput
      • if using run at a fast clock, multi-clock-domain design must be introduced
    • pipeling allows throughput to "match" system clock rate (still have latency)

      • requires hardware per iteration, typically more area
      • stages are simplified as they are speciallized for each iteration. LUT for the allowed θi\theta_i is replaced by a hardwired constant per stage

Computation of SINE and COSINE

  • computed using a special case of vector rotation
  • initialize system with a specific vector: X=1, Y=0
  • final result:
    • X:cos(θ)\cos(\theta)
    • Y:sin(θ)\sin(\theta)

Free multiplication

  • initialize system with X=A, Y=0, θ\theta

    • final result:
      • X:Acos(θ)A \cdot \cos(\theta)
      • Y:Asin(θ)A \cdot \sin(\theta)
  • initialize a pipeline implemention at each time t: X=A(t)X=A(t); Y=0Y=0; θ(t)\theta(t)=Ffclock×2π×t\frac{F}{f_{clock}}\times2\pi\times t

    • result:
      • X(t):A(t(m+1))cos(θ(t(m+1)))A(t-(m+1))\cos(\theta(t-(m+1)))
      • Y(t):A(t(m+1))sin(θ(t(m+1)))A(t-(m+1))\sin(\theta(t-(m+1)))
    • This is a modulator with a latency (m+1)

N-BIT REPRESENTATION of θ\theta

IF mapping the full binary range of n bits to the range of angles around the circle,

  • angle resoution is 2π2n\frac{2\pi}{2^n}
  • the first two bits determine the quandrant
  • two's complement or unsigned inputs can be accepted without modification
  • any modular arithmetic naturally maps to the correct result (mod 2n2^n :: mod 360360^\circ) e.g.
    180+1800180^\circ+180^\circ\to0
    90±1809090^\circ\pm180\to-90^\circ