Lecture12__Communications_and_Clock

Multi-clock/Async Communications and Clock Management

Ryan Robucci

• Spacebar to advance through slides in order • Shift-Spacebar to go back • Arrow keys for navigation
• ESC/O-Key to see slide overview • ? to see help

Printable Version

References

For classroom presentation,

$\dagger$ slides with Red Rule Lines are borrowed from
- https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/lect_02.pdf
- https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/Old/Older/lect_17_CDR_2up.pdf
- Copyright © 2007 by Mark Horowitz w/ material from David Harris
- Course Website: https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/
$\dagger\dagger$ slides with black rule lines are borrowed from
- http://www.ee.ic.ac.uk/pcheung/teaching/ee3_DSD//Topic 6 - Clocking & Metastability.pdf
- Source slide numbers are noted in slides (e.g. $^{\dagger\dagger_8}$ is slide 8 from PYKC 31-Jan-08 E3.05 Digital System Design Topic 6)
- Imperial College OF SCIENCE, TECHNOLOGY AND MEDICINE
- University of London
  Department of Electrical & Electronic Engineering
  B.Eng. & M.Eng. Third Year Course
  Digital System Design (DSD) Spring 2009
  Peter Y. K. Cheung
  Course Website: http://www.ee.ic.ac.uk/pcheung/teaching/ee3_DSD/
$\dagger\dagger\dagger$ slides (aside from bracketed quoted text "" $\dagger\dagger\dagger\dagger$ ) are presented material directly from https://www.allaboutcircuits.com/technical-articles/back-to-basics-the-universal-asynchronous-receiver-transmitter-uart/
$\dagger\dagger\dagger\dagger$ added text from me
$\dagger\dagger\dagger\dagger\dagger$ Lecture 22: PLLs and DLLs, Weste and Harris
$\dagger\dagger\dagger\dagger\dagger\dagger$ Rabaey, Chandrakasan, and Nikolic, Digital integrated circuits: a design perspective, Prentice-Hall, 2003.

Clock Uncertainty

Correct physical design requires an appriciation for systematic and random errors
For synchronous design, concerned with clock skew
- can be systematic, i.e. inherent to the design (imbalanced wiring delays, buffers)
- can be random (e.g. wiring and device variation per manufacturing run, and even between each transistor in a design)
We are also concerned with temporal randomness, e.g. Clock Jitter

In general
- systematic errors can be corrected for at design-time
- manufacturing randomness can be mitgated through design or corrected (potentially completely compensated for) by post-manufacturing tunable circuits
- Temporal noise like Clock Jitter can in general only be mitigated, but in some cases reduced through filtering

Example of bad clock distribution $^{\dagger\dagger_{18}}$

Imbalanced (systematic/design)

Balanced
Remeber, clock skew matters, delay may be OK if balanced

Multiple Clock Domains $^{\dagger\dagger_{19}}$

Many digital systems have more than one clock domain:
Need to synchronize the two clock domains, two basic building blocks:
- Phase-locked loop (PLL)
- Delay-locked loop (DLL)

Example: Classical clock recovery $^{\dagger\dagger_{20}}$

Clocking information embedded in data stream
Use PLL to recover the clock
State of system is stored in analog loop filter

Oversampled Clock/Data Recovery $^{\dagger\dagger_{21}}$

Oversample the data and perform phase alignment digitally
De-couples clock generation from tracking of data
Data must guarantee transitions (edges) to ensure tracking

Phase Alignment in Source Synchronous Systems $^{\dagger\dagger_{22}}$

Timing information carried by reference clock
Use DLL to ensure proper clock phase for sampling
Can correct for skew using DLL or PLL

Purpose of PLLs and DLLs

Phased-Locked Loops (PLLs) and Delay-Locked Loops (DLLs) are two options for removing clock delay
Additional Features:
- Frequency Synthesis (clock multiplication and clock division and the combination of both)
- Clock Conditioning (modifying duty cycle and phase shifting)
- Multi-clock output (such as multiple clock with fixed phase or fixed frequency)
  - Allows specification of related clock domains (e.g. known phase)
- Run-Time Adjustments (adjust delay/phase or frequency at run-time)

PLLs and DLLs

https://www.xilinx.com/support/documentation/application_notes/xapp132.pdf

DLL: based on a variable delay line based on voltage controlled delay (analog) or a chain of delay elements with selectable output tap along the chain

PLL: based on a variable (voltage controlled) ocilator,

Each, in addition to tunable element, includes a comparison of a distributed clock to a reference clock, called a Phase Detector, in order to generate a correction signal
The control loop can employ a filter to make small adjustments over time
Parts of DLLs and PLLs:
- Phase-Frequency Detector
- Charge Pump
- Loop Filter
- Tunable Delay/Freq
- Feedback Divider

What is a Delay locked loop? $^{\dagger\dagger_{23}}$

based on delaying input reference clock, so reference clock jitter passes to output
no phase error accumulation
frequency synthesis is difficult, PLL is well-suited for this

Typical Loop Filter $^{\dagger_{L17:S30}}$

A “charge-pump” acting as an ideal integrator

Incremental update signals increase or decrease charge over many cycles

Typical Osscilation/Delay Elements

(oscillator shown)

tunable capacitance, resistance, and current drive are all possible
- http://pages.hmc.edu/harris/cmosvlsi/4e/lect/lect22.pdf
  (slide 10)

What is Phase locked loop? $^{\dagger\dagger_{24}}$

frequency multiplication is easy
can perform frequency synthesis according to ratio of integers
reference clock jitter reduced by filtering
suffers phase error accumulation

Use of a divider for Freq Multiplication

ignoring the initial locking procedure, by keeping every n-th edge in sync with the reference we can achieved a synchronized, higher-freq clock

$\dagger\dagger\dagger\dagger\dagger\dagger$ (Rabaey et al.)

Linear Phase Detectors $^{\dagger_{L17:S27}}$

Correction signal is linearly proportional to error
XOR phase detector - 90
- sensitive to input duty cycle
SR phase detector - 180
- 1-shots remove duty cycle sensitivity

Non-linear Phase Detector [EE 371 Lecture 17 29]

An ideal flip/flop should force a loop to lock at 0
The set-up time of the flip-flop will introduce phase offset
- Symmetric structures can eliminate this problem [16]
- Can be used to cancel the set-up time of an input sampler [13]

The loop dynamics change:
- The loop is now a “bang-bang” system which dithers around a locking point:
- Risky for a PLL, routinely done for DLL’s.
The dither magnitude depends on the delay through the loop and the “loop-gain”

Timing Loop Performance Parameters $^{\dagger\dagger_{25}}$

Phase Jitter:

Varience in cycle to cycle clock period
Related to ability to "filter" reference clock jitter (which could be desired or not)
Phase Offset: Error between output phase and reference phase

Bandwidth: rate at which output phase tracks reference, ability to track changing clock frequency/phase
Acquisition time (to lock)
Frequency range (lock range)
Practical Note on Lock:
- Common to have an output "LOCK" signal to indicate when it is valid to start system operation. Can tie to a reset signal in your design

Zero Delay Buffer

Clock Management with DLL $^{\dagger\dagger_{26}}$

Can eliminate on-chip clock delay (so can PLL)
can also eliminate on-board clock delay by returning a reference from the board (so can PLL)
4 fixed-phase outputs (0, 90, 180, 270) (can be implemented by multiple taps along a delay chain)
- Multiple outputs are always phase-coherent to eachother (as opposed to parallel PLLs)
Selectable (RUN-TIME) phase shift ( n / 256 of the period) through configuration or through increment/decrement 1/256 of clock period or 50 picosecond granularity
Frequency synthesis (2x is common, PLL more capable)

DLL in Xilinx Virtex data/clock alignment $^{\dagger\dagger_{27}}$

Xilinx DLL with various phase outputs $^{\dagger\dagger_{28}}$

DLLs are commonly used to generate outputs at the same frequency but different phases, just by having multiple taps along the delay line

Using DLL to de-skew onboard clock signals $^{\dagger\dagger_{29}}$

FPGA can drive external clocks, and by passing back an external (board level) reference signal they can be adjusted to match a reference clock (at a known phase)
This compensates for unknown loads (common to see with external memory ICs)

Altera Cyclone II PLL (1) $^{\dagger\dagger_{30}}$

Phase-locked loop (PLL) is a closed-loop frequency-control system based on the phase difference between the input clock signal and the feedback clock signal of a controlled oscillator.

Main components:

Phase frequency detector (PFD)
Charge pump & loop filter
Voltage controlled oscillator (VCO)
Counters (N – pre-scale, M – feedback, C – post-scale)

Altera Cyclone II PLL (3) $^{\dagger\dagger_{32}}$

The output frequency is given by:

References for this topic $^{\dagger\dagger_{33}}$

Chapter 8, pp757-773, Digital Design Principles & Practices, John Wakerly.
“Metastability in Altera Devices”, Altera App Note 42.

“Using the ClockLock &ClockBoost PLL Features in APEX Devices”, Altera App Note 115.
“Using the Virtex Delay-Locked Loop”, XAPP-132.
“Advantages of APEX PLLs Over Virtex DLLs”, Altera TB60.

UART $\dagger\dagger\dagger$

UART: universal asynchronous receiver/transmitter
basic UART system provides robust, moderate-speed, full-duplex communication with only three signals: Tx (transmitted serial data), Rx (received serial data), and ground. In contrast to other protocols such as SPI and I2C, no clock signal is required because the user gives the UART hardware the necessary timing information.
Internal “clocks” of systems must be synchronized based on the data signal
System must both know the approximate clock rate and generate internal clocks
The error in the rates must be low enough to assume synchronization for one frame, so frame are kept short (<10 bits)

Start and Stop bits $\dagger\dagger\dagger$

Start bit: The first bit of a one-byte UART transmission. It indicates that the data line is leaving its idle state. The idle state is typically logic high, so the start bit is logic low.
The start bit is an overhead bit; this means that it facilitates communication between receiver and transmitter but does not transfer meaningful data.
Stop bit: The last bit of a one-byte UART transmission. Its logic level is the same as the signal’s idle state, i.e., logic high. This is another overhead bit.

Buad Rate $\dagger\dagger\dagger$

Baud rate: The approximate rate (in bits per second, or bps) at which data can be transferred.
A more precise definition is the frequency (in bps) corresponding to the time (in seconds) required to transmit one bit of digital data.
For example, with a 9600-baud system, one bit requires 1/(9600 bps) ≈ 104.2 µs. The system cannot actually transfer 9600 bits of meaningful data per second because additional time is needed for the overhead bits and perhaps for delays between one-byte transmissions.

Parity Bit $\dagger\dagger\dagger$

Parity bit: An error-detection bit added to the end of the byte.

There are two types—“odd parity” means that the parity bit will be logic high if the data byte contains an even number of logic-high bits, and “even parity” means that the parity bit will be logic high if the data byte contains an odd number of logic-high bits. This may seem counterintuitive, but the idea is that the parity bit ensures that the number of logic-high bits is always even (for even parity) or odd (for odd parity). So if you’re using even parity and the byte has three logic-high bits, the parity bit will be logic high, so that the total number of logic-high bits in the transmitted data (i.e., the byte itself plus the parity bit) is even.
- By forcing the number of logic-high bits to be always even (for even parity) or odd (for odd parity), the parity bit provides a crude error-detection mechanism—if a bit gets flipped somewhere in the transmission process, the number of logic-high bits won’t match the chosen parity mode. Of course, the strategy breaks down if two bits are flipped, so the parity bit is far from bulletproof. If you have a serious need for error-free communication, I recommend a CRC.

Synchronizing and Sampling $\dagger\dagger\dagger$

To ensure that an active edge of the receiver clock can occur near the middle of the bit period, the frequency of the baud-rate clock sent to the receiver module is much higher (by a factor of 8 or 16 or even 32) than the actual baud rate.

Let’s say that one bit period corresponds to 16 receiver clock cycles. In this case, synchronization and sampling can proceed as follows:
- The receive process is initiated by the falling edge of the start bit.
- The receiver waits for 8 clock cycles, in order to establish a sampling point that is near the middle of the bit period.
- The receiver then waits 16 clock cycles, which brings it to the middle of the first data-bit period.
- The first data bit is sampled and stored in the receive register, and then the module waits another 16 clock * cycles before sampling the second data bit.
- This process repeats until all data bits have been sampled and stored, and then the rising edge of the stop bit returns the UART interface to its idle state.

RS232

RS232 Standard specifies voltages and hardware connectors
https://et.wikipedia.org/wiki/RS-232
Commonly referenced and emulated communication, e.g. USB-to-Serial
https://www.cdw.com/product/Iogear-16.6in-USB-to-Serial-RS-232-Adapter/261218

Hardware Flow Control: RTS (Request to Send) and CTS (Clear to Send) are optional additional hardware signals supporting hardware flow control. (additional control signals exist as well)
- Disadvantage is additional physical conductors.
Software flow control instead uses special characters Xoff & Xon which a “receiver” sends to the “sender” to pause and resume transmission respectively.
- Could be sent by either end
- Disadvantage that Xon and Xoff characters may exist in the actual data being transmitted, requiring alternative encoding, sometimes using escape sequences or other methods like headers
- Sending of Xon and Xoff can be delayed as compared to hardware flow control signals, so the communication should not send data too quickly

Multi-clock/Async Communications and Clock Management

Ryan Robucci

References

Clock Uncertainty

Example of bad clock distribution ††18^{\dagger\dagger_{18}}††18​

Multiple Clock Domains ††19^{\dagger\dagger_{19}}††19​

Example: Classical clock recovery ††20^{\dagger\dagger_{20}}††20​

Oversampled Clock/Data Recovery ††21^{\dagger\dagger_{21}}††21​

Phase Alignment in Source Synchronous Systems ††22^{\dagger\dagger_{22}}††22​

Purpose of PLLs and DLLs

PLLs and DLLs

What is a Delay locked loop? ††23^{\dagger\dagger_{23}}††23​

Typical Loop Filter †L17:S30^{\dagger_{L17:S30}}†L17:S30​

Typical Osscilation/Delay Elements

What is Phase locked loop? ††24^{\dagger\dagger_{24}}††24​

Use of a divider for Freq Multiplication

Linear Phase Detectors †L17:S27^{\dagger_{L17:S27}}†L17:S27​

Non-linear Phase Detector [EE 371 Lecture 17 29]

Timing Loop Performance Parameters ††25^{\dagger\dagger_{25}}††25​

Zero Delay Buffer

Clock Management with DLL ††26^{\dagger\dagger_{26}}††26​

DLL in Xilinx Virtex data/clock alignment ††27^{\dagger\dagger_{27}}††27​

Xilinx DLL with various phase outputs ††28^{\dagger\dagger_{28}}††28​

Using DLL to de-skew onboard clock signals ††29^{\dagger\dagger_{29}}††29​

Altera Cyclone II PLL (1) ††30^{\dagger\dagger_{30}}††30​

Altera Cyclone II PLL (3) ††32^{\dagger\dagger_{32}}††32​

References for this topic ††33^{\dagger\dagger_{33}}††33​

UART†††\dagger\dagger\dagger†††

Start and Stop bits †††\dagger\dagger\dagger†††

Buad Rate †††\dagger\dagger\dagger†††

Parity Bit †††\dagger\dagger\dagger†††

Synchronizing and Sampling †††\dagger\dagger\dagger†††