

# Jitter in PCIe application on embedded boards with PLL Zero delay Clock buffer

Hermann Ruckerbauer EKH - EyeKnowHow 94469 Deggendorf, Germany Hermann.Ruckerbauer@EyeKnowHow.de



#### 1) PCI-Express Clocking

2) Motivation and Background

#### 2a) Basics

2b) Clocking in different PCIe generations

2c) Different PCIe clocking architectures

3) Clock Compliance Test

#### 4) Conclusion

2

**EYE KNOW HOW** 



PCIe are working at speeds up to 8Gb/s the RefClk only works at 100MHz!

- While the data signals are treated as "high speed", the clock is often treated as low speed signal
- Sut the PCIe specification relies on a well defined Jitter behavior, so it is very important to verify the RefClk behavior very carefully

PCIe RefClk needs also be measured as part of the compliance measurements in a PCIe system! Jitter as well as Signal Integrity and AC parameter definition needs to be verified in these tests!



#### 1) PCI-Express Clocking

2) Motivation and Background

2a) Basics

2b) Clocking in different PCIe generations

2c) Different PCIe clocking architectures

3) Clock Compliance Test

4) Conclusion

EKH - EyeKnowHow 3/1/2014

## Important Basics Jitter amplitude



Jitter Explanation out of Agilent Jitter seminar 2006
 Jitter Analysis Techniques for High Data Rates (Application Note 1432)

Figure 1. Comparison of an ideal clock and a sinusoidally jittered clock. The jitter amplitude is % UI.



## Important Basics PLL Jitter Transfer Function

EYE KNOW HOV

PLL should follow Low Frequency Jitter (e. g. SSC or Wander) and cancel High Frequency Jitter

#### **Solution** Jitter Transfer function of a PLL

Ideal behavior (blue line)

 $\stackrel{\top}{A}$ 

- As long PLL follows low frequency Jitter in the eye this is not seen in the Eye Diagram
- Possible real behavior (red Dots)
  - If the PLL/CDR can not follow this Jitter it will show up in the Eye Diagram
- Things like "Jitter Peaking" are not considered in this picture

imes Effect of CDR (and Jitter Transfer function) on Jitter

Modulation Frequency (Hz)

10<sup>5</sup>

-30

10



#### 1<sup>st</sup> or 2<sup>nd</sup> Order of a PLL\* ?

**About the order of PLL** – The order of a PLL is specified by its transfer function. If there is no filter, the PLL is called a first order PLL. The highest power of s in the denominator is used as an indicator of the loop order. The transfer function below is for a 2nd order loop.

The loop transfer equation can be written in this fashion,

$$H(s) = \frac{\omega_n^2 + 2s\xi\omega_n}{s^2 + 2s\xi\omega_n + \omega_n^2}$$
 18

where

$$\omega_n = \sqrt{\frac{K}{\tau_1}} \qquad \xi = \frac{\omega_n \tau_2}{2}$$

\*Source: http://complextoreal.com/tutorials/ Tutorial 18 and 19

EKH - EyeKnowHow 3/1/2014





#### 1) PCI-Express Clocking

#### 2) Motivation and Background

#### 2a) Basics

2b) Clocking in different PCIe generations

2c) Different PCIe clocking architectures

3) Clock Compliance Test

#### 4) Conclusion

## **Clock Specification over PCIe Generations**



## First major impact on clock performance was the change from PCIe 1.0a to 1.1:

"The PCI Express 1.0a specification failed to specify the input bandwidth the reference clock receiver or phase jitter of the reference clock itself. This is important because jitter that lies within the loop bandwidth the receiver PLL for the reference clock will transfer onto the high speed data lines. This hole in the PCI Express specification was corrected in the 1.1 update" \*

## In Gen1 the RefClk spec was part of the CEM spec, in Gen2 it moved to the base spec

Other parameters are added

## Cen3 added again some parameters that are required to be measured for compliance testing

\*Source: Agilent application note 5989-1240EN

**Clock Specification over PCIe Generations** 



Cen2 Compliance testing changed to "Dual Port Testing". This test method convolves Signal and clock traces and calculate Signal quality for data traces

"Dual Port" Signal compliance testing does not mean separate clock tests are obsolete!

Lower speed RefClk specs are no subsets from Gen3 specification. Each generation covers it's own frequency requirement

**RefClk Tests are required for each generation separately!** 



#### 1) PCI-Express Clocking

2) Motivation and Background

#### 2a) Basics

2b) Clocking in different PCIe generations

2c) Different PCIe clocking architectures

3) Clock Compliance Test

#### 4) Conclusion

12

**EYE KNOW HOW** 

## **Clocking Architectures**



#### imes Three different clocking architectures are defined:

- Common clocked architecture
- Data clocked architecture
- Separate clocked architecture
  - This would require much tighter clock spec and no SSC in the system. This configuration is not covered in this presentation
- Even called "clocking architectures" these are receiver (RX) implementation architectures

Usually a system designer does not know about the RX clocking architecture of the used devices (e. g. for AddIn Cards). So from a system perspective it is required to support both, common clocked and data clocked architecture!



#### Which Transfer is shown ?

- Host to AddIn Card ?
- Which components are where ?
- How to calculate the transport delay T2 –T1 ?



### How does this map to embedded Applications?

# Common clocked RX architecture

Assuming "Zero Delay" for the clock buffer there is only small delta between RefClk and TX Transport Delay



**Carrier Board** 

"Jitter Domain" on RX is the same for Clk and Data! Small delta for transport delay due to system configuration



**No Transport Delay requirement!** 





**No Transport delay requirement** 



## Common clocked RX architecture Transfer from Addln Card





EYF KNOW HOW



## Common Clocked RX architecture Transport delay calculation



# Transport delay calculation for embedded systems

### Parameters to be considered:

- Signal flight times on PCB
  - CPU Module
  - Carrier Board
  - AddIn Card
- Delays introduced from connectors
- RX/TX device delays
- "Zero Delay" Buffer

# Common Clocked RX architecture



- 4inch routing on AddIn Card for clock and data calculates to:
   2signals x 4inch x 175ps = 1.4ns
- Considering a connector delay with 150ps and knowing that each signal crosses two connectors calculates to 0.6ns
- Zero Delay clock buffer need to be considered with several parameters e. g. clock output skew.
  - Reference [4] calculates these to another 2ns
- imes TX internal delays need also be considered with 2ns
- These delays take 6ns from the specified 12ns maximum transport delay

Common Clocked RX architecture

Remaining 6ns need to be distributed to clock and data lines, so 3ns for each signal

 With 175ps/inch this allows 17Inch routing on carrier board and CPU Module

5Inch module routing leaves 12 inch routing on carrier board

The special implementation with Connectors and Zero Delay clock Buffer limits the solution space for PCIe clock distribution to around 12 Inch clock routing!



#### 1) PCI-Express Clocking

2) Motivation and Background

#### 2a) Basics

2b) Clocking in different PCIe generations

2c) Different PCIe clocking architectures

3) Clock Compliance Test

#### 4) Conclusion

22

**EYE KNOW HOW** 

## Clock Compliance: Test Setup



### Specified clock compliance test Setup

- No Termination and
- 2pF load to GND



RefClk Test Setup

## Clock Compliance: Test Setup



#### The CLB (Compliance Load Board) is not optimized for clock compliance tests

- Normal test configuration connect 500hm SMA cables to Scope input with 50 0hm termination
- This does not fit to the specified test load configuration: No Termination and 2pF load to GND



## Clock Compliance: Test Setup



# > Difficult to create a common setup for all measurements.

- Standard configuration in most cases will be SMP/SMA cables to 500hm scope input
  - Due to 50 Ohm termination this setup does not allow crossing point and rise/fall time tests
  - > Jitter tests will have only small difference to open circuit test
- Using a differential active probe provides the open circuit, but can not measure the clock crossing point.
- Best would be to use two single ended active probes.
  But there is no GND Pin at the probe connector of the CLB!

## **Compliance Test**

#### **Clock Compliance test require:**

- AC Parametric test
- Jitter test

#### $\times$ Available tools

- PCISig ClockJitter tool: Only Jitter evaluation
- Scope vendors compliance application

| Agilent PCIe Gen2 RefClock     |
|--------------------------------|
| compliance Test result example |

| $\checkmark$ | 0 | 1 | System Board Tx, RMS Random Jitter without crosstalk (PCIE 2.0, 5.0 GT/s)                  | 3.495ps  | 92.7 %  | VALUE <= 48.000ps   |
|--------------|---|---|--------------------------------------------------------------------------------------------|----------|---------|---------------------|
| $\checkmark$ | 0 | 1 | System Board Tx, Maximum Deterministic Jitter without crosstalk (PCIE 2.0, 5.0 GT/s)       | 5.639ps  | 87.2 %  | VALUE <= 44.000ps   |
| $\checkmark$ | 0 | 1 | System Board Tx, Total Jitter at BER-12 without crosstalk (PCIE 2.0, 5.0 GT/s)             | 54.809ps | 40.4 %  | VALUE <= 92.000ps   |
| $\checkmark$ | 0 | 1 | Reference Clock, High frequency > 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s)      | 2.42ps   | 21.9 %  | VALUE <= 3.10ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Residual (Common Clk) (PCIE 2.0, 5.0 GT/s)                            | 19.70ps  | 73.7 %  | VALUE <= 75.00ps    |
| $\checkmark$ | 0 | 1 | Reference Clock, Low frequency 10kHz - 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s) | 590fs    | 80.3 %  | VALUE <= 3.00ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Deviation (Common Clk) (PCIE 2.0, 5.0GT/s)                            | 465.9m%  | 6.8 %   | VALUE <= 500.0m%    |
| $\checkmark$ | 0 | 1 | Reference Clock, Maximum SSC Slew Rate (Common Clk) (PCIE 2.0, 5.0GT/s)                    | 3.8fs/UI | 99.5 %  | VALUE <= 750.0fs/UI |
| $\checkmark$ | 0 | 1 | Reference Clock, High frequency > 1.5MHz RMS Jitter (Data Clk) (PCIE 2.0, 5.0 GT/s)        | 2.92ps   | 27.0 %  | VALUE <= 4.00ps     |
| 1            | 0 | 1 | Reference Clock, Low frequency 10kHz - 1.5MHz RMS Jitter (Data Clk) (PCIE 2.0, 5.0 GT/s)   | 670fs    | 91.1 %  | VALUE <= 7.50ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, Full SSC Modulation (Data Clk) (PCIE 2.0, 5.0 GT/s)                       | 6.0ps    | 100.0 % | VALUE <= 20.0000ns  |
| 1            | 0 | 1 | Reference Clock, SSC Deviation (Data Clk) (PCIE 2.0, 5.0 GT/s)                             | 466.3m%  | 6.7 %   | VALUE <= 500.0m%    |
| $\checkmark$ | 0 | 1 | Reference Clock, Maximum SSC Slew Rate (Data Clk) (PCIE 2.0, 5.0 GT/s)                     | 3.8fs/UI | 99.5 %  | VALUE <= 750.0fs/UI |
|              |   |   |                                                                                            |          |         |                     |

#### Agilent PCIe Gen1 RefClock compliance Test result example



Clock Jitter Test Results

**FYF KNOW** 

|   |                   |   |   | , , , , ,                                                   |           |         |                               |  |
|---|-------------------|---|---|-------------------------------------------------------------|-----------|---------|-------------------------------|--|
|   | $\checkmark$      | 0 | 1 | Reference Clock, Falling Edge Rate (PCIE 1.1)               | 1.13V/ns  | 15.6 %  | 600mV/ns <= VALUE <= 4.00V/ns |  |
|   | $\checkmark$      | 0 | 1 | Reference Clock, Differential Input High Voltage (PCIE 1.1) | 418mV     | 178.7 % | VALUE >= 150mV                |  |
|   | $\checkmark$      | 0 | 1 | Reference Clock, Differential Input Low Voltage (PCIE 1.1)  | -405mV    | 170.0 % | VALUE <= -150mV               |  |
|   | 1                 | 0 | 1 | Reference Clock, Average Clock Period (PCIE 1.1)            | 2.468kppm | 10.7 %  | -300ppm <= VALUE <= 2.800kppm |  |
|   | $\checkmark$      | 0 | 1 | Reference Clock, Duty Cycle (PCIE 1.1)                      | 49.6%     | 48.0 %  | 40.0% <= VALUE <= 60.0%       |  |
|   | 1                 | 0 | 1 | Reference Clock, Variation of VCross (PCIE 1.1)             | 42.3mV    | 69.8 %  | VALUE <= 140.0mV              |  |
|   | $\checkmark$      | 0 | 1 | Reference Clock, Absolute Max Input Voltage (PCIE 1.1)      | 410.4mV   | 64.3 %  | VALUE <= 1.1500V              |  |
|   | 1                 | 0 | 1 | Reference Clock, Absolute Min Input Voltage (PCIE 1.1)      | -14.3mV   | 95.2 %  | VALUE >= -300.0mV             |  |
|   |                   |   |   |                                                             |           |         |                               |  |
| % | VALUE <= 48.000ps |   |   |                                                             |           |         |                               |  |

Reference Clock, Phase Jitter (PCIE 1.1)

Reference Clock, Rising Edge Rate (PCIE 1.1

57.7 % VALUE <= 86.00ps 18.5 % 600mV/ns <= VALUE <= 4.00V/

36.38ps

1.23V/ns

## Compliance Test Clocking/RX architectures



#### > Data lane compliance testing utilizes "Dual Port" methodology

- Clock and Data are captured with a single acquisition and Sigtest convolves the data to generate a Dataeye.
- >> Dual Port Compliance test Methodology does not fit to Data Clocked architecture
- Data lane compliance does show problems on Clock and Data, but does not allow to distinguish the source of the problem.

With the "Dual Port" test setup in PCIe Gen2 compliance tests it is difficult to analyze the root cause of compliance issues!

## Compliance Test Pass/Fail Clock Compliance





EYF KNOW HOW

## CLK Compliance Test Pass/Fail Clock Compliance

### **Failing Clock Compliance**

| ×            | 1 | 1 | Reference Clock, High frequency > 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s)      | 5.19 ps   | -67.4 % | VALUE <= 3.10 ps     |
|--------------|---|---|--------------------------------------------------------------------------------------------|-----------|---------|----------------------|
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Residual (Common Clk) (PCIE 2.0, 5.0 GT/s)                            | 39.21 ps  | 47.7 %  | VALUE <= 75.00 ps    |
| $\checkmark$ | 0 | 1 | Reference Clock, Low frequency 10kHz - 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s) | 340 fs    | 88.7 %  | VALUE <= 3.00 ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Deviation (Common Clk) (PCIE 2.0, 5.0GT/s)                            | 471.1 m%  | 58%     | VALUE <= 500.0 m%    |
| $\checkmark$ | 0 | 1 | Reference Clock, Maximum SSC Slew Rate (Common Clk) (PCIE 2.0, 5.0GT/s)                    | 7.1 fs/UI | 99.1%   | VALUE <= 750.0 fs/UI |
| ×            | 1 | 1 | Reference Clock, High frequency > 1.5MHz RMS Jitter (Data Clk) (PCIE 2.0, 5.0 GT/s)        | 6.52 ps   | -63.0 % | VALUE <= 4.00 ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, Low frequency 10kHz - 1.5MHz RMS Jitter (Data Clk) (PCIE 2.0, 5.0 GT/s)   | 530 fs    | 92.9 %  | VALUE <= 7.50 ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, Full SSC Modulation (Data Clk) (PCIE 2.0, 5.0 GT/s)                       | 6.0 ps    | 100.0 % | VALUE <= 20.0000 ns  |
| <b>√</b>     | 0 | 1 | Reference Clock, SSC Deviation (Data Clk) (PCIE 2.0, 5.0 GT/s)                             | 469.3 m%  | 6.1 %   | VALUE <= 500.0 m%    |
| $\checkmark$ | 0 | 1 | Reference Clock, Maximum SSC Slew Rate (Data Clk) (PCIE 2.0, 5.0 GT/s)                     | 7.2 fs/UI | 99.0 %  | VALUE <= 750.0 fs/UI |

### **X** Passing Clock Compliance

5ps vs. 2ps RMS jitter

EYE KNOW H

| $\checkmark$ | 0 | 1 | Reference Clock, High frequency > 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s)      | 2.37ps   | 23.5 %  | VALUE <= 3.10ps     |
|--------------|---|---|--------------------------------------------------------------------------------------------|----------|---------|---------------------|
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Residual (Common Clk) (PCIE 2.0, 5.0 GT/s)                            | 21.58ps  | 71.2 %  | VALUE <= 75.00ps    |
| $\checkmark$ | 0 | 1 | Reference Clock, Low frequency 10kHz - 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s) | 590fs    | 80.3 %  | VALUE <= 3.00ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Deviation (Common Clk) (PCIE 2.0, 5.0GT/s)                            | 465.4m%  | 6.9 %   | VALUE <= 500.0m%    |
| $\checkmark$ | 0 | 1 | Reference Clock, Maximum SSC Slew Rate (Common Clk) (PCIE 2.0, 5.0GT/s)                    | 3.8fs/UI | 99.5 %  | VALUE <= 750.0fs/UI |
| $\checkmark$ | 0 | 1 | Reference Clock, High frequency > 1.5MHz RMS Jitter (Data Clk) (PCIE 2.0, 5.0 GT/s)        | 2.88ps   | 28.0 %  | VALUE <= 4.00ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, Low frequency 10kHz - 1.5MHz RMS Jitter (Data Clk) (PCIE 2.0, 5.0 GT/s)   | 660fs    | 91.2 %  | VALUE <= 7.50ps     |
| $\checkmark$ | 0 | 1 | Reference Clock, Full SSC Modulation (Data Clk) (PCIE 2.0, 5.0 GT/s)                       | 5.7ps    | 100.0 % | VALUE <= 20.0000ns  |
| $\checkmark$ | 0 | 1 | Reference Clock, SSC Deviation (Data Clk) (PCIE 2.0, 5.0 GT/s)                             | 464.3m%  | 7.1 %   | VALUE <= 500.0m%    |
| $\checkmark$ | 0 | 1 | Reference Clock, Maximum SSC Slew Rate (Data Clk) (PCIE 2.0, 5.0 GT/s)                     | 4.0fs/UI | 99.5 %  | VALUE <= 750.0fs/UI |

## CLK Clock Compliance Test: Post Processing for Gen2



## The clock compliance test algorithms requires quite some post processing of the Data



For failure analysis these functions need to be "rebuild" outside of the scopes compliance application

EKH - EyeKnowHow 3/1/2014

## CLK Compliance Test Pass/Fail Clock Compliance

**PASS CLK TIE** 



EYE KNOW HOW



EKH - EyeKnowHow 3/1/2014



### **Combined Jitter**



### LF Jitter <1.5MHz HF Jitter > 1.5MHz



How to create this data:

- SSC removal
- Transfer to Frequency domain
- BP and HP Filters
- PLL Transfer function manipulation

HP Filter from 1.5MHz to 50MHz

BP Filter from 0.1 to 1.5MHz

## CLK Compliance Test Pass/Fail Clock Compliance



### **X RX/TX PLL transfer functions for RMS jitter** calculation for common clock RX Architecture

Reference Clock, High frequency > 1.5MHz RMS Jitter (Common Clk) (PCIE 2.0, 5.0 GT/s)

Reference: PCI Express Base Specification, Rev 3.0, Section 4.3.7.3.3, Table 4-

Test Summary. FAIL Test Description. This test verifies that the reference clock TREFCLK-HF-RMS is less than the max Pass Limits: <= 3.10 ps Reference Clock RMS Jitter (TREFCLK-HF-RMS) 13.70 ps

#### Result Details

Ref Clock Tilter Response (See image) Ref Clock TIE Spectra (See image) Ref Clock TIE Waveforms (See image) Transfer Function H1: 8MHz, 3.0dB peaking | H2: 16MHz, 3.0dB peaking

Num Uls Processed 96.242 kCycles RefClkJit (p-p) filtered 124.22 ps RefClkJit (rms) filtered 13.70 ps Connection Type Chan 2,4 - Direct Connect

#### Trial 1

Trial 1: Ref Clock Filter Response



#### PCIE 2.0 Common Clk Jitter Filter Response, Mag

## CLK Compliance Test Failure analysis



> Based on the information from the clock compliance test the conclusion is that the bad signal quality on the data is caused by a clock jitter problem.

#### To do more detailed analysis the compliance tests need to be "re-build" in the normal scope environment.

- As some post-processing features (e. g. PLL difference function) is a special feature of the compliance application this might be difficult.
- Functions that should be available on most scopes directly:
  - Clock recovery by second order PLL
  - SSC removal (better to switch off SSC for analyis)
  - FFT for transfer of TimeDomain signal to Frequency Domain
  - BandPass and HighPass Filters
  - TIE measurement for Clock jitter

## **CLK Compliance Test Failure analysis**



### TIE shows ~3MHz Jitter on Clock with standard Scope tool functionality



CLK Compliance Test Failure analysis



Jitter is too fast for "Number 1" Jitter source: DCDC switching Power Noise

- 3MHz noise might already come out from the host clock generator and it might be not be possible to Improve on board level
- Possible solution (?): Turn drawback from embedded systems into advantage
  - Zero Delay clock buffers can clean up input clock by PLL Bandwidth setting optimization

Always configure the Zero Delay clock buffer according to your needs! Measure input clock compliance as well!



#### 1) PCI-Express Clocking

2) Motivation and Background

#### 2a) Basics

2b) Clocking in different PCIe generations

2c) Different PCIe clocking architectures

3) Clock Compliance Test

#### 4) Conclusion

EKH - EyeKnowHow 3/1/2014

37

**EYE KNOW HOW** 

## Conclusion



Clock and Data Compliance Test on embedded Systems with PCIe interface is required and really important!

- > Don't forget TX AND RX for Data as well!
- 100MHz is NOT low speed
  - On clocks Jitter is important!
  - Edge rates are important for signal quality
- The standard test setup with the CLB does not allow to do all measurements with a single setup
  - Single ended open circuit measurements for e.g. crossing point tests are difficult to implement

Especially for embedded applications the transport delay delta for common clocked architectures need to be considered



The convolution of Data and Clock signals in the "Dual Port Test setup" for Gen 2 TX compliance tests makes it difficult to distinguish between clock and data lane related issues.

- Most systems will require compliance to both, common and data clocked RX architecture
- Cascading PLLs can cause issues due to jitter amplification, but might be also used to fix clock jitter issues

For analysis of Clock jitter issues it is required to understand the post processing of the scopes compliance application.

EVE KNOW



[1] PCI-Sig, "PCI Express Jitter Modeling, Revision 1.0," www.pcisig.com, July 2014, p. 19.

[2] Agilent Technologies, "N5393C PCI Express® 3.0 (Gen3) Software for Infiniium Oscilloscopes," www.agilent.com/find/N5393B, 5989-1240EN, May 2013, p. 15.

[3] Agilent Technologies, "Jitter Analysis Techniques for High Data Rates", cp.literature.agilent.com/litweb/pdf/5988-8425EN.pdf, February 2003, p. 4.

[4] Ernie Buterbaugh, Cypress, "Cypres PerfectTiming II", Perfect Timing II: Design Guide for Clock Generation and Distribution, Chapter 3-3.