Abstract -- We have developed a Clock Multiplying Unit and 4 to 1 Multiplexor (CMU/MUX 4:1) that operates up to 47Gb/s. The on board VCO is LC tuned and the output clock jitter is 4pspp. The eye diagram of the output is
400mV swing with rise and fall times of less than 9ps. This integrated circuit was processed in a 120GHz BICMOS 0.25um process. Power dissipation is only 1W for a chip area of 2mm^2.
The next generation of optical fiber communications will require transmission speeds of higher than 40Gb/ s. The OC-768 Synchronous Optical Network (SONET) standard with Forward Error Correction (FEC) will run at 43.2 Gb/ s. Silicon Germanium (SiGe) Technology, with unity current gain frequency (Ft) of higher than 120GHz is perfect for this market, being integrated with CMOS on 8 inch wafers, at very high yields.
We will describe a 4 to 1 multiplexor (mux) with integrated clock multiplying unit (CMU) phase locked loop (PLL), which achieves 43.2 Gb/ s operation with good 0.4V 'eye' opening and low 4pspp jitter. Four inputs at 10.8Gb/ s (SONET OC-192 with FEC) are time multiplexed to the higher 43.2 Gb/s rate. A 2.7GHz or 675MHz reference is multiplied by the PLL to the half-frequency of 21.6GHz. The output is clocked and multiplexed by both levels of our very symmetric clock, to eliminate duty cycle related jitter. The CMU requires a single external capacitor, or RC network, for filtering. This integrated circuit was also produced without the PLL, for laboratory test use, where a high quality 21.6GHz synthesizer with very low jitter is available. The stand-alone 4:1 Mux jitter is less than 3pspp, limited by measurement accuracy. A companion 1:4 demultiplexor (demux) that also requires external clock was also developed for bit-error-rate (BER) testing of the multiplexor.
II. Block diagram
All logic in our chip is done in full differential fashion. This has multiple benefits including: reduced voltage swing, phase margin tracking, noise cancellation, circuit simplification, etc. Figure 1 shows a top level interconnection of the main blocks in our circuit. All inputs and outputs are CML compatible; the inputs are terminated in 50-ohm load resistors.
Fig. 1. CMU and MUX top schematic
The on-board PLL takes either 675MHz or 2.7GHz into a digital quadricorrelator and locks a divided down 21.6GHz voltage controlled oscillator (VCO) to it. Locking is first achieved in frequency and next in phase. This phase detector has the advantage of being naturally multirate and also exhibits very low jitter. Other CMU implementations may show jitter that depends on the loop divider ratio and the amount of idle time between phase detector updates. Our loop bandwidth is higher than 10MHz, reducing incoming noise from the free-running VCO and rejecting crosstalk noise inside this bandwidth. Please refer to Figure 2.
Fig. 2. The CMU circuit provides a lock detection signal.
A cross-coupled current mode logic (CML), or differential amplifier logic, forms the core of the negative resistance in the VCO (1). An LC tank is connected to the collectors of the amplifier and tuning is achieved by voltage control of variable capacitors (varactors). Our tank inductor is center tapped and a voltage lower than the VCC supply is applied at its center. This prevents forward biasing the emitter follower that buffer the differential outputs. Great care needs to be taken at 21.6GHz since the RC gain stages do not have enough gain to pass this frequency with ease. Inductive peaking can be used to increase the gain at these clock speeds.
Fig. 3. The VCO is differential with varactor tuning. The voltage at the center of the tank is shifted to prevent saturating the emitter followers.
V. Phase -Frequency Detector
The digital quadri-correlator requires two quadrature phases from the VCO (2). These are provided from the last divider, which has in-phase and quadrature (I and Q) outputs between its two latches in ring-divider connection. The reference clock samples the quadrature clocks that run at 2.7GHz and digital logic then determines relative frequency from the phase rotation of the beat frequencies that result. Next, phase is locked in a bang-bang or sign-only fashion by a sampling flip-flop.
Only one divider is required in the multiplexor since the output is clocked from the 21.6GHz VCO, and 10.8GHz is all that is needed in the first mux level. Care was taken to align the proper phase of the 10.8 GHz clock to the (final) 21.6GHz retiming. The difficulty with high-speed multiplexor design is that the data and clocks flow in opposite directions. Clock centering needs to be done in Spice simulations.
Each 2:1 mux is composed of three latches and a 2:1 selector. When the top latch is holding, i.e. when clock is high, its output is multiplexed out; meanwhile the master latch of the bottom pair is holding the other input channel. When the clock switches, to low in this example, the data is transferred to the third latch and its output selected.
Fig. 4. Basic 2:1 Multiplexor consists of three latches and a 2 to 1 selector. Data is selected when it is stable.
We use a four-channel Pseudo Random Bit Sequence (PRBS) generator that works up to 12.5Gb/ s. The electrical delay between channels is approximately 1/4 the PRBS length, avoiding artifacts of simultaneously driving each mux input with the same pattern. To simplify testing, each mux input is driven with a 250 mVpp single-ended signal via a DC block. Figure 4 shows the PRBS being driven with a 10.8GHz clock that is divided down to 2.7Ghz for the CMU reference input. The output of the mux is monitored using a high speed sampling scope with a precision time-base that enables low jitter measurements and observation of waveform inter-symbol interference.
In addition to measuring the quality of the generated eye, we developed a 1:4 demultiplexor that is used with a companion PRBS decoder/error detector. The demux converts the 43.2Gb/ s stream into four 10.8Gb/s data streams, any one of which can be feed into the error detector. The error detector uses an open loop design that avoids the need for a synchronization mechanism, requiring only a frequency counter to confirm error free operation.
Fig. 5. Test Setup consists of a synthesizer, a PRBS source, an oscilloscope, a demux, a PRBS error detector and a frequency counter.
Table 1 lists the main characteristics of the IC that generates a clear 400 mV eye amplitude with 1/6 pSrms/pSpp of jitter from a -3.3V supply drawing 300mA when operated at 43.2Gb/s. Measurements on static (DC) input data patterns show 0.25/2.2 pSrms/pSpp of jitter on the resulting output signal. Spectrum analyzer measurements @ 1 MHz offsets on the divided down VCO (2.7GHz monitor) show the noise to be better than -105 dBc/Hz. Both measurements suggest that the random component of the jitter is very small. The majority of the jitter is attributed to inter-symbol interference caused in part by the output stage and digital noise entering the VCO. Undershoot and "tram lines" on the edges of the eye diagram are indications of the ISI. At high temperatures (95C), the amplitude of the waveform decreased by ~10-15%, with little degradation of the jitter. The robust performance across temperature is due in part to the use an LC tank; the variation in the oscillator's Kv was less than 10% over a 60C range.
Table 1: Basic characteristics of CMU-MUX
Fig. 6. Eye Diagram of 46 Gb/ s output taken with high bandwidth head and low jitter trigger module.
IX. Process description
The technology is a 0.25µm BiCMOS process with only 25 lithographic steps, offering 4 levels of Al and a full menu of active and passive devices. The process uses an essentially industry-standard, qualified CMOS platform, and offers two HBT devices with different speed/breakdown voltages by adding only 5 masks to the underlying CMOS process. The fast bipolar transistors ft/fmax are 120/140GHz. Further key features are: 2.5V V DD MOS transistors for digital applications, including an isolated NMOS device for improved signal isolation; high-Q MOS varactors; two polysilicon resistors; a 2µm thick upper Al layer for high-Q inductor fabrication; and a 1fF/µm 2MIM .
The process requires no trench isolation to achieve very high-speed performance. This is beneficial not only in cost and yield, but it lowers the self-heating of the bipolar transistors. Measured yields are very high and the ICs show very low Electrostatic Sensitivity Damage (ESD).
Fig. 7. Chip photograph of SiGe CMU-MUX
 P.D. Capofreddi, et al , "A Clock and Data Recovery IC for Communications and Radar Applications," IEEE Int. Workshop on Design of Mixed-Signal Integrated Circuits and Applications , 1999, pp. 88-90.
 A. Pottbacker, U. Langmann and H. U. Schreiber, "A Si Bipolar Phase and Frequency Detector IC for Clock Extraction up to 8 Gb/s," IEEE J. Solid-State Circuits , Vol. 27, No. 12, December 1992, pp. 1747-1751.
 D. Knoll, B. Heinemann, K.E. Ehwald, H. Rücker,
B. Tillack, W. Winkler, and P. Schley , "BiCMOS Integration of SiGe:C Heterojunction Bipolar Transistors," 2002 IEEE Bipolar/Bicmos Circuits and Technology Meeting Proceedings , pp. 162-166.
R. Malasani, C. Bourde, G. Gutierrez, "A SiGe 10-Gb/s multi-pattern bit error rate tester," 2003 IEEE Radio Frequency Integrated Circuits (RFIC) Symposium