## VLSI TESTING AT MULTIGBPS RATES Dragan Topisirović, Regional Centre for Talents, Niš Testing at Gbps needs high transfer rates and functional units, and requires of data format and communication within a limit implies that a physical phenomen-jitter, is essential to tester operation. This establishes design shift, which in turn dictates a shift in test and DFT methods. We, here, approaches and discuss the tradeoffs in devices. Today's high performance of digital systems requires VLSI testing at gigabits per second (multiGbps). has economic challenges. A particular on the conventional ATE resources, that bring and this systems can test ICs at 2.5 Gbps with extensions planned that will have test 5 Gbps. With this approach, the cost of corespond to the driver and receiver modules is of dollars per channel and even with extensions, where so that is commercial systems \$10,000 per channel for multigigahertz ## CCTION ATE manufacturer and suppliers today solution with options for adding multigigahertz beir systems for testing. Currently produced being of excepteonally high performance of VLSI circuit at rates of Gb per second. In we are witnessing significantly fast growth of for testing of VLSI circuits and systems, that and fast testing times. High density, coresignificant popularity, although complexity of slow down development and increase cost made high performance and profit margins in Today's economy and the rising role of new expending costs for development of new the electronic industry to reexamine the to design and test. For new product, the new technological environments promises to market, increases and fastest time to market, costs under control. Although, testing and devices represents very difficalt problems, and modern industry recognizes that testing faster than other costs related to the Future designs targeting 5 Gbps and 10 The even tighter control of timing and jitter. cops rates, is necessary to overcome between mues, which rely extensively an ATE, and improvements in ICs and their high clock rate. Cal changes in the organization of the test as and practical solutions to the support equipment. These changes have a profound impact on many aspects of existing test techniques. For example, allowing high transfer rates among channels and functional units, such as in the I/O definition of a SoC, requires readressing the implication of data format and communication within a serial mode. This contains feature into a shell, that phisical phenomen, such a jitter, are becoming very relevant to tester operation. It is today focus of all of these issues that makes multigigahertz testing a challenging problem in today's test technology. ## 1.1 Specifications of testing We pay attention into two aspects of testing: The first part, "Miltiplexing ATE Chanells for Production testing at 2.5 Gbps", analyze testing at multigigahertz using a different technique, namely to multiplex ATE chanells for production testing. Several features of current-generation ATE-timing calibration, modularity, temperature effects for sampling logic, and the large number of high channals - all shows the need for multiplexing. There are two variants for testing using new multiplexer circuit to accelerate the speed up to 2.5 Gbps. First variants uses differential pair signals in a arrangement with embedded ATE circuitry to support accurate timing calibration albeit jitter makes it prone to timing errors. The second variants reducing the negative influence of jitter on test operations. This type of design is expected to ensure high Gbps rates in future systems. The second part, which includes "Testing Gbps Interfaces without a Gigahertz tester" and this realations represent new approaches and frameworks that enable testing of multigigahertz digital devices with or without a modified ATE. There are novel testing problems - called the source synchronouns interface. The proposed technique relies heavily on DFT (Design for Testeability) and in particular use a new metodology called AC I/O loopback. This technique represents a significant improvement over a simple I/O looback arrangement. This technique allows the measurements of multiple functional parametars inclusive of AC timing specifications. Own example represents application of AC I/O loopback and supporting DFT cicuitry for the Processor Intel Pentium 4, showing that their technique can efficiently correlate different stress measurements at the physical layer within a self-test framework. A combination of timing stress and voltage stress generate diagrams with no need for a highspeed tester. #### 1.2 Automated test equipment, economics of test The economics of test, especially in a case of need test equipment in particular, has received significant attention from many vendors and ATE manufacturers, customers of ATE and the research community at large. ATE is shown on Figure 1. Increasing cost of ATE, increases the price of the product. Features, such as multisite organization, architecture modularization, and the increased presence of inexpensive testers such as those included in BIST techniques (BIST - *Built In Self Test*), are some of the significant developments of recent years. As a possible alternative to speeding up test application time represent a combination of BIST and ATE. Figure 1. Architecture of a electronic tester, ATE ## 1.3 Equipment for testing General block scheme of *Automated Test Equipmant*, ATE, represented on Fig.1,[1], [2]. The tester contain the following components: - the computer system used for "testing programming", the electronic subsystem enabling the synchronisation, waveform generation, timing, formating, - probe and companion electronics and - computer control of testing. Today exist many producer of such testing devices. Depending on configuration one tester of high performance maz cost a pair milions dollar and more [3]. High quality probe and catcher, costs up to half milion dollar [3], [4]. If we include and costs working premices, electrical instalation and working staff, it is easy to come to a conclusion why testing is exspensive bisiness. All of ATE must provide for the following: **Condition and impulse**: power supply and ground; output, incoming signals; to adapt signal on site of reset impuls and consumer. *Measurements*: impedance on input pin; threshold of logical level input digital signal; generation voltage on input pins; time of establishment an front and back edge of signals; propagation of delay; speed working; output signals. *Extraction*: adequate DC features; adequate AC features; functioning properly logical function; Exact speed of work; Correct characteristic of signals. Corresponding eletronics, that is board for interface with DUT, (*Device Interface Board*, DIB), represents electrical interface between ATE and DUT. There are various form and size DIB, but their common function are to provide reliable and uncomplicated separable electrical interface between DUT and electrical instrument of the testers. ## 2. AUTOMATED TEST SYSTEM CONFIGURATION A system for testing multi-gigahertz digital devices uses conventional automated test equipment (ATE), supplemented with multiplexing and sampling logic. The approach is similar to earlier work [1] that demonstrated feasibility. However, this current paper solves many of the practical problems that limited application in production environments. Specifically, embedded logic is used for fast/reliable auto-calibration of signals to achieve improved accuracy (typically ±25== output-level buffers are included in the multipleane provide a range of input levels to the device under relays selectively switch between high-speed and DC testing. Air- and liquid-cooling is used to electronics temperature, and thereby stabilize time production version of the system is scalable up to 14 differential pairs, each operating at 2.5 Gbps. Operating accuracy (OTA) is about ±100ps, and is typically much Timing errors are found to be dominated by the ATTEN uncertainty, which is nevertheless improved through the the embedded calibration logic [patent pending] includes peak-to-peak jitter (at a bit error rate of system is demonstrated by applying it to an Imm S2018 cross point switch that supports data rates as the same Gbps. Additional electronic modules are under development will further extend the maximum data rate (initially to a then to 5 Gbps and above), while tightening the OTA ## 2.1 Automated test system configuration Figure 2., depicts a top-level view of the test system. This approach uses multiplexing-drive mounted on the application load board to produce stimuli signals. In this solution use sampling plexing-receiver modules to capture the high output response. This modular approach lets develop the driver and receiver electronics characterize them before assembling them on the Figure 2. Multiplexing test configuration, included multiplexing and sampling modules mounted to the load board Using high pin count, high-bandwith, 50-ohr-connectors between the modules and the load usereplaceable, reusable modules. The same convenient electrical calibration interface. We domain reflectometry techniques to calibrate transmittime delays between the ATE electronics and convenient is from the ATE contact the load board on side, via pogo pins, and they go through the multiple to the DUT test socket. This normal routing is used to speed signals (below the ATE frequency limits multigigahertz signals connect between the test state the driver and receiver modules. ## 2.2 Timing subsystem One of the most important aspects of the tester's synchronization. Duration of signal's edge is hundred pico-seconds (or less) and discrepancy (deviation) in this domain will probably be treated The term timing will be used not only to synchronization, but express control of logic conditions extensions always work with the fastest clock. shared-resource architecture includes a master mater, a number of timing generators (generally fewer materials to distribute to waveform formatters, and a pin-electronics comparators. Thus each pin is supplied with a programmable timing generator, waveform formatter, DC since there is no longer a need to switch signal accuracy is possible. Also, software is simpler develop. testing has surfaced as a viable way of testing VLSI effectively and will have to be taken into account by sorial-scan, and level-sensitive-scan-design (LSSD), structures the logic so that its response is both of the order in which inputs change and circuit effectively logic elements. 3. Multiplexed signals to avoid interference signals are performed under the computer for every input pin impulse column are formed with signals means in fact determination of possible work for every pin. This subsystem may also specific functions that enable effective work of the lower complexity of the electronic circuits work be used later on. For bringing the signal on two pins multiplexing signal electronics could be fig. 3. example of this effect is shown. # mediniques for increasing the accurancy and of logic diagnosis as well as ability of the total to analyze complex and multiple defects using transition fault simulation for fault list pruning. tecnique correlates the behavior of same pre- - stock-at (S), - mansition (T), - Indging (B) and - = net (N). To classify a fault candidate as a stuck-at or transition fault, the original stuck-at or transition fault should explain some failing patterns and pass all passing patterns. Transition faults requare a certain transition on the fault site for all failing patterns. Classifying a fault as a bridging fault requires that the representative stuck-at fault explain a subset of the failing patterns and be a path tracing is used to distinguish unrelated failing measures. Classifying a fault candidate as a net fault requires that the final diagnosis report include at least one additional stuck-at fault candidate as a different fan-out branch of the same stem. The second technique is based on the iterative nature of diagnosis and focuses on increasing accuracy for multiple defects. The diagnostic algorithm is a multiphase procedure that's used to derive the high confidence defects during the first pass. After this pass, the diagnostic algorithm updates the failing measures for all unexplained failing patterns based on the already-extracted defects. All passes after the first one use less-restrictive constraints for faultslist pruning. The goals to extract additional information from the unexplained failing patterns, which might explain some multiple defects or complex defects that don't behave as stuck-at faults. An analysis based on cones af logic within the circuit and backward path tracing is used to distinguish unrelated failing measures. #### 3. AC I/O LOOPBACK TEST # 3.1 Implementation source-synchronous (SS) Interfaces The example of I/O performance changes include Intel's changing its processors' front-side bus from common-clock to source-synchronous (SS) signaling and increasing their bus transfer rate from less than 100 MHz to 800 megatransfers/second (1 MT/s = 1 Mbyte/s/pin),. On the chipset side, Intel has upgrated its universal serial bus from 48 Mbps to 400 Mbps and has transioneted to the Serial Advanced Technology Attachment (SATA) standard at a 1.25-Gbps data rate. Also, we show how we have solved the testing problem of the SS interface and how this self-test cheme is extendable to other high-speed I/O circuits, including high-speed serial (HSS) signaling. Intel designed the Pentium 4 so that has two strobes associated with each data signal. Strobe 1 captures even data bits; strobe 2 captures odd ones. The specific elements of the Pentium 4 AC I/O loopback implementation are: - per-pad, two-bit pattern generation, programmable through a TAP (*Test Access Port*), controlled scan chain; - a timing stress mechanism that can shift the position of the strobe generation consisting of a delay chain programable through a TAP-controlled test configuration chain; - a comparator that compares expected values with results stored using a sticky bit mechanism accessible through boundary scan as the pass/fail detection; and - the ability to exercise this circuitry over thousends of cycles. Because it implemented a unidirectional stress mechanism on the Pentium 4 (it consider delay the strobe generation only with respect to its nominal position), it measured the following two points for each SS signal group: - 1. First fail (FF). These are the first signals within a signal group that fail at least one cycle. - 2. All fail (AF). All signals within a signal group fail for all cycles. In an SS interface, the receiving agent captures data based on a strobe or clock provided by the driving agent along with the data. Front side bus of Intel's Pentium 4 is an example of an SS interface. The critical timing parametars in an SS interface are all skews between the output or input signal and an associated strobe. In the data bus, they are characteristics parametars: $T_{vbd}$ -data output valid before strobe; $T_{vad}$ -data output valid after strobe; $T_{suss}$ -input setup time to strobe; and $T_{bss}$ -input hold time after strobe.. One advantage of this signaling architecture is that common-mode jitter (variations that occur simultaneously in both the signal and the strobe) doesn't impact the interface's performance; only differential jitter (variations that affect the data or strobe differently in a given cycle) doas. The advent of serial communication links in chip-to-chip and system-to-system applications has resulted in intense focus on jitter and BER testing techniques, including jitter generation and measurement methodologies. Long-term jitter measures the maximum change in a clock's output transition from its ideal over a large number of cycles. The actual number of cycles depends on the application and the clock frequency. For PC motherboards and graphics applications, this is usually 10-20 microseconds. For other applications, this number will be different. Jitter is generally divided into three components: random jitter (RJ), data-dependent jitter (DDJ), and periodic jitter (PJ) [8]. Each of these components is correlated with physical sources and impact bit error rate (BER) differently. The continued market demand for GHz processors and high-capacity communication systems has resulted in an increasing number of low-cost high volume ICs clocked at GHz rates and beyond and/or equipped with multi-Gb/s serial interfaces, e.g., PCIExpress, Infiniband, HyperTransport, Serial ATA, etc. ## 3.2 AC I/O loopback test The developing plans about reducing tester capital spending and move to lower-capability structural testers. It is developing an I/O test methodology that required only an accurate clock source; it did not require probing individual signals [6]. Because the method relies on a loop in the I/O buffer and because the producer guarantee the AC timing parameters, it call this method AC I/O loopback. AC I/O loopback is a significant enhancement over a simple I/O loopback scheme targeted primarly at screening stuck-at (hard) failires. Figure 4., is a simplified representation of an oscilloscope measurement showing an eye diagram for two consecutive data bits on three separate signals, synchronized in absolute time. The multiple waveforms forming the valid data eye represent across multiple cycles in the relative position of the data with respect to the strobe. These variations are due to various sources of differential jitter, such as noise on the local $V_{\rm DD}$ grid, pattern dependencies, and even defects. The dotted vertical lines represent the strobe positions (nominal, shifted to FF, and shifted to AF). First fail is the minimum delay of the strobe (from its nominal position) that causes the input latch to capture incorrect data for at least one signal of the signal group correspondents to the worst-case $T_{vad}$ and $T_{hss}$ the signal For a centered strobe interface like that in Pentium calculate expected delay D as $$D1 = (0.5)T - (T_{vad} - T_{hss}),$$ where T is the period. All fail is the maximum delay of the strobe, nominal position, that causes all cycles in all data the signal group to fail. This corresponds to the warm $T_{vbd}$ and $T_{suss}$ of the next cycle: $$D2 = T - (T_{vbd} - T_{suss}).$$ Figure 4. AC I/O loopback measurement Because the product specifications for $T_{suss}$ account for some amount of shift in the position of and $T_{hss}$ ) window corresponding to variations in dual devices' process kew, it further improved the accuracy by using the delta between FF and AF dividual devices. This delta corresponds to the width of the ( $T_{suss}$ and $T_{hss}$ ) window across the specific group being tested. With these two formulas we come close to relate AC I/O loopback measurement to the actual sparameters. Some aliasing is possible because a could compensate for a slower $T_{vbd}$ . However, because and $T_{suss}$ and $T_{hss}$ can vary only within a small possibility of $T_{suss}$ and $T_{hss}$ covering up delays with $T_{vbd}$ (for any particular pin) is unlikely. Test engineers have extended the AC I/O methodology for other areas, such as DC tests andvanced signaling technoloies. A possibility conduct DC tests using the same loopback configuration of the signal simultaneos bidirectional (SBD) I/O tests interfaces can transmit and receive signals of the signal line. Thus, both transmitters, at either interface pairs, are driving at the same time. Even normal I/O pins are supposedly I/O, in reality drive or receive, not both at the same time. Figure 5. Simultaneous bidirectional (SBD) waveford diagram (from osciloscope) at 2.5 Gbps per variable. Figure 5. shows. To extract the polarity of the seed from the threshold so that we can sample the level. By controlling the receiver's threshold the changing the delay elements in the AC I/O mechanism, we extract the true dual-loop data eye. ## DIAGNOSIS RESULTS For each defect, the tipical diagnosis report inst af fault candidates (pins), the coresponding and the associated behavior explaining a set of the responsible for the failures has two main - Intentification of the existing sources of design marginality; - The critical process steps for the design. loss. The key is to unscramble the the essential loss. The key is to unscramble the the essential loss from the repetitive failure mechanisms caused layout marginalities that are not easily loss. the power supply subsystem is measuring the a decisive role. We also use $I_{DDQ}$ to help classify defects. $I_{DDQ}$ is used often in purpose too. This test flow uses ATPG vectors to measurements on qualified strobe points. The $I_{DDQ}$ is the DFT method is intendend to an elegant (catastrophic, more exactly defects in digital circuits. It observes behaviour defects in digital circuits. It observes behaviour defects in the defect of the power supply and grounding, [8]. Any change of $I_{DDQ}$ value from the shows at defect. is a very sensitive technique, able to detect in an early stage, even before they really harm as such it also offers a window to the future device. It is also a proper alternative to replace expensive or more time-consuming test needed to guarantee the quality and reliability of In combination with emission spectroscopy analysis $I_{DDQ}$ is also a very powerful technique material properties of the prope Figure 6.: Leakage of IDDO test technique can be applied at wafer level, at level, during incoming inspection, during life tests or even during on-line testing. Making use of an $I_{DDQ}$ test approach supported by the use of proper measurement instrumentation offers the following advantages: increased product quality, replacement (or reduction) of burn-in tests, elimination of early lifetime failures, increased product reliability, reduction of the overall test cost, increase of engineering and failure analysis productivity. | I <sub>DDQ</sub> C <sub>OV</sub> | #of I <sub>DDQ</sub> tests | Test Time | | |----------------------------------|----------------------------|------------|--------| | | | <b>PMU</b> | Q-Star | | 50% | 1 | 100 ms | 100µs | | 80% | 10-20 | 1 s | 1-2 ms | | 98% | ±500 | 50 s | 50 ms | Figure 7. Compared to the Q-Star Test solution, requiring only 100µs per (off-chip) measurement, a standard PMU is slow Can to serve as an next example: manufacturer A is doing $10~I_{DDQ}$ measurements as part of his test program. To carry out this measurement he makes use of the available PMU on his test machine. This solution requires typically about 100ms per measurement. Compared to the Q-Star Test solution, requiring only $100\mu s$ per (off-chip) measurement, a standard PMU solution is slow [11]. The Q-Star monitor allows this user to complete his $I_{DDQ}$ measurements (if he sticks to 10 measurements) 100 times faster than the time he needs for only 1 measurement or 1000 times faster in comparison to the described situation. (Q-Star monitor: 1ms for 10 measurements). Using a Q-Star Test monitor offers you the possibility to apply a complete $I_{DDQ}$ vector set of about 500 vectors in 50ms. Taking into consideration the overlapping test coverage of a functional/scan test and a full $I_{DDQ}$ test, and the fact that an $I_{DDQ}$ test is as well a good screen to detect quality and reliability problems, you can replace approximately 90% of your functional/scan vectors by running a full $I_{DDQ}$ test set and using a Q-Star Test monitor, Figure 7. That brings you to an overall test time of 50ms, plus the time needed to run the remaining 10% of your functional/scan vectors [12]. #### 4.1 Understanding test-mode functional marginalities Yield improvement requires understanding failures and identifying potential sources of yield loss. We discuss yield losses determined by marginalities in the functionality of the chip under test. These types of factors often influence yield in various ways and tipically, we associated yield variation with process variation. If there are presence parameters outside an acceptable range, that affects yield. Catching these types of marginalities requires the ability to test chips under different conditions, exploring the operating margins. With different conditions, we changed the operating conditions (supply voltage, timing, and temperature) of the DUT. We then examined test data coming from *corner lots*, batches of chips manufactured with process parameters that we intentionally varied from what is typical. The key of this analysis is to understed systematic marginalities that might unpredictaby affect the yield. The well-known techniques such as SHMOO plots can be used to assert the behavior of a chip with respect to a given test pattern set when test condition such as power supply voltage, temperature and timing are varied. Usually, shmoo plots are represented using 2D or 3D charts. Each test result is reported with green and red boxes to identify passes and failures of the given pattern set [13]. This metodology uses DFT, (DFT, *Design For Testability*), in which we vary parameters determining test conditions according to the DFT solutions in place. The goal is to check the diagnostic tool's ability to locate the basic defect types and to minimize the number of initial fault candidates (potential locations) for consideration during diagnosis. The advantages of simulation over silicon-base experiments are numerous. Simulation's quickness and lower cost let us conduct many experiments to tuner the algorithms. It used 10 full-scan industrial circuits and ran 1.000 experiments for each defect type. The diagnosis algorithm is accuracy for simple defect types (single and multiple stuck-at faults and single transition faults) was in the 98%. For more complex defect types such as bridge faults the accuracy was in the 90%. Thus, the algorithm initially satisfied the necessary conditions of having high accuracy for real physical defects when a good correlation existed between the selected fault model and the behavior of real physical defects. #### 5. CONCLUSION Research results are shown related to the problem of testing and diagnosis of digitale electronic circuits operated at very high frequencies. It is today focus of all of these issues that makes multigigahertz testing a challenging problem in today's test technology. Then impact on testing technology was considered including the ATE performance. Accordingly, new design architectures were discussed developing an I/O test methodology that required only an accurate clock source and enabling design for testability at GHz. Finally specific problems related to diagnosys of digital circuits were discussed and experience presented. #### 6. REFERENCES - [1] V. B. Litovski, "CAD of electronic circuits", DIGP "Nova Jugoslavija", Vranje, 2000 (in Serbian). - [2] M. Baker, "Demystifying Mixed-Signal Test Methods", An imprint of Elsevier Science, USA, 2003. - [3] M. Burns, and G. W. Roberts, "An Introduction to Mixed-Signal IC Test and Measurement", Oxford University Press, New York, 2001. - [4] B. Davis, "The Economics of Automatic Testings", McGraw-Hill Book Company, London, 1982. - [5] D.C. Keezer, D. Minier, M.C. Caron, "A Proposition of the Conference Conf - [6] T.M. Mak, Mike Tripp, and Anne Meixner. Gbps Interfaces without a Gigahertz Tester Design and Test of Computers, 2004, pp. 278-2 - [7] C. Hora et al., "An Effective Diagnosis Support Yield Improvement", Proc. int'l Test O3), IEEE, Press, 2002, pp. 260-269. - [8] E. Isern, J. Figueras, "I<sub>DDQ</sub> Test and Diagnosis Circuits", IEEE Design and Test of Computers No. 4, Winter 1995, pp. 60-67. - [9] www.QStar.be, Last Update 02/28/2006 - [10] http://www.gstar.be/html/your\_benefits.html - [11] Test & Measurement World, 05/10/2006 - [12] http://www.qstar.be/html/your\_benefits.html - [13] K. Baker and J. van Beers, "Shmoo Plotting. Art of IC testing", IEEE Design & test, vol. 14. July-Sept. 1997, pp. 90-97. Sadržaj-Testiranje na brzinama Gb po sekura velike brzine prenosa izmedju kanala i funkcionalni i zahteva ponovno ardresiranje podataka i komposerijskom modu rada. Ova aktivnost sadrži u sefenomen džiter, koji postaje suštinski element testiranja i uspostavlja funkcionalni i projektanskoji predstavlja zaokret u testiranju i DFT metodu razmotrili različite pristupe u testiranju Proizvodjači digitalnih sistema, danas zahtevaj VLSI na brzinama multigigabajta po sekundi. Za današnju industiju, zahtevi za testiranjima nenata na nivou multigigahertza predstavlja i izazvinomskog stanovišta. Pojedinačna rešenja, bazirani vencionalnim uredjajima za testiranje, usmeravaj pravcu postizanja što veće tačnosti, dozvoljavajući integrisanih kola na nivou 2.5 Gbps sa 144 kanala niranjem uvećanja brzine testiranja do 5 Gbps. Sa tupom, cene materijala i odgovarajućih drajveni iznose približno sto dolara po modulu, cena kanalu je još nekoliko hiljada dolara, tako da kome cena sistema za testiranje ne prelazi 10.000 dolara za kapacitet testiranja nivoa multigigahertza. # TESTIRANJE VLSI KOLA NA BRZINAMA MULTI GBPS Dragan Topisirović