To Open The Sky
The Front Pages of Christopher P. Winter
Adding a Second CPU to the HP Kayak
Although I'm not a Gamer, I've long wanted a computer capable of heavy-duty number-crunching, with graphics performance to match. I'm looking to get into jobs that require industrial-strength processing power, the classical example being CFD (computational fluid dynamics), with its insatiable thirst for floating-point operations.1 I could also make good use of a fast machine to edit video streams and large images (using e.g. Photoshop and something like Adobe Premiere.)
Machines that can do those jobs have been unaffordable to most individuals — until recently. Speedy modern Pentium 4 or AMD processors offer great performance, and a computer that pairs up two of these chips provides even more impressive gains for applications that can take advantage of multi-threading. However, while the price of this performance is quite reasonable in ordinary terms, my current budget precludes buying any of these latest models. Still, I wanted the benefits of dual processors for Adobe Photoshop at least. With some research and a little luck, I obtained a computer that will give me these benefits for less than the proverbial arm and leg.
In late 2004 I acquired a used HP Kayak workstation for $100. Its mini-tower case held all the essential components (CPU, power supply, hard and floppy disk drives, video card, RAM), and they all worked. However, substantial upgrading was needed if I was to get the maximum performance from the machine. This document will describe the steps I followed in successfully adding a second CPU and more memory to my Kayak. For those who may not be familiar with the history of computers, I provide links to some background material here.
History of the Kayak Personal Workstation
The Kayak series of workstations can fairly be described as "venerable". They were introduced by Hewlett-Packard (HP) in August 1997. The first model was the XA, and it used Intel's then new Pentium II (or P2) processor. The XA was followed quickly by the XA-s, the first dual-processor model. Then came the XU and XW series (each with many variants), and finally the XM600 and XU800, which ended the Kayak line — an evolution incorporating the progressively faster and more capable CPUs from Intel.2 The general design of all these computers is a tower or mini-tower case containing the following components:
In most cases, a RAID adapter is available as an option. Also, HP provided numerous software tools for these computers. Alas, most of this software is no longer available. However, manuals and other documentation can be found on HP's IT Resource Center (ITRC) Web site; the ITRC still hosts searchable discussion forums from which, with some persistence, useful information can be gleaned; and HP's on-line parts databases are a treasure trove.
Intel introduced the Pentium II (or P2) central processor unit in May 1997. This processor was mounted on a Slot 1 card because the state of the integrated-circuit design art at the time made it prohibitive to place the required cache memory on the same chip as the CPU. The P2 used 32 KB of writeback level 1 (L1) cache and 512 KB of Level 2 (L2) cache. A standard feature of Intel CPUs is that they derive their clock from the motherboard's clock signal. Initially, the P2 ran at 266 MHz, multiplying the 66 MHz motherboard or front-side bus (FSB) clock by 4. The P2 advanced to 450 MHz, using a 100 MHz FSB, by February 1999 when the P3 came in.
At its inception, the P3's design closely resembled that of the P2. Clock speeds, bus widths, cache sizes and voltages were the same. The main difference was that the P2's MMX graphics instructions were superseded by SSE. This first P3 was dubbed Katmai; it contains 9.5 million transistors (versus the P2's 7.5 million) and benefits from a smaller feature size. Katmai P3s run at up to 600 MHz. Typically, they use a 100 MHz FSB; but a few are designed for 133 MHz. All require a VRM producing 2.0 Volts.
The next P3 version is called Coppermine. It has 28.1 million transistors and is available in steppings that can speed along at up to 1100 MHz. (A still faster variant called Tualatin pushes this to 1400 MHz; but it is not available in a Slot 1 package.) Coppermine's L2 cache is halved, to 256 KB. The fact that it runs at full CPU speed somewhat makes up for this. As for front-side bus rate, Coppermines are split about equally between those requiring a motherboard with a front-side bus running at 133 MHz versus the older 100 MHz FSB. The much larger transistor count means that the chip's voltage has to be lower to reduce power dissipation. For Coppermine, voltage is 1.65 to 1.75V, and a different VRM is needed.
There is also a Xeon version of both the P2 and P3. This uses 0.5, 1 or 2 MB of L2 cache and comes in a substantially larger Slot 2 package. Xeon is intended primarily for servers, but its better performance makes it desirable in a workstation. (Do not confuse this with the P4 Xeon — it is a whole different animal.)
It is not only Kayaks that accept two Slot 1 processors. HP's Netservers of similar vintage do, and so do some Compaq products (and not just because of Compaq's acquisition by HP.) Other possibilities are the IBM Netfinity and IntelliStation, and the Dell Precision and PowerEdge. Asus and other motherboard makers also include dual-Slot-1 MBs in their product lines.
The bottom line here is that the many processor variations produced by Intel (56 for the P2, 106 for the P3, according to Mueller) — and the numerous models of the Kayak — make it imperative to thoroughly research your options before buying and assembling components into a used Kayak workstation.3
My choices in this workstation project were dictated primarily by a rather small budget. Obviously, a 500 MHz processor is far behind the state of the art in 2004.
In fact, random chance played a big part. I happened across my Kayak offered for $100 with a 30-day warranty at a local electronics surplus outlet. It contained one processor, the video and SCSI/Lan cards, 128 MB of SDRAM and a 9.1 GB SCSI disk. After looking it over carefully, outside and inside, at the store, I decided it was worth the $100 asking price and took it home. Mechanically it was in good shape, with no discernable damage. The case was unmarred. I bought a keyboard for $7, a mouse for $13, a cable for an external SCSI disk for $20. A friend let me borrow a monitor. One problem occurred when I tried to boot up the system: the hard disk was not recognized. I removed the disk and saw that a jumper was mispositioned on the header. Fixing that got me a computer that would pass the BIOS screens and boot from a Windows 2000 CD. I installed that newly purchased O/S and found that all parts functioned perfectly. I had a functioning workstation.
The next step was getting on-line so I could download information from HP and other Web sites, as well as post questions on Usenet. (That was a bit of a hassle, but not gername to this topic.) I downloaded and printed the manuals for my Kayak — to be specific, it's a Kayak XU 7/500 (model D8432T) — and found some useful discussions on the HP Business and IT Resource Center forums. Most of this discussion dates from the 2000-2001 time frame, which was about the time the Kayak line was replaced by newer models.
I knew that, when it comes to speeding up software applications, the first choice is to add more memory. The documentation I found showed that my Kayak was capable of using up to 1 GB of SDRAM IFF I installed four sticks of 256 MB ECC registered memory. Alas, HP wanted around $300 for each stick of such memory, and the memory dealers were asking $80-90. Here's where patience paid off: within a couple of weeks, I found a dealer on eBay asking $27.50 per stick. I purchased four of those and when they finally arrived, found that they too worked perfectly. However, no speedup was immediately apparent. In fact, some of the video benchmarks, as measured by Passmark's PerformanceTest, indicated slower performance. But the hit was not great, and not at all noticeable in operation. I expect the real benefit of this 1 GB will show when I run apps like Photoshop, or engineering design applications (EDA) software.
Another well-known way of speeding things up is to add a second hard disk and transfer your Windows swap disk to it. (What does MS call that now?) That's for the future.
Matching electrical characteristics
My challenge in adding a second CPU card was to find an affordable one of the proper type. All my research told me that there are four characteristics of concern if multiple processors are to operate together in a system.
Notice the less stringent wording of that last item. Some sources say stepping doesn't matter at all, as long as all the other characteristics are matched. I chose to be conservative and hold out for the same stepping. The fact that the P3 is outmoded plays to my advantage here; there are many pulled, working units on offer.
A question of power
Here is a table of Pentium CPU power ratings. This table is illuminating. It reveals that, purely from a power-consumption standpoint, it is better to run a 500MHz Coppermine than a 500MHz Katmai. The latter uses 20 more Watts. Also, an 1100MHz Coppermine uses less power (if only slightly less) than a 600MHz Katmai.
This table summarizes the differences between Katmai and Coppermine P3s.
The primary consideration in this area is the size of the heat sinks. This is a restriction imposed by the air-ducting inside the Kayak. My original CPU has a heat sink of nearly the same length and width as the Slot 1 card itself; it extends about 2.5 inches in the third dimension — upwards from the CPU chip. There is no fan mounted to it. Many of the Slot 1 cards offered on eBay have much larger heat sinks. If I have the choice, I will go with one that has an HP-style heat sink already mounted. That saves having to separate the heat sink, clean the CPU surface, buy and apply new thermal-interface compound.
At first, in order to test dual-processor operation, I used one I bought locally for $18. It has a smaller heat sink designed for a fan. I removed the fan, and saw no problem. Later I found an identical processor with an HP-style heat sink for $10. That is what I now use.
Note also that if you have an older Kayak — one of the XA-s models — the latches that hold the processor cards in place are slightly different from newer models. This can be a problem for upgrading from P2 to P3 processors. The crux of the matter is that the P3 Slot 1 is designated "SECC2", an inproved design over the original SECC ("Single Edge Contact Cartridge") the P2 uses.
Finally, if you are contemplating one of the so-called "slotkey" adapters to permit use of a fast Celeron CPU, be advised that most add too much vertical height to fit in the Kayak.
Since the processor clock is derived from the motherboard's FSB clock signal, you'll get optimum results if you choose a Slot 1 card designed for the same clock rate as your motherboard. (See table on power levels for options.) Just as with SDRAM sticks, the speeds should match. However, a stick of PC-133 memory will probably work in a PC-100 system — it will just work more slowly than it's designed to. So it is with processors. For example, say you put an 800/133/256/1.65 unit in a system with a 100MHz FSB clock. Such a processor operates at 6x FSB, so in your system it will run at 600MHz. I can't see any harm coming from that; in fact, it should run cooler than normal. But why buy performance you aren't going to use?
But there is one potential show-stopper in such a situation: The system BIOS. Depending on how the BIOS recognizes its CPUs, it may simply refuse to boot up with a mismatched processor. This is an area in which I found very little guidance. What I did learn, from two separate user accounts, is that my Kayak — the XU 7/500 — can be flashed to a BIOS for the XW series, and then it will operate with processors up to 1GHz.
So my next mission, should I choose to accept it, is to get a matched pair of suitable processors and see if they will work in the Kayak with its existing BIOS — and if not to flash the BIOS to KW1-02. Being so hard to recover from if it fails, this last action is something I won't begin without a whole lot of study.
Voltage Regulator Modules
Lately I've been getting some inquiries about voltage regulator modules. What I know is basically that each type of VRM is matched to a specific processor family. For example, the Pentium-3 Katmai family takes the 0950-2837 VRM, while the lower-power P3 Coppermine takes the 0950-3363. I've put together a VRM Identifier that gives a little more detail.
1 CFD is not the most demanding computational task, of course. Nuclear weapons design, climate prediction, and many problems in astrophysics or bioinformatics far out-rank (or "out-crank") it. But it is among the most demanding problems I can hope to work on using a home system.
2 The evolution of HP Personal Workstations continues, of course. The Kayak line gave way to three "Visualize" models using top-of-the-line Pentium-III chips and HP's Visualize fx graphics cards, specially developed for speedy 3-D modeling work. Subsequent workstations incorporate the Pentium 4.
3 A cautionary note: While searching on eBay, I've seen P3 steppings offered that Mueller does not list. It may be that his remarkable success with the "PC Upgrade" series — long regarded as the bible of the topic — has led him to get sloppy.
Hot Tip #8 -- Performance Exposed
STREAM: Sustainable Bandwidth in High-performance Computers (John D. McCalpin)
Benchmaster and other tools (Robert G. Brown, Duke Univ.)