David Krauss, Senior Hardware Engineer, Yavapai Systems
Heat in our machines is one of my biggest worries and a major cause of that elevated forehead you see in the picture above. I make its reduction the number one issue around here. Because I feel that heat is such an important factor in reliability I am writing this to show what we do to control it and why we fear it so, in the hope that you will be convinced and follow our example. If it inspires you to buy one of our systems... so much the better. I am the perfect guy for this job. I was born and grew up in the Senora Desert in southern Arizona where we spend half the year figuring out how to keep things cool (mainly ourselves).
Most people figure if the little rinky-dink 40 cfm fan in the power supply is still turning and there are no dust bunnies clogging up the vents they have done due diligence. These same folks will often ignore that big wad of ribbon cable obstructing the air channels. To illustrate just what can be done, the server which you're connected to right now (which is sitting in an attic in San Francisco) is exhausting air at 75 degrees F. The ambient room temperature is 73.9 F., a rise of 1.1 degrees! The Athlon Super Server on the bench is showing a 6.3 degree rise over ambient The workstation I am using (a pentium II Klamath, not one of ours), for comparison, shows an internal case temperature of over 100 F., 28 degrees over ambient!
Why worry? Nothing degrades reliability like heat Every body has seen a computer operating so hot it blisters the paint on the case but it still runs. No arguments... but for how long? Since my job is to build machines that must operate reliably 24 hours a day, seven days a week, only shut down when the motherboard is obsolete and must be replaced, I can't risk stuff like that. Heat breaks down insulating materiels and degrades junction performance in IC's. It breaks down lubricants and bearings in disk drives and cooling fans (ironic, that). It causes the values of passive components (resistors and capacitors) to change over time. Everybody knows that over clocking shortens CPU life. Why?, because it raises junction temperature (makes them run hotter, Duh!).
OK, so what can I do about it? Well for starters we:
1.) Look inside the case. If you see a rat's nest of wires you've found a good place to start. Wires are fine for conducting electrons but air doesn't go through them worth a hoot! Get yourself a big hand full of wire ties and some adhesive hold downs and neaten that mess up. Get everything out of the airflow and away from obvious heat sources like memory, CPU's, complex logic (IC's) and vents in the case and power supply. Don't forget that big new graphics card you bought. Its powerful chip that makes your games fly like the wind can generate more heat than your CPU in some cases. Leave the slot next to it empty if you can. Give it some room to breathe.
2.) Feel around. Let the system run for an hour then stick your fingers into every nook and cranny. Don't worry, as long as you stay away from the inside of the power supply (and the power switch on older AT style units) there's nothing in there that will hurt you. Put your finger on every chip. If you find one that feels hot glue a heat sink onto it. The same goes for drives. It may seem silly but we often have to use Seagate Barracuda drives. When we do we firmly attach a large heat sink to the top plate of the drive. We then drill holes in the bottom of the case and mount the drive in the airflow of the auxiliary fan. For our other machines we select drives for (among other things) the amount of heat that they generate. For this we have found that most Maxtor, Fujitsu and IBM's do very well. Most high speed (7,200 or 10,000 RPM) drives cause problems. While we all hold the Seagate Barracuda family of SCSI UW in high regard for their performance, we usually find that we can get equivalent data rates from a pair (or more) of lesser IDE drives and a RAID interleaving controller like those from Promise Technology and others. The net thermal efficiency is much better. If you find a board (like the Riva TNT2 video cards) that is offensively hot try to shuffle other boards around to leave a vacant slot next to its component side. Remember, air can't cool what it can't get to. If you can, glue an inexpensive CPU cooler to the main offending chip. Some manufacturers (Creative Labs, for instance) do this for you. Unfortunately, to meet the spacing requirements for the board, they often skimp on the assembly. A larger fan and sink will function with far greater efficiency but its size will block the adjacent slot. Use your own judgement here.
3.) Check your fans regularly. The fan on your CPU cooler and the one in your power supply are usually the cheapest that the vendor can find. Fortunately, both are easily replaced. Check the web. There are a number of vendors that offer high efficiency and high reliability substitutes and though they will set you back a few bucks they are well worth it. Make sure that the fan motor has ball bearings (rather than bushings) and get heat sinks with the largest fin surface area that you can find.
The heat sink on the left is typical of many found on consumer grade systems. The unit on the right is the one that we use.
Look at the heat sink. If the fins are cute little things (.25 to .325 in. fins) lose that sucker like a bad habit! The coolers we use are 1.75 in. deep with a solid copper core. If the fan should fail, the sink can still radiate passively at a rate sufficient to keep the CPU's junction temperature within safe limits for a short time.
Last but not least, make sure that your heat sink's contact surfaces are coated with heat conducting grease. Remove the cooling assembly from your CPU and look at the underside. If it has a white greasy goo smeared on it, put it back, all is well. If not, get a small tube of "heat sink grease" (about $2.00 for a .25 oz. tube) and coat the mating surfaces.
A high end CPU like the Athlon 2400+ can dissipate up to 56 Watts of heat. To get a better idea of what that means, turn on a lamp with a 60 Watt bulb. After 10 or 15 minutes to let it warm up and grab the bulb with your bare hand and try to unscrew it. When you get back from the hospital you will have a very clear idea of what we are dealing with.
For more information on heat sinks check out http://www.heatsink-guide.com/ among others.
4.) If you have an auxiliary (booster) fan in your case remove it and inspect the circumstances of its mounting. It is usually mounted behind a grill of small holes. These severely restrict the air flow. Using a "nibbling tool" or Dremel tool or saber saw cut out the opening to the diameter of the fan blades. This can easily double or triple the air flow from the fan. Unfortunately it will also compromise the RFI integrity of the case. To counter this we attach a formed wire grill over the opening with four screws fitted with nuts and conductive star washers. This has little significant effect on the airflow but effectively shields the opening.
5.) Keep it clean! That thin layer of dust works just like the blanket that you sleep under. It retains heat! We use an air compressor and a cheap (but soft) paint brush. For a machine in a normal office environment we suggest a thorough cleaning every four months. In a home we extend that to six months. Yes, Virginia, your house is actually cleaner than your office (well, maybe your house). We do a lot of rebuilds and upgrades and cleaning is always the first step. The first time we turn the air hose on one it usually looks like somebody set off a pipe bomb in their Hoover!
6.) Pick your parts with heat in mind. If you are dealing with an existing system this is not always possible. If you are building a new system it should be a cardinal concern. CPU's built on a finer fab will generate less heat per junction but may well have far more junctions to add to the heat load (22 million transistors in the Athlon from AMD). Using multiple (slower) drives in a striped RAID array will often generate less heat than a single super fast drive. If you are building a server or even an office workstation avoid those fancy, super gaming, VGA cards. Look for a simple S3 Virge or some such and don't load it with more Video RAM than you actually need. You can get 24 bit color at 1024x768 with 4 Megabytes. Unless you are using at least a 19 inch monitor you probably do not have any need for 8 Mb.
7.) Measure and monitor. You can not react to what you cannot see. Get an inexpensive indoor/outdoor thermometer like the one pictured below and feed the remote sensor in through any convenient hole on your case. Mount the probe tip in any area of concern (or just anywhere in the general airflow of the system) and "Bob's your uncle" as our Brittish friends might say. Sit the readout where you can see it and you are in business. If you subtract the outside tamperature (indoor) from the inside (outdoor) temperature you have your heat rise factor!
The unit pictured here is from Oregon Scientific (about 20 bucks) bought from a local electronic discount house that must remain nameless (rhymes with flies) but there are myriad alternatives available anywhere from your nearest Radio Shack to the neighborhood hardware store.
Last of all, if you have any questions please give me a call and we can discuss it. I will try to help if I can.
Home Copyright Yavapai Systems, 2003