Spare Parts – It’s Hot AND Cold!
When you consider data mirroring through RAID, Hot-Pluggable PSUs, and Hot Swappable hard drives, there is no question that Enterprise servers – with their mission-critical applications – are built with redundancy in mind.
To that end, we thought it would be helpful to share what we suggest to keep on hand in terms of spares and the basics of hot-swapping vs. hot-plugging; and hot spares vs. cold spares.
After all, nobody wants to have a data loss or a server crash because a spare part was unavailable. It’s TechMike’s suggestions to avoid getting caught with your digital pants down!
Hot Spare vs. Cold Spare
A Hot Spare is a component on constant standby should a primary part fail, go offline or work less than optimally. In the world of Enterprise servers, hot spares almost always refer to a hard drive in a RAID virtual disk (although it can technically include printers, switches, and power supply units – more on PSU’s later!). Should one of the drives within the disk fail, that hot spare is ready to go, seamlessly taking over the functions of the failed drive without any action or intervention by the server’s administrator.
A Cold Spare is a spare part that will require physical intervention by the user to engage the device. If we go back to our example of the RAID virtual disk, the server admin would need to manually insert the new drive to replace that failed drive. The system would automatically engage the swapped drive to take over the functions of the failed drive.
MORE RESOURCES! Our popular RAID configuration tutorials on our YouTube channel explain step-by-step setting up a hot spare when building a RAID array. Here are the links for a Dell PowerEdge server and an HPE ProLiant server.
What about “Hot Swapping” vs. “Hot Plugging?”
While they are sometimes used interchangeably, Hot Swapping and Hot Plugging are technically very different. “Hot Swapping” – as its name suggests – is the process of removing a component and replacing it without the need for the system to be shut down. “Hot Plugging” is the ability to attach – not replace – to the computer without the need for a reboot.
Most Enterprise rack servers have hot-pluggable backplanes – meaning that you can install hard drives while the system is running, including RAID configurations. If the drive is part of an existing RAID virtual disk, it is not necessary to configure the BIOS or restart. This is a Hot Swap.
A “Cold Swap” replaces a component that requires the system to be shutdown.
Should you keep a hot spare and a cold spare on hand?
If you utilize RAID virtual disks in your server, you definitely should! While the proper RAID configuration will avoid data loss in the event of a disk failure, you still need to replace that failed drive with a new drive – or your cold spare.
RAID NUANCES: If there is a bad drive in an existing RAID array/virtual disk, you can replace it "on the fly" or hot while the server is running and in use - ONLY if the drive is the same volume, format, and type (SATA vs. SAS) as the drive it is replacing. However, if you are adding a new drive to that RAID array or for a new virtual disk, you will need to use the Dell iDRAC utility or reboot the server and use the built-in RAID utility.
What About Other Spare Parts?
We’ve talked mainly about hard drives, but other spare components make sense to keep on hand.
Power Supply Units
Most rack servers have two hot-pluggable PSU slots. Again, the idea is redundancy. As in our cold spare hard drive principle, it makes sense to keep a cold spare PSU.
VISIT OUR PSU PRODUCT PAGE! We keep a wide selection of Dell and HPE PSUs in stock that usually ship out within one to two business days.
While server RAM is not hot-swappable or hot-pluggable like the other components, keeping spare RAM modules is a smart idea. One bad DIMM can shut your entire server down.
With their hot-pluggable hard drive bays and PSU slots, Enterprise rack servers are built with redundancy in mind. It makes sense to take advantage of this with cold spares to ensure you never have any downtime on your server.
What is your experience with spare parts? Do you keep them on hand? Let us know in the comments! We love the feedback!