Our previous Quick Fix blog post focused on troubleshooting common startup issues associated with PSUs. In this post, we'll continue with the spirit of Quick Fixes and look at other issues that can cause problems right out of the gate, specifically, RAM and cabling.
It's TechMike's Quick Fix Troubleshooting Guide – Part 2! Let's go Under the Hood!
Cabling
While our server builders test every device before it leaves our warehouse, we have found that shipping carriers don't always show the proper TLC when handling packages. Subsequently, cables can become disconnected during the shipping process. We find that loose or disconnected cables are sometimes the causes of startup issues.
Power Ribbon Cable
If you go to power on your server and nothing happens – no blinking lights, drives, and fans coming to life – it could be a loose power cable within the server. This becomes especially likely if you find that the PSUs are lit up and receiving power, but the front panel is dead. (If the PSUs aren't receiving power, then refer to our previous guide on PSU troubleshooting fixes).
Open the server's access panel and confirm that the power cable running to the front panel is securely connected to the front panel and the system board. Some server models have additional, smaller cables that run from the rack mount ear to the control panel. You'll want to confirm these are secure as well.
Backplane Cables
Another common cabling issue that comes up is a loose SAS Backplane cable. Fortunately, Dell PowerEdge servers will give an explicit error message stating, "Backplane X connector X is not connected." Reseat the backplane connectors, and consider blowing compressed air into the connectors to ensure they are free of dust.
RAM
One bad RAM module can throw a server into disarray. When diagnosing a RAM error, it's essential to determine if it is a bad module or a bad slot (or possibly even something else).
If you receive RAM errors on startup, swap the RAM module(s) with other RAM in other slots to see if the error follows the module or the RAM slot.
- If the error moves with the RAM module, that module is defective. (And if you are within your one-year warranty period, TechMikeNY will send you a replacement module.)
- If the error stays within the same slot (and is confirmed with a module operational from another slot), then the issue is likely the slot. This can be corrected by thoroughly blowing compressed air into the RAM slot. Even the smallest dust particle between the connectors can throw up errors.
RAM ERROR 'WILDCARD.' We had a customer once who was receiving memory error messages on the B2 RAM slot. The issue was not with that slot or RAM module but with a damaged pin on the system board processor socket. Ah-ha! That specific pin goes to slot B2! Can you find the damaged pin below? P.S. we immediately sent the customer a new server, and now he is up and running.
A Word on RAM Upgrades
Upgrading or adding RAM modules is perfectly acceptable. However, you do want to keep a couple of pointers in mind.
- When mixing LV (low-voltage) and non-LV RAM, the RAM will automatically run at the higher voltage.
- The same principle applies to mixing RAM modules with different speeds – the slower speed will default.
- Most importantly, if you max out the RAM in a server, you must confirm that all the modules are of the same Rank.
You can read more about upgrading server RAM in our blog post here.
Final Thoughts
At TechMikeNY, we know how frustrating malfunctioning servers and parts can be. The good news: we have decades of experience in troubleshooting server issues. There probably isn't an error we haven't seen.
Do you have a server issue that you would like addressed? Leave it in the comments! We will answer it!