We have HS21 (5583) Blades running ESX 3.5. About 6 months ago we replaced the original 2GB DIMMs with 4GB Dimms that we purchased online. The new memory worked very well at first, but over the last few months pretty much every single new DIMM has gone bad on us and we cannot figure out why. I've contacted the company we purchased the memory from and they verified that we purchased the correct RAM for our blades and they said that since the memory has a lifetime warranty to just send it back and they'd send replacements. So we have now replaced almost every single DIMM with a new one, and now once again DIMMs that we just replaced are going bad on us again. We get Single Bit Memory errors, which are correctable errors. We also get Double Bit Memory errors, which result in a memory fault and cause the server BIOS to disable the RAM slots where those DIMMs are located.
I have now updated the BIOS firmware, as well as for the Diagnostics and Blade System Management Processor for all of our IBM Blades. I've also upgraded the BladeCenter Chassis AMM firmware and the ESX 3.5 to Update 5. I've actually resolved a few other bugs we were having with our VMware environment, but yet we still have RAM going bad on us almost every week, despite all of these updates.
I've verified that the Blades are in a cool ventilated environment and that they are getting sufficient power. However, one other warning we are getting for each of the blades is, "Blade power meter monitoring off line". I'm not sure if that issue is related or not, but I couldn't find any good documentation as to what that warning is or how to resolve it either.
This topic has been locked.
2 replies Latest Post - 2012-11-02T19:14:15Z by jaquice
Pinned topic Memory keeps failing
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-11-02T19:14:15Z at 2012-11-02T19:14:15Z by jaquice
DynamicMike 270002S55S2 PostsACCEPTED ANSWER
Re: Memory keeps failing2010-10-25T21:12:00Z in response to DynamicMikePlease can anyone help me with this issue and suggest a solution. I can send any logs or any other troubleshooting information that might be required. I am at a loss for what else to try and it is getting very frustrating having to swap the RAM out and get warrenty replacements several times a month.