Server Weirdness

Mark Lane lmlane-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Sat Oct 10 22:22:19 UTC 2009


Yeah it's not the fans the system is running at about 47C even under heavy
load. The hottest I could get the processors was only 50C so well with in
tolerances.

Also I don't think it's the power supply because it's not going down under
heavy power usage.

On Sat, Oct 10, 2009 at 3:15 PM, Madison Kelly <linux-5ZoueyuiTZhBDgjK7y7TUQ at public.gmane.org> wrote:

> Mark Lane wrote:
>
>> I have been having problems with a CentOS 5.3 Fileserver (64 Bit) lately
>> that wants to run reboot all of a sudden. It was up for over 200 days
>> without an issue. I have run a burn in on the server and it was fine.
>> Checked the memory with memcheck86+ and it was fine. Ran a CPU burn-in along
>> with bonnie++ looping for 5 hours and the I couldn't get the CPU to over
>> heat or the power supply to choke on heavy load. It seems to be a power
>> management problem, yet am even running the same kernel that it ran for 200
>> days hasn't made the system stable. I did have to replace the motherboard
>> battery but I have restored the BIOS settings and the problems existed
>> before the battery went. The system just stops working without warning and
>> it's getting worse.  It only seems to happen when it's somewhat idle for an
>> extended period.
>>
>> It's a Athlon 64x2 3800+ Running on a MSI K9N Platinum with Linux software
>> raid 5 across 4 WD 250 Satas. I have checked the drives and they seem fine.
>> I am currently running FSCK to see if it finds any problems.
>>
>> Anyone else experiencing problems with CentOS 5.3 lately? I am wondering
>> if it's a package that might be causing the instability. And yes I have
>> checked to see if the system was compromised but I haven't found anything.
>>
>
> What daemons/services are you running? I've run into bad openais and cman
> RPMs that messed things up, but not reboots (unless you have fence devices
> in which case they could be in a fence loop, but not likely).
>
> As for possible simple problems, check the fans. If they're sleeve-bearing
> fans, they could have "spun out". They'll work sometimes (sometimes with
> noise, other times quiet), and occasionally stop. If they were running
> during your burn-in you would not reproduce the reboots. However, if they
> stop, particularly the CPU fan, it could over-heat and trigger a thermal
> shutdown/reboot.
>
> A bad power supply could also do this, but that is less likely.
>
> Madi
> --
> The Toronto Linux Users Group.      Meetings: http://gtalug.org/
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
>



-- 
Mark Lane <lmlane-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/legacy/attachments/20091010/2ef63ee6/attachment.html>


More information about the Legacy mailing list