[GTALUG] server questions - - help needed

o1bigtenor o1bigtenor at gmail.com
Tue Jun 5 17:16:51 EDT 2018


On Sun, Jun 3, 2018 at 7:18 PM, D. Hugh Redelmeier via talk
<talk at gtalug.org> wrote:
> | From: o1bigtenor via talk <talk at gtalug.org>
>
> | My server has been operational for about a year and I am working on a
> | number of different projects on it. Twice now (this last friday and 5
> | weeks early I came into the office to find that the server has somehow
> | been taken down and  has rebooted itself (process setup in the bios)
> | but as it doesn't quite complete the boot process, I have to hit a key
> | to tell it to continue and then finally to log in to read Debian
> | (stable).
> |
> | So I am trying to determine what may have caused the system to do a
> | reboot,
>
> Often a crash prevents logging.  Clearly logging would have to happen
> after the crash, something that isn't easy when the system has
> crashed.  But there is some hope.

Using suggestions offered I think I have been able to pinpoint the issue.
>
> Do you have a working UPS?  I don't, and I lose power a few times a
> year.  That knocks out my computers (and clocks everywere).
>
> Aside: all device classes evolve to have enough intelligence to have
> clocks that need setting, and then evolve to be networked to set their
> own clocks.  The timing of these steps is not fixed.
>
> Can you believe that I grew up with phones that had no clock?
>
> The first small computers I used had no clocks.  The big ones did so
> that IBM could charge for the time that they were used (eg. one used
> to rent machines and have to pay overtime if they worked more than one
> shift).  CP/M's file system didn't have timestamps (the were added
> long after I moved on).  MS-DOS stupidly used local time for
> timestamps, even though UNIX got it right (used UTC) before MS-DOS.
>
> | AIUI servers should be
> | able to run happily for years without issues (barring hardware
> | problems) so I want that kind of reliability. Where in /var/log will I
> | be finding the most clues as to the events that lead up to this
> | 'reboot'?
>
> Not being a debian user, I don't know which files are most useful.  If
> you are using systemd you might find that journalctl is the command
> you need.
>
> You could look at them all (you can skip the ones which haven't changed
> recently).
>
>
> I don't know why your system stops at the POST page.  Could it be that
> your HDD doesn't spin up quickly enough for the normal boot logic?

Dell has som kind of goofy BIOS stuff so that one needs to choose one
of 2 options and then the UEFI stuff happens and then the reboot works.
The waiting for input is not at issue here (the system has always been this way
- - -grin!
>
> I have one server that hangs because the EFI System Partition's
> filesystem gets corrupted during a crash (oops).  I think that the
> problem is that the OS leaves /boot/efi mounted most of the time
> (that's dumb) so the filesystem gets marked as "dirty" and the
> firmware doesn't like that.

Thanks for the ideas!

Dee


More information about the talk mailing list