[GTALUG] war story: power failure makes CentOS system unbootable

Lennart Sorensen lsorense at csclub.uwaterloo.ca
Thu Sep 29 18:11:29 EDT 2022


On Thu, Sep 29, 2022 at 12:22:08PM -0400, D. Hugh Redelmeier via talk wrote:
> I'm continuing a very old thread of mine.
> Because of the age, I'm top-posting.
> 
> Summary up until now:
> 
> - system fails to reboot after power failure
> 
> - problem seems to be firmware distaste for dirty bit on ESP (/boot/efi)
> 
> - dirty bit always happens because /boot/efi is always mounted, even 
>   though it isn't needed 99% of the time
> 
> - mounting-on-demand doesn't fix this because packagekit accesses the ESP 
>   almost immediately.  This feels like a bug.
> 
> - I disabled packagekit to prevent /boot/efi from being automounted.
>   Well, it grew back, somehow.  Updates, I guess.
> 
> Now:
> 
> So now I'll use a different approach.  I'll get automount to also 
> autodismount.  Oh, the wonders and mysteries provided by systemd!
> 
> 1. Modify the /etc/fstab entry for /boot/efi to include 
> 	"x-systemd.idle-timeout=600"
>    This asks for dismounting after 600 seconds of inactivity.
> 
> 2. Get the relevant daemon(s) to pay attention:
> 	systemctl daemon-reload
> 	systemctl restart boot-efi.automount
>    Note that boot-efi.automount's name is synthesized automatically from 
>    the mount point.  So is the target script itself.
> 
> 3. Hope this work
> 
> See systemd.mount(5).
> You can ignore or be confused by systemd.automount(5).

It seems to correct thing to do is get the kernel fixed.

https://github.com/torvalds/linux/commit/b88a105802e9aeb6e234e8106659f5d1271081bb
clearly states that windows only sets the dirty bit on write, yet for
some reason it was chosen to make linux set it whenever you mount the
vfat read-write.  That is dumb.  Of course the UEFI firmware that only
works when the dirty bit is cleared is also dumb, but what can you do
with firmware developers.

Obviously these defective UEFI firmwares should ignore the bit but since
they don't and windows doesn't break them (because it only sets it when
writing to the filesystem the first time) it would make sense to fix it.
Of course this means instead of being able to have all the logic do
this at mount time, there has to be a state kept that is checked whenever
writes happen to the filesystem and set the bit at that time instead.

-- 
Len Sorensen


More information about the talk mailing list