[GTALUG] War Story: UEFI Boot Failure and Fix

Scott Sullivan scott at ss.org
Wed Jan 18 23:57:14 EST 2017


I was recently issue an new laptop from work. Of course, being in a 
sysadmin position, and giving the latitude to use the tools I deem 
necessary, I replaced the disk and installed Fedora 25.

I felt it was also time to take the plunge with UEFI booting.

The Problem
===========

After installation, the first boot was successful. After applying 
updates, boot failed.

This was repeatable after a second re-install, and following updates.

Symptoms
========

System boots, and goes to a blue and white UI called 'MokManager.efi'.

Research revealed this to be a UI for enrolling encryption keys for 
secure boot.

http://www.rodsbooks.com/efi-bootloaders/secureboot.html
MOKs—A Machine Owner Key (MOK) is a type of key that a user generates 
and uses to sign an EFI binary. The point of a MOK is to give users the 
ability to run locally-compiled kernels, boot loaders not delivered by 
the distribution maintainer, and so on.

Trouble Shooting
================

Firstly, I confirmed that secure boot was disabled in the laptops firmware.

Next, I booted a rescue image. Mounted the EFI partition and investigated.

mkdir /mnt/boot
mount /dev/sda2 /mnt/boot     # grub2 boot partition
mount /dev/sda1 /mnt/boot/efi # EFI partition

We then find the MokManager.efi under the following path in our rescue 
environment.
/mnt/boot/efi/EFI/fedora/MokManager.efi

I removed this executable to see what I could discover from the boot 
process.

Upon reboot, I was greeted with message of grub2-x64.efi listed as being 
corrupt, failing through to MokManager.efi that was now missing, and 
another optional efi executable that was also missing.


Repair
======

fsck.vfat /dev/sda1
- preformed removal of dirtybit>
- FATs didn't agreed, selected the first one at random. Going to restore 
firmware anyways.

mkdir /mnt/chroot
mount /dev/sda3 /mnt/chroot          # root filesystem
mount /dev/sda2 /mnt/chroot/boot     # grub2 boot partition
mount /dev/sda1 /mnt/chroot/boot/efi # EFI partition

# Make devices and process table avaliable
for j in /dev /dev/pts /sys /proc; do mount -B $j /mnt/chroot$j; done

# Make DNS avaliable inside the chroot
cp /etc/resolve.conf /mnt/chroot/etc/resolve.conf

chroot /mnt/chroot

# re-install grub2 EFI executables
yum reinstall grub2-efi shim

# leave the chroot
exit

umount /mnt/chroot/boot/efi

fsck.vfat /dev/sda1 # just to sanity check that it's still clean.

reboot


Success
=======

Machine booted normally with the latest kernel form updates.


Notes
=====

1) I've simplified my device names for convince of conveying the process 
followed. Adjust appropriately for your circumstances and disk layout.

2) In Fedora 25 is a symylink, /etc/resolv.conf -> 
/var/run/NetworkManager/resolv.conf, I had to copy this instead, your 
distro may vary.

3) There was no left over EFI partition from a previous system. This was 
all done on a blank hard drive.

4) Frankly, I found all of this far simpler to manage then a classical 
MBR based boot loader. I was able to use standard tools to investigate 
and repair the boot chain. No special gurb-install commands or dinking 
with config files.

5) None of this explains why a normal update left the boot partition in 
a unclean state to being with. But if it happens again, it's an easy 
enough repair, just a little time consuming.

-- 
Scott Sullivan


More information about the talk mailing list