[GTALUG] war story: python broken by Fedora update

D. Hugh Redelmeier hugh at mimosa.com
Wed Jun 9 11:17:41 EDT 2021


I update Fedora 32 on my wife's netbook.
Just the normal "sudo dnf update".

After the update, dnf no longer worked.  Nor did many other things
written in python.

This is politically very bad because my wife is deathly afraid of updates 
and I'm deathly afraid of software that isn't updated.  She was right!

With dnf broken, the main tool to fix broken packages was unavailable.
How does one recover?
1) diagnose
2) attempt a fix
3) repeat until things work

Here's the message I got with any invocation of dnf:
================
Traceback (most recent call last):
  File "/usr/bin/register-python-argcomplete", line 26, in <module>
    import argparse
  File "/usr/lib64/python3.8/argparse.py", line 89, in <module>
    import shutil as _shutil
  File "/usr/lib64/python3.8/shutil.py", line 29, in <module>
    import lzma
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 844, in exec_module
  File "<frozen importlib._bootstrap_external>", line 976, in get_code
  File "<frozen importlib._bootstrap_external>", line 640, in _compile_bytecode
ValueError: bad marshal data (unknown type code)
================

I guessed that some compiled python file was corrupt (.pyc).  Mostly
because of the word "frozen" in the traceback.  The .pyc files all
seem to be in directories called __pycache__.  The name "cache"
suggested to me that if I deleted the .pyc file in question it would
be recreated.

Even safer, hide all the contents of the __pycache__ directory.  That
way I could restore the contents if my guess was wrong.

There are a lot of __pycache__ directories.  The diagnostics gave no
hint of which was bad.  I guessed /usr/lib64/python3.8/importlib/__pycache__ 
because the diagnostic said "importlib".  I hid this cache and ran
dnf.  This did not fix things.  But some contents were regenerated as
I had guessed.

I decided to run strace on dnf to find out what files were accessed.

Oops: the system didn't have strace installed, and dnf was broken.
How do I install strace?  It turns out that when you type "strace" to
Fedora's bash, it notices that the command isn't installed and offers
to install it -- without needing python!  (The rpm command would have
worked, but with a lot more fuss.)

strace produces a lot of output -- there are a lot of system calls.  I
just grepped for '"/' because I only cared about system calls with
absolute pathnames.

It turned out that the damaged .pyc file was in
/usr/lib64/python3.8/__pycache__
The strace output hinted at this but didn't prove it.
I renamed it with BAD added to the name.
dnf now worked!

I have no idea why a .pyc got corrupted.

I have no idea how an ordinary user could recover from this problem.
Even the diagnostic is unlikely to help an ordinary user.


More information about the talk mailing list