[GTALUG] war story: read you kernel log (dmesg) once in a while

D. Hugh Redelmeier hugh at mimosa.com
Fri Mar 19 11:24:11 EDT 2021


[One reason for this message is to test if the mailing list is working.  I 
haven't seen a new message in a 10 days.]

dmseg command
=============

The dmesg command shows you the kernel log.  It takes the log from the 
kernel itself.  It is stored in a circular RAM buffer, so you can still 
read it if the normal logging system isn't working.  This buffer is a 
fixed size so older messages can get pushed out by newer ones if there is 
enough logging going on.

You can get more info on Fedora by
	journalctl -b
but it isn't limited to kernel messages.  It does colour-code messages 
based on severity, so that's a nice plus.  Since this log typically goes 
to disk, it tends to be complete.  Oh: the -b flag means: start from the 
most recent boot -- logs can go back months and years.

As an old timer, my first instinct is to use dmesg.

looking at kernel messages
==========================

dmesg | less -i

dmesg pours out a lot of lines.  less is a good way of navigating this 
log.  The -i makes searches within less case-insensitive.

The Linux kernel is meant to log problems and move on.  This means that 
there can be problems that you don't even know about because all looks 
well.  I think it pays to once in a while look for problems reported in 
the log.

A lot of messages will be inscrutable.  If they intrigue you, investigate 
them.

Here's one I noticed recently on one of my systems.  It's been there for 
the whole life of the system, but I never noticed.  

[    2.545003] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    2.545482] ata1.00: supports DRM functions and may not be fully accessible
[    2.545554] ata1.00: READ LOG DMA EXT failed, trying unqueued
[    2.548363] ata1.00: disabling queued TRIM support
[    2.548370] ata1.00: ATA-9: Crucial_CT240M500SSD1, MU05, max UDMA/133
[    2.548376] ata1.00: 468862128 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    2.551973] ata1.00: supports DRM functions and may not be fully accessible
[    2.554812] ata1.00: disabling queued TRIM support
[    2.558006] ata1.00: configured for UDMA/133

Look at "disabling queued TRIM support".
What's that about?

- TRIM is a useful feature in SSDs.  It allows the OS to advise the SSD 
  that chunks of the filesystem are no longer needed (eg. deleted files).  
  This helps the SSD's wear-levelling firmware's garbage collector.  It 
  should help speed up the SSD and add to its lifetime.

- Even without queued TRIM, TRIM can still be accomplished.  Queued TRIM 
  is some higher-performance variant (I haven't investigate).

But what's up with this message?  Googling got me to
	<https://bugzilla.kernel.org/show_bug.cgi?id=71371>
Very useful.

- apparently my SSD (a Crucial M500) had buggy firmware, leading to 
  corruption in some cases.  Including queue TRIM

- Crucial released new firmware (in 2015)
	<https://www.crucial.com/support/ssd-support/m500-support>

- my SSD has this firmware (MU05), as reported in the dmesg output

- even after the update, M500's screw up queued TRIM

- the Linux kernel embeds all this wisdom and it blacklists queued TRIM on 
  my box

I spent an hour investigating this.  There was no effect, except that I 
learned a few things.  Linux just does the right thing.

I do recommend also looking at journalctl output because it highlights 
things that it thinks are of particular interest.


More information about the talk mailing list