[GTALUG] nvme SSD: critical medium error, dev nvme0n1
Giles Orr
gilesorr at gmail.com
Mon Jul 29 15:58:24 EDT 2019
On Mon, 29 Jul 2019 at 13:28, Stewart C. Russell via talk <talk at gtalug.org>
wrote:
> I'm guessing this is bad, right?
>
> [Mon Jul 29 12:59:48 2019] print_req_error: critical medium error,
> dev nvme0n1, sector 296089600 flags 80700
> [Mon Jul 29 12:59:48 2019] print_req_error: critical medium error,
> dev nvme0n1, sector 296089744 flags 0
>
> Is it an oh-shit-get-yerself-a-new-drive-NOW thing, or …?
>
> Drive is a 2+ year old Intel 512 GB SSD. Not entirely sure what the
> right diagnostics are for SSDs. Filesystem is showing clean but touching
> certain known-bad files triggers the error in the system log.
>
> Dunno if these nvme stats are useful:
>
> Smart Log for NVME device:nvme0 namespace-id:ffffffff
> critical_warning : 0
> temperature : 25 C
> available_spare : 85%
> available_spare_threshold : 10%
> percentage_used : 1%
> data_units_read : 10,349,479
> data_units_written : 10,098,299
> host_read_commands : 183,018,841
> host_write_commands : 136,702,227
> controller_busy_time : 1,342
> power_cycles : 201
> power_on_hours : 15,722
> unsafe_shutdowns : 10
> media_errors : 803
> num_err_log_entries : 844
> Warning Temperature Time : 0
> Critical Composite Temperature Time : 0
> Thermal Management T1 Trans Count : 0
> Thermal Management T2 Trans Count : 0
> Thermal Management T1 Total Time : 0
> Thermal Management T2 Total Time : 0
>
> Any suggestions, please, for:
>
> * what I should be looking for in stats (nvme smart-log-add doesn't give
> me anything at all, so no wear-levelling stats)
>
> * a decent brand to replace it with. I'm likely okay with a SATA SSD.
>
> cheers,
> Stewart
>
The log doesn't sound like heavy use ... and yet that sounds like an
"oh-shit-get-yerself-a-new-drive-NOW" error to me. At the very least, stay
on top of your backups. As I understand it, when "segments" go bad on a
solid state drive (hell, even on a spinning disk these days), the drive
firmware should silently move the data and you'd never even know it
happened. That you're seeing the errors is alarming and suggests a fairly
serious malfunction.
But ... I have no expertise with SSD (or NVMe) drives - I have a few, but
none have failed so I haven't had to learn. Ignore this suggestion if you
get advice from someone with more knowledge of those drives ...
--
Giles
https://www.gilesorr.com/
gilesorr at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/talk/attachments/20190729/ca513774/attachment.html>
More information about the talk
mailing list