RAID without TLER

Thu Feb 21 17:51:35 UTC 2013

| From: Lennart Sorensen <lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org>

| On Wed, Feb 20, 2013 at 02:31:49PM -0500, D. Hugh Redelmeier wrote:
| > Sale price for 3T Red: 140; 3T Seagate $90; both 7200RPM; Seagate has no 
| > ERC.
| 
| I wouldn't put any data of mine on a Seagate drive.  No comparison at all.
| Compare it against something reliable.

Based on anecdotes?  Based on long-obsolete models?  Of course you
might be right.

I personally hold a grudge against Seagate for how they handled the
7200.11 firmware bug.  That's not anecdote, that's observed behaviour
of the company.

| > Just what does that mean?  Drives don't know what time of day they are
| > on.  Which spec realistically reflects this, MTBF?  Or is this another
| > market segmentation trick.
| 
| Many drives sold for desktop use are only expected to run about 8 hours
| a day 5 days a week.  They expect they will be off the rest of the time.
| 
| They claim the MTBF of the Red is 35% higher than the standard desktop
| drive (I would think that means the Blue model).
| 
| Of course my experience is that leaving a drive always on makes it
| last longer.

Thanks for trying, but you didn't answer my question.  I, cynically,
think that the 8x5 claim is just marketing FUD.

| WD did put some of the WD green features in too, to help control heat
| in small NAS enclosures, but without the annoying constant park that
| the green drives like to do.

Sometimes constant parking is timed just right to be worst-case for
Linux.  In earlier posts I've noted how some laptop disks seem to
be murdered by this.

| > Doesn't matter too much: if you need an absolute bound on latency,
| > maybe.  This has little effect on average latency since these errors
| > are very rare (or something is very wrong and needs to be fixed).
| 
| What seagate calls ERC, western digital calls TLER, and it's not something
| a RAID uses, it is something it requires.

I thought so too.  But learned otherwise.  That's part of my original
post.

I would like the RAID system to be able to use ERC to reduce the time
exactly when there is redundancy.  For example, if the RAID is
degraded, I'd like the disk hardware to try really hard to read.

But ERC isn't needed on systems without requirements on
worst-case-latency UNLESS the RAID controller cannot be told to be
patient.  I have read claims that Linux software RAID is patient.

|  If a disk decides to spend 2
| minutes trying to complete a read that is failing in the hopes that it
| just might eventually read it and then be able to remap it right away,
| then the RAID will usually drop that disk as being dead.  That's not good.
| Since you have raid (raid0 isn't raid), you would rather have the read
| fail quickly, keep the disk in the raid with one unreadable area, have
| the raid controller rewrite the bad area which lets the disk remap it.
| No RAID rebuild needed and no slowdown.

That was my understanding until I read the smallnetbuilder article.

| > This terminology is a mess.  TLER is a WD marketing term (a good one,
| > and apparently not trademarked).  In the interest of being neutral, I
| > switchted to ERC which seems to be the generic term.
| 
| ERC is a seagate marketing term and much less accurate than TLER.
| ERC sounds like something that the controller needs to talk to.
| That isn't the case.  TLER says exactly what it does.

And the ATA standard calls SCT ERC.  So does smartmontools.
SCT == SMART Command Transport
SMART == Self-Monitoring, Analysis and Reporting Technology
ERC == Error Recovery Control

Some people claim that ERC is a non-optional part of ATA-8.  I'm too
lazy to check.

|   I think I have been
| lucky never to have a problem with the Blacks so far, probably because
| none of them have ever had a read error that took long to deal with.

Who knows?  I had a Hitachi 2.5" drive go bad recently.  I timed an
attempted read of a bad sector:

    silly at reddot:~$ sudo time dd if=/dev/sda of=/dev/null bs=512 count=1 skip=104674232
    dd: reading `/dev/sda': Input/output error
    0+0 records in
    0+0 records out
    0 bytes (0 B) copied, 19.6043 s, 0.0 kB/s
    Command exited with non-zero status 1
    0.00user 0.00system 0:19.60elapsed 0%CPU (0avgtext+0avgdata 3696maxresident)k
    8inputs+0outputs (0major+281minor)pagefaults 0swaps

It took 19 seconds.  I don't have a clue what evasive actions Linux
took: multiple reads, bus resets, or whatever.  So I don't know how
long the disk was in limbo for exactly one attempt.

I wish that the system could query the disk "Just what are you up to?
Should I be patient?  Are we nearly there?"  Then the driver could
make more informed decisions.

"smartctl -x" shows that the drive does have SCT ERC.  Who knows it
the disk is lying (some disks do lie about some of their
capabilities).  This is what smartctl reported:

SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

The ERC feature was not actually being used:

    SCT Error Recovery Control:
           Read: Disabled
          Write: Disabled
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists