All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	linux-lvm@redhat.com, Linux-RAID <linux-raid@vger.kernel.org>,
	linux-scsi@vger.kernel.org, linux-usb@vger.kernel.org
Subject: Re: Add udev-md-raid-safe-timeouts.rules
Date: Mon, 16 Apr 2018 11:10:16 -0600	[thread overview]
Message-ID: <CAJCQCtRzmBys+eYsd=zsAK1deYQt47nysHQBdn3CreOmObz59g@mail.gmail.com> (raw)
In-Reply-To: <5425366f-f339-d6f3-26d1-d02c3ba80671@gmail.com>

Adding linux-usb@ and linux-scsi@
(This email does contain the thread initiating email, but some replies
are on the other lists.)

On Mon, Apr 16, 2018 at 5:43 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2018-04-15 21:04, Chris Murphy wrote:
>>
>> I just ran into this:
>>
>> https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec
>>
>> This solution is inadequate, can it be made more generic? This isn't
>> an md specific problem, it affects Btrfs and LVM as well. And in fact
>> raid0, and even none raid setups.
>>
>> There is no good reason to prevent deep recovery, which is what
>> happens with the default command timer of 30 seconds, with this class
>> of drive. Basically that value is going to cause data loss for the
>> single device and also raid0 case, where the reset happens before deep
>> recovery has a chance. And even if deep recovery fails to return user
>> data, what we need to see is the proper error message: read error UNC,
>> rather than a link reset message which just obfuscates the problem.
>
>
> This has been discussed at least once here before (probably more times, hard
> to be sure since it usually comes up as a side discussion in an only
> marginally related thread).  Last I knew, the consensus here was that it
> needs to be changed upstream in the kernel, not by adding a udev rule
> because while the value is technically system policy, the default policy is
> brain-dead for anything but the original disks it was i9ntended for (30
> seconds works perfectly fine for actual SCSI devices because they behave
> sanely in the face of media errors, but it's horribly inadequate for ATA
> devices).
>
> To re-iterate what I've said before on the subject:
>
> For ATA drives it should probably be 150 seconds.  That's 30 seconds beyond
> the typical amount of time most consumer drives will keep retrying a sector,
> so even if it goes the full time to try and recover a sector this shouldn't
> trigger.  The only people this change should negatively impact are those who
> have failing drives which support SCT ERC and have it enabled, but aren't
> already adjusting this timeout.
>
> For physical SCSI devices, it should continue to be 30 seconds.  SCSI disks
> are sensible here and don't waste your time trying to recover a sector.  For
> PV-SCSI devices, it should probably be adjusted too, but I don't know what a
> reasonable value is.
>
> For USB devices it should probably be higher than 30 seconds, but again I
> have no idea what a reasonable value is.

I don't know how all of this is designed but it seems like there's
only one location for the command timer, and the SCSI driver owns it,
and then everyone else (ATA and USB and for all I know SAN) are on top
of that and lack any ability to have separate timeouts.

The nice thing about the udev rule is that it tests for SCT ERC before
making a change. There certainly are enterprise and almost enterprise
"NAS" SATA drives that have short SCT ERC times enabled out of the box
- and the udev method makes them immune to the change.


-- 
Chris Murphy

WARNING: multiple messages have this Message-ID (diff)
From: Chris Murphy <lists@colorremedies.com>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
Cc: linux-scsi@vger.kernel.org, linux-usb@vger.kernel.org,
	Linux-RAID <linux-raid@vger.kernel.org>,
	linux-lvm@redhat.com, Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: [linux-lvm] Add udev-md-raid-safe-timeouts.rules
Date: Mon, 16 Apr 2018 11:10:16 -0600	[thread overview]
Message-ID: <CAJCQCtRzmBys+eYsd=zsAK1deYQt47nysHQBdn3CreOmObz59g@mail.gmail.com> (raw)
In-Reply-To: <5425366f-f339-d6f3-26d1-d02c3ba80671@gmail.com>

Adding linux-usb@ and linux-scsi@
(This email does contain the thread initiating email, but some replies
are on the other lists.)

On Mon, Apr 16, 2018 at 5:43 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2018-04-15 21:04, Chris Murphy wrote:
>>
>> I just ran into this:
>>
>> https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec
>>
>> This solution is inadequate, can it be made more generic? This isn't
>> an md specific problem, it affects Btrfs and LVM as well. And in fact
>> raid0, and even none raid setups.
>>
>> There is no good reason to prevent deep recovery, which is what
>> happens with the default command timer of 30 seconds, with this class
>> of drive. Basically that value is going to cause data loss for the
>> single device and also raid0 case, where the reset happens before deep
>> recovery has a chance. And even if deep recovery fails to return user
>> data, what we need to see is the proper error message: read error UNC,
>> rather than a link reset message which just obfuscates the problem.
>
>
> This has been discussed at least once here before (probably more times, hard
> to be sure since it usually comes up as a side discussion in an only
> marginally related thread).  Last I knew, the consensus here was that it
> needs to be changed upstream in the kernel, not by adding a udev rule
> because while the value is technically system policy, the default policy is
> brain-dead for anything but the original disks it was i9ntended for (30
> seconds works perfectly fine for actual SCSI devices because they behave
> sanely in the face of media errors, but it's horribly inadequate for ATA
> devices).
>
> To re-iterate what I've said before on the subject:
>
> For ATA drives it should probably be 150 seconds.  That's 30 seconds beyond
> the typical amount of time most consumer drives will keep retrying a sector,
> so even if it goes the full time to try and recover a sector this shouldn't
> trigger.  The only people this change should negatively impact are those who
> have failing drives which support SCT ERC and have it enabled, but aren't
> already adjusting this timeout.
>
> For physical SCSI devices, it should continue to be 30 seconds.  SCSI disks
> are sensible here and don't waste your time trying to recover a sector.  For
> PV-SCSI devices, it should probably be adjusted too, but I don't know what a
> reasonable value is.
>
> For USB devices it should probably be higher than 30 seconds, but again I
> have no idea what a reasonable value is.

I don't know how all of this is designed but it seems like there's
only one location for the command timer, and the SCSI driver owns it,
and then everyone else (ATA and USB and for all I know SAN) are on top
of that and lack any ability to have separate timeouts.

The nice thing about the udev rule is that it tests for SCT ERC before
making a change. There certainly are enterprise and almost enterprise
"NAS" SATA drives that have short SCT ERC times enabled out of the box
- and the udev method makes them immune to the change.


-- 
Chris Murphy

  parent reply	other threads:[~2018-04-16 17:10 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-16  1:04 Add udev-md-raid-safe-timeouts.rules Chris Murphy
2018-04-16  1:04 ` [linux-lvm] " Chris Murphy
2018-04-16 11:43 ` Austin S. Hemmelgarn
2018-04-16 11:43   ` [linux-lvm] " Austin S. Hemmelgarn
2018-04-16 15:02   ` Wol's lists
2018-04-16 15:02     ` [linux-lvm] " Wol's lists
2018-04-16 15:19     ` Roger Heflin
2018-04-16 15:19       ` [linux-lvm] " Roger Heflin
2018-04-17 11:15     ` Austin S. Hemmelgarn
2018-04-17 11:15       ` [linux-lvm] " Austin S. Hemmelgarn
2018-04-16 17:10   ` Chris Murphy [this message]
2018-04-16 17:10     ` Chris Murphy
2018-04-16 17:33     ` Alan Stern
2018-04-16 17:33       ` [linux-lvm] " Alan Stern
2018-04-16 17:33       ` Alan Stern
2018-04-17 11:28     ` Austin S. Hemmelgarn
2018-04-17 11:28       ` [linux-lvm] " Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtRzmBys+eYsd=zsAK1deYQt47nysHQBdn3CreOmObz59g@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-lvm@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.