All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "dm mpath: fix stalls when handling invalid ioctls"
@ 2015-10-29 12:24 Mauricio Faria de Oliveira
  2015-10-29 12:33 ` Mauricio Faria de Oliveira
  2015-10-29 13:18 ` IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"] Mike Snitzer
  0 siblings, 2 replies; 23+ messages in thread
From: Mauricio Faria de Oliveira @ 2015-10-29 12:24 UTC (permalink / raw)
  To: dm-devel; +Cc: snitzer

This reverts commit a1989b330093578ea5470bea0a00f940c444c466.

That commit introduced a regression at least for the case of the SG_IO ioctl()
running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
than blocking due to queue_if_no_path until a path becomes active, for example.

That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2])
from multipath devices; which leads to SCSI/filesystem errors in such a guest.

More general scenarios can hit that regression too. The following demonstration
employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
(some output & user changes omitted for brevity and comments added for clarity).

Reverting that commit restores normal operation (queueing) in failing scenarios;
tested on linux-next (next-20151022).

1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)

    $ cat sg_simple0.c
    ... see [3] ...
    $ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
    $ gcc sgio_inquiry.c -o sgio_inquiry

2) The ioctl() works fine with active paths present.

    # multipath -l 85ag56
    85ag56 (...) dm-19 IBM     ,2145
    size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
    |-+- policy='service-time 0' prio=0 status=active
    | |- 8:0:11:0  sdz  65:144  active undef running
    | `- 9:0:9:0   sdbf 67:144  active undef running
    `-+- policy='service-time 0' prio=0 status=enabled
      |- 8:0:12:0  sdae 65:224  active undef running
      `- 9:0:12:0  sdbo 68:32   active undef running

    $ ./sgio_inquiry /dev/mapper/85ag56
    Some of the INQUIRY command's response:
        IBM       2145              0000
    INQUIRY duration=0 millisecs, resid=0

3) The ioctl() fails with ENOTTY errno with _no_ active paths present,
   for unprivileged users (rather than blocking due to queue_if_no_path).

    # for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \
          do multipathd -k"fail path $path"; done

    # multipath -l 85ag56
    85ag56 (...) dm-19 IBM     ,2145
    size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
    |-+- policy='service-time 0' prio=0 status=enabled
    | |- 8:0:11:0  sdz  65:144  failed undef running
    | `- 9:0:9:0   sdbf 67:144  failed undef running
    `-+- policy='service-time 0' prio=0 status=enabled
      |- 8:0:12:0  sdae 65:224  failed undef running
      `- 9:0:12:0  sdbo 68:32   failed undef running

    $ ./sgio_inquiry /dev/mapper/85ag56
    sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device

4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
   it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().

    $ dmesg
    <...>
    [] device-mapper: multipath: Failing path 65:144.
    [] device-mapper: multipath: Failing path 67:144.
    [] device-mapper: multipath: Failing path 65:224.
    [] device-mapper: multipath: Failing path 68:32.
    [] sgio_inquiry: sending ioctl 2285 to a partition!

5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
   (then queueing happens -- in this example, queue_if_no_path is set);
   this is due to a conditional check in scsi_verify_blk_ioctl().

    # capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56'
    sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device

    # ./sgio_inquiry /dev/mapper/85ag56 &
    [1] 72830

    # cat /proc/72830/stack
    [<c00000171c0df700>] 0xc00000171c0df700
    [<c000000000015934>] __switch_to+0x204/0x350
    [<c000000000152d4c>] msleep+0x5c/0x80
    [<c00000000077dfb0>] dm_blk_ioctl+0x70/0x170
    [<c000000000487c40>] blkdev_ioctl+0x2b0/0x9b0
    [<c0000000003128e4>] block_ioctl+0x64/0xd0
    [<c0000000002dd3b0>] do_vfs_ioctl+0x490/0x780
    [<c0000000002dd774>] SyS_ioctl+0xd4/0xf0
    [<c000000000009358>] system_call+0x38/0xd0

6) This is the function call chain exercised in this analysis:

SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c
    -> do_vfs_ioctl()
        -> vfs_ioctl()
            ...
            error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
            ...
                -> dm_blk_ioctl() @ drivers/md/dm.c
                    -> multipath_ioctl() @ drivers/md/dm-mpath.c
                        ...
                        (bdev = NULL, due to no active paths)
                        ...
                        if (!bdev || <...>) {
                            int err = scsi_verify_blk_ioctl(NULL, cmd);
                            if (err)
                                r = err;
                        }
                        ...
                            -> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c
                                ...
                                if (bd && bd == bd->bd_contains) // not taken (bd = NULL)
                                    return 0;
                                ...
                                if (capable(CAP_SYS_RAWIO)) // not taken (unprivileged user)
                                    return 0;
                                ...
                                printk_ratelimited(KERN_WARNING
                                           "%s: sending ioctl %x to a partition!\n" <...>);

                                return -ENOIOCTLCMD;
                            <-
                        ...
                        return r ? : <...>
                    <-
            ...
            if (error == -ENOIOCTLCMD)
                error = -ENOTTY;
             out:
                return error;
            ...

Links:
[1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
[2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')
[3] http://tldp.org/HOWTO/SCSI-Generic-HOWTO/pexample.html (Revision 1.2, 2002-05-03)

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 drivers/md/dm-mpath.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 5a67671..bdc96cd 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1569,11 +1569,8 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 	/*
 	 * Only pass ioctls through if the device sizes match exactly.
 	 */
-	if (!bdev || ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) {
-		int err = scsi_verify_blk_ioctl(NULL, cmd);
-		if (err)
-			r = err;
-	}
+	if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT)
+		r = scsi_verify_blk_ioctl(NULL, cmd);
 
 	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
 		spin_lock_irqsave(&m->lock, flags);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] Revert "dm mpath: fix stalls when handling invalid ioctls"
  2015-10-29 12:24 [PATCH] Revert "dm mpath: fix stalls when handling invalid ioctls" Mauricio Faria de Oliveira
@ 2015-10-29 12:33 ` Mauricio Faria de Oliveira
  2015-10-29 13:18 ` IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"] Mike Snitzer
  1 sibling, 0 replies; 23+ messages in thread
From: Mauricio Faria de Oliveira @ 2015-10-29 12:33 UTC (permalink / raw)
  To: dm-devel; +Cc: snitzer

On 10/29/2015 10:24 AM, Mauricio Faria de Oliveira wrote:
> This reverts commit a1989b330093578ea5470bea0a00f940c444c466.

Cc'ing Hannes manually (-cc failed).


-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-29 12:24 [PATCH] Revert "dm mpath: fix stalls when handling invalid ioctls" Mauricio Faria de Oliveira
  2015-10-29 12:33 ` Mauricio Faria de Oliveira
@ 2015-10-29 13:18 ` Mike Snitzer
  2015-10-29 14:47   ` [dm-devel] " Mauricio Faria de Oliveira
  2015-10-31 15:33   ` Paolo Bonzini
  1 sibling, 2 replies; 23+ messages in thread
From: Mike Snitzer @ 2015-10-29 13:18 UTC (permalink / raw)
  To: Mauricio Faria de Oliveira; +Cc: dm-devel, hare, linux-scsi

On Thu, Oct 29 2015 at  8:24am -0400,
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote:

> This reverts commit a1989b330093578ea5470bea0a00f940c444c466.
> 
> That commit introduced a regression at least for the case of the SG_IO ioctl()
> running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
> are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
> than blocking due to queue_if_no_path until a path becomes active, for example.
> 
> That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
> (qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2])
> from multipath devices; which leads to SCSI/filesystem errors in such a guest.
> 
> More general scenarios can hit that regression too. The following demonstration
> employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
> (some output & user changes omitted for brevity and comments added for clarity).
> 
> Reverting that commit restores normal operation (queueing) in failing scenarios;
> tested on linux-next (next-20151022).

Obviously !bdev is causing you to take the branch you'd like to change
via revert.  Once in that branch scsi_verify_blk_ioctl() is called.  If
the ioctl is valid the ioctl is allowed to carry on (regardless of
whether there are paths or not).
 
> 1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)
> 
>     $ cat sg_simple0.c
>     ... see [3] ...
>     $ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
>     $ gcc sgio_inquiry.c -o sgio_inquiry
> 
> 2) The ioctl() works fine with active paths present.
> 
>     # multipath -l 85ag56
>     85ag56 (...) dm-19 IBM     ,2145
>     size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
>     |-+- policy='service-time 0' prio=0 status=active
>     | |- 8:0:11:0  sdz  65:144  active undef running
>     | `- 9:0:9:0   sdbf 67:144  active undef running
>     `-+- policy='service-time 0' prio=0 status=enabled
>       |- 8:0:12:0  sdae 65:224  active undef running
>       `- 9:0:12:0  sdbo 68:32   active undef running
> 
>     $ ./sgio_inquiry /dev/mapper/85ag56
>     Some of the INQUIRY command's response:
>         IBM       2145              0000
>     INQUIRY duration=0 millisecs, resid=0
> 
> 3) The ioctl() fails with ENOTTY errno with _no_ active paths present,
>    for unprivileged users (rather than blocking due to queue_if_no_path).
> 
>     # for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \
>           do multipathd -k"fail path $path"; done
> 
>     # multipath -l 85ag56
>     85ag56 (...) dm-19 IBM     ,2145
>     size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
>     |-+- policy='service-time 0' prio=0 status=enabled
>     | |- 8:0:11:0  sdz  65:144  failed undef running
>     | `- 9:0:9:0   sdbf 67:144  failed undef running
>     `-+- policy='service-time 0' prio=0 status=enabled
>       |- 8:0:12:0  sdae 65:224  failed undef running
>       `- 9:0:12:0  sdbo 68:32   failed undef running
> 
>     $ ./sgio_inquiry /dev/mapper/85ag56
>     sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device
> 
> 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
>    it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
> 
>     $ dmesg
>     <...>
>     [] device-mapper: multipath: Failing path 65:144.
>     [] device-mapper: multipath: Failing path 67:144.
>     [] device-mapper: multipath: Failing path 65:224.
>     [] device-mapper: multipath: Failing path 68:32.
>     [] sgio_inquiry: sending ioctl 2285 to a partition!

So scsi_verify_blk_ioctl() considers the ioctl invalid.

> 5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
>    (then queueing happens -- in this example, queue_if_no_path is set);
>    this is due to a conditional check in scsi_verify_blk_ioctl().
> 
>     # capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56'
>     sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device
> 
>     # ./sgio_inquiry /dev/mapper/85ag56 &
>     [1] 72830

Sorry but I fail to see why your request to revert is viable.
Commit a1989b330093578ea5470bea0a00f940c444c466 fixes issues seen with
udevd (namely "udevd[]: worker [] unexpectedly returned with status 0x0100")

Wouldn't the correct fix be to train scsi_verify_blk_ioctl() to work
even without SYS_CAP_RAWIO?

I'm not doubting that the commit caused problems for the case you care
about (unprivledged users issueing ioctls) but that is an entirely
different issue that needs to be discussed more directly (with the
broader linux-scsi community, now cc'd).

Mike


p.s. leaving full context for linux-scsi so they don't need to track
down other mails (btw, thanks for the detailed patch header but it
enabled me to be skeptical of your request to revert):

> 6) This is the function call chain exercised in this analysis:
> 
> SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c
>     -> do_vfs_ioctl()
>         -> vfs_ioctl()
>             ...
>             error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
>             ...
>                 -> dm_blk_ioctl() @ drivers/md/dm.c
>                     -> multipath_ioctl() @ drivers/md/dm-mpath.c
>                         ...
>                         (bdev = NULL, due to no active paths)
>                         ...
>                         if (!bdev || <...>) {
>                             int err = scsi_verify_blk_ioctl(NULL, cmd);
>                             if (err)
>                                 r = err;
>                         }
>                         ...
>                             -> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c
>                                 ...
>                                 if (bd && bd == bd->bd_contains) // not taken (bd = NULL)
>                                     return 0;
>                                 ...
>                                 if (capable(CAP_SYS_RAWIO)) // not taken (unprivileged user)
>                                     return 0;
>                                 ...
>                                 printk_ratelimited(KERN_WARNING
>                                            "%s: sending ioctl %x to a partition!\n" <...>);
> 
>                                 return -ENOIOCTLCMD;
>                             <-
>                         ...
>                         return r ? : <...>
>                     <-
>             ...
>             if (error == -ENOIOCTLCMD)
>                 error = -ENOTTY;
>              out:
>                 return error;
>             ...
> 
> Links:
> [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
> [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')
> [3] http://tldp.org/HOWTO/SCSI-Generic-HOWTO/pexample.html (Revision 1.2, 2002-05-03)
> 
> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
> ---
>  drivers/md/dm-mpath.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 5a67671..bdc96cd 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1569,11 +1569,8 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>  	/*
>  	 * Only pass ioctls through if the device sizes match exactly.
>  	 */
> -	if (!bdev || ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) {
> -		int err = scsi_verify_blk_ioctl(NULL, cmd);
> -		if (err)
> -			r = err;
> -	}
> +	if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT)
> +		r = scsi_verify_blk_ioctl(NULL, cmd);
>  
>  	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
>  		spin_lock_irqsave(&m->lock, flags);
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dm-devel] IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-29 13:18 ` IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"] Mike Snitzer
@ 2015-10-29 14:47   ` Mauricio Faria de Oliveira
  2015-10-31 15:33   ` Paolo Bonzini
  1 sibling, 0 replies; 23+ messages in thread
From: Mauricio Faria de Oliveira @ 2015-10-29 14:47 UTC (permalink / raw)
  To: snitzer; +Cc: device-mapper development, linux-scsi

Hi Mike,

On 10/29/2015 11:18 AM, Mike Snitzer wrote:
> Sorry but I fail to see why your request to revert is viable.

No problem. Thanks for moving this for a discussion on a proper fix.

I'm somewhat new to kernel and SCSI workings and its community process,
so that's certainly appreciated.

> Wouldn't the correct fix be to train scsi_verify_blk_ioctl() to work
> even without SYS_CAP_RAWIO?

I see it would fix the problem as well, but I don't happen to know
all of its usages, so I'd have to defer to question to someone who
knows more of it.

> I'm not doubting that the commit caused problems for the case you care
> about (unprivledged users issueing ioctls) but that is an entirely
> different issue that needs to be discussed more directly (with the
> broader linux-scsi community, now cc'd).

Sure. Thanks for opening the discussion.

> p.s. leaving full context for linux-scsi so they don't need to track
> down other mails (btw, thanks for the detailed patch header but it
> enabled me to be skeptical of your request to revert):

You're welcome. If it's been useful for rejecting this patch and
getting a better one later, it's worth it. :)

Kind regards,

-- 
Mauricio Faria de Oliveira
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-29 13:18 ` IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"] Mike Snitzer
  2015-10-29 14:47   ` [dm-devel] " Mauricio Faria de Oliveira
@ 2015-10-31 15:33   ` Paolo Bonzini
  2015-10-31 18:13     ` Mike Snitzer
  1 sibling, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2015-10-31 15:33 UTC (permalink / raw)
  To: Mike Snitzer, Mauricio Faria de Oliveira; +Cc: dm-devel, hare, linux-scsi



On 29/10/2015 14:18, Mike Snitzer wrote:
> > 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
> >    it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
> > 
> >     $ dmesg
> >     <...>
> >     [] device-mapper: multipath: Failing path 65:144.
> >     [] device-mapper: multipath: Failing path 67:144.
> >     [] device-mapper: multipath: Failing path 65:224.
> >     [] device-mapper: multipath: Failing path 68:32.
> >     [] sgio_inquiry: sending ioctl 2285 to a partition!
> 
> So scsi_verify_blk_ioctl() considers the ioctl invalid.

But that's wrong, I think.  It's a false positive in
scsi_verify_blk_ioctl().

If the ioctl is valid when bdev becomes non-NULL (and it will be if
ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
think the ioctls can go away and come back.  So Hannes's patch broke the
userspace ABI. :(

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-31 15:33   ` Paolo Bonzini
@ 2015-10-31 18:13     ` Mike Snitzer
  2015-10-31 18:36       ` Mike Snitzer
  2015-10-31 19:07       ` Paolo Bonzini
  0 siblings, 2 replies; 23+ messages in thread
From: Mike Snitzer @ 2015-10-31 18:13 UTC (permalink / raw)
  To: Paolo Bonzini, hch; +Cc: Mauricio Faria de Oliveira, dm-devel, hare, linux-scsi

On Sat, Oct 31 2015 at 11:33am -0400,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 29/10/2015 14:18, Mike Snitzer wrote:
> > > 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
> > >    it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
> > > 
> > >     $ dmesg
> > >     <...>
> > >     [] device-mapper: multipath: Failing path 65:144.
> > >     [] device-mapper: multipath: Failing path 67:144.
> > >     [] device-mapper: multipath: Failing path 65:224.
> > >     [] device-mapper: multipath: Failing path 68:32.
> > >     [] sgio_inquiry: sending ioctl 2285 to a partition!
> > 
> > So scsi_verify_blk_ioctl() considers the ioctl invalid.
> 
> But that's wrong, I think.  It's a false positive in
> scsi_verify_blk_ioctl().
> 
> If the ioctl is valid when bdev becomes non-NULL (and it will be if
> ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
> you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
> think the ioctls can go away and come back.  So Hannes's patch broke the
> userspace ABI. :(

Huh?  All that Hannes' patch did was add early verification of the ioctl
if there are no paths, since: there is no point queueing an ioctl that
is invalid.

But looking just now, Christoph's recent ioctl refactoring that I staged
for 4.4 does seem to subtley revert Hannes' change:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.4&id=40cf639be1db8cc2b8183fe2ccd390ca77b90396
With hch's change multipath_prepare_ioctl() will _not_ do early
verification of the ioctl if no paths are available (and
queue_if_no_path is configured).  Because the call to
scsi_verify_blk_ioctl() was moved to dm_blk_ioctl() and is only called
if the return is > 0 (again -ENOTCONN is being returned).

Not to mention hch's lifting of scsi_verify_blk_ioctl() into DM core's
dm_blk_ioctl() -- likely motivated by not requiring all targets to do
the call like they were doing -- should really be done as part of the
new DM target .prepare_ioctl hook.

Christoph, I think your DM ioctl changes need more review/work.. which
implies they'll very likely miss 4.4.. sorry.

Anyway, all DM specifics aside:

The point is scsi_verify_blk_ioctl() is saying the ioctl isn't valid.
It has nothing to do with the existance of a bdev or not; but everything
to do with the unprivledged user's request to issue an ioctl.

Paolo, AFAIK unprivledged ioctls is one of your pet-projects so your
insight on what, if anything, needs changing to support them is the
insight I think we need.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-31 18:13     ` Mike Snitzer
@ 2015-10-31 18:36       ` Mike Snitzer
  2015-10-31 19:07       ` Paolo Bonzini
  1 sibling, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2015-10-31 18:36 UTC (permalink / raw)
  To: hch; +Cc: Mauricio Faria de Oliveira, dm-devel, hare, linux-scsi, Paolo Bonzini

On Sat, Oct 31 2015 at  2:13P -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Sat, Oct 31 2015 at 11:33am -0400,
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > 
> > 
> > On 29/10/2015 14:18, Mike Snitzer wrote:
> > > > 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
> > > >    it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
> > > > 
> > > >     $ dmesg
> > > >     <...>
> > > >     [] device-mapper: multipath: Failing path 65:144.
> > > >     [] device-mapper: multipath: Failing path 67:144.
> > > >     [] device-mapper: multipath: Failing path 65:224.
> > > >     [] device-mapper: multipath: Failing path 68:32.
> > > >     [] sgio_inquiry: sending ioctl 2285 to a partition!
> > > 
> > > So scsi_verify_blk_ioctl() considers the ioctl invalid.
> > 
> > But that's wrong, I think.  It's a false positive in
> > scsi_verify_blk_ioctl().
> > 
> > If the ioctl is valid when bdev becomes non-NULL (and it will be if
> > ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
> > you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
> > think the ioctls can go away and come back.  So Hannes's patch broke the
> > userspace ABI. :(
> 
> Huh?  All that Hannes' patch did was add early verification of the ioctl
> if there are no paths, since: there is no point queueing an ioctl that
> is invalid.
> 
> But looking just now, Christoph's recent ioctl refactoring that I staged
> for 4.4 does seem to subtley revert Hannes' change:
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.4&id=40cf639be1db8cc2b8183fe2ccd390ca77b90396
> With hch's change multipath_prepare_ioctl() will _not_ do early
> verification of the ioctl if no paths are available (and
> queue_if_no_path is configured).  Because the call to
> scsi_verify_blk_ioctl() was moved to dm_blk_ioctl() and is only called
> if the return is > 0 (again -ENOTCONN is being returned).
> 
> Not to mention hch's lifting of scsi_verify_blk_ioctl() into DM core's
> dm_blk_ioctl() -- likely motivated by not requiring all targets to do
> the call like they were doing -- should really be done as part of the
> new DM target .prepare_ioctl hook.
> 
> Christoph, I think your DM ioctl changes need more review/work.. which
> implies they'll very likely miss 4.4.. sorry.

This patch will maintain Hannes' commit a1989b3300935 (which I think is
correct!):

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index aaa6caa..ffea28f 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1562,6 +1562,16 @@ static int multipath_prepare_ioctl(struct dm_target *ti,
 
 	spin_unlock_irqrestore(&m->lock, flags);
 
+	/*
+	 * Only pass ioctls through if the device sizes match exactly.
+	 */
+	if (!*bdev || ti->len != i_size_read((*bdev)->bd_inode) >> SECTOR_SHIFT) {
+		/* not deferring to DM core to verify the ioctl */
+		int err = scsi_verify_blk_ioctl(NULL, cmd);
+		if (err)
+			r = err;
+	}
+
 	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
 		spin_lock_irqsave(&m->lock, flags);
 		if (!m->current_pg) {
@@ -1574,11 +1584,6 @@ static int multipath_prepare_ioctl(struct dm_target *ti,
 		dm_table_run_md_queue_async(m->ti->table);
 	}
 
-	/*
-	 * Only pass ioctls through if the device sizes match exactly.
-	 */
-	if (!r && ti->len != i_size_read((*bdev)->bd_inode) >> SECTOR_SHIFT)
-		return 1;
 	return r;
 }
 
Christoph, I've folded this into your original commit 40cf639be1d I
referenced above, new commit is here:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.4&id=21a2807bc3ff0eec3e2ec35357a4c37d4bcbfd5b

But if you, Hannes or others disagree with this change I'll drop it for
4.4 and we'll have to revisit this later.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-31 18:13     ` Mike Snitzer
  2015-10-31 18:36       ` Mike Snitzer
@ 2015-10-31 19:07       ` Paolo Bonzini
  2015-10-31 22:47         ` Mike Snitzer
  1 sibling, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2015-10-31 19:07 UTC (permalink / raw)
  To: Mike Snitzer, hch; +Cc: Mauricio Faria de Oliveira, dm-devel, hare, linux-scsi



On 31/10/2015 19:13, Mike Snitzer wrote:
> > But that's wrong, I think.  It's a false positive in
> > scsi_verify_blk_ioctl().
> > 
> > If the ioctl is valid when bdev becomes non-NULL (and it will be if
> > ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
> > you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
> > think the ioctls can go away and come back.  So Hannes's patch broke the
> > userspace ABI. :(
> 
> Huh?  All that Hannes' patch did was add early verification of the ioctl
> if there are no paths, since: there is no point queueing an ioctl that
> is invalid.
> 
> [snip discussion of Christoph's patches]
> 
> The point is scsi_verify_blk_ioctl() is saying the ioctl isn't valid.
> It has nothing to do with the existance of a bdev or not; but everything
> to do with the unprivledged user's request to issue an ioctl.

... but the call is skipped (and all ioctls are valid) if ti->len ==
i_size_read(bdev->bd_inode) >> SECTOR_SHIFT.  Therefore, until you have
the bdev you don't know which ioctls are valid, and you must assume all
of them are.  You can't do anything unsafe anyway until you have the
bdev.  This is the reasoning prior to Hannes's change.

Afterwards, you end up calling scsi_verify_blk_ioctl() when bdev ==
NULL.  If the future bdev satisfies the above condition on ti->len, this
means that ioctl(SG_IO) switches from ENOTTY to available.  Userspace is
clearly not expecting that.

> Paolo, AFAIK unprivledged ioctls is one of your pet-projects so your
> insight on what, if anything, needs changing to support them is the
> insight I think we need.

I hope the above provides some extra information.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-31 19:07       ` Paolo Bonzini
@ 2015-10-31 22:47         ` Mike Snitzer
  2015-11-02  7:28           ` Hannes Reinecke
  2015-11-02  9:55           ` Paolo Bonzini
  0 siblings, 2 replies; 23+ messages in thread
From: Mike Snitzer @ 2015-10-31 22:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: hch, Mauricio Faria de Oliveira, dm-devel, hare, linux-scsi

On Sat, Oct 31 2015 at  3:07pm -0400,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 31/10/2015 19:13, Mike Snitzer wrote:
> > > But that's wrong, I think.  It's a false positive in
> > > scsi_verify_blk_ioctl().
> > > 
> > > If the ioctl is valid when bdev becomes non-NULL (and it will be if
> > > ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
> > > you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
> > > think the ioctls can go away and come back.  So Hannes's patch broke the
> > > userspace ABI. :(
> > 
> > Huh?  All that Hannes' patch did was add early verification of the ioctl
> > if there are no paths, since: there is no point queueing an ioctl that
> > is invalid.
> > 
> > [snip discussion of Christoph's patches]
> > 
> > The point is scsi_verify_blk_ioctl() is saying the ioctl isn't valid.
> > It has nothing to do with the existance of a bdev or not; but everything
> > to do with the unprivledged user's request to issue an ioctl.
> 
> ... but the call is skipped (and all ioctls are valid) if ti->len ==
> i_size_read(bdev->bd_inode) >> SECTOR_SHIFT.  Therefore, until you have
> the bdev you don't know which ioctls are valid, and you must assume all
> of them are.  You can't do anything unsafe anyway until you have the
> bdev.  This is the reasoning prior to Hannes's change.

Yes, with your commit ec8013be ("dm: do not forward ioctls from logical
volumes to the underlying device") you added protections to disallow
issuing ioctls to a partition that could impact the rest of the device.

Given that I can see why you're seizing on the "ti->len !=
i_size_read(bdev->bd_inode) >> SECTOR_SHIFT" negative checks that gate
the call to scsi_verify_blk_ioctl().

> Afterwards, you end up calling scsi_verify_blk_ioctl() when bdev ==
> NULL.  If the future bdev satisfies the above condition on ti->len, this
> means that ioctl(SG_IO) switches from ENOTTY to available.  Userspace is
> clearly not expecting that.

For Hannes, and in my head, it didn't matter if a future bdev satisfies
the length condition.  I don't think Hannes was trying to guard against
dangerous partition ioctls being issues by udev... but now I _do_
question what exactly _is_ the point of his commit 21a2807bc3f.  Which
invalid ioctls, from udev, did Hannes' change actually disallow?

I obviously should've asked more questions before now!

At the time, I thought scsi_verify_blk_ioctl() was generally useful.
But it clearly only makes sense in the narrow context of guarding
against partition ioctls expanding their target to the parent block
device.

(scsi_verify_blk_ioctl should be renamed to something like
'scsi_verify_blk_ioctl_partition_safe' but I digress)
 
> > Paolo, AFAIK unprivledged ioctls is one of your pet-projects so your
> > insight on what, if anything, needs changing to support them is the
> > insight I think we need.
> 
> I hope the above provides some extra information.

It did, but you made me work for it ;)

I could've sworn that unprivledged users (without CAP_SYS_RAWIO)
wouldn't be allowed to issue ioctls.  Am I completely mistaken?  Or is
it still contentious and DM-mpath removing the ability to allow these
unprivledged ioctls (as a side-effect of Hannes' commit ec8013be) makes
your life, and other virt users' lives, harder?

Apologies for being out of touch on what the latest is on this
unprivledged ioctl issue.

Given Hannes' commit ec8013be was marked for stable and went in some
time ago we really need to get to resolve this ASAP.

To that end, I'm going to prep a tree that first applies Mauricio's
patch: https://patchwork.kernel.org/patch/7518601/

And I'll then rebase Christoph's DM ioctl changes ontop of it.

Hannes will need to revalidate his desire to verify ioctls in the
context of a dm-multipath device without paths. (sorry Hannes)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-31 22:47         ` Mike Snitzer
@ 2015-11-02  7:28           ` Hannes Reinecke
  2015-11-02  9:57             ` Paolo Bonzini
  2015-11-02 13:31             ` Mike Snitzer
  2015-11-02  9:55           ` Paolo Bonzini
  1 sibling, 2 replies; 23+ messages in thread
From: Hannes Reinecke @ 2015-11-02  7:28 UTC (permalink / raw)
  To: Mike Snitzer, Paolo Bonzini
  Cc: hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On 10/31/2015 11:47 PM, Mike Snitzer wrote:
> On Sat, Oct 31 2015 at  3:07pm -0400,
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>>
>>
>> On 31/10/2015 19:13, Mike Snitzer wrote:
>>>> But that's wrong, I think.  It's a false positive in
>>>> scsi_verify_blk_ioctl().
>>>>
>>>> If the ioctl is valid when bdev becomes non-NULL (and it will be if
>>>> ti->len becomes equal to i_size_read(bdev->bd_inode) >> SECTOR_SHIFT),
>>>> you should not return -ENOIOCTLCMD aka ENOTTY, because userspace doesn't
>>>> think the ioctls can go away and come back.  So Hannes's patch broke the
>>>> userspace ABI. :(
>>>
>>> Huh?  All that Hannes' patch did was add early verification of the ioctl
>>> if there are no paths, since: there is no point queueing an ioctl that
>>> is invalid.
>>>
>>> [snip discussion of Christoph's patches]
>>>
>>> The point is scsi_verify_blk_ioctl() is saying the ioctl isn't valid.
>>> It has nothing to do with the existance of a bdev or not; but everything
>>> to do with the unprivledged user's request to issue an ioctl.
>>
>> ... but the call is skipped (and all ioctls are valid) if ti->len ==
>> i_size_read(bdev->bd_inode) >> SECTOR_SHIFT.  Therefore, until you have
>> the bdev you don't know which ioctls are valid, and you must assume all
>> of them are.  You can't do anything unsafe anyway until you have the
>> bdev.  This is the reasoning prior to Hannes's change.
> 
> Yes, with your commit ec8013be ("dm: do not forward ioctls from logical
> volumes to the underlying device") you added protections to disallow
> issuing ioctls to a partition that could impact the rest of the device.
> 
> Given that I can see why you're seizing on the "ti->len !=
> i_size_read(bdev->bd_inode) >> SECTOR_SHIFT" negative checks that gate
> the call to scsi_verify_blk_ioctl().
> 
>> Afterwards, you end up calling scsi_verify_blk_ioctl() when bdev ==
>> NULL.  If the future bdev satisfies the above condition on ti->len, this
>> means that ioctl(SG_IO) switches from ENOTTY to available.  Userspace is
>> clearly not expecting that.
> 
> For Hannes, and in my head, it didn't matter if a future bdev satisfies
> the length condition.  I don't think Hannes was trying to guard against
> dangerous partition ioctls being issues by udev... but now I _do_
> question what exactly _is_ the point of his commit 21a2807bc3f.  Which
> invalid ioctls, from udev, did Hannes' change actually disallow?
> 
The reasoning is thus:

With the original code we would need to wait for path activation
before we would be able to figure out if the ioctl is valid.
However, the callback to verify the ioctl is sd_ioctl(), which as
a first step calls scsi_verify_ioctl().
So my reasoning was that we can as well call scsi_verify_ioctl()
early, and allow it to filter out known invalid ioctls.
Then we wouldn't need to wait for path activation and return
immediately.

Incidentally, in the 'r == -ENOTCONN' case, we're waiting
for path activation, but then just bail out with -ENOTCONN.
As we're not resetting -ENOTCONN, where's the point in activate the
paths here?
Shouldn't we retry to figure out if -ENOTCONN is still true?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-10-31 22:47         ` Mike Snitzer
  2015-11-02  7:28           ` Hannes Reinecke
@ 2015-11-02  9:55           ` Paolo Bonzini
  1 sibling, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2015-11-02  9:55 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: hch, Mauricio Faria de Oliveira, dm-devel, hare, linux-scsi



On 31/10/2015 23:47, Mike Snitzer wrote:
> Yes, with your commit ec8013be ("dm: do not forward ioctls from logical
> volumes to the underlying device") you added protections to disallow
> issuing ioctls to a partition that could impact the rest of the device.
> 
> Given that I can see why you're seizing on the "ti->len !=
> i_size_read(bdev->bd_inode) >> SECTOR_SHIFT" negative checks that gate
> the call to scsi_verify_blk_ioctl().

Right.

> For Hannes, and in my head, it didn't matter if a future bdev satisfies
> the length condition.

I agree actually.  The only problem is that the returned errno value is
ENOTTY, and to userspace that "sounds like" a future bdev will not make
the ioctl valid.

> I could've sworn that unprivledged users (without CAP_SYS_RAWIO)
> wouldn't be allowed to issue ioctls.  Am I completely mistaken?

They are allowed to issue ioctls.

CAP_SYS_RAWIO changes that to also allow issuing of ioctls to
partitions.  That was required by Linus for backwards compatibility.

> Or is
> it still contentious and DM-mpath removing the ability to allow these
> unprivledged ioctls (as a side-effect of Hannes' commit ec8013be) makes
> your life, and other virt users' lives, harder?

Yes, it would.  virt runs as an unprivileged user (so does CD burning,
which was the original reason to let SG_IO run by unprivileged users;
there are probably other use cases).

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02  7:28           ` Hannes Reinecke
@ 2015-11-02  9:57             ` Paolo Bonzini
  2015-11-02 13:31             ` Mike Snitzer
  1 sibling, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2015-11-02  9:57 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer
  Cc: hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi



On 02/11/2015 08:28, Hannes Reinecke wrote:
> 
> With the original code we would need to wait for path activation
> before we would be able to figure out if the ioctl is valid.
> However, the callback to verify the ioctl is sd_ioctl(), which as
> a first step calls scsi_verify_ioctl().
> So my reasoning was that we can as well call scsi_verify_ioctl()
> early, and allow it to filter out known invalid ioctls.
> Then we wouldn't need to wait for path activation and return
> immediately.

That in principle makes sense.  Unfortunately, before path activation
you can only find out if a ioctl is valid.  You cannot find out which
ioctls are _in_valid, because path activation sets the bdev and that
might make all ioctls valid.

> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
> for path activation, but then just bail out with -ENOTCONN.
> As we're not resetting -ENOTCONN, where's the point in activate the
> paths here?
> Shouldn't we retry to figure out if -ENOTCONN is still true?

That makes sense too.  You've done the work, might as well reap the
benefits...

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02  7:28           ` Hannes Reinecke
  2015-11-02  9:57             ` Paolo Bonzini
@ 2015-11-02 13:31             ` Mike Snitzer
  2015-11-02 13:56               ` Hannes Reinecke
  1 sibling, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2015-11-02 13:31 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Paolo Bonzini, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On Mon, Nov 02 2015 at  2:28am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> On 10/31/2015 11:47 PM, Mike Snitzer wrote:
> > 
> > For Hannes, and in my head, it didn't matter if a future bdev satisfies
> > the length condition.  I don't think Hannes was trying to guard against
> > dangerous partition ioctls being issued by udev... but now I _do_
> > question what exactly _is_ the point of his commit 21a2807bc3f.  Which
> > invalid ioctls, from udev, did Hannes' change actually disallow?
> > 

FYI, I meant s/21a2807bc3f/a1989b33/

> The reasoning is thus:
> 
> With the original code we would need to wait for path activation
> before we would be able to figure out if the ioctl is valid.
> However, the callback to verify the ioctl is sd_ioctl(), which as
> a first step calls scsi_verify_ioctl().
> So my reasoning was that we can as well call scsi_verify_ioctl()
> early, and allow it to filter out known invalid ioctls.
> Then we wouldn't need to wait for path activation and return
> immediately.

Right, I understood that to be your intent.  And it seemed reasonible.
Unfortuntely, as we now know, scsi_verify_blk_ioctl() only really cares
to filter out ioctls that are invalid for partitions.

I'm still curious: which ioctls were being issued by udev that your
commit a1989b33 "fixed" (so udev workers didn't hang)?  Is the right fix
to modify udev to not issue such ioctls?  What was invalid about the
udev issued ioctls?

> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
> for path activation, but then just bail out with -ENOTCONN.
> As we're not resetting -ENOTCONN, where's the point in activate the
> paths here?
> Shouldn't we retry to figure out if -ENOTCONN is still true?

We do, in DM core, see _your_ commit that added this ;)
6c182cd88d ("dm mpath: fix ioctl deadlock when no paths")

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 13:31             ` Mike Snitzer
@ 2015-11-02 13:56               ` Hannes Reinecke
  2015-11-02 14:12                 ` Mike Snitzer
  2015-11-02 14:52                 ` Paolo Bonzini
  0 siblings, 2 replies; 23+ messages in thread
From: Hannes Reinecke @ 2015-11-02 13:56 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Paolo Bonzini, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On 11/02/2015 02:31 PM, Mike Snitzer wrote:
> On Mon, Nov 02 2015 at  2:28am -0500,
> Hannes Reinecke <hare@suse.de> wrote:
> 
>> On 10/31/2015 11:47 PM, Mike Snitzer wrote:
>>>
>>> For Hannes, and in my head, it didn't matter if a future bdev satisfies
>>> the length condition.  I don't think Hannes was trying to guard against
>>> dangerous partition ioctls being issued by udev... but now I _do_
>>> question what exactly _is_ the point of his commit 21a2807bc3f.  Which
>>> invalid ioctls, from udev, did Hannes' change actually disallow?
>>>
> 
> FYI, I meant s/21a2807bc3f/a1989b33/
> 
>> The reasoning is thus:
>>
>> With the original code we would need to wait for path activation
>> before we would be able to figure out if the ioctl is valid.
>> However, the callback to verify the ioctl is sd_ioctl(), which as
>> a first step calls scsi_verify_ioctl().
>> So my reasoning was that we can as well call scsi_verify_ioctl()
>> early, and allow it to filter out known invalid ioctls.
>> Then we wouldn't need to wait for path activation and return
>> immediately.
> 
> Right, I understood that to be your intent.  And it seemed reasonible.
> Unfortuntely, as we now know, scsi_verify_blk_ioctl() only really cares
> to filter out ioctls that are invalid for partitions.
> 
> I'm still curious: which ioctls were being issued by udev that your
> commit a1989b33 "fixed" (so udev workers didn't hang)?  Is the right fix
> to modify udev to not issue such ioctls?  What was invalid about the
> udev issued ioctls?
> 
Thing is, it's not udev itself which issues ioctl. It's the programs
started by udev via the various rules (PROGRAM or IMPORT keywords).
They might (and occasionally, will) issue ioctls, and we have no
means of controlling them.

>> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
>> for path activation, but then just bail out with -ENOTCONN.
>> As we're not resetting -ENOTCONN, where's the point in activate the
>> paths here?
>> Shouldn't we retry to figure out if -ENOTCONN is still true?
> 
> We do, in DM core, see _your_ commit that added this ;)
> 6c182cd88d ("dm mpath: fix ioctl deadlock when no paths")
> 
Indeed.

But then the real question remains:

What is the 'correct' behaviour for ioctls when no path retry
is active (or when no paths are present)?

Should we start path activation?
If so, should we wait for path activation to finish, risking udev
killing the worker for that event (udev has a built-in timeout of
120 seconds, which we might easily exceed when we need to activate
paths for large installations or slow path activation ... just
thinking of NetApp takeover/giveback cycle)?
If we're not waiting for path activation, where would be the point
in starting it, seeing that we're not actually interested in the result?
And if we shouldn't start a path activation, what is the point of
having code for it in the first place?
Couldn't we just rip it out altogether, and avoid much of the
current unpleasantness?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 13:56               ` Hannes Reinecke
@ 2015-11-02 14:12                 ` Mike Snitzer
  2015-11-02 14:36                   ` Hannes Reinecke
  2015-11-02 14:52                 ` Paolo Bonzini
  1 sibling, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2015-11-02 14:12 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Paolo Bonzini, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On Mon, Nov 02 2015 at  8:56am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> On 11/02/2015 02:31 PM, Mike Snitzer wrote:
> > On Mon, Nov 02 2015 at  2:28am -0500,
> > Hannes Reinecke <hare@suse.de> wrote:
> > 
> >> On 10/31/2015 11:47 PM, Mike Snitzer wrote:
> >>>
> >>> For Hannes, and in my head, it didn't matter if a future bdev satisfies
> >>> the length condition.  I don't think Hannes was trying to guard against
> >>> dangerous partition ioctls being issued by udev... but now I _do_
> >>> question what exactly _is_ the point of his commit 21a2807bc3f.  Which
> >>> invalid ioctls, from udev, did Hannes' change actually disallow?
> >>>
> > 
> > FYI, I meant s/21a2807bc3f/a1989b33/
> > 
> >> The reasoning is thus:
> >>
> >> With the original code we would need to wait for path activation
> >> before we would be able to figure out if the ioctl is valid.
> >> However, the callback to verify the ioctl is sd_ioctl(), which as
> >> a first step calls scsi_verify_ioctl().
> >> So my reasoning was that we can as well call scsi_verify_ioctl()
> >> early, and allow it to filter out known invalid ioctls.
> >> Then we wouldn't need to wait for path activation and return
> >> immediately.
> > 
> > Right, I understood that to be your intent.  And it seemed reasonible.
> > Unfortuntely, as we now know, scsi_verify_blk_ioctl() only really cares
> > to filter out ioctls that are invalid for partitions.
> > 
> > I'm still curious: which ioctls were being issued by udev that your
> > commit a1989b33 "fixed" (so udev workers didn't hang)?  Is the right fix
> > to modify udev to not issue such ioctls?  What was invalid about the
> > udev issued ioctls?
> > 
> Thing is, it's not udev itself which issues ioctl. It's the programs
> started by udev via the various rules (PROGRAM or IMPORT keywords).
> They might (and occasionally, will) issue ioctls, and we have no
> means of controlling them.
> 
> >> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
> >> for path activation, but then just bail out with -ENOTCONN.
> >> As we're not resetting -ENOTCONN, where's the point in activate the
> >> paths here?
> >> Shouldn't we retry to figure out if -ENOTCONN is still true?
> > 
> > We do, in DM core, see _your_ commit that added this ;)
> > 6c182cd88d ("dm mpath: fix ioctl deadlock when no paths")
> > 
> Indeed.
> 
> But then the real question remains:
> 
> What is the 'correct' behaviour for ioctls when no path retry
> is active (or when no paths are present)?
> 
> Should we start path activation?
> If so, should we wait for path activation to finish, risking udev
> killing the worker for that event (udev has a built-in timeout of
> 120 seconds, which we might easily exceed when we need to activate
> paths for large installations or slow path activation ... just
> thinking of NetApp takeover/giveback cycle)?
> If we're not waiting for path activation, where would be the point
> in starting it, seeing that we're not actually interested in the result?

We only start it: if (r == -ENOTCONN && !fatal_signal_pending(current))

-ENOTCONN is set with:
if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path))

> And if we shouldn't start a path activation, what is the point of
> having code for it in the first place?

The point is we have reason to believe paths will be coming back or that
the user wants to queue_if_no_path.

> Couldn't we just rip it out altogether, and avoid much of the
> current unpleasantness?

Sorry, not seeing how you're getting there.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 14:12                 ` Mike Snitzer
@ 2015-11-02 14:36                   ` Hannes Reinecke
  2015-11-02 15:14                     ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Hannes Reinecke @ 2015-11-02 14:36 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Paolo Bonzini, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 4732 bytes --]

On 11/02/2015 03:12 PM, Mike Snitzer wrote:
> On Mon, Nov 02 2015 at  8:56am -0500,
> Hannes Reinecke <hare@suse.de> wrote:
> 
>> On 11/02/2015 02:31 PM, Mike Snitzer wrote:
>>> On Mon, Nov 02 2015 at  2:28am -0500,
>>> Hannes Reinecke <hare@suse.de> wrote:
>>>
>>>> On 10/31/2015 11:47 PM, Mike Snitzer wrote:
>>>>>
>>>>> For Hannes, and in my head, it didn't matter if a future bdev satisfies
>>>>> the length condition.  I don't think Hannes was trying to guard against
>>>>> dangerous partition ioctls being issued by udev... but now I _do_
>>>>> question what exactly _is_ the point of his commit 21a2807bc3f.  Which
>>>>> invalid ioctls, from udev, did Hannes' change actually disallow?
>>>>>
>>>
>>> FYI, I meant s/21a2807bc3f/a1989b33/
>>>
>>>> The reasoning is thus:
>>>>
>>>> With the original code we would need to wait for path activation
>>>> before we would be able to figure out if the ioctl is valid.
>>>> However, the callback to verify the ioctl is sd_ioctl(), which as
>>>> a first step calls scsi_verify_ioctl().
>>>> So my reasoning was that we can as well call scsi_verify_ioctl()
>>>> early, and allow it to filter out known invalid ioctls.
>>>> Then we wouldn't need to wait for path activation and return
>>>> immediately.
>>>
>>> Right, I understood that to be your intent.  And it seemed reasonible.
>>> Unfortuntely, as we now know, scsi_verify_blk_ioctl() only really cares
>>> to filter out ioctls that are invalid for partitions.
>>>
>>> I'm still curious: which ioctls were being issued by udev that your
>>> commit a1989b33 "fixed" (so udev workers didn't hang)?  Is the right fix
>>> to modify udev to not issue such ioctls?  What was invalid about the
>>> udev issued ioctls?
>>>
>> Thing is, it's not udev itself which issues ioctl. It's the programs
>> started by udev via the various rules (PROGRAM or IMPORT keywords).
>> They might (and occasionally, will) issue ioctls, and we have no
>> means of controlling them.
>>
>>>> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
>>>> for path activation, but then just bail out with -ENOTCONN.
>>>> As we're not resetting -ENOTCONN, where's the point in activate the
>>>> paths here?
>>>> Shouldn't we retry to figure out if -ENOTCONN is still true?
>>>
>>> We do, in DM core, see _your_ commit that added this ;)
>>> 6c182cd88d ("dm mpath: fix ioctl deadlock when no paths")
>>>
>> Indeed.
>>
>> But then the real question remains:
>>
>> What is the 'correct' behaviour for ioctls when no path retry
>> is active (or when no paths are present)?
>>
>> Should we start path activation?
>> If so, should we wait for path activation to finish, risking udev
>> killing the worker for that event (udev has a built-in timeout of
>> 120 seconds, which we might easily exceed when we need to activate
>> paths for large installations or slow path activation ... just
>> thinking of NetApp takeover/giveback cycle)?
>> If we're not waiting for path activation, where would be the point
>> in starting it, seeing that we're not actually interested in the result?
> 
> We only start it: if (r == -ENOTCONN && !fatal_signal_pending(current))
> 
> -ENOTCONN is set with:
> if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path))
> 
I know.

>> And if we shouldn't start a path activation, what is the point of
>> having code for it in the first place?
> 
> The point is we have reason to believe paths will be coming back or that
> the user wants to queue_if_no_path.
> 
Yes. But taken at face value, it means that any I/O will be stuck in
the queue for an indefinite amount of time.
Which doesn't work for things like udev, which _requires_ an answer
after a certain time.
And if that answer isn't forthcoming, the entire udev worker is
killed, rendering the device useless as the event is never delivered.

So if the user has set queue_if_no_path, we might never get a path
down event delivered as udev is stuck.

What do we do then?
Tell the customer "Hey, you screwed up" is possibly not the best of
ideas, especially not as multipath sets an unconditional
queue_if_no_path per default for quite some arrays.

I can see a point for waiting the path initialisation to complete
(ie if queue_io is active). But for queue_if_no_path I'd rather
return an error directly (like -EAGAIN).

With that change we could retry path selection until queue_io is
done, and would be able to retrieve a valid status (as we'd always
have a bdev).

See the attached patch for a detailed explanation.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-dm-mpath-Retry-ioctl-if-paths-need-to-be-initialized.patch --]
[-- Type: text/x-patch; name="0001-dm-mpath-Retry-ioctl-if-paths-need-to-be-initialized.patch", Size: 2052 bytes --]

From b0d5848e91cfc3b97adb49121085c994b707eac3 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Mon, 2 Nov 2015 15:33:58 +0100
Subject: [PATCH] dm-mpath: Retry ioctl if paths need to be initialized

If paths need to be initialized (m->queue_io is set) we should
be starting the initialization and retry the ioctl to retrieve
the final status.
At the same time, we should _not_ wait for queue_if_no_path
to go away as this might take forever, resulting in the
ioctl to be stuck.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 5a67671..7352397 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1543,6 +1543,7 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 	unsigned long flags;
 	int r;
 
+again:
 	bdev = NULL;
 	mode = 0;
 	r = 0;
@@ -1559,8 +1560,10 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 		mode = pgpath->path.dev->mode;
 	}
 
-	if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path))
+	if (pgpath && m->queue_io)
 		r = -ENOTCONN;
+	else if (!pgpath && m->queue_if_no_path)
+		r = -EAGAIN;
 	else if (!bdev)
 		r = -EIO;
 
@@ -1569,7 +1572,7 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 	/*
 	 * Only pass ioctls through if the device sizes match exactly.
 	 */
-	if (!bdev || ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) {
+	if (r == -ENOTCONN && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) {
 		int err = scsi_verify_blk_ioctl(NULL, cmd);
 		if (err)
 			r = err;
@@ -1585,6 +1588,8 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 			__pg_init_all_paths(m);
 		spin_unlock_irqrestore(&m->lock, flags);
 		dm_table_run_md_queue_async(m->ti->table);
+		msleep(10);
+		goto again;
 	}
 
 	return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg);
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 13:56               ` Hannes Reinecke
  2015-11-02 14:12                 ` Mike Snitzer
@ 2015-11-02 14:52                 ` Paolo Bonzini
  2015-11-02 15:05                   ` Mike Snitzer
  2015-11-02 15:32                   ` Hannes Reinecke
  1 sibling, 2 replies; 23+ messages in thread
From: Paolo Bonzini @ 2015-11-02 14:52 UTC (permalink / raw)
  To: Hannes Reinecke, Mike Snitzer
  Cc: hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi



On 02/11/2015 14:56, Hannes Reinecke wrote:
> But then the real question remains:
> 
> What is the 'correct' behaviour for ioctls when no path retry
> is active (or when no paths are present)?
> 
> Should we start path activation?
> If so, should we wait for path activation to finish, risking udev
> killing the worker for that event (udev has a built-in timeout of
> 120 seconds, which we might easily exceed when we need to activate
> paths for large installations or slow path activation ... just
> thinking of NetApp takeover/giveback cycle)?
> If we're not waiting for path activation, where would be the point
> in starting it, seeing that we're not actually interested in the result?
> And if we shouldn't start a path activation, what is the point of
> having code for it in the first place?

That's a fair question, and it depends on what said udev worker actually
does.

In any case, if we don't start path activation we should return
ENOTCONN, not ENOTTY.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 14:52                 ` Paolo Bonzini
@ 2015-11-02 15:05                   ` Mike Snitzer
  2015-11-02 15:45                     ` Paolo Bonzini
  2015-11-02 15:32                   ` Hannes Reinecke
  1 sibling, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2015-11-02 15:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Hannes Reinecke, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On Mon, Nov 02 2015 at  9:52am -0500,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 02/11/2015 14:56, Hannes Reinecke wrote:
> > But then the real question remains:
> > 
> > What is the 'correct' behaviour for ioctls when no path retry
> > is active (or when no paths are present)?
> > 
> > Should we start path activation?
> > If so, should we wait for path activation to finish, risking udev
> > killing the worker for that event (udev has a built-in timeout of
> > 120 seconds, which we might easily exceed when we need to activate
> > paths for large installations or slow path activation ... just
> > thinking of NetApp takeover/giveback cycle)?
> > If we're not waiting for path activation, where would be the point
> > in starting it, seeing that we're not actually interested in the result?
> > And if we shouldn't start a path activation, what is the point of
> > having code for it in the first place?
> 
> That's a fair question, and it depends on what said udev worker actually
> does.
> 
> In any case, if we don't start path activation we should return
> ENOTCONN, not ENOTTY.

Currently, if we don't start path activation we're returning EIO.
ENOTCONN is used for when we do start path activation (and ENOTCONN is
the means for DM core to retry)

We _could_ change the ENOTCONN to be EAGAIN and EIO to ENOTCONN...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 14:36                   ` Hannes Reinecke
@ 2015-11-02 15:14                     ` Mike Snitzer
  2015-11-02 15:29                       ` Hannes Reinecke
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2015-11-02 15:14 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Paolo Bonzini, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On Mon, Nov 02 2015 at  9:36am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> On 11/02/2015 03:12 PM, Mike Snitzer wrote:
> > On Mon, Nov 02 2015 at  8:56am -0500,
> > Hannes Reinecke <hare@suse.de> wrote:
> > 
> >> On 11/02/2015 02:31 PM, Mike Snitzer wrote:
> >>> On Mon, Nov 02 2015 at  2:28am -0500,
> >>> Hannes Reinecke <hare@suse.de> wrote:
> >>>
> >>>> On 10/31/2015 11:47 PM, Mike Snitzer wrote:
> >>>>>
> >>>>> For Hannes, and in my head, it didn't matter if a future bdev satisfies
> >>>>> the length condition.  I don't think Hannes was trying to guard against
> >>>>> dangerous partition ioctls being issued by udev... but now I _do_
> >>>>> question what exactly _is_ the point of his commit 21a2807bc3f.  Which
> >>>>> invalid ioctls, from udev, did Hannes' change actually disallow?
> >>>>>
> >>>
> >>> FYI, I meant s/21a2807bc3f/a1989b33/
> >>>
> >>>> The reasoning is thus:
> >>>>
> >>>> With the original code we would need to wait for path activation
> >>>> before we would be able to figure out if the ioctl is valid.
> >>>> However, the callback to verify the ioctl is sd_ioctl(), which as
> >>>> a first step calls scsi_verify_ioctl().
> >>>> So my reasoning was that we can as well call scsi_verify_ioctl()
> >>>> early, and allow it to filter out known invalid ioctls.
> >>>> Then we wouldn't need to wait for path activation and return
> >>>> immediately.
> >>>
> >>> Right, I understood that to be your intent.  And it seemed reasonible.
> >>> Unfortuntely, as we now know, scsi_verify_blk_ioctl() only really cares
> >>> to filter out ioctls that are invalid for partitions.
> >>>
> >>> I'm still curious: which ioctls were being issued by udev that your
> >>> commit a1989b33 "fixed" (so udev workers didn't hang)?  Is the right fix
> >>> to modify udev to not issue such ioctls?  What was invalid about the
> >>> udev issued ioctls?
> >>>
> >> Thing is, it's not udev itself which issues ioctl. It's the programs
> >> started by udev via the various rules (PROGRAM or IMPORT keywords).
> >> They might (and occasionally, will) issue ioctls, and we have no
> >> means of controlling them.
> >>
> >>>> Incidentally, in the 'r == -ENOTCONN' case, we're waiting
> >>>> for path activation, but then just bail out with -ENOTCONN.
> >>>> As we're not resetting -ENOTCONN, where's the point in activate the
> >>>> paths here?
> >>>> Shouldn't we retry to figure out if -ENOTCONN is still true?
> >>>
> >>> We do, in DM core, see _your_ commit that added this ;)
> >>> 6c182cd88d ("dm mpath: fix ioctl deadlock when no paths")
> >>>
> >> Indeed.
> >>
> >> But then the real question remains:
> >>
> >> What is the 'correct' behaviour for ioctls when no path retry
> >> is active (or when no paths are present)?
> >>
> >> Should we start path activation?
> >> If so, should we wait for path activation to finish, risking udev
> >> killing the worker for that event (udev has a built-in timeout of
> >> 120 seconds, which we might easily exceed when we need to activate
> >> paths for large installations or slow path activation ... just
> >> thinking of NetApp takeover/giveback cycle)?
> >> If we're not waiting for path activation, where would be the point
> >> in starting it, seeing that we're not actually interested in the result?
> > 
> > We only start it: if (r == -ENOTCONN && !fatal_signal_pending(current))
> > 
> > -ENOTCONN is set with:
> > if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path))
> > 
> I know.
> 
> >> And if we shouldn't start a path activation, what is the point of
> >> having code for it in the first place?
> > 
> > The point is we have reason to believe paths will be coming back or that
> > the user wants to queue_if_no_path.
> > 
> Yes. But taken at face value, it means that any I/O will be stuck in
> the queue for an indefinite amount of time.
> Which doesn't work for things like udev, which _requires_ an answer
> after a certain time.
> And if that answer isn't forthcoming, the entire udev worker is
> killed, rendering the device useless as the event is never delivered.
> 
> So if the user has set queue_if_no_path, we might never get a path
> down event delivered as udev is stuck.
> 
> What do we do then?
> Tell the customer "Hey, you screwed up" is possibly not the best of
> ideas, especially not as multipath sets an unconditional
> queue_if_no_path per default for quite some arrays.
> 
> I can see a point for waiting the path initialisation to complete
> (ie if queue_io is active). But for queue_if_no_path I'd rather
> return an error directly (like -EAGAIN).
> 
> With that change we could retry path selection until queue_io is
> done, and would be able to retrieve a valid status (as we'd always
> have a bdev).
> 
> See the attached patch for a detailed explanation.
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke		               zSeries & Storage
> hare@suse.de			               +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
> HRB 21284 (AG Nürnberg)

> From b0d5848e91cfc3b97adb49121085c994b707eac3 Mon Sep 17 00:00:00 2001
> From: Hannes Reinecke <hare@suse.de>
> Date: Mon, 2 Nov 2015 15:33:58 +0100
> Subject: [PATCH] dm-mpath: Retry ioctl if paths need to be initialized
> 
> If paths need to be initialized (m->queue_io is set) we should
> be starting the initialization and retry the ioctl to retrieve
> the final status.
> At the same time, we should _not_ wait for queue_if_no_path
> to go away as this might take forever, resulting in the
> ioctl to be stuck.

I understand your goal.. but the patch is a bit incomplete.

In your patch the activation is still in progress; so there is no
bdev.. so it'll yield false positives still.

And DM core has retry via -ENOTCONN.. yet you're adding it to
multipath_ioctl (meaning nothing will use DM core's retry logic now).

Also, you still haven't established what it is about
scsi_verify_blk_ioctl() that you see as beneficial.  If we return EAGAIN
to userspace that should "fix" the udev worker issue (so no need for
extra ioctl verification.. which you're clearly still implicitly
overloading with the hopes that it is more generally applicable than
just being concerned about whether an ioctl is valid against a
partition).


> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 5a67671..7352397 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1543,6 +1543,7 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>  	unsigned long flags;
>  	int r;
>  
> +again:
>  	bdev = NULL;
>  	mode = 0;
>  	r = 0;
> @@ -1559,8 +1560,10 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>  		mode = pgpath->path.dev->mode;
>  	}
>  
> -	if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path))
> +	if (pgpath && m->queue_io)
>  		r = -ENOTCONN;
> +	else if (!pgpath && m->queue_if_no_path)
> +		r = -EAGAIN;
>  	else if (!bdev)
>  		r = -EIO;
>  
> @@ -1569,7 +1572,7 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>  	/*
>  	 * Only pass ioctls through if the device sizes match exactly.
>  	 */
> -	if (!bdev || ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) {
> +	if (r == -ENOTCONN && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) {
>  		int err = scsi_verify_blk_ioctl(NULL, cmd);
>  		if (err)
>  			r = err;
> @@ -1585,6 +1588,8 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>  			__pg_init_all_paths(m);
>  		spin_unlock_irqrestore(&m->lock, flags);
>  		dm_table_run_md_queue_async(m->ti->table);
> +		msleep(10);
> +		goto again;
>  	}
>  
>  	return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg);
> -- 
> 1.8.5.6
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 15:14                     ` Mike Snitzer
@ 2015-11-02 15:29                       ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2015-11-02 15:29 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Paolo Bonzini, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On 11/02/2015 04:14 PM, Mike Snitzer wrote:
> On Mon, Nov 02 2015 at  9:36am -0500,
> Hannes Reinecke <hare@suse.de> wrote:
[ .. ]
>> From b0d5848e91cfc3b97adb49121085c994b707eac3 Mon Sep 17 00:00:00 2001
>> From: Hannes Reinecke <hare@suse.de>
>> Date: Mon, 2 Nov 2015 15:33:58 +0100
>> Subject: [PATCH] dm-mpath: Retry ioctl if paths need to be initialized
>>
>> If paths need to be initialized (m->queue_io is set) we should
>> be starting the initialization and retry the ioctl to retrieve
>> the final status.
>> At the same time, we should _not_ wait for queue_if_no_path
>> to go away as this might take forever, resulting in the
>> ioctl to be stuck.
> 
> I understand your goal.. but the patch is a bit incomplete.
> 
> In your patch the activation is still in progress; so there is no
> bdev.. so it'll yield false positives still.
> 
Right.

> And DM core has retry via -ENOTCONN.. yet you're adding it to
> multipath_ioctl (meaning nothing will use DM core's retry logic now).
> 
Yeah, I've just noticed.

> Also, you still haven't established what it is about
> scsi_verify_blk_ioctl() that you see as beneficial.  If we return EAGAIN
> to userspace that should "fix" the udev worker issue (so no need for
> extra ioctl verification.. which you're clearly still implicitly
> overloading with the hopes that it is more generally applicable than
> just being concerned about whether an ioctl is valid against a
> partition).
> 
Personally, I would happily drop the call to scsi_verify_blk_ioctl
altogether; it'll be called via sd_ioctl() later on anyway.
And if we don't have valid/accessible paths I'd rather return an
error code indicating that instead of trying to figure out what the
error would be if we would have had a valid path.

If we can agree on that everything would become _so much_ easier...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 14:52                 ` Paolo Bonzini
  2015-11-02 15:05                   ` Mike Snitzer
@ 2015-11-02 15:32                   ` Hannes Reinecke
  1 sibling, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2015-11-02 15:32 UTC (permalink / raw)
  To: Paolo Bonzini, Mike Snitzer
  Cc: hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi

On 11/02/2015 03:52 PM, Paolo Bonzini wrote:
> 
> 
> On 02/11/2015 14:56, Hannes Reinecke wrote:
>> But then the real question remains:
>>
>> What is the 'correct' behaviour for ioctls when no path retry
>> is active (or when no paths are present)?
>>
>> Should we start path activation?
>> If so, should we wait for path activation to finish, risking udev
>> killing the worker for that event (udev has a built-in timeout of
>> 120 seconds, which we might easily exceed when we need to activate
>> paths for large installations or slow path activation ... just
>> thinking of NetApp takeover/giveback cycle)?
>> If we're not waiting for path activation, where would be the point
>> in starting it, seeing that we're not actually interested in the result?
>> And if we shouldn't start a path activation, what is the point of
>> having code for it in the first place?
> 
> That's a fair question, and it depends on what said udev worker actually
> does.
> 
Point is, we have no idea whatsoever what udev might be doing.

And experience shows that while most admins/distros etc have a fair
knowledge of writing valid udev rules, more often than not the
'queue_if_no_path' feature from multipath totally escapes them.
So we will be finding ourselves in positions where programs from
udev rules issue ioctls to devices where queue_if_no_path is active.

Plus it's a really daunting task to validate all udev rules to check
if any of those programs might issue ioctls and blank them out for
the queue_if_no_path case.
I did a review once for SUSE, but even that might be obsoleted now
as rules are being changed all the time.

> In any case, if we don't start path activation we should return
> ENOTCONN, not ENOTTY.
> 
No problem with that.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 15:05                   ` Mike Snitzer
@ 2015-11-02 15:45                     ` Paolo Bonzini
  2015-11-02 15:49                       ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2015-11-02 15:45 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Hannes Reinecke, hch, Mauricio Faria de Oliveira, dm-devel, linux-scsi



On 02/11/2015 16:05, Mike Snitzer wrote:
> > In any case, if we don't start path activation we should return
> > ENOTCONN, not ENOTTY.
> 
> Currently, if we don't start path activation we're returning EIO.
> ENOTCONN is used for when we do start path activation (and ENOTCONN is
> the means for DM core to retry)
> 
> We _could_ change the ENOTCONN to be EAGAIN and EIO to ENOTCONN...

This makes sense... though of course testing the impact of this on
userspace is going to be hard. :(  Chances are that userspace is not
expecting EAGAIN either.

Even if they did, how would someone know that they can now retry the
ioctl after getting EAGAIN?  Should they just do it in a loop?

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"]
  2015-11-02 15:45                     ` Paolo Bonzini
@ 2015-11-02 15:49                       ` Mike Snitzer
  0 siblings, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2015-11-02 15:49 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Hannes Reinecke, hch, Mauricio Faria de Oliveira, dm-devel,
	linux-scsi, Peter Rajnoha

On Mon, Nov 02 2015 at 10:45am -0500,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 02/11/2015 16:05, Mike Snitzer wrote:
> > > In any case, if we don't start path activation we should return
> > > ENOTCONN, not ENOTTY.
> > 
> > Currently, if we don't start path activation we're returning EIO.
> > ENOTCONN is used for when we do start path activation (and ENOTCONN is
> > the means for DM core to retry)
> > 
> > We _could_ change the ENOTCONN to be EAGAIN and EIO to ENOTCONN...
> 
> This makes sense... though of course testing the impact of this on
> userspace is going to be hard. :(  Chances are that userspace is not
> expecting EAGAIN either.
> 
> Even if they did, how would someone know that they can now retry the
> ioctl after getting EAGAIN?  Should they just do it in a loop?

Turns out multipath (userspace) has a udev rule for this now (prajnoha
pointed this out):
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=multipath/11-dm-mpath.rules

So now I'm wondering if we _need_ to do any retries in kernel (aside
from while activation is active)?

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-11-02 15:49 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 12:24 [PATCH] Revert "dm mpath: fix stalls when handling invalid ioctls" Mauricio Faria de Oliveira
2015-10-29 12:33 ` Mauricio Faria de Oliveira
2015-10-29 13:18 ` IBM request to allow unprivledged ioctls [Was: Revert "dm mpath: fix stalls when handling invalid ioctls"] Mike Snitzer
2015-10-29 14:47   ` [dm-devel] " Mauricio Faria de Oliveira
2015-10-31 15:33   ` Paolo Bonzini
2015-10-31 18:13     ` Mike Snitzer
2015-10-31 18:36       ` Mike Snitzer
2015-10-31 19:07       ` Paolo Bonzini
2015-10-31 22:47         ` Mike Snitzer
2015-11-02  7:28           ` Hannes Reinecke
2015-11-02  9:57             ` Paolo Bonzini
2015-11-02 13:31             ` Mike Snitzer
2015-11-02 13:56               ` Hannes Reinecke
2015-11-02 14:12                 ` Mike Snitzer
2015-11-02 14:36                   ` Hannes Reinecke
2015-11-02 15:14                     ` Mike Snitzer
2015-11-02 15:29                       ` Hannes Reinecke
2015-11-02 14:52                 ` Paolo Bonzini
2015-11-02 15:05                   ` Mike Snitzer
2015-11-02 15:45                     ` Paolo Bonzini
2015-11-02 15:49                       ` Mike Snitzer
2015-11-02 15:32                   ` Hannes Reinecke
2015-11-02  9:55           ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.