All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Martin Wilck <mwilck@suse.com>, Mike Snitzer <snitzer@redhat.com>,
	Alasdair G Kergon <agk@redhat.com>,
	Bart Van Assche <Bart.VanAssche@sandisk.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	dm-devel@redhat.com, Hannes Reinecke <hare@suse.de>,
	Daniel Wagner <dwagner@suse.de>,
	linux-block <linux-block@vger.kernel.org>,
	Benjamin Marzinski <bmarzins@redhat.com>,
	Nils Koenig <nkoenig@redhat.com>, Ewan Milne <emilne@redhat.com>
Subject: Re: [PATCH v5 3/3] dm mpath: add CONFIG_DM_MULTIPATH_SG_IO - failover for SG_IO
Date: Thu, 1 Jul 2021 13:06:50 +0200	[thread overview]
Message-ID: <CABgObfYi6TooJM1cCCQrj2pdzz+VHtC+-w1KTycvsSiC+koNVQ@mail.gmail.com> (raw)
In-Reply-To: <20210701075629.GA25768@lst.de>

On Thu, Jul 1, 2021 at 9:56 AM Christoph Hellwig <hch@lst.de> wrote:
> On Mon, Jun 28, 2021 at 05:15:58PM +0200, mwilck@suse.com wrote:
> > The qemu "pr-helper" was specifically invented for it. I
> > believe that this is the most important real-world scenario for sending
> > SG_IO ioctls to device-mapper devices.
>
> pr-helper obviously does not SG_IO on dm-multipath as that simply does
> not work.

Right, for the specific case of persistent reservation ioctls, SG_IO is
sent manually to each path via libmpathpersist.

Failover for SG_IO is needed for general-purpose commands (ranging
from INQUIRY/READ CAPACITY to READ/WRITE). The reason to use
SG_IO instead of syscalls is mostly to preserve sense data; QEMU does
have code to convert errno to sense data, but it's fickle. If QEMU can use
sense data directly, it's easier to forward conditions that the guest can
resolve itself (for example unit attentions) and to let the guest operate
at a lower level (e.g. host-managed ZBC can be forwarded and they just
work).

Of course, all this works only for SCSI. As NVMe becomes more common,
and Linux exposes more functionality to userspace with a fabric-neutral
API, QEMU's SBC emulation can start using that functionality and provide
low-level passthrough functionality no matter if the host is using SCSI
or NVMe. Again, the main obstacle for this is sense data; for example,
the SCSI subsystem rightfully eats unit attentions and converts them to
uevents if you go through read/write requests instead of SG_IO.

> More importantly - if you want to use persistent reservations use the
> kernel ioctls for that.  These work on SCSI, NVMe and device mapper
> without any extra magic.

If they provide functionality equivalent to libmpathpersist without having
to do the DM_TABLE_STATUS, I will certainly consider switching! The
only possible issue could be the lost unit attentions.

Paolo

> Failing over SG_IO does not make sense.  It is an interface specically
> designed to leave all error handling to the userspace program using it,
> and we should not change that for one specific error case.  If you
> want the kernel to handle errors for you, use the proper interfaces.
> In this case this is the persistent reservation ioctls.  If they miss
> some features that qemu needs we should add those.


WARNING: multiple messages have this Message-ID (diff)
From: Paolo Bonzini <pbonzini@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	Daniel Wagner <dwagner@suse.de>, Ewan Milne <emilne@redhat.com>,
	linux-block <linux-block@vger.kernel.org>,
	dm-devel@redhat.com,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Nils Koenig <nkoenig@redhat.com>,
	Bart Van Assche <Bart.VanAssche@sandisk.com>,
	Martin Wilck <mwilck@suse.com>,
	Alasdair G Kergon <agk@redhat.com>
Subject: Re: [dm-devel] [PATCH v5 3/3] dm mpath: add CONFIG_DM_MULTIPATH_SG_IO - failover for SG_IO
Date: Thu, 1 Jul 2021 13:06:50 +0200	[thread overview]
Message-ID: <CABgObfYi6TooJM1cCCQrj2pdzz+VHtC+-w1KTycvsSiC+koNVQ@mail.gmail.com> (raw)
In-Reply-To: <20210701075629.GA25768@lst.de>

On Thu, Jul 1, 2021 at 9:56 AM Christoph Hellwig <hch@lst.de> wrote:
> On Mon, Jun 28, 2021 at 05:15:58PM +0200, mwilck@suse.com wrote:
> > The qemu "pr-helper" was specifically invented for it. I
> > believe that this is the most important real-world scenario for sending
> > SG_IO ioctls to device-mapper devices.
>
> pr-helper obviously does not SG_IO on dm-multipath as that simply does
> not work.

Right, for the specific case of persistent reservation ioctls, SG_IO is
sent manually to each path via libmpathpersist.

Failover for SG_IO is needed for general-purpose commands (ranging
from INQUIRY/READ CAPACITY to READ/WRITE). The reason to use
SG_IO instead of syscalls is mostly to preserve sense data; QEMU does
have code to convert errno to sense data, but it's fickle. If QEMU can use
sense data directly, it's easier to forward conditions that the guest can
resolve itself (for example unit attentions) and to let the guest operate
at a lower level (e.g. host-managed ZBC can be forwarded and they just
work).

Of course, all this works only for SCSI. As NVMe becomes more common,
and Linux exposes more functionality to userspace with a fabric-neutral
API, QEMU's SBC emulation can start using that functionality and provide
low-level passthrough functionality no matter if the host is using SCSI
or NVMe. Again, the main obstacle for this is sense data; for example,
the SCSI subsystem rightfully eats unit attentions and converts them to
uevents if you go through read/write requests instead of SG_IO.

> More importantly - if you want to use persistent reservations use the
> kernel ioctls for that.  These work on SCSI, NVMe and device mapper
> without any extra magic.

If they provide functionality equivalent to libmpathpersist without having
to do the DM_TABLE_STATUS, I will certainly consider switching! The
only possible issue could be the lost unit attentions.

Paolo

> Failing over SG_IO does not make sense.  It is an interface specically
> designed to leave all error handling to the userspace program using it,
> and we should not change that for one specific error case.  If you
> want the kernel to handle errors for you, use the proper interfaces.
> In this case this is the persistent reservation ioctls.  If they miss
> some features that qemu needs we should add those.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  parent reply	other threads:[~2021-07-01 11:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28 15:15 [PATCH v5 0/3] scsi/dm: dm_blk_ioctl(): implement failover for SG_IO on dm-multipath mwilck
2021-06-28 15:15 ` [dm-devel] " mwilck
2021-06-28 15:15 ` [PATCH v5 1/3] scsi: scsi_ioctl: export __scsi_result_to_blk_status() mwilck
2021-06-28 15:15   ` [dm-devel] " mwilck
2021-06-28 15:15 ` [PATCH v5 2/3] scsi: scsi_ioctl: add sg_io_to_blk_status() mwilck
2021-06-28 15:15   ` [dm-devel] " mwilck
2021-06-28 15:15 ` [PATCH v5 3/3] dm mpath: add CONFIG_DM_MULTIPATH_SG_IO - failover for SG_IO mwilck
2021-06-28 15:15   ` [dm-devel] " mwilck
2021-07-01  7:56   ` Christoph Hellwig
2021-07-01  7:56     ` [dm-devel] " Christoph Hellwig
2021-07-01 10:35     ` Martin Wilck
2021-07-01 10:35       ` [dm-devel] " Martin Wilck
2021-07-01 11:34       ` Christoph Hellwig
2021-07-01 11:34         ` [dm-devel] " Christoph Hellwig
2021-07-02 14:21         ` Martin Wilck
2021-07-02 14:21           ` Martin Wilck
2021-07-05 13:02           ` Paolo Bonzini
2021-07-05 13:02             ` Paolo Bonzini
2021-07-05 13:11             ` Hannes Reinecke
2021-07-05 13:11               ` Hannes Reinecke
2021-07-05 13:48               ` Martin Wilck
2021-07-05 13:48                 ` Martin Wilck
2021-07-06 10:13                 ` Paolo Bonzini
2021-07-06 10:13                   ` Paolo Bonzini
2021-07-01 11:06     ` Paolo Bonzini [this message]
2021-07-01 11:06       ` Paolo Bonzini
2021-06-30 22:36 kernel test robot
2021-07-02  4:55 kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABgObfYi6TooJM1cCCQrj2pdzz+VHtC+-w1KTycvsSiC+koNVQ@mail.gmail.com \
    --to=pbonzini@redhat.com \
    --cc=Bart.VanAssche@sandisk.com \
    --cc=agk@redhat.com \
    --cc=bmarzins@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=dwagner@suse.de \
    --cc=emilne@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mwilck@suse.com \
    --cc=nkoenig@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.