All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>,
	Christoph Hellwig <hch@infradead.org>,
	dm-devel@redhat.com, Christoph Hellwig <hch@lst.de>,
	linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.com>
Subject: Re: dm-mpath: always return reservation conflict
Date: Fri, 30 Sep 2016 08:55:44 +0800	[thread overview]
Message-ID: <1475196944.3384.8.camel@HansenPartnership.com> (raw)
In-Reply-To: <20160929150132.GA31965@redhat.com>

On Thu, 2016-09-29 at 11:01 -0400, Mike Snitzer wrote:
> On Tue, Sep 27 2016 at  2:50pm -0400,
> James Bottomley <James.Bottomley@hansenpartnership.com> wrote:
> 
> > On Tue, 2016-09-27 at 08:34 +0200, Hannes Reinecke wrote:
> > > On 09/26/2016 09:06 PM, James Bottomley wrote:
> > > > On Mon, 2016-09-26 at 09:52 -0700, Christoph Hellwig wrote:
> > > > > Getting back to this after Hannes recovered from his vacation
> > > > > and I had a chat with him..
> > > > > 
> > > > > On Mon, Aug 15, 2016 at 09:40:39AM -0400, Mike Snitzer wrote:
> > > > > > Seems we still need a more sophisticated approach.  But I'm
> > > > > > left wondering: if we didn't do it would anything notice? 
> > > > > >  Sadly, the same big question from the original thread from
> > > > > > a
> > > > > > year ago:
> > > > > 
> > > > > Yes.  I have a customer looking to push the pNFS SCSI layout
> > > > > into
> > > > > a product, and the major show stopper right now is that we
> > > > > can
> > > > > trivially get into failver loops without this (or and
> > > > > equivalent)
> > > > > fix.
> > > > > 
> > > > > A year ago SCSI layout was still work in progress in the
> > > > > IETF,
> > > > > people use the similar block layout instead that doesn't use
> > > > > PRs and we also didn't have the in-kernel PR API, so you 
> > > > > effectively couldn't use PRs with multipathing.
> > > > > 
> > > > > > https://patchwork.kernel.org/patch/6797111/
> > > > > > 
> > > > > > > So this is throw-away for now (and I'll get Hannes' patch
> > > > > > > applied for 4.8-rc3, with the tweak of returning -EBADE
> > > > > > > immediately):
> > > > > > 
> > > > > > Unfortunately, I'm _not_ staging Hannes' patch until I have
> > > > > > James Bottomley's Ack (given his original issues with the
> > > > > > patch
> > > > > > haven't been explained away AFAICT).
> > > > > 
> > > > > I've added James to the Cc.  His argument was that the old 
> > > > > behavior could be implemented to use some non-standard use of
> > > > > reservations without a specific example.  I don't really
> > > > > think 
> > > > > his example even is practical - once we use dm-mpath it 
> > > > > exclusively claims the underling block devices, so any sort
> > > > > of 
> > > > > selective reservations would have had to happen before even
> > > > > starting dm-multipath.
> > > > 
> > > > Well, now that you've made me reread the thread from 14 months
> > > > ago 
> > > > that wasn't quite my objection.  The objection hinged on the
> > > > fact 
> > > > that anything that uses path specific reservations would now
> > > > fail
> > > > instead of retrying on a different path.  I thought the IBM SVC
> > > > did 
> > > > this and Hannes implied he'd be able to check this ... did
> > > > anyone 
> > > > check?  If we've checked and there's no issue with the SVC,
> > > > then I 
> > > > don't have any other objections.
> > > > 
> > > > >   So a dynamic SAN controller would have to tear down and
> > > > > rebuild 
> > > > > the dm-multipath setup at all the time.
> > > > 
> > > > That was the job of the SVC: it sat in the middle of the SAN
> > > > and
> > > > controlled which node saw what storage.
> > > > 
> > > > https://www.ibm.com/support/knowledgecenter/STPVGU/com.ibm.stor
> > > > age.
> > > > svc.console.720.doc/svc_svcovr_1bcfiq.html
> > > > 
> > > > The SVC can issue its own reservations in those circumstances. 
> > > >  What I'm not at all clear on is whether they'll interact badly
> > > > with the dm-mp reservations.
> > > > 
> > > In the end SVC is (for us) just another storage array.
> > > If and what SVC does in the background is of no interest to us.
> > 
> > How can that be true?  It sits *on* the san and manages devices, it
> > doesn't sit between the initators and the devices.  It applies
> > reservations to devices under management, but every node usually
> > sees
> > everything else, so devices under SVC management are visible to all
> > initators unless you zone them off.
> > 
> > The last SVC manual I saw included a procedure for manually
> > releasing
> > stuck SVC reservations from an initator, which illustrates the
> > expectation.
> > 
> > > OTOH I'd be very surprised if the SVC would be allowing us to see
> > > remnants of its internal working (like persistent reservation 
> > > errors); in doing so third-party applications would be able to
> > > see 
> > > and possibly modify these persistent reservations and the SVC
> > > would 
> > > find itself in a very fragile operating scenario.
> > 
> > Because unless you zone the fibre, that's precisely what you do
> > see.
> > 
> > > Also interactions with GPFS (which uses it's own set of
> > > reservations)
> > > will become very tricky.
> > > 
> > > So I sincerely doubt we'll ever see SVC-originated persistent
> > > reservations errors.
> > > 
> > > And as a side-note, this particular patch is included in SLES
> > > since
> > > 2011. With no noticeable side-effect.
> > 
> > OK, so can you actually say that someone has tested this scenario? 
> >  If not, do you have the capacity to test it?
> 
> I've elected to just take this change for 4.9.  Please see:
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.g
> it/commit/?h=dm-4.9&id=8ff232c1a819c2e98d85974a3bff0b7b8e2970ed

That's fine.  I think the answer is that SVC technology is not around
much so it can't be tested, so I was going to dump it on you to make
the call anyway ...

James


  parent reply	other threads:[~2016-09-30  0:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-02 12:36 [PING / RESEND] handling reservation conflicts in dm-mpath Christoph Hellwig
2016-08-02 12:36 ` [PATCH] dm-mpath: always return reservation conflict Christoph Hellwig
2016-08-11 18:38   ` [dm-devel] " Christoph Hellwig
2016-08-15 13:08     ` Mike Snitzer
2016-08-15 13:40       ` Mike Snitzer
2016-09-26 16:52         ` [dm-devel] " Christoph Hellwig
2016-09-26 19:06           ` James Bottomley
2016-09-27  6:34             ` Hannes Reinecke
2016-09-27 18:50               ` James Bottomley
2016-09-29 15:01                 ` Mike Snitzer
2016-09-29 15:35                   ` Christoph Hellwig
2016-09-30  0:55                   ` James Bottomley [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-07-15 11:23 [PATCH] " Hannes Reinecke
2015-07-15 11:35 ` James Bottomley
2015-07-15 11:52   ` Hannes Reinecke
2015-07-15 12:01     ` James Bottomley
2015-07-15 12:15       ` Hannes Reinecke
2015-07-15 13:20         ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475196944.3384.8.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=dm-devel@redhat.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.