All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>, Sagi Grimberg <sagi@grimberg.me>,
	Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nvme-tcp: Check if request has started before processing it
Date: Wed, 3 Mar 2021 03:45:29 +0900	[thread overview]
Message-ID: <20210302184529.GB22346@redsun51.ssa.fujisawa.hgst.com> (raw)
In-Reply-To: <8eb26510-03e1-7923-0f47-2a8d3d539963@suse.de>

On Tue, Mar 02, 2021 at 08:18:40AM +0100, Hannes Reinecke wrote:
> On 3/1/21 9:59 PM, Keith Busch wrote:
> > On Mon, Mar 01, 2021 at 05:53:25PM +0100, Hannes Reinecke wrote:
> >> On 3/1/21 5:05 PM, Keith Busch wrote:
> >>> On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote:
> >>>> On 3/1/21 2:26 PM, Daniel Wagner wrote:
> >>>>> On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote:
> >>>>>> Crashing is bad, silent data corruption is worse. Is there truly no
> >>>>>> defense against that? If not, why should anyone rely on this?
> >>>>>
> >>>>> If we receive an response for which we don't have a started request, we
> >>>>> know that something is wrong. Couldn't we in just reset the connection
> >>>>> in this case? We don't have to pretend nothing has happened and
> >>>>> continuing normally. This would avoid a host crash and would not create
> >>>>> (more) data corruption. Or I am just too naive?
> >>>>>
> >>>> This is actually a sensible solution.
> >>>> Please send a patch for that.
> >>>
> >>> Is a bad frame a problem that can be resolved with a reset?
> >>>
> >>> Even if so, the reset doesn't indicate to the user if previous commands
> >>> completed with bad data, so it still seems unreliable.
> >>>
> >> We need to distinguish two cases here.
> >> The one is use receiving a frame with an invalid tag, leading to a crash.
> >> This can be easily resolved by issuing a reset, as clearly the command was
> >> garbage and we need to invoke error handling (which is reset).
> >>
> >> The other case is us receiving a frame with a _duplicate_ tag, ie a tag
> >> which is _currently_ valid. This is a case which will fail _even now_, as we
> >> have simply no way of detecting this.
> >>
> >> So what again do we miss by fixing the first case?
> >> Apart from a system which does _not_ crash?
> > 
> > I'm just saying each case is a symptom of the same problem. The only
> > difference from observing one vs the other is a race with the host's
> > dispatch. And since you're proposing this patch, it sounds like this
> > condition does happen on tcp compared to other transports where we don't
> > observe it. I just thought the implication that data corruption happens
> > is a alarming.
> > 
> Oh yes, it is.
> But sadly TCP inherently suffers from this, as literally anyone can
> spoof frames on the network.
> Other transports like RDMA or FC do not suffer to that extend as
> spoofing frames there is far more elaborate, and not really possible
> without dedicated hardware equipment.
> 
> That's why there is header and data digest; that will protect you
> against accidental frame corruption (as this case clearly is; the
> remainder of the frame is filled with zeroes).
> It would not protect you against deliberate frame corruption; that's why
> there is TPAR 8010 (TLS encryption for NVMe-TCP).
> 
> Be it as it may, none of these methods are in use here, and none of
> these methods can be made mandatory. So we need to deal with the case at
> hand.
> 
> And in my opinion crashing is the _worst_ options of all.
> Tear the connection down, reset the thing, whatever.
> 
> But do not crash.
> Customers tend to have a very dim view on crashing machines, and have a
> very limited capacity for being susceptible to our reasoning in these cases.

I was pushing the data corruption angle because fixing that should
address both cases. If there's really nothing you can do about
corruption, your approach here makes sense, and I defer to Sagi and
Christoph for inclusion.

I still wouldn't trust my data to storage behaving this way, though. :)

WARNING: multiple messages have this Message-ID (diff)
From: Keith Busch <kbusch@kernel.org>
To: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>, Sagi Grimberg <sagi@grimberg.me>,
	Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nvme-tcp: Check if request has started before processing it
Date: Wed, 3 Mar 2021 03:45:29 +0900	[thread overview]
Message-ID: <20210302184529.GB22346@redsun51.ssa.fujisawa.hgst.com> (raw)
In-Reply-To: <8eb26510-03e1-7923-0f47-2a8d3d539963@suse.de>

On Tue, Mar 02, 2021 at 08:18:40AM +0100, Hannes Reinecke wrote:
> On 3/1/21 9:59 PM, Keith Busch wrote:
> > On Mon, Mar 01, 2021 at 05:53:25PM +0100, Hannes Reinecke wrote:
> >> On 3/1/21 5:05 PM, Keith Busch wrote:
> >>> On Mon, Mar 01, 2021 at 02:55:30PM +0100, Hannes Reinecke wrote:
> >>>> On 3/1/21 2:26 PM, Daniel Wagner wrote:
> >>>>> On Sat, Feb 27, 2021 at 02:19:01AM +0900, Keith Busch wrote:
> >>>>>> Crashing is bad, silent data corruption is worse. Is there truly no
> >>>>>> defense against that? If not, why should anyone rely on this?
> >>>>>
> >>>>> If we receive an response for which we don't have a started request, we
> >>>>> know that something is wrong. Couldn't we in just reset the connection
> >>>>> in this case? We don't have to pretend nothing has happened and
> >>>>> continuing normally. This would avoid a host crash and would not create
> >>>>> (more) data corruption. Or I am just too naive?
> >>>>>
> >>>> This is actually a sensible solution.
> >>>> Please send a patch for that.
> >>>
> >>> Is a bad frame a problem that can be resolved with a reset?
> >>>
> >>> Even if so, the reset doesn't indicate to the user if previous commands
> >>> completed with bad data, so it still seems unreliable.
> >>>
> >> We need to distinguish two cases here.
> >> The one is use receiving a frame with an invalid tag, leading to a crash.
> >> This can be easily resolved by issuing a reset, as clearly the command was
> >> garbage and we need to invoke error handling (which is reset).
> >>
> >> The other case is us receiving a frame with a _duplicate_ tag, ie a tag
> >> which is _currently_ valid. This is a case which will fail _even now_, as we
> >> have simply no way of detecting this.
> >>
> >> So what again do we miss by fixing the first case?
> >> Apart from a system which does _not_ crash?
> > 
> > I'm just saying each case is a symptom of the same problem. The only
> > difference from observing one vs the other is a race with the host's
> > dispatch. And since you're proposing this patch, it sounds like this
> > condition does happen on tcp compared to other transports where we don't
> > observe it. I just thought the implication that data corruption happens
> > is a alarming.
> > 
> Oh yes, it is.
> But sadly TCP inherently suffers from this, as literally anyone can
> spoof frames on the network.
> Other transports like RDMA or FC do not suffer to that extend as
> spoofing frames there is far more elaborate, and not really possible
> without dedicated hardware equipment.
> 
> That's why there is header and data digest; that will protect you
> against accidental frame corruption (as this case clearly is; the
> remainder of the frame is filled with zeroes).
> It would not protect you against deliberate frame corruption; that's why
> there is TPAR 8010 (TLS encryption for NVMe-TCP).
> 
> Be it as it may, none of these methods are in use here, and none of
> these methods can be made mandatory. So we need to deal with the case at
> hand.
> 
> And in my opinion crashing is the _worst_ options of all.
> Tear the connection down, reset the thing, whatever.
> 
> But do not crash.
> Customers tend to have a very dim view on crashing machines, and have a
> very limited capacity for being susceptible to our reasoning in these cases.

I was pushing the data corruption angle because fixing that should
address both cases. If there's really nothing you can do about
corruption, your approach here makes sense, and I defer to Sagi and
Christoph for inclusion.

I still wouldn't trust my data to storage behaving this way, though. :)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-03-02 22:07 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-12 18:17 [PATCH] nvme-tcp: Check if request has started before processing it Daniel Wagner
2021-02-12 18:17 ` Daniel Wagner
2021-02-12 20:58 ` Sagi Grimberg
2021-02-12 20:58   ` Sagi Grimberg
2021-02-12 21:09   ` Keith Busch
2021-02-12 21:09     ` Keith Busch
2021-02-12 21:49     ` Sagi Grimberg
2021-02-12 21:49       ` Sagi Grimberg
2021-02-13  8:46       ` Hannes Reinecke
2021-02-13  8:46         ` Hannes Reinecke
2021-02-15 10:40         ` Daniel Wagner
2021-02-15 10:40           ` Daniel Wagner
2021-02-15 21:29           ` Sagi Grimberg
2021-02-15 21:29             ` Sagi Grimberg
2021-02-26 12:35             ` Daniel Wagner
2021-02-26 12:35               ` Daniel Wagner
2021-02-26 12:54               ` Hannes Reinecke
2021-02-26 12:54                 ` Hannes Reinecke
2021-02-26 16:13                 ` Keith Busch
2021-02-26 16:13                   ` Keith Busch
2021-02-26 16:42                   ` Hannes Reinecke
2021-02-26 16:42                     ` Hannes Reinecke
2021-02-26 17:19                     ` Keith Busch
2021-02-26 17:19                       ` Keith Busch
2021-03-01 13:26                       ` Daniel Wagner
2021-03-01 13:26                         ` Daniel Wagner
2021-03-01 13:55                         ` Hannes Reinecke
2021-03-01 13:55                           ` Hannes Reinecke
2021-03-01 16:05                           ` Keith Busch
2021-03-01 16:05                             ` Keith Busch
2021-03-01 16:53                             ` Hannes Reinecke
2021-03-01 16:53                               ` Hannes Reinecke
2021-03-01 20:59                               ` Keith Busch
2021-03-01 20:59                                 ` Keith Busch
2021-03-02  7:18                                 ` Hannes Reinecke
2021-03-02  7:18                                   ` Hannes Reinecke
2021-03-02 18:45                                   ` Keith Busch [this message]
2021-03-02 18:45                                     ` Keith Busch
2021-02-15 21:23         ` Sagi Grimberg
2021-02-15 21:23           ` Sagi Grimberg
2021-02-16  8:51           ` Hannes Reinecke
2021-02-16  8:51             ` Hannes Reinecke
2021-02-13  8:42 ` Hannes Reinecke
2021-02-13  8:42   ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210302184529.GB22346@redsun51.ssa.fujisawa.hgst.com \
    --to=kbusch@kernel.org \
    --cc=axboe@fb.com \
    --cc=dwagner@suse.de \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.