All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
	Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
Subject: Re: [PATCH] nvme-tcp: fix incorrect h2cdata pdu offset accounting
Date: Thu, 30 Sep 2021 14:15:10 -0600	[thread overview]
Message-ID: <20210930201510.GA94540@C02WT3WMHTD6> (raw)
In-Reply-To: <d969a4b7-4d3c-9df7-5a7d-9700bff37d50@grimberg.me>

On Wed, Sep 29, 2021 at 12:00:20AM +0300, Sagi Grimberg wrote:
> 
> 
> On 9/28/21 11:40 PM, Keith Busch wrote:
> > On Tue, Sep 14, 2021 at 06:38:55PM +0300, Sagi Grimberg wrote:
> > > When the controller sends us multiple r2t PDUs in a single
> > > request we need to account for it correctly as our send/recv
> > > context run concurrently (i.e. we get a new r2t with r2t_offset
> > > before we updated our iterator and req->data_sent marker). This
> > > can cause wrong offsets to be sent to the controller.
> > > 
> > > To fix that, we will first know that this may happen only in
> > > the send sequence of the last page, hence we will take
> > > the r2t_offset to the h2c PDU data_offset, and in
> > > nvme_tcp_try_send_data loop, we make sure to increment
> > > the request markers also when we completed a PDU but
> > > we are expecting more r2t PDUs as we still did not send
> > > the entire data of the request.
> > > 
> > > Fixes: 825619b09ad3 ("nvme-tcp: fix possible use-after-completion")
> > > Reported-by: Nowak, Lukasz <Lukasz.Nowak@Dell.com>
> > > Tested-by: Nowak, Lukasz <Lukasz.Nowak@Dell.com>
> > > Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> > > ---
> > > - Keith, can you ask the WD team to test this as well?
> > 
> > I finally have some feedback on testing, and it looks like this patch
> > has created a regression. The testers are observing:
> > 
> >    [Tue Sep 28 02:12:55 2021] nvme nvme6: req 3 r2t len 8192 exceeded data len 8192 (4096 sent)
> >    [Tue Sep 28 02:12:55 2021] nvme nvme6: receive failed:  -71
> >    [Tue Sep 28 02:12:55 2021] nvme nvme6: starting error recovery
> > 
> > Which we had seen before, and you fixed with commit:
> > 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=825619b09ad351894d2c6fb6705f5b3711d145c7
> > 
> > I only just recently heard about this, so I haven't looked into any
> > additional possible cause for this regression yet, but just wanted to
> > let you know earlier.
> 
> Yes I definitely see the bug... Damn it..
> 
> The issue here is that we peek at req->data_sent and req->data_len
> after the last send, but by then the request may have completed
> and these members were overwritten by a new execution of the
> request.
> 
> We really should make sure that we record these before we perform
> the network send to avoid this race.
> 
> Can you guys check that this restores the correct behavior?

Unfortunately this was unsuccessful. The same issue is still occuring. Please
let me know if you have another patch to try. I'll also keep looking for a
solution as well.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply	other threads:[~2021-09-30 20:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-14 15:38 [PATCH] nvme-tcp: fix incorrect h2cdata pdu offset accounting Sagi Grimberg
2021-09-14 22:01 ` Keith Busch
2021-09-15  9:28   ` Sagi Grimberg
2021-09-16 16:45 ` Keith Busch
2021-09-19  7:12   ` Sagi Grimberg
2021-09-20 10:11     ` Sagi Grimberg
2021-09-20 15:04     ` Keith Busch
2021-09-28 20:40 ` Keith Busch
2021-09-28 21:00   ` Sagi Grimberg
2021-09-29  0:24     ` Keith Busch
2021-09-30 20:15     ` Keith Busch [this message]
2021-10-02 22:19       ` Sagi Grimberg
2021-10-03  2:04         ` Keith Busch
2021-10-03  8:51           ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210930201510.GA94540@C02WT3WMHTD6 \
    --to=kbusch@kernel.org \
    --cc=Chaitanya.Kulkarni@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.