All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>,
	Keith Busch <kbusch@kernel.org>,
	Max Gurtovoy <mgurtovoy@nvidia.com>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: [PATCH v2 4/4] nvme: code command_id with a genctr for use-after-free validation
Date: Wed, 26 May 2021 01:41:17 -0700	[thread overview]
Message-ID: <2426b42f-b345-2689-3bd8-8a24d6670ac2@grimberg.me> (raw)
In-Reply-To: <BYAPR04MB4965873E107F598F6BD95B1386249@BYAPR04MB4965.namprd04.prod.outlook.com>


>>> The bad controller should be fixed.
>>>
>>> In the past, I've sent patches that check that sqid match in nvme cqe to
>>> protect faulty drives that might send
>>> the completion on a wrong msix.
>>>
>>> My patch wasn't accepted since it added an additional "if" in the fast path.
>>>
>>> Now we're adding much more operation in the fast path because of buggy ctrl
>>> ?
>> I shared the same performance concern on v1 on this series. I haven't
>> been able to test this one yet (only have emulation for two more weeks).
>>
>> Hannes says the bug this catches happens frequently enough on TCP. If we
>> don't catch it, we get kernel panic or corruption, so we defintely need to
>> do something. Sagi correctly noted this type of bug is not unique to TCP
>> (or even NVMe, for that matter), but if there is a performance impact on
>> PCI, and no one so far reports such an issue, I would still recommend
>> this type of mitigation be isolated to transports that actually observe
>> invalid CQEs.
>>   
> 
> Please do that if possible.

I disagree with that. The original report was about TCP, but that is
pure coincidence as far as I'm concerned, as I indicated before there
is nothing TCP specific about this.

Today we don't protect against such a scenario. I originally dismissed
this issue as we should probably address the a buggy controller, but an
argument was made that a buggy controller should not be able to crash
the kernel and we should have a safe-guard for it, which is a valid
argument, but this is not the worst that can happen, this could have
caused a silent data corruption with a slightly different timing. So
it seems absolutely appropriate to me considering the possible
consequences.

The way I see it, we should decide where we stand here, either we
continue to ignore this possible panic/data-corruption (which I
personally think would be wrong), or we address it, and if we do,
we should do it in the correct place, not limit it to the observed
component.

As for the performance concerns, I'd be surprised if performance is
noticeably impacted from 2 assignments and 2 optimized branches.
Also the overall nvme_request did not grow (still 32-bytes, half a
cacheline). But let's see if measurements prove otherwise...

Before patch:
struct nvme_request {
         struct nvme_command *      cmd;                  /*     0     8 */
         union nvme_result          result;               /*     8     8 */
         u8                         retries;              /*    16     1 */
         u8                         flags;                /*    17     1 */
         u16                        status;               /*    18     2 */

         /* XXX 4 bytes hole, try to pack */

         struct nvme_ctrl *         ctrl;                 /*    24     8 */

         /* size: 32, cachelines: 1, members: 6 */
         /* sum members: 28, holes: 1, sum holes: 4 */
         /* last cacheline: 32 bytes */
};

After patch:
truct nvme_request {
         struct nvme_command *      cmd;                  /*     0     8 */
         union nvme_result          result;               /*     8     8 */
         u8                         genctr;               /*    16     1 */
         u8                         retries;              /*    17     1 */
         u8                         flags;                /*    18     1 */

         /* XXX 1 byte hole, try to pack */

         u16                        status;               /*    20     2 */

         /* XXX 2 bytes hole, try to pack */

         struct nvme_ctrl *         ctrl;                 /*    24     8 */

         /* size: 32, cachelines: 1, members: 7 */
         /* sum members: 29, holes: 2, sum holes: 3 */
         /* last cacheline: 32 bytes */
};


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-05-26  8:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 17:43 [PATCH v2 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
2021-05-19 17:43 ` [PATCH v2 1/4] params: lift param_set_uint_minmax to common code Sagi Grimberg
2021-05-19 21:10   ` Chaitanya Kulkarni
2021-05-20  6:01   ` Christoph Hellwig
2021-05-19 17:43 ` [PATCH v2 2/4] nvme-pci: limit maximum queue depth to 4095 Sagi Grimberg
2021-05-19 21:12   ` Chaitanya Kulkarni
2021-05-26  8:55   ` Hannes Reinecke
2021-05-19 17:43 ` [PATCH v2 3/4] nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data Sagi Grimberg
2021-05-26  8:56   ` Hannes Reinecke
2021-05-19 17:43 ` [PATCH v2 4/4] nvme: code command_id with a genctr for use-after-free validation Sagi Grimberg
2021-05-20  6:49   ` Daniel Wagner
2021-05-25 22:50   ` Max Gurtovoy
2021-05-26  0:39     ` Keith Busch
2021-05-26  1:47       ` Chaitanya Kulkarni
2021-05-26  8:41         ` Sagi Grimberg [this message]
2021-05-26  8:48           ` Hannes Reinecke
2021-05-26  9:26           ` Max Gurtovoy
2021-05-26  8:59   ` Hannes Reinecke
2021-06-16 16:28 ` [PATCH v2 0/4] nvme: protect against possible request reference after completion Sagi Grimberg
2021-06-16 17:04   ` Keith Busch
2021-06-16 21:05     ` Sagi Grimberg
2021-07-01 11:51       ` Daniel Wagner
2021-07-01 11:52         ` Christoph Hellwig
2021-07-08 10:02           ` Daniel Wagner
2021-07-09  6:37             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2426b42f-b345-2689-3bd8-8a24d6670ac2@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=Chaitanya.Kulkarni@wdc.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mgurtovoy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.