All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>, <linux-nvme@lists.infradead.org>,
	<chaitanyak@nvidia.com>, <oren@nvidia.com>, <benishay@nvidia.com>,
	<borisp@nvidia.com>, <aviadye@nvidia.com>, <idanb@nvidia.com>,
	<jsmart2021@gmail.com>
Subject: Re: [PATCH v1 0/4] Add command id quirk for fabrics
Date: Thu, 11 Nov 2021 11:29:11 +0200	[thread overview]
Message-ID: <fc27faf4-c876-ced5-c9cb-2d6730e07f2d@nvidia.com> (raw)
In-Reply-To: <e2d589ab-2457-a3c8-ba2a-9e4f813d9a72@grimberg.me>


On 11/10/2021 9:45 PM, Sagi Grimberg wrote:
> Hey, sorry for the late chime here, ramping up on some emails.
>
>>>>>>> Max, if you can't point us to a broken target (and yes, it is 
>>>>>>> broken)
>>>>>>> this will not go anywhere.
>>>>>> Any target that uses Apple device as backend can be harmed.
>>>>>>
>>>>>> Most simple example is Linux PT target that copy the sqe as-is 
>>>>>> and passes
>>>>>> it to the NVMe Apple drive.
>>>>> Take another close look at how command_id are assigned my Linux 
>>>>> driver.
>>>>> We obviously do not pass it through as that would be completely 
>>>>> broken.
>>>> Also worth noting this driver has always defined the command id as a
>>>> __u16, not __le16, yet we don't have any bug reports from big-endian
>>>> hosts.
>>>
>>> Right, my bad. I thought that the pass-through target uses the same id.
>>>
>>> Linux PT target works fine.
>>>
>>> Bad example.
>>>
>>> Linux kernel world is covered but I still think we need to add this 
>>> ability
>>> for fabrics controllers as we did for pci controllers.
>>>
>>> There are a lot of vendors out there with their optimizations and 
>>> solutions
>>> and by adding some code to cover a broken TCP target (that no one 
>>> said what
>>> is this target and why nobody fixed it) by default that hurts others 
>>> (even
>>> if it's spec compliant) is not a good practice.
>
> Completely disagree here. The TCP original report was just an example of
> lack of protection we have against spurious completions. Nothing
> specific about nvme-tcp here, this was discussed and agreed on in
> the original report.
>
You are ignoring the facts:

1. The device that broke the spec in the first place was that device for 
which caused you to add the gen bits to CID.

2. These gen bits are causing the limit of 4K Q_depth.

3. It's not mention anywhere in the spec, and if it was intended to be 
implemented like it's now - it would have mentioned in the spec.

4. Since gen bits were introduced, other devices got broken (such as 
Apple), hence the quirk for PCI.

5. The gen bits adds "if" conditions and logic to the fast path for 
"innosent" transports.

6. This series just extends this quirk for fabrics.

7. Even if not broken, some devices may suffer from reduced performance 
having CID space spanning all 16 possible bit - fact that we ignore

8. This series provides a flag to disable default behavior per connection.

9. This series doesn't add any logic to fast path.

10. My patch from last year for resiliency for nvme_pci was rejected 
because it added one if condition to the fast path - no consistency.


>> Could you qualify the harm this caused? The command id is just an opaque
>> cookie; the target should not do any interpretation on it, so this
>> encoding should be inconsequential from the target's perspective.
>
> Exactly, the command id is an opaque that is solely up to the host
> discretion in terms of how to use it. It's pure coincidence that Linux
> uses it for command indexes.
>
> Any implementation that interprets command ids to _anything_ needs
> a quirk, not the other way around.
>
>> There are more hosts than just Linux that may encode id's with flags for
>> driver use, so non-compliance here is just asking for trouble.
>
> I know of at least one significant host implementation where command
> ids are not indexes.
>
>> If a vendor wants to constrain the command id for some vendor specific
>> optimization, they should bring forth a TPar and fight it out in the
>> workgroup.
>>
>> We did get bug reports that not validating command id's will crash the
>> kernel or corrupt data if an unexpected response is observed. Even
>> though the incorrect id is not the kernel's fault, we generally strive
>> for resilience against those types of observations in spite of
>> potentially flaky hardware.
>
> Agreed.


  reply	other threads:[~2021-11-11  9:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-08 14:46 [PATCH v1 0/4] Add command id quirk for fabrics Max Gurtovoy
2021-11-08 14:46 ` [PATCH 1/1 nvmecli] fabrics: add new --skip-cid-gen flag to connect cmd Max Gurtovoy
2021-11-08 14:46 ` [PATCH 1/1 libnvme] fabrics: add support for new cli --skip-cid-gen flag Max Gurtovoy
2021-11-08 14:47 ` [PATCH 1/4] nvme-fabrics: add command id quirk for fabrics controllers Max Gurtovoy
2021-11-08 14:47 ` [PATCH 2/4] nvme-rdma: add command id quirk for RDMA controllers Max Gurtovoy
2021-11-08 14:47 ` [PATCH 3/4] nvme-tcp: add command id quirk for TCP controllers Max Gurtovoy
2021-11-08 14:47 ` [PATCH 4/4] nvme-fc: add command id quirk for FC controllers Max Gurtovoy
2021-11-08 16:45 ` [PATCH v1 0/4] Add command id quirk for fabrics Keith Busch
2021-11-09  8:09   ` Christoph Hellwig
2021-11-09 12:08     ` Max Gurtovoy
2021-11-09 13:15       ` Christoph Hellwig
2021-11-09 14:23         ` Max Gurtovoy
2021-11-09 14:31           ` Christoph Hellwig
2021-11-09 16:15             ` Keith Busch
2021-11-09 16:59               ` Max Gurtovoy
2021-11-09 19:04                 ` Keith Busch
2021-11-10 19:45                   ` Sagi Grimberg
2021-11-11  9:29                     ` Max Gurtovoy [this message]
2021-11-11 17:36                       ` Keith Busch
2021-11-12 16:07                       ` Sagi Grimberg
2021-11-12 21:37                         ` Keith Busch
2021-11-18 11:19                         ` Max Gurtovoy
2021-11-21 10:05                           ` Sagi Grimberg
2021-11-10 10:32       ` Daniel Wagner
2021-11-10 10:56         ` Max Gurtovoy
2021-11-10 11:18           ` Daniel Wagner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc27faf4-c876-ced5-c9cb-2d6730e07f2d@nvidia.com \
    --to=mgurtovoy@nvidia.com \
    --cc=aviadye@nvidia.com \
    --cc=benishay@nvidia.com \
    --cc=borisp@nvidia.com \
    --cc=chaitanyak@nvidia.com \
    --cc=hch@lst.de \
    --cc=idanb@nvidia.com \
    --cc=jsmart2021@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=oren@nvidia.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.