All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH 0/3] nvme sq associations
@ 2021-09-29  0:48 Nikitin, Andrey
  2021-09-29  1:35 ` Keith Busch
  0 siblings, 1 reply; 8+ messages in thread
From: Nikitin, Andrey @ 2021-09-29  0:48 UTC (permalink / raw)
  To: Christoph Hellwig, Benjamin Herrenschmidt
  Cc: Keith Busch, linux-nvme, Buches, Dave

On 9/25/21, 01:38, "Christoph Hellwig" <hch@infradead.org> wrote:
>
> Honestly I'd rather not merge this whole patchset at all.  It is a
> completly frinde feature for a totally misdesigned part of the NVMe
> spec.  Until actual controller in the hands of prosumers support
> anything like that I'm very reluctant to bloat the driver fast path for
> it.

Thank you for the feedback.
While I agree with your remarks regarding feature design in NVMe spec
the minimal implementation proposed in this patchset would help resolving
the problems outlined int the original post (undesired queue sharing and
noisy neighbor).
For controllers that do not support NVM sets and SQ associations the
configuration would stay the same as it used to be. So I would be
interested to know more about what brings your concerns for the driver
fast path.

Best regards,
Andrey


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] nvme sq associations
  2021-09-29  0:48 [RFC PATCH 0/3] nvme sq associations Nikitin, Andrey
@ 2021-09-29  1:35 ` Keith Busch
  0 siblings, 0 replies; 8+ messages in thread
From: Keith Busch @ 2021-09-29  1:35 UTC (permalink / raw)
  To: Nikitin, Andrey
  Cc: Christoph Hellwig, Benjamin Herrenschmidt, linux-nvme, Buches, Dave

On Wed, Sep 29, 2021 at 12:48:49AM +0000, Nikitin, Andrey wrote:
> On 9/25/21, 01:38, "Christoph Hellwig" <hch@infradead.org> wrote:
> >
> > Honestly I'd rather not merge this whole patchset at all.  It is a
> > completly frinde feature for a totally misdesigned part of the NVMe
> > spec.  Until actual controller in the hands of prosumers support
> > anything like that I'm very reluctant to bloat the driver fast path for
> > it.
> 
> Thank you for the feedback.
> While I agree with your remarks regarding feature design in NVMe spec
> the minimal implementation proposed in this patchset would help resolving
> the problems outlined int the original post (undesired queue sharing and
> noisy neighbor).

You still get noisy neighbor with this approach, though: completions
from one set can stall the driver from reaping another set's completions
that were posted earlier.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] nvme sq associations
  2021-09-29  6:07 ` Chaitanya Kulkarni
@ 2021-09-29 13:17   ` Sagi Grimberg
  0 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2021-09-29 13:17 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Andrey Nikitin, linux-nvme; +Cc: benh, davebuch


>> The NVMe specification allows for namespaces with different performance
>> characteristics, as well as allowing IOs submissions to any namespace via
>> any non-empty submission queue. However, sharing queue resources between
>> namespaces with different performance characteristics can cause undesired
>> behavior (e.g. head-of-line-blocking for IOs that target a high-performance
>> namespace behind IOs that target a low performance namespace via the same
>> queue). In addition, the lack of hardware queue isolation support can cause
>> “noisy neighbor” type problems for applications issuing IOs to different
>> namespaces of the same controller. This problem may be especially pronounced
>> in multi-tenant environments such as the ones provided by cloud services.
>>
> 
> For RFC such as this you need to provide a quantitative data for
> different usecases for targeted (e.g. noisy neighbor problem) and non
> targeted workloads (e.g. standard fio perf) and also the performance
> impact of this change on the the hot path.

Agreed.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] nvme sq associations
  2021-09-24 21:08 Andrey Nikitin
  2021-09-25  3:02 ` Keith Busch
@ 2021-09-29  6:07 ` Chaitanya Kulkarni
  2021-09-29 13:17   ` Sagi Grimberg
  1 sibling, 1 reply; 8+ messages in thread
From: Chaitanya Kulkarni @ 2021-09-29  6:07 UTC (permalink / raw)
  To: Andrey Nikitin, linux-nvme; +Cc: benh, davebuch

On 9/24/21 2:08 PM, Andrey Nikitin wrote:
> The NVMe specification allows for namespaces with different performance
> characteristics, as well as allowing IOs submissions to any namespace via
> any non-empty submission queue. However, sharing queue resources between
> namespaces with different performance characteristics can cause undesired
> behavior (e.g. head-of-line-blocking for IOs that target a high-performance
> namespace behind IOs that target a low performance namespace via the same
> queue). In addition, the lack of hardware queue isolation support can cause
> “noisy neighbor” type problems for applications issuing IOs to different
> namespaces of the same controller. This problem may be especially pronounced
> in multi-tenant environments such as the ones provided by cloud services.
> 

For RFC such as this you need to provide a quantitative data for 
different usecases for targeted (e.g. noisy neighbor problem) and non
targeted workloads (e.g. standard fio perf) and also the performance 
impact of this change on the the hot path.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] nvme sq associations
  2021-09-25  8:31   ` Benjamin Herrenschmidt
@ 2021-09-25  8:36     ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2021-09-25  8:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Keith Busch, Andrey Nikitin, linux-nvme, davebuch

On Sat, Sep 25, 2021 at 06:31:58PM +1000, Benjamin Herrenschmidt wrote:
> On Sat, 2021-09-25 at 12:02 +0900, Keith Busch wrote:
> > 
> > Different submission queue groups per NVM Set sounds right for this
> > feature, but I'm not sure it makes sense for these to have their own
> > completion queues: completions from different sets would try to
> > schedule on the same CPU. I think it should be more efficient to
> > break the 1:1
> > SQ:CQ pairing, and instead have all the SQs with the same CPU
> > affinity share a single CQ so that completions from different
> > namespaces could be handled in a single interrupt.
> 
> Can this be an incremental improvement ?

Honestly I'd rather not merge this whole patchset at all.  It is a
completly frinde feature for a totally misdesigned part of the NVMe
spec.  Until actual controller in the hands of prosumers support
anything like that I'm very reluctant to bloat the driver fast path for
it.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] nvme sq associations
  2021-09-25  3:02 ` Keith Busch
@ 2021-09-25  8:31   ` Benjamin Herrenschmidt
  2021-09-25  8:36     ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2021-09-25  8:31 UTC (permalink / raw)
  To: Keith Busch, Andrey Nikitin; +Cc: linux-nvme, davebuch

On Sat, 2021-09-25 at 12:02 +0900, Keith Busch wrote:
> 
> Different submission queue groups per NVM Set sounds right for this
> feature, but I'm not sure it makes sense for these to have their own
> completion queues: completions from different sets would try to
> schedule on the same CPU. I think it should be more efficient to
> break the 1:1
> SQ:CQ pairing, and instead have all the SQs with the same CPU
> affinity share a single CQ so that completions from different
> namespaces could be handled in a single interrupt.

Can this be an incremental improvement ?

Cheers,
Ben.



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 0/3] nvme sq associations
  2021-09-24 21:08 Andrey Nikitin
@ 2021-09-25  3:02 ` Keith Busch
  2021-09-25  8:31   ` Benjamin Herrenschmidt
  2021-09-29  6:07 ` Chaitanya Kulkarni
  1 sibling, 1 reply; 8+ messages in thread
From: Keith Busch @ 2021-09-25  3:02 UTC (permalink / raw)
  To: Andrey Nikitin; +Cc: linux-nvme, benh, davebuch

On Fri, Sep 24, 2021 at 09:08:06PM +0000, Andrey Nikitin wrote:
> The NVMe specification allows for namespaces with different performance
> characteristics, as well as allowing IOs submissions to any namespace via
> any non-empty submission queue. However, sharing queue resources between
> namespaces with different performance characteristics can cause undesired
> behavior (e.g. head-of-line-blocking for IOs that target a high-performance
> namespace behind IOs that target a low performance namespace via the same
> queue). In addition, the lack of hardware queue isolation support can cause
> “noisy neighbor” type problems for applications issuing IOs to different
> namespaces of the same controller. This problem may be especially pronounced
> in multi-tenant environments such as the ones provided by cloud services.
> 
> The NVMe 1.4 specification has introduced some optional features (NVM sets
> and SQ associations) that can be utilized to improve this situation provided
> these features are supported by both controllers and host drivers. Namespaces
> can be assigned to NVM sets (by performance characteristics, for example)
> which each NVM set having its own set of associated queues.
> 
> This patch series proposes a simple implementation of NVM sets and
> SQ associations for the NVMe host PCI module.  A controller that supports
> these features, along with a sufficient number of queue pairs (at least
> one per NVM set), will have the available queue pairs associated uniformly
> across each NVM set. IO requests directed at the controller will honor
> the namespace/NVM set/queue association by virtue of each NVM set having
> its own blk-mq tagset associated with it.

Different submission queue groups per NVM Set sounds right for this
feature, but I'm not sure it makes sense for these to have their own
completion queues: completions from different sets would try to schedule
on the same CPU. I think it should be more efficient to break the 1:1
SQ:CQ pairing, and instead have all the SQs with the same CPU affinity
share a single CQ so that completions from different namespaces could be
handled in a single interrupt.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 0/3] nvme sq associations
@ 2021-09-24 21:08 Andrey Nikitin
  2021-09-25  3:02 ` Keith Busch
  2021-09-29  6:07 ` Chaitanya Kulkarni
  0 siblings, 2 replies; 8+ messages in thread
From: Andrey Nikitin @ 2021-09-24 21:08 UTC (permalink / raw)
  To: linux-nvme; +Cc: benh, davebuch, Andrey Nikitin

The NVMe specification allows for namespaces with different performance
characteristics, as well as allowing IOs submissions to any namespace via
any non-empty submission queue. However, sharing queue resources between
namespaces with different performance characteristics can cause undesired
behavior (e.g. head-of-line-blocking for IOs that target a high-performance
namespace behind IOs that target a low performance namespace via the same
queue). In addition, the lack of hardware queue isolation support can cause
“noisy neighbor” type problems for applications issuing IOs to different
namespaces of the same controller. This problem may be especially pronounced
in multi-tenant environments such as the ones provided by cloud services.

The NVMe 1.4 specification has introduced some optional features (NVM sets
and SQ associations) that can be utilized to improve this situation provided
these features are supported by both controllers and host drivers. Namespaces
can be assigned to NVM sets (by performance characteristics, for example)
which each NVM set having its own set of associated queues.

This patch series proposes a simple implementation of NVM sets and
SQ associations for the NVMe host PCI module.  A controller that supports
these features, along with a sufficient number of queue pairs (at least
one per NVM set), will have the available queue pairs associated uniformly
across each NVM set. IO requests directed at the controller will honor
the namespace/NVM set/queue association by virtue of each NVM set having
its own blk-mq tagset associated with it.

Andrey Nikitin (3):
  nvme: split admin queue in pci
  nvme: add NVM set structures
  nvme: implement SQ associations

 drivers/nvme/host/core.c   |  18 +-
 drivers/nvme/host/fc.c     |   1 +
 drivers/nvme/host/nvme.h   |  10 +
 drivers/nvme/host/pci.c    | 363 +++++++++++++++++++++++++------------
 drivers/nvme/host/rdma.c   |   1 +
 drivers/nvme/host/tcp.c    |   1 +
 drivers/nvme/target/loop.c |   1 +
 include/linux/nvme.h       |  11 +-
 8 files changed, 286 insertions(+), 120 deletions(-)

-- 
2.32.0


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-13  6:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-29  0:48 [RFC PATCH 0/3] nvme sq associations Nikitin, Andrey
2021-09-29  1:35 ` Keith Busch
  -- strict thread matches above, loose matches on Subject: below --
2021-09-24 21:08 Andrey Nikitin
2021-09-25  3:02 ` Keith Busch
2021-09-25  8:31   ` Benjamin Herrenschmidt
2021-09-25  8:36     ` Christoph Hellwig
2021-09-29  6:07 ` Chaitanya Kulkarni
2021-09-29 13:17   ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.