linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nvme: use the correct msix vector for each queue
@ 2016-12-07 22:03 Dan Streetman
  2016-12-07 22:44 ` Keith Busch
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Streetman @ 2016-12-07 22:03 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe
  Cc: Dan Streetman, linux-nvme, linux-kernel, Dan Streetman

Change each queue's cq_vector to match its qid, instead of qid - 1.

The first queue is always the admin queue, and the remaining queues are
I/O queues.  The interrupt vectors they use are all in the same array,
however, the vector indexes for the admin and I/O queues are setup
differently; the admin queue's cq_vector is manually set to 0, while
each I/O queue's cq_vector is set to qid - 1.  Since the admin queue
is qid 0, and the I/O queues start at qid 1, using qid - 1 is wrong for the
I/O queues, as it makes the first I/O queue (qid 1) share the vector from
the admin queue (qid 0), and no queue uses the last interrupt vector.
Instead, each I/O queue should set their cq_vector to qid.

Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
---
 drivers/nvme/host/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 5e52034..def2285 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1117,7 +1117,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
 	struct nvme_dev *dev = nvmeq->dev;
 	int result;
 
-	nvmeq->cq_vector = qid - 1;
+	nvmeq->cq_vector = qid;
 	result = adapter_alloc_cq(dev, qid, nvmeq);
 	if (result < 0)
 		return result;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:44 ` Keith Busch
@ 2016-12-07 22:36   ` Dan Streetman
  2016-12-07 22:49     ` Keith Busch
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Streetman @ 2016-12-07 22:36 UTC (permalink / raw)
  To: Keith Busch; +Cc: Jens Axboe, linux-nvme, linux-kernel, Dan Streetman

On Wed, Dec 7, 2016 at 5:44 PM, Keith Busch <keith.busch@intel.com> wrote:
> On Wed, Dec 07, 2016 at 05:03:48PM -0500, Dan Streetman wrote:
>> Change each queue's cq_vector to match its qid, instead of qid - 1.
>>
>> The first queue is always the admin queue, and the remaining queues are
>> I/O queues.  The interrupt vectors they use are all in the same array,
>> however, the vector indexes for the admin and I/O queues are setup
>> differently; the admin queue's cq_vector is manually set to 0, while
>> each I/O queue's cq_vector is set to qid - 1.  Since the admin queue
>> is qid 0, and the I/O queues start at qid 1, using qid - 1 is wrong for the
>> I/O queues, as it makes the first I/O queue (qid 1) share the vector from
>> the admin queue (qid 0), and no queue uses the last interrupt vector.
>> Instead, each I/O queue should set their cq_vector to qid.
>
> pci_alloc_irq_vectors doesn't know you intend to make the first
> vector special, so it's going to come up with a CPU affinity from
> blk_mq_pci_map_queues that clashes with what you've programmed in the
> IO completion queues.

I don't follow.  You're saying you mean to share cq_vector 0 between
the admin queue and io queue 1?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:03 [PATCH] nvme: use the correct msix vector for each queue Dan Streetman
@ 2016-12-07 22:44 ` Keith Busch
  2016-12-07 22:36   ` Dan Streetman
  0 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2016-12-07 22:44 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Jens Axboe, linux-nvme, linux-kernel, Dan Streetman

On Wed, Dec 07, 2016 at 05:03:48PM -0500, Dan Streetman wrote:
> Change each queue's cq_vector to match its qid, instead of qid - 1.
> 
> The first queue is always the admin queue, and the remaining queues are
> I/O queues.  The interrupt vectors they use are all in the same array,
> however, the vector indexes for the admin and I/O queues are setup
> differently; the admin queue's cq_vector is manually set to 0, while
> each I/O queue's cq_vector is set to qid - 1.  Since the admin queue
> is qid 0, and the I/O queues start at qid 1, using qid - 1 is wrong for the
> I/O queues, as it makes the first I/O queue (qid 1) share the vector from
> the admin queue (qid 0), and no queue uses the last interrupt vector.
> Instead, each I/O queue should set their cq_vector to qid.

pci_alloc_irq_vectors doesn't know you intend to make the first
vector special, so it's going to come up with a CPU affinity from
blk_mq_pci_map_queues that clashes with what you've programmed in the
IO completion queues.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:49     ` Keith Busch
@ 2016-12-07 22:46       ` Christoph Hellwig
  2016-12-07 22:49         ` Dan Streetman
  2016-12-07 22:46       ` Dan Streetman
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2016-12-07 22:46 UTC (permalink / raw)
  To: Keith Busch
  Cc: Dan Streetman, Jens Axboe, Dan Streetman, linux-kernel, linux-nvme

On Wed, Dec 07, 2016 at 05:49:42PM -0500, Keith Busch wrote:
> I'm just saying that blk-mq's hctx mapping will end up choosing a queue
> who's vector is mapped to a different CPU, and we don't want that.

Right.  For 4.10 we could use the pci_alloc_irq_vectors_affinity helper
to set away a pre_vector IFF we want a separate vector for the admin
queue.

> We are currently sharing the first IO queue's interrupt vector with
> the admin queue's on purpose. Are you saying there's something wrong
> with that?

But given that the sharing was done intentionally and we had a long
discussion on it back then there should be no real reason to change
the assignment in NVMe.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:49     ` Keith Busch
  2016-12-07 22:46       ` Christoph Hellwig
@ 2016-12-07 22:46       ` Dan Streetman
  2016-12-07 23:05         ` Keith Busch
  1 sibling, 1 reply; 8+ messages in thread
From: Dan Streetman @ 2016-12-07 22:46 UTC (permalink / raw)
  To: Keith Busch; +Cc: Jens Axboe, linux-nvme, linux-kernel, Dan Streetman

On Wed, Dec 7, 2016 at 5:49 PM, Keith Busch <keith.busch@intel.com> wrote:
> On Wed, Dec 07, 2016 at 05:36:00PM -0500, Dan Streetman wrote:
>> On Wed, Dec 7, 2016 at 5:44 PM, Keith Busch <keith.busch@intel.com> wrote:
>> > pci_alloc_irq_vectors doesn't know you intend to make the first
>> > vector special, so it's going to come up with a CPU affinity from
>> > blk_mq_pci_map_queues that clashes with what you've programmed in the
>> > IO completion queues.
>>
>> I don't follow.  You're saying you mean to share cq_vector 0 between
>> the admin queue and io queue 1?
>
> I'm just saying that blk-mq's hctx mapping will end up choosing a queue
> who's vector is mapped to a different CPU, and we don't want that.
>
> We are currently sharing the first IO queue's interrupt vector with
> the admin queue's on purpose. Are you saying there's something wrong
> with that?

that's intentional?  Ok then.  That's extremely non-obvious.

Is there a reason you want to share the interrupt between the queues?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:46       ` Christoph Hellwig
@ 2016-12-07 22:49         ` Dan Streetman
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Streetman @ 2016-12-07 22:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Jens Axboe, Dan Streetman, linux-kernel, linux-nvme

On Wed, Dec 7, 2016 at 5:46 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Dec 07, 2016 at 05:49:42PM -0500, Keith Busch wrote:
>> I'm just saying that blk-mq's hctx mapping will end up choosing a queue
>> who's vector is mapped to a different CPU, and we don't want that.
>
> Right.  For 4.10 we could use the pci_alloc_irq_vectors_affinity helper
> to set away a pre_vector IFF we want a separate vector for the admin
> queue.
>
>> We are currently sharing the first IO queue's interrupt vector with
>> the admin queue's on purpose. Are you saying there's something wrong
>> with that?
>
> But given that the sharing was done intentionally and we had a long
> discussion on it back then there should be no real reason to change
> the assignment in NVMe.

sorry, i missed the past discussion.  It still seems strange and
obscure that it's intentional, from reading the code at least.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:36   ` Dan Streetman
@ 2016-12-07 22:49     ` Keith Busch
  2016-12-07 22:46       ` Christoph Hellwig
  2016-12-07 22:46       ` Dan Streetman
  0 siblings, 2 replies; 8+ messages in thread
From: Keith Busch @ 2016-12-07 22:49 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Jens Axboe, linux-nvme, linux-kernel, Dan Streetman

On Wed, Dec 07, 2016 at 05:36:00PM -0500, Dan Streetman wrote:
> On Wed, Dec 7, 2016 at 5:44 PM, Keith Busch <keith.busch@intel.com> wrote:
> > pci_alloc_irq_vectors doesn't know you intend to make the first
> > vector special, so it's going to come up with a CPU affinity from
> > blk_mq_pci_map_queues that clashes with what you've programmed in the
> > IO completion queues.
> 
> I don't follow.  You're saying you mean to share cq_vector 0 between
> the admin queue and io queue 1?

I'm just saying that blk-mq's hctx mapping will end up choosing a queue
who's vector is mapped to a different CPU, and we don't want that.

We are currently sharing the first IO queue's interrupt vector with
the admin queue's on purpose. Are you saying there's something wrong
with that?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nvme: use the correct msix vector for each queue
  2016-12-07 22:46       ` Dan Streetman
@ 2016-12-07 23:05         ` Keith Busch
  0 siblings, 0 replies; 8+ messages in thread
From: Keith Busch @ 2016-12-07 23:05 UTC (permalink / raw)
  To: Dan Streetman; +Cc: Jens Axboe, linux-nvme, linux-kernel, Dan Streetman

On Wed, Dec 07, 2016 at 05:46:34PM -0500, Dan Streetman wrote:
> 
> Is there a reason you want to share the interrupt between the queues?

The admin queue is hardly ever used (if at all) compared to an IO queue's
usage, so why waste the resource? I bet you can't measure a preformance
difference on queues that don't share, and it's reasonable to expect an
nvme controller's MSI-x count matches the IO queue count, which forces
us to share a vector.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-12-07 22:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-07 22:03 [PATCH] nvme: use the correct msix vector for each queue Dan Streetman
2016-12-07 22:44 ` Keith Busch
2016-12-07 22:36   ` Dan Streetman
2016-12-07 22:49     ` Keith Busch
2016-12-07 22:46       ` Christoph Hellwig
2016-12-07 22:49         ` Dan Streetman
2016-12-07 22:46       ` Dan Streetman
2016-12-07 23:05         ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).