All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-nvme@lists.infradead.org,
	Steve Wise <swise@opengridcomputing.com>,
	Max Gurtovoy <maxg@mellanox.com>
Subject: Re: [PATCH v2] block: fix rdma queue mapping
Date: Fri, 24 Aug 2018 19:17:38 -0700	[thread overview]
Message-ID: <83dd169f-034b-3460-7496-ef2e6766ea55@grimberg.me> (raw)
In-Reply-To: <20180822131130.GC28149@lst.de>


>> nvme-rdma attempts to map queues based on irq vector affinity.
>> However, for some devices, completion vector irq affinity is
>> configurable by the user which can break the existing assumption
>> that irq vectors are optimally arranged over the host cpu cores.
> 
> IFF affinity is configurable we should never use this code,
> as it breaks the model entirely.  ib_get_vector_affinity should
> never return a valid mask if affinity is configurable.

I agree that the model intended initially doesn't fit. But it seems
that some users like to write into their nic's
/proc/irq/$IRP/smp_affinity and get mad at us for not letting them with
using managed affinity.

So instead of falling back to the block mapping function we try
to do a little better first:
1. map according to the device vector affinity
2. map vectors that end up without a mapping to cpus that belong
    to the same numa-node
3. map all the rest of the unmapped cpus like the block layer
    would do.

We could have device drivers that don't use managed affinity to never
return a valid mask but that would never allow affinity based mapping
which is optimal at least for users that do not mangle with device
irq affinity (which is probably the majority of users).

Thoughts?

WARNING: multiple messages have this Message-ID (diff)
From: sagi@grimberg.me (Sagi Grimberg)
Subject: [PATCH v2] block: fix rdma queue mapping
Date: Fri, 24 Aug 2018 19:17:38 -0700	[thread overview]
Message-ID: <83dd169f-034b-3460-7496-ef2e6766ea55@grimberg.me> (raw)
In-Reply-To: <20180822131130.GC28149@lst.de>


>> nvme-rdma attempts to map queues based on irq vector affinity.
>> However, for some devices, completion vector irq affinity is
>> configurable by the user which can break the existing assumption
>> that irq vectors are optimally arranged over the host cpu cores.
> 
> IFF affinity is configurable we should never use this code,
> as it breaks the model entirely.  ib_get_vector_affinity should
> never return a valid mask if affinity is configurable.

I agree that the model intended initially doesn't fit. But it seems
that some users like to write into their nic's
/proc/irq/$IRP/smp_affinity and get mad at us for not letting them with
using managed affinity.

So instead of falling back to the block mapping function we try
to do a little better first:
1. map according to the device vector affinity
2. map vectors that end up without a mapping to cpus that belong
    to the same numa-node
3. map all the rest of the unmapped cpus like the block layer
    would do.

We could have device drivers that don't use managed affinity to never
return a valid mask but that would never allow affinity based mapping
which is optimal at least for users that do not mangle with device
irq affinity (which is probably the majority of users).

Thoughts?

  reply	other threads:[~2018-08-25  5:54 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20 20:54 [PATCH v2] block: fix rdma queue mapping Sagi Grimberg
2018-08-20 20:54 ` Sagi Grimberg
2018-08-21  2:04 ` Ming Lei
2018-08-21  2:04   ` Ming Lei
2018-08-25  2:06   ` Sagi Grimberg
2018-08-25  2:06     ` Sagi Grimberg
2018-08-25 12:18     ` Steve Wise
2018-08-25 12:18       ` Steve Wise
2018-08-27  3:50       ` Ming Lei
2018-08-27  3:50         ` Ming Lei
2018-08-22 13:11 ` Christoph Hellwig
2018-08-22 13:11   ` Christoph Hellwig
2018-08-25  2:17   ` Sagi Grimberg [this message]
2018-08-25  2:17     ` Sagi Grimberg
2018-10-03 19:05     ` Steve Wise
2018-10-03 19:05       ` Steve Wise
2018-10-03 21:14       ` Sagi Grimberg
2018-10-03 21:14         ` Sagi Grimberg
2018-10-03 21:21         ` Steve Wise
2018-10-03 21:21           ` Steve Wise
2018-10-16  1:04         ` Sagi Grimberg
2018-10-16  1:04           ` Sagi Grimberg
2018-10-17 16:37           ` Christoph Hellwig
2018-10-17 16:37             ` Christoph Hellwig
2018-10-17 16:37       ` Christoph Hellwig
2018-10-17 16:37         ` Christoph Hellwig
2018-10-23  6:02         ` Sagi Grimberg
2018-10-23  6:02           ` Sagi Grimberg
2018-10-23 13:00           ` Steve Wise
2018-10-23 13:00             ` Steve Wise
2018-10-23 21:25             ` Sagi Grimberg
2018-10-23 21:25               ` Sagi Grimberg
2018-10-23 21:31               ` Steve Wise
2018-10-23 21:31                 ` Steve Wise
2018-10-24  0:09               ` Shiraz Saleem
2018-10-24  0:09                 ` Shiraz Saleem
2018-10-24  0:37                 ` Sagi Grimberg
2018-10-24  0:37                   ` Sagi Grimberg
2018-10-29 23:58                   ` Saleem, Shiraz
2018-10-29 23:58                     ` Saleem, Shiraz
2018-10-30 18:26                     ` Sagi Grimberg
2018-10-30 18:26                       ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83dd169f-034b-3460-7496-ef2e6766ea55@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maxg@mellanox.com \
    --cc=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.