All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise@opengridcomputing.com>
To: Sagi Grimberg <sagi@grimberg.me>, Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-nvme@lists.infradead.org, Max Gurtovoy <maxg@mellanox.com>
Subject: Re: [PATCH v2] block: fix rdma queue mapping
Date: Wed, 3 Oct 2018 14:05:16 -0500	[thread overview]
Message-ID: <33192971-7edd-a3b6-f2fa-abdcbef375de@opengridcomputing.com> (raw)
In-Reply-To: <83dd169f-034b-3460-7496-ef2e6766ea55@grimberg.me>

CgpPbiA4LzI0LzIwMTggOToxNyBQTSwgU2FnaSBHcmltYmVyZyB3cm90ZToKPiAKPj4+IG52bWUt
cmRtYSBhdHRlbXB0cyB0byBtYXAgcXVldWVzIGJhc2VkIG9uIGlycSB2ZWN0b3IgYWZmaW5pdHku
Cj4+PiBIb3dldmVyLCBmb3Igc29tZSBkZXZpY2VzLCBjb21wbGV0aW9uIHZlY3RvciBpcnEgYWZm
aW5pdHkgaXMKPj4+IGNvbmZpZ3VyYWJsZSBieSB0aGUgdXNlciB3aGljaCBjYW4gYnJlYWsgdGhl
IGV4aXN0aW5nIGFzc3VtcHRpb24KPj4+IHRoYXQgaXJxIHZlY3RvcnMgYXJlIG9wdGltYWxseSBh
cnJhbmdlZCBvdmVyIHRoZSBob3N0IGNwdSBjb3Jlcy4KPj4KPj4gSUZGIGFmZmluaXR5IGlzIGNv
bmZpZ3VyYWJsZSB3ZSBzaG91bGQgbmV2ZXIgdXNlIHRoaXMgY29kZSwKPj4gYXMgaXQgYnJlYWtz
IHRoZSBtb2RlbCBlbnRpcmVseS7CoCBpYl9nZXRfdmVjdG9yX2FmZmluaXR5IHNob3VsZAo+PiBu
ZXZlciByZXR1cm4gYSB2YWxpZCBtYXNrIGlmIGFmZmluaXR5IGlzIGNvbmZpZ3VyYWJsZS4KPiAK
PiBJIGFncmVlIHRoYXQgdGhlIG1vZGVsIGludGVuZGVkIGluaXRpYWxseSBkb2Vzbid0IGZpdC4g
QnV0IGl0IHNlZW1zCj4gdGhhdCBzb21lIHVzZXJzIGxpa2UgdG8gd3JpdGUgaW50byB0aGVpciBu
aWMncwo+IC9wcm9jL2lycS8kSVJQL3NtcF9hZmZpbml0eSBhbmQgZ2V0IG1hZCBhdCB1cyBmb3Ig
bm90IGxldHRpbmcgdGhlbSB3aXRoCj4gdXNpbmcgbWFuYWdlZCBhZmZpbml0eS4KPiAKPiBTbyBp
bnN0ZWFkIG9mIGZhbGxpbmcgYmFjayB0byB0aGUgYmxvY2sgbWFwcGluZyBmdW5jdGlvbiB3ZSB0
cnkKPiB0byBkbyBhIGxpdHRsZSBiZXR0ZXIgZmlyc3Q6Cj4gMS4gbWFwIGFjY29yZGluZyB0byB0
aGUgZGV2aWNlIHZlY3RvciBhZmZpbml0eQo+IDIuIG1hcCB2ZWN0b3JzIHRoYXQgZW5kIHVwIHdp
dGhvdXQgYSBtYXBwaW5nIHRvIGNwdXMgdGhhdCBiZWxvbmcKPiDCoMKgIHRvIHRoZSBzYW1lIG51
bWEtbm9kZQo+IDMuIG1hcCBhbGwgdGhlIHJlc3Qgb2YgdGhlIHVubWFwcGVkIGNwdXMgbGlrZSB0
aGUgYmxvY2sgbGF5ZXIKPiDCoMKgIHdvdWxkIGRvLgo+IAo+IFdlIGNvdWxkIGhhdmUgZGV2aWNl
IGRyaXZlcnMgdGhhdCBkb24ndCB1c2UgbWFuYWdlZCBhZmZpbml0eSB0byBuZXZlcgo+IHJldHVy
biBhIHZhbGlkIG1hc2sgYnV0IHRoYXQgd291bGQgbmV2ZXIgYWxsb3cgYWZmaW5pdHkgYmFzZWQg
bWFwcGluZwo+IHdoaWNoIGlzIG9wdGltYWwgYXQgbGVhc3QgZm9yIHVzZXJzIHRoYXQgZG8gbm90
IG1hbmdsZSB3aXRoIGRldmljZQo+IGlycSBhZmZpbml0eSAod2hpY2ggaXMgcHJvYmFibHkgdGhl
IG1ham9yaXR5IG9mIHVzZXJzKS4KPiAKPiBUaG91Z2h0cz8KCkNhbiB3ZSBwbGVhc2UgbWFrZSBm
b3J3YXJkIHByb2dyZXNzIG9uIHRoaXM/CgpDaHJpc3RvcGgsIFNhZ2k6ICBpdCBzZWVtcyB5b3Ug
dGhpbmsgL3Byb2MvaXJxLyRJUlAvc21wX2FmZmluaXR5CnNob3VsZG4ndCBiZSBhbGxvd2VkIGlm
IGRyaXZlcnMgc3VwcG9ydCBtYW5hZ2VkIGFmZmluaXR5LiBJcyB0aGF0IGNvcnJlY3Q/CgpQZXJo
YXBzIHRoYXQgY2FuIGJlIGNvZGlmaWVkIGFuZCBiZSBhIHdheSBmb3J3YXJkPyAgSUU6IFNvbWVo
b3cgYWxsb3cKdGhlIGFkbWluIHRvIGNob29zZSBlaXRoZXIgIm1hbmFnZWQgYnkgdGhlIGRyaXZl
ci91bHBzIiBvciAibWFuYWdlZCBieQp0aGUgc3lzdGVtIGFkbWluIGRpcmVjdGx5Ij8KCk9yIGp1
c3QgdXNlIFNhZ2kncyBwYXRjaC4gIFBlcmhhcHMgYSBXQVJOX09OQ0UoKSBpZiB0aGUgYWZmaW5p
dHkgbG9va3MKd29ua2VkIHdoZW4gc2V0IHZpYSBwcm9jZnM/ICBKdXN0IHRoaW5raW5nIG91dCBs
b3VkLi4uCgpCdXQgYXMgaXQgc3RhbmRzLCB0aGluZ3MgYXJlIGp1c3QgcGxhaW4gYm9ya2VkIGlm
IGFuIHJkbWEgZHJpdmVyCnN1cHBvcnRzIGliX2dldF92ZWN0b3JfYWZmaW5pdHkoKSB5ZXQgdGhl
IGFkbWluIGNoYW5nZXMgdGhlIGFmZmluaXR5IHZpYQovcHJvYy4uLgoKVGhhbmtzLAoKU3RldmUu
CgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51eC1u
dm1lIG1haWxpbmcgbGlzdApMaW51eC1udm1lQGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xp
c3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1udm1lCg==

WARNING: multiple messages have this Message-ID (diff)
From: swise@opengridcomputing.com (Steve Wise)
Subject: [PATCH v2] block: fix rdma queue mapping
Date: Wed, 3 Oct 2018 14:05:16 -0500	[thread overview]
Message-ID: <33192971-7edd-a3b6-f2fa-abdcbef375de@opengridcomputing.com> (raw)
In-Reply-To: <83dd169f-034b-3460-7496-ef2e6766ea55@grimberg.me>



On 8/24/2018 9:17 PM, Sagi Grimberg wrote:
> 
>>> nvme-rdma attempts to map queues based on irq vector affinity.
>>> However, for some devices, completion vector irq affinity is
>>> configurable by the user which can break the existing assumption
>>> that irq vectors are optimally arranged over the host cpu cores.
>>
>> IFF affinity is configurable we should never use this code,
>> as it breaks the model entirely.? ib_get_vector_affinity should
>> never return a valid mask if affinity is configurable.
> 
> I agree that the model intended initially doesn't fit. But it seems
> that some users like to write into their nic's
> /proc/irq/$IRP/smp_affinity and get mad at us for not letting them with
> using managed affinity.
> 
> So instead of falling back to the block mapping function we try
> to do a little better first:
> 1. map according to the device vector affinity
> 2. map vectors that end up without a mapping to cpus that belong
> ?? to the same numa-node
> 3. map all the rest of the unmapped cpus like the block layer
> ?? would do.
> 
> We could have device drivers that don't use managed affinity to never
> return a valid mask but that would never allow affinity based mapping
> which is optimal at least for users that do not mangle with device
> irq affinity (which is probably the majority of users).
> 
> Thoughts?

Can we please make forward progress on this?

Christoph, Sagi:  it seems you think /proc/irq/$IRP/smp_affinity
shouldn't be allowed if drivers support managed affinity. Is that correct?

Perhaps that can be codified and be a way forward?  IE: Somehow allow
the admin to choose either "managed by the driver/ulps" or "managed by
the system admin directly"?

Or just use Sagi's patch.  Perhaps a WARN_ONCE() if the affinity looks
wonked when set via procfs?  Just thinking out loud...

But as it stands, things are just plain borked if an rdma driver
supports ib_get_vector_affinity() yet the admin changes the affinity via
/proc...

Thanks,

Steve.

  reply	other threads:[~2018-10-03 19:05 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20 20:54 [PATCH v2] block: fix rdma queue mapping Sagi Grimberg
2018-08-20 20:54 ` Sagi Grimberg
2018-08-21  2:04 ` Ming Lei
2018-08-21  2:04   ` Ming Lei
2018-08-25  2:06   ` Sagi Grimberg
2018-08-25  2:06     ` Sagi Grimberg
2018-08-25 12:18     ` Steve Wise
2018-08-25 12:18       ` Steve Wise
2018-08-27  3:50       ` Ming Lei
2018-08-27  3:50         ` Ming Lei
2018-08-22 13:11 ` Christoph Hellwig
2018-08-22 13:11   ` Christoph Hellwig
2018-08-25  2:17   ` Sagi Grimberg
2018-08-25  2:17     ` Sagi Grimberg
2018-10-03 19:05     ` Steve Wise [this message]
2018-10-03 19:05       ` Steve Wise
2018-10-03 21:14       ` Sagi Grimberg
2018-10-03 21:14         ` Sagi Grimberg
2018-10-03 21:21         ` Steve Wise
2018-10-03 21:21           ` Steve Wise
2018-10-16  1:04         ` Sagi Grimberg
2018-10-16  1:04           ` Sagi Grimberg
2018-10-17 16:37           ` Christoph Hellwig
2018-10-17 16:37             ` Christoph Hellwig
2018-10-17 16:37       ` Christoph Hellwig
2018-10-17 16:37         ` Christoph Hellwig
2018-10-23  6:02         ` Sagi Grimberg
2018-10-23  6:02           ` Sagi Grimberg
2018-10-23 13:00           ` Steve Wise
2018-10-23 13:00             ` Steve Wise
2018-10-23 21:25             ` Sagi Grimberg
2018-10-23 21:25               ` Sagi Grimberg
2018-10-23 21:31               ` Steve Wise
2018-10-23 21:31                 ` Steve Wise
2018-10-24  0:09               ` Shiraz Saleem
2018-10-24  0:09                 ` Shiraz Saleem
2018-10-24  0:37                 ` Sagi Grimberg
2018-10-24  0:37                   ` Sagi Grimberg
2018-10-29 23:58                   ` Saleem, Shiraz
2018-10-29 23:58                     ` Saleem, Shiraz
2018-10-30 18:26                     ` Sagi Grimberg
2018-10-30 18:26                       ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33192971-7edd-a3b6-f2fa-abdcbef375de@opengridcomputing.com \
    --to=swise@opengridcomputing.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maxg@mellanox.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.