linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers
@ 2020-09-29  9:13 Leon Romanovsky
  2020-09-29  9:13 ` [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code Leon Romanovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-09-29  9:13 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe, Jens Axboe, Keith Busch
  Cc: Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme, linux-block,
	Leon Romanovsky, Christoph Hellwig

From: Leon Romanovsky <leonro@nvidia.com>

There are no drivers that implement .get_vector_affinity(), so delete
the RDMA function and simplify block code.

Thanks

P.S. Probably it should go through block tree.

Leon Romanovsky (2):
  blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  RDMA/core: Delete not-implemented get_vector_affinity

 block/Kconfig                    |  5 ----
 block/Makefile                   |  1 -
 block/blk-mq-rdma.c              | 44 --------------------------------
 drivers/infiniband/core/device.c |  1 -
 drivers/nvme/host/rdma.c         |  7 ++---
 include/linux/blk-mq-rdma.h      | 11 --------
 include/rdma/ib_verbs.h          | 23 -----------------
 7 files changed, 2 insertions(+), 90 deletions(-)
 delete mode 100644 block/blk-mq-rdma.c
 delete mode 100644 include/linux/blk-mq-rdma.h

--
2.26.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-09-29  9:13 [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
@ 2020-09-29  9:13 ` Leon Romanovsky
  2020-09-29 10:20   ` Christoph Hellwig
  2020-09-29  9:13 ` [PATCH blk-next 2/2] RDMA/core: Delete not-implemented get_vector_affinity Leon Romanovsky
  2020-10-01  5:01 ` [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
  2 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2020-09-29  9:13 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe, Jens Axboe, Keith Busch
  Cc: Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme, linux-block,
	Leon Romanovsky, Christoph Hellwig

From: Leon Romanovsky <leonro@nvidia.com>

The RDMA vector affinity code is not backed up by any driver and always
returns NULL to every ib_get_vector_affinity() call.

This means that blk_mq_rdma_map_queues() always takes fallback path.

Fixes: 9afc97c29b03 ("mlx5: remove support for ib_get_vector_affinity")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 block/Kconfig               |  5 -----
 block/Makefile              |  1 -
 block/blk-mq-rdma.c         | 44 -------------------------------------
 drivers/nvme/host/rdma.c    |  7 ++----
 include/linux/blk-mq-rdma.h | 11 ----------
 5 files changed, 2 insertions(+), 66 deletions(-)
 delete mode 100644 block/blk-mq-rdma.c
 delete mode 100644 include/linux/blk-mq-rdma.h

diff --git a/block/Kconfig b/block/Kconfig
index bbad5e8bbffe..8ede308a1343 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -227,11 +227,6 @@ config BLK_MQ_VIRTIO
 	depends on BLOCK && VIRTIO
 	default y

-config BLK_MQ_RDMA
-	bool
-	depends on BLOCK && INFINIBAND
-	default y
-
 config BLK_PM
 	def_bool BLOCK && PM

diff --git a/block/Makefile b/block/Makefile
index 8d841f5f986f..bbdc3e82308a 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -29,7 +29,6 @@ obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o
 obj-$(CONFIG_BLK_DEV_INTEGRITY_T10)	+= t10-pi.o
 obj-$(CONFIG_BLK_MQ_PCI)	+= blk-mq-pci.o
 obj-$(CONFIG_BLK_MQ_VIRTIO)	+= blk-mq-virtio.o
-obj-$(CONFIG_BLK_MQ_RDMA)	+= blk-mq-rdma.o
 obj-$(CONFIG_BLK_DEV_ZONED)	+= blk-zoned.o
 obj-$(CONFIG_BLK_WBT)		+= blk-wbt.o
 obj-$(CONFIG_BLK_DEBUG_FS)	+= blk-mq-debugfs.o
diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c
deleted file mode 100644
index 14f968e58b8f..000000000000
--- a/block/blk-mq-rdma.c
+++ /dev/null
@@ -1,44 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (c) 2017 Sagi Grimberg.
- */
-#include <linux/blk-mq.h>
-#include <linux/blk-mq-rdma.h>
-#include <rdma/ib_verbs.h>
-
-/**
- * blk_mq_rdma_map_queues - provide a default queue mapping for rdma device
- * @map:	CPU to hardware queue map.
- * @dev:	rdma device to provide a mapping for.
- * @first_vec:	first interrupt vectors to use for queues (usually 0)
- *
- * This function assumes the rdma device @dev has at least as many available
- * interrupt vetors as @set has queues.  It will then query it's affinity mask
- * and built queue mapping that maps a queue to the CPUs that have irq affinity
- * for the corresponding vector.
- *
- * In case either the driver passed a @dev with less vectors than
- * @set->nr_hw_queues, or @dev does not provide an affinity mask for a
- * vector, we fallback to the naive mapping.
- */
-int blk_mq_rdma_map_queues(struct blk_mq_queue_map *map,
-		struct ib_device *dev, int first_vec)
-{
-	const struct cpumask *mask;
-	unsigned int queue, cpu;
-
-	for (queue = 0; queue < map->nr_queues; queue++) {
-		mask = ib_get_vector_affinity(dev, first_vec + queue);
-		if (!mask)
-			goto fallback;
-
-		for_each_cpu(cpu, mask)
-			map->mq_map[cpu] = map->queue_offset + queue;
-	}
-
-	return 0;
-
-fallback:
-	return blk_mq_map_queues(map);
-}
-EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues);
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 9e378d0a0c01..5989d4e35ef3 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -12,7 +12,6 @@
 #include <linux/string.h>
 #include <linux/atomic.h>
 #include <linux/blk-mq.h>
-#include <linux/blk-mq-rdma.h>
 #include <linux/types.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
@@ -2171,10 +2170,8 @@ static int nvme_rdma_map_queues(struct blk_mq_tag_set *set)
 			ctrl->io_queues[HCTX_TYPE_DEFAULT];
 		set->map[HCTX_TYPE_READ].queue_offset = 0;
 	}
-	blk_mq_rdma_map_queues(&set->map[HCTX_TYPE_DEFAULT],
-			ctrl->device->dev, 0);
-	blk_mq_rdma_map_queues(&set->map[HCTX_TYPE_READ],
-			ctrl->device->dev, 0);
+	blk_mq_map_queues(&set->map[HCTX_TYPE_DEFAULT]);
+	blk_mq_map_queues(&set->map[HCTX_TYPE_READ]);

 	if (opts->nr_poll_queues && ctrl->io_queues[HCTX_TYPE_POLL]) {
 		/* map dedicated poll queues only if we have queues left */
diff --git a/include/linux/blk-mq-rdma.h b/include/linux/blk-mq-rdma.h
deleted file mode 100644
index 5cc5f0f36218..000000000000
--- a/include/linux/blk-mq-rdma.h
+++ /dev/null
@@ -1,11 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _LINUX_BLK_MQ_RDMA_H
-#define _LINUX_BLK_MQ_RDMA_H
-
-struct blk_mq_tag_set;
-struct ib_device;
-
-int blk_mq_rdma_map_queues(struct blk_mq_queue_map *map,
-		struct ib_device *dev, int first_vec);
-
-#endif /* _LINUX_BLK_MQ_RDMA_H */
--
2.26.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH blk-next 2/2] RDMA/core: Delete not-implemented get_vector_affinity
  2020-09-29  9:13 [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
  2020-09-29  9:13 ` [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code Leon Romanovsky
@ 2020-09-29  9:13 ` Leon Romanovsky
  2020-10-01  5:01 ` [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
  2 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-09-29  9:13 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe, Jens Axboe, Keith Busch
  Cc: Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme, linux-block,
	Leon Romanovsky, Christoph Hellwig

From: Leon Romanovsky <leonro@nvidia.com>

There are no drivers that support .get_vector_affinity(), so delete it.

Fixes: 9afc97c29b03 ("mlx5: remove support for ib_get_vector_affinity")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/device.c |  1 -
 include/rdma/ib_verbs.h          | 23 -----------------------
 2 files changed, 24 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 417d93bbdaca..e00ce044555d 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2619,7 +2619,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, get_link_layer);
 	SET_DEVICE_OP(dev_ops, get_netdev);
 	SET_DEVICE_OP(dev_ops, get_port_immutable);
-	SET_DEVICE_OP(dev_ops, get_vector_affinity);
 	SET_DEVICE_OP(dev_ops, get_vf_config);
 	SET_DEVICE_OP(dev_ops, get_vf_guid);
 	SET_DEVICE_OP(dev_ops, get_vf_stats);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 7fb09a36b654..b1b279d888f0 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2319,8 +2319,6 @@ struct ib_device_ops {
 	int (*modify_device)(struct ib_device *device, int device_modify_mask,
 			     struct ib_device_modify *device_modify);
 	void (*get_dev_fw_str)(struct ib_device *device, char *str);
-	const struct cpumask *(*get_vector_affinity)(struct ib_device *ibdev,
-						     int comp_vector);
 	int (*query_port)(struct ib_device *device, u8 port_num,
 			  struct ib_port_attr *port_attr);
 	int (*modify_port)(struct ib_device *device, u8 port_num,
@@ -4545,27 +4543,6 @@ static inline __be16 ib_lid_be16(u32 lid)
 	return cpu_to_be16((u16)lid);
 }

-/**
- * ib_get_vector_affinity - Get the affinity mappings of a given completion
- *   vector
- * @device:         the rdma device
- * @comp_vector:    index of completion vector
- *
- * Returns NULL on failure, otherwise a corresponding cpu map of the
- * completion vector (returns all-cpus map if the device driver doesn't
- * implement get_vector_affinity).
- */
-static inline const struct cpumask *
-ib_get_vector_affinity(struct ib_device *device, int comp_vector)
-{
-	if (comp_vector < 0 || comp_vector >= device->num_comp_vectors ||
-	    !device->ops.get_vector_affinity)
-		return NULL;
-
-	return device->ops.get_vector_affinity(device, comp_vector);
-
-}
-
 /**
  * rdma_roce_rescan_device - Rescan all of the network devices in the system
  * and add their gids, as needed, to the relevant RoCE devices.
--
2.26.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-09-29  9:13 ` [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code Leon Romanovsky
@ 2020-09-29 10:20   ` Christoph Hellwig
  2020-09-29 10:35     ` Leon Romanovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-09-29 10:20 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jens Axboe, Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme,
	linux-block, Doug Ledford, Jason Gunthorpe, Keith Busch,
	Leon Romanovsky, Christoph Hellwig

On Tue, Sep 29, 2020 at 12:13:57PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> The RDMA vector affinity code is not backed up by any driver and always
> returns NULL to every ib_get_vector_affinity() call.
> 
> This means that blk_mq_rdma_map_queues() always takes fallback path.
> 
> Fixes: 9afc97c29b03 ("mlx5: remove support for ib_get_vector_affinity")
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>

So you guys totally broken the nvme queue assignment without even
telling anyone?  Great job!

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-09-29 10:20   ` Christoph Hellwig
@ 2020-09-29 10:35     ` Leon Romanovsky
  2020-09-29 18:24       ` Sagi Grimberg
  0 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2020-09-29 10:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme,
	linux-block, Doug Ledford, Jason Gunthorpe, Keith Busch

On Tue, Sep 29, 2020 at 12:20:46PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 29, 2020 at 12:13:57PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > The RDMA vector affinity code is not backed up by any driver and always
> > returns NULL to every ib_get_vector_affinity() call.
> >
> > This means that blk_mq_rdma_map_queues() always takes fallback path.
> >
> > Fixes: 9afc97c29b03 ("mlx5: remove support for ib_get_vector_affinity")
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>
> So you guys totally broken the nvme queue assignment without even
> telling anyone?  Great job!

Who is "you guys" and it wasn't silent either? I'm sure that Sagi knows the craft.
https://lore.kernel.org/linux-rdma/20181224221606.GA25780@ziepe.ca/

commit 759ace7832802eaefbca821b2b43a44ab896b449
Author: Sagi Grimberg <sagi@grimberg.me>
Date:   Thu Nov 1 13:08:07 2018 -0700

    i40iw: remove support for ib_get_vector_affinity

....

commit 9afc97c29b032af9a4112c2f4a02d5313b4dc71f
Author: Sagi Grimberg <sagi@grimberg.me>
Date:   Thu Nov 1 09:13:12 2018 -0700

    mlx5: remove support for ib_get_vector_affinity

Thanks

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-09-29 10:35     ` Leon Romanovsky
@ 2020-09-29 18:24       ` Sagi Grimberg
  2020-10-02  6:45         ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2020-09-29 18:24 UTC (permalink / raw)
  To: Leon Romanovsky, Christoph Hellwig
  Cc: Jens Axboe, linux-rdma, linux-kernel, linux-nvme, linux-block,
	Doug Ledford, Jason Gunthorpe, Keith Busch


>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> The RDMA vector affinity code is not backed up by any driver and always
>>> returns NULL to every ib_get_vector_affinity() call.
>>>
>>> This means that blk_mq_rdma_map_queues() always takes fallback path.
>>>
>>> Fixes: 9afc97c29b03 ("mlx5: remove support for ib_get_vector_affinity")
>>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>>
>> So you guys totally broken the nvme queue assignment without even
>> telling anyone?  Great job!
> 
> Who is "you guys" and it wasn't silent either? I'm sure that Sagi knows the craft.
> https://lore.kernel.org/linux-rdma/20181224221606.GA25780@ziepe.ca/
> 
> commit 759ace7832802eaefbca821b2b43a44ab896b449
> Author: Sagi Grimberg <sagi@grimberg.me>
> Date:   Thu Nov 1 13:08:07 2018 -0700
> 
>      i40iw: remove support for ib_get_vector_affinity
> 
> ....
> 
> commit 9afc97c29b032af9a4112c2f4a02d5313b4dc71f
> Author: Sagi Grimberg <sagi@grimberg.me>
> Date:   Thu Nov 1 09:13:12 2018 -0700
> 
>      mlx5: remove support for ib_get_vector_affinity
> 
> Thanks

Yes, basically usage of managed affinity caused people to report
regressions not being able to change irq affinity from procfs.

Back then I started a discussion with Thomas to make managed
affinity to still allow userspace to modify this, but this
was dropped at some point. So currently rdma cannot do
automatic irq affinitization out of the box.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers
  2020-09-29  9:13 [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
  2020-09-29  9:13 ` [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code Leon Romanovsky
  2020-09-29  9:13 ` [PATCH blk-next 2/2] RDMA/core: Delete not-implemented get_vector_affinity Leon Romanovsky
@ 2020-10-01  5:01 ` Leon Romanovsky
  2020-10-02  1:28   ` Jens Axboe
  2 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2020-10-01  5:01 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe, Jens Axboe, Keith Busch
  Cc: Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme, linux-block,
	Christoph Hellwig

On Tue, Sep 29, 2020 at 12:13:56PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> There are no drivers that implement .get_vector_affinity(), so delete
> the RDMA function and simplify block code.
>
> Thanks
>
> P.S. Probably it should go through block tree.
>
> Leon Romanovsky (2):
>   blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
>   RDMA/core: Delete not-implemented get_vector_affinity

Jens, Keith

How can we progress here?

Thanks

>
>  block/Kconfig                    |  5 ----
>  block/Makefile                   |  1 -
>  block/blk-mq-rdma.c              | 44 --------------------------------
>  drivers/infiniband/core/device.c |  1 -
>  drivers/nvme/host/rdma.c         |  7 ++---
>  include/linux/blk-mq-rdma.h      | 11 --------
>  include/rdma/ib_verbs.h          | 23 -----------------
>  7 files changed, 2 insertions(+), 90 deletions(-)
>  delete mode 100644 block/blk-mq-rdma.c
>  delete mode 100644 include/linux/blk-mq-rdma.h
>
> --
> 2.26.2
>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers
  2020-10-01  5:01 ` [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
@ 2020-10-02  1:28   ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2020-10-02  1:28 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford, Jason Gunthorpe, Keith Busch
  Cc: Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme, linux-block,
	Christoph Hellwig

On 9/30/20 11:01 PM, Leon Romanovsky wrote:
> On Tue, Sep 29, 2020 at 12:13:56PM +0300, Leon Romanovsky wrote:
>> From: Leon Romanovsky <leonro@nvidia.com>
>>
>> There are no drivers that implement .get_vector_affinity(), so delete
>> the RDMA function and simplify block code.
>>
>> Thanks
>>
>> P.S. Probably it should go through block tree.
>>
>> Leon Romanovsky (2):
>>   blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
>>   RDMA/core: Delete not-implemented get_vector_affinity
> 
> Jens, Keith
> 
> How can we progress here?

I'd really like for the nvme side to sign off on this first.

-- 
Jens Axboe


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-09-29 18:24       ` Sagi Grimberg
@ 2020-10-02  6:45         ` Christoph Hellwig
  2020-10-02 20:20           ` Sagi Grimberg
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-10-02  6:45 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, Leon Romanovsky, linux-rdma, linux-kernel,
	linux-nvme, linux-block, Doug Ledford, Jason Gunthorpe,
	Keith Busch, Christoph Hellwig

On Tue, Sep 29, 2020 at 11:24:49AM -0700, Sagi Grimberg wrote:
> Yes, basically usage of managed affinity caused people to report
> regressions not being able to change irq affinity from procfs.

Well, why would they change it?  The whole point of the infrastructure
is that there is a single sane affinity setting for a given setup. Now
that setting needed some refinement from the original series (e.g. the
current series about only using housekeeping cpus if cpu isolation is
in use).  But allowing random users to modify affinity is just a receipe
for a trainwreck.

So I think we need to bring this back ASAP, as doing affinity right
out of the box is an absolute requirement for sane performance without
all the benchmarketing deep magic.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-10-02  6:45         ` Christoph Hellwig
@ 2020-10-02 20:20           ` Sagi Grimberg
  2020-10-05  8:38             ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2020-10-02 20:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Leon Romanovsky, linux-rdma, linux-kernel,
	linux-nvme, linux-block, Doug Ledford, Jason Gunthorpe,
	Keith Busch


>> Yes, basically usage of managed affinity caused people to report
>> regressions not being able to change irq affinity from procfs.
> 
> Well, why would they change it?  The whole point of the infrastructure
> is that there is a single sane affinity setting for a given setup. Now
> that setting needed some refinement from the original series (e.g. the
> current series about only using housekeeping cpus if cpu isolation is
> in use).  But allowing random users to modify affinity is just a receipe
> for a trainwreck.

Well allowing people to mangle irq affinity settings seem to be a hard
requirement from the discussions in the past.

> So I think we need to bring this back ASAP, as doing affinity right
> out of the box is an absolute requirement for sane performance without
> all the benchmarketing deep magic.

Well, it's hard to say that setting custom irq affinity settings is
deemed non-useful to anyone and hence should be prevented. I'd expect
that irq settings have a sane default that works and if someone wants to
change it, it can but there should be no guarantees on optimal
performance. But IIRC this had some dependencies on drivers and some
more infrastructure to handle dynamic changes...

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-10-02 20:20           ` Sagi Grimberg
@ 2020-10-05  8:38             ` Christoph Hellwig
  2020-10-06  4:58               ` Leon Romanovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2020-10-05  8:38 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Jens Axboe, Leon Romanovsky, linux-rdma, linux-kernel,
	linux-nvme, linux-block, Doug Ledford, Jason Gunthorpe,
	Keith Busch, Christoph Hellwig

On Fri, Oct 02, 2020 at 01:20:35PM -0700, Sagi Grimberg wrote:
>> Well, why would they change it?  The whole point of the infrastructure
>> is that there is a single sane affinity setting for a given setup. Now
>> that setting needed some refinement from the original series (e.g. the
>> current series about only using housekeeping cpus if cpu isolation is
>> in use).  But allowing random users to modify affinity is just a receipe
>> for a trainwreck.
>
> Well allowing people to mangle irq affinity settings seem to be a hard
> requirement from the discussions in the past.
>
>> So I think we need to bring this back ASAP, as doing affinity right
>> out of the box is an absolute requirement for sane performance without
>> all the benchmarketing deep magic.
>
> Well, it's hard to say that setting custom irq affinity settings is
> deemed non-useful to anyone and hence should be prevented. I'd expect
> that irq settings have a sane default that works and if someone wants to
> change it, it can but there should be no guarantees on optimal
> performance. But IIRC this had some dependencies on drivers and some
> more infrastructure to handle dynamic changes...

The problem is that people change random settings.  We need to generalize
it into a sane API (e.g. the housekeeping CPUs thing which totally makes
sense).

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code
  2020-10-05  8:38             ` Christoph Hellwig
@ 2020-10-06  4:58               ` Leon Romanovsky
  0 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-10-06  4:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sagi Grimberg, linux-rdma, linux-kernel, linux-nvme,
	linux-block, Doug Ledford, Jason Gunthorpe, Keith Busch

On Mon, Oct 05, 2020 at 10:38:17AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 02, 2020 at 01:20:35PM -0700, Sagi Grimberg wrote:
> >> Well, why would they change it?  The whole point of the infrastructure
> >> is that there is a single sane affinity setting for a given setup. Now
> >> that setting needed some refinement from the original series (e.g. the
> >> current series about only using housekeeping cpus if cpu isolation is
> >> in use).  But allowing random users to modify affinity is just a receipe
> >> for a trainwreck.
> >
> > Well allowing people to mangle irq affinity settings seem to be a hard
> > requirement from the discussions in the past.
> >
> >> So I think we need to bring this back ASAP, as doing affinity right
> >> out of the box is an absolute requirement for sane performance without
> >> all the benchmarketing deep magic.
> >
> > Well, it's hard to say that setting custom irq affinity settings is
> > deemed non-useful to anyone and hence should be prevented. I'd expect
> > that irq settings have a sane default that works and if someone wants to
> > change it, it can but there should be no guarantees on optimal
> > performance. But IIRC this had some dependencies on drivers and some
> > more infrastructure to handle dynamic changes...
>
> The problem is that people change random settings.  We need to generalize
> it into a sane API (e.g. the housekeeping CPUs thing which totally makes
> sense).

I don't see many people jump on the bandwagon, someone should do it, but
who will? I personally have no knowledge in that area to do anything
meaningful.

Thanks

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-06  4:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29  9:13 [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
2020-09-29  9:13 ` [PATCH blk-next 1/2] blk-mq-rdma: Delete not-used multi-queue RDMA map queue code Leon Romanovsky
2020-09-29 10:20   ` Christoph Hellwig
2020-09-29 10:35     ` Leon Romanovsky
2020-09-29 18:24       ` Sagi Grimberg
2020-10-02  6:45         ` Christoph Hellwig
2020-10-02 20:20           ` Sagi Grimberg
2020-10-05  8:38             ` Christoph Hellwig
2020-10-06  4:58               ` Leon Romanovsky
2020-09-29  9:13 ` [PATCH blk-next 2/2] RDMA/core: Delete not-implemented get_vector_affinity Leon Romanovsky
2020-10-01  5:01 ` [PATCH blk-next 0/2] Delete the get_vector_affinity leftovers Leon Romanovsky
2020-10-02  1:28   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).