linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Keith Busch <kbusch@meta.com>
Cc: linux-nvme@lists.infradead.org, hch@lst.de,
	Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCHv2] nvme-pci: allow unmanaged interrupts
Date: Sat, 11 May 2024 07:47:26 +0800	[thread overview]
Message-ID: <Zj6yDtRzl68zspQY@fedora> (raw)
In-Reply-To: <20240510174645.3987951-1-kbusch@meta.com>

On Fri, May 10, 2024 at 10:46:45AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Some people _really_ want to control their interrupt affinity,
> preferring to sacrafice storage performance for scheduling
> predicatability on some other subset of CPUs.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> Sorry for the rapid fire v2, and I know some are still aginst this; I'm
> just getting v2 out because v1 breaks a different use case.
> 
> And as far as acceptance goes, this doesn't look like it carries any
> longterm maintenance overhead. It's an opt-in feature, and you're own
> your own if you turn it on.
> 
> v1->v2: skip the the AFFINITY vector allocation if the parameter is
> provided instead trying to make the vector code handle all post_vectors.
> 
>  drivers/nvme/host/pci.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 8e0bb9692685d..def1a295284bb 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -63,6 +63,11 @@ MODULE_PARM_DESC(sgl_threshold,
>  		"Use SGLs when average request segment size is larger or equal to "
>  		"this size. Use 0 to disable SGLs.");
>  
> +static bool managed_irqs = true;
> +module_param(managed_irqs, bool, 0444);
> +MODULE_PARM_DESC(managed_irqs,
> +		 "set to false for user controlled irq affinity");
> +
>  #define NVME_PCI_MIN_QUEUE_SIZE 2
>  #define NVME_PCI_MAX_QUEUE_SIZE 4095
>  static int io_queue_depth_set(const char *val, const struct kernel_param *kp);
> @@ -456,7 +461,7 @@ static void nvme_pci_map_queues(struct blk_mq_tag_set *set)
>  		 * affinity), so use the regular blk-mq cpu mapping
>  		 */
>  		map->queue_offset = qoff;
> -		if (i != HCTX_TYPE_POLL && offset)
> +		if (managed_irqs && i != HCTX_TYPE_POLL && offset)
>  			blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset);
>  		else
>  			blk_mq_map_queues(map);

Now the queue mapping is built with nothing from irq affinity which is
setup from userspace, and performance could be pretty bad.

Is there any benefit to use unmanaged irq in this way?


Thanks,
Ming



  reply	other threads:[~2024-05-10 23:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-10 17:46 [PATCHv2] nvme-pci: allow unmanaged interrupts Keith Busch
2024-05-10 23:47 ` Ming Lei [this message]
2024-05-11  0:29   ` Keith Busch
2024-05-11  0:44     ` Ming Lei
2024-05-12 14:16       ` Sagi Grimberg
2024-05-12 22:05         ` Keith Busch
2024-05-13  1:12         ` Ming Lei
2024-05-13  4:09           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zj6yDtRzl68zspQY@fedora \
    --to=ming.lei@redhat.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).