linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 13/16] irq: add support for allocating (and affinitizing) sets of IRQs
Date: Fri, 2 Nov 2018 22:37:07 +0800	[thread overview]
Message-ID: <20181102143707.GA31121@ming.t460p> (raw)
In-Reply-To: <20181030183252.17857-14-axboe@kernel.dk>

On Tue, Oct 30, 2018 at 12:32:49PM -0600, Jens Axboe wrote:
> A driver may have a need to allocate multiple sets of MSI/MSI-X
> interrupts, and have them appropriately affinitized. Add support for
> defining a number of sets in the irq_affinity structure, of varying
> sizes, and get each set affinitized correctly across the machine.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  drivers/pci/msi.c         | 14 ++++++++++++++
>  include/linux/interrupt.h |  4 ++++
>  kernel/irq/affinity.c     | 40 ++++++++++++++++++++++++++++++---------
>  3 files changed, 49 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index af24ed50a245..e6c6e10b9ceb 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1036,6 +1036,13 @@ static int __pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec,
>  	if (maxvec < minvec)
>  		return -ERANGE;
>  
> +	/*
> +	 * If the caller is passing in sets, we can't support a range of
> +	 * vectors. The caller needs to handle that.
> +	 */
> +	if (affd->nr_sets && minvec != maxvec)
> +		return -EINVAL;
> +
>  	if (WARN_ON_ONCE(dev->msi_enabled))
>  		return -EINVAL;
>  
> @@ -1087,6 +1094,13 @@ static int __pci_enable_msix_range(struct pci_dev *dev,
>  	if (maxvec < minvec)
>  		return -ERANGE;
>  
> +	/*
> +	 * If the caller is passing in sets, we can't support a range of
> +	 * supported vectors. The caller needs to handle that.
> +	 */
> +	if (affd->nr_sets && minvec != maxvec)
> +		return -EINVAL;
> +
>  	if (WARN_ON_ONCE(dev->msix_enabled))
>  		return -EINVAL;
>  
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 1d6711c28271..ca397ff40836 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -247,10 +247,14 @@ struct irq_affinity_notify {
>   *			the MSI(-X) vector space
>   * @post_vectors:	Don't apply affinity to @post_vectors at end of
>   *			the MSI(-X) vector space
> + * @nr_sets:		Length of passed in *sets array
> + * @sets:		Number of affinitized sets
>   */
>  struct irq_affinity {
>  	int	pre_vectors;
>  	int	post_vectors;
> +	int	nr_sets;
> +	int	*sets;
>  };
>  
>  #if defined(CONFIG_SMP)
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index f4f29b9d90ee..2046a0f0f0f1 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -180,6 +180,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
>  	int curvec, usedvecs;
>  	cpumask_var_t nmsk, npresmsk, *node_to_cpumask;
>  	struct cpumask *masks = NULL;
> +	int i, nr_sets;
>  
>  	/*
>  	 * If there aren't any vectors left after applying the pre/post
> @@ -210,10 +211,23 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
>  	get_online_cpus();
>  	build_node_to_cpumask(node_to_cpumask);
>  
> -	/* Spread on present CPUs starting from affd->pre_vectors */
> -	usedvecs = irq_build_affinity_masks(affd, curvec, affvecs,
> -					    node_to_cpumask, cpu_present_mask,
> -					    nmsk, masks);
> +	/*
> +	 * Spread on present CPUs starting from affd->pre_vectors. If we
> +	 * have multiple sets, build each sets affinity mask separately.
> +	 */
> +	nr_sets = affd->nr_sets;
> +	if (!nr_sets)
> +		nr_sets = 1;
> +
> +	for (i = 0, usedvecs = 0; i < nr_sets; i++) {
> +		int this_vecs = affd->sets ? affd->sets[i] : affvecs;
> +		int nr;
> +
> +		nr = irq_build_affinity_masks(affd, curvec, this_vecs,
> +					      node_to_cpumask, cpu_present_mask,
> +					      nmsk, masks + usedvecs);

The last parameter of the above function should have been 'masks',
because irq_build_affinity_masks() always treats 'masks' as the base
address of the array.

> +		usedvecs += nr;
> +	}

Thinking of further, one big problem in this patch is that each set of
IRQs should have been spread on all possible CPUs, which is done via
2-stages spread now.

However, this patch only spreads each set of IRQs on present CPUs, this
way may not work in case of physical CPU hotplug.

Thanks,
Ming

  parent reply	other threads:[~2018-11-02 14:37 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30 18:32 [PATCHSET v3 0/16] blk-mq: Add support for multiple queue maps Jens Axboe
2018-10-30 18:32 ` [PATCH 01/16] blk-mq: kill q->mq_map Jens Axboe
2018-10-31  0:28   ` Sagi Grimberg
2018-10-30 18:32 ` [PATCH 02/16] blk-mq: abstract out queue map Jens Axboe
2018-10-31  0:31   ` Sagi Grimberg
2018-10-31 14:17     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 03/16] blk-mq: provide dummy blk_mq_map_queue_type() helper Jens Axboe
2018-10-31  0:32   ` Sagi Grimberg
2018-10-30 18:32 ` [PATCH 04/16] blk-mq: pass in request/bio flags to queue mapping Jens Axboe
2018-10-31  0:37   ` Sagi Grimberg
2018-10-31 14:18     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 05/16] blk-mq: allow software queue to map to multiple hardware queues Jens Axboe
2018-10-31  0:49   ` Sagi Grimberg
2018-10-31 14:19     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 06/16] blk-mq: add 'type' attribute to the sysfs hctx directory Jens Axboe
2018-10-31  0:53   ` Sagi Grimberg
2018-10-31 14:21     ` Jens Axboe
2018-11-01 21:59   ` Omar Sandoval
2018-11-01 22:50     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 07/16] blk-mq: support multiple hctx maps Jens Axboe
2018-10-31  0:59   ` Sagi Grimberg
2018-10-31 14:23     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 08/16] blk-mq: separate number of hardware queues from nr_cpu_ids Jens Axboe
2018-10-31  1:00   ` Sagi Grimberg
2018-10-30 18:32 ` [PATCH 09/16] blk-mq: cache request hardware queue mapping Jens Axboe
2018-10-31  1:01   ` Sagi Grimberg
2018-11-01  9:27   ` Hannes Reinecke
2018-11-01 12:22     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 10/16] blk-mq: cleanup and improve list insertion Jens Axboe
2018-10-31  1:03   ` Sagi Grimberg
2018-11-01  9:28   ` Hannes Reinecke
2018-10-30 18:32 ` [PATCH 11/16] blk-mq: improve plug list sorting Jens Axboe
2018-10-31  1:04   ` Sagi Grimberg
2018-11-01  9:30   ` Hannes Reinecke
2018-10-30 18:32 ` [PATCH 12/16] blk-mq: initial support for multiple queue maps Jens Axboe
2018-10-31  1:14   ` Sagi Grimberg
2018-10-30 18:32 ` [PATCH 13/16] irq: add support for allocating (and affinitizing) sets of IRQs Jens Axboe
2018-10-31  1:17   ` Sagi Grimberg
2018-11-02 14:37   ` Ming Lei [this message]
2018-11-02 15:09     ` Keith Busch
2018-11-03  2:22       ` Ming Lei
2018-10-30 18:32 ` [PATCH 14/16] nvme: utilize two queue maps, one for reads and one for writes Jens Axboe
2018-10-31  1:57   ` Sagi Grimberg
2018-10-31 14:32     ` Jens Axboe
2018-10-30 18:32 ` [PATCH 15/16] block: add REQ_HIPRI and inherit it from IOCB_HIPRI Jens Axboe
2018-10-31  1:58   ` Sagi Grimberg
2018-10-30 18:32 ` [PATCH 16/16] nvme: add separate poll queue map Jens Axboe
2018-10-30 18:35 ` [PATCHSET v3 0/16] blk-mq: Add support for multiple queue maps Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181102143707.GA31121@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).