Re: [PATCH 06/13] irq: add a helper spread an affinity mask for MSI/MSI-X vectors

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christoph Hellwig <hch@lst.de>
To: Alexander Gordeev <agordeev@redhat.com>
Cc: linux-block@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@fb.com, tglx@linutronix.de
Subject: Re: [PATCH 06/13] irq: add a helper spread an affinity mask for MSI/MSI-X vectors
Date: Thu, 30 Jun 2016 19:48:54 +0200	[thread overview]
Message-ID: <20160630174854.GA23578@lst.de> (raw)
In-Reply-To: <20160625200518.GA29251@dhcp-27-118.brq.redhat.com>

On Sat, Jun 25, 2016 at 10:05:19PM +0200, Alexander Gordeev wrote:
> > + * and generate an output cpumask suitable for spreading MSI/MSI-X vectors
> > + * so that they are distributed as good as possible around the CPUs.  If
> > + * more vectors than CPUs are available we'll map one to each CPU,
> 
> Unless I do not misinterpret a loop from msix_setup_entries() (patch 08/13),
> the above is incorrect:

What part do you think is incorrect?

> > + * otherwise we map one to the first sibling of each socket.
> 
> (*) I guess, in some topology configurations a total number of all
> first siblings may be less than the number of vectors.

Yes, in that case we'll assign imcompetely.  I've already heard people
complaining about that at LSF/MM, but no one volunteered patches.
I only have devices with 1 or enough vectores to test, so I don't
really dare to touch the algorithm.  Either way the algorithm
change should probably be a different patch than refactoring it and
moving it around.

> > + * If there are more vectors than CPUs we will still only have one bit
> > + * set per CPU, but interrupt code will keep on assining the vectors from
> > + * the start of the bitmap until we run out of vectors.
> > + */
> > +int irq_create_affinity_mask(struct cpumask **affinity_mask,
> > +		unsigned int *nr_vecs)
> 
> Both the callers of this function and the function itself IMHO would
> read better if it simply returned the affinity mask. Or passed the 
> affinity mask pointer.

We can't just return the pointer as NULL is a valid and common return
value.  If we pass the pointer we'd then also need to allocate one for
the (common) nvec = 1 case.

> 
> > +{
> > +	unsigned int vecs = 0;
> 
> In case (*nr_vecs >= num_online_cpus()) the contents of *nr_vecs
> will be overwritten with 0.

Thanks, fixed.

> So considering (*) comment above the number of available vectors
> might be unnecessarily shrunken here.
> 
> I think nr_vecs need not be an out-parameter since we always can
> assign multiple vectors to a CPU. It is better than limiting number
> of available vectors AFAIKT. Or you could pass one-per-cpu flag
> explicitly.

The function is intended to replicate the blk-mq algorithm.  I don't
think it's optimal, but I really want to avoid dragging the discussion
about the optimal algorithm into this patchset.  We should at least
move to a vector per node/socket model instead of just the siblings,
and be able to use all vectors (at least optionally).

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

WARNING: multiple messages have this Message-ID (diff)

From: Christoph Hellwig <hch@lst.de>
To: Alexander Gordeev <agordeev@redhat.com>
Cc: tglx@linutronix.de, axboe@fb.com, linux-block@vger.kernel.org,
	linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/13] irq: add a helper spread an affinity mask for MSI/MSI-X vectors
Date: Thu, 30 Jun 2016 19:48:54 +0200	[thread overview]
Message-ID: <20160630174854.GA23578@lst.de> (raw)
In-Reply-To: <20160625200518.GA29251@dhcp-27-118.brq.redhat.com>

On Sat, Jun 25, 2016 at 10:05:19PM +0200, Alexander Gordeev wrote:
> > + * and generate an output cpumask suitable for spreading MSI/MSI-X vectors
> > + * so that they are distributed as good as possible around the CPUs.  If
> > + * more vectors than CPUs are available we'll map one to each CPU,
> 
> Unless I do not misinterpret a loop from msix_setup_entries() (patch 08/13),
> the above is incorrect:

What part do you think is incorrect?

> > + * otherwise we map one to the first sibling of each socket.
> 
> (*) I guess, in some topology configurations a total number of all
> first siblings may be less than the number of vectors.

Yes, in that case we'll assign imcompetely.  I've already heard people
complaining about that at LSF/MM, but no one volunteered patches.
I only have devices with 1 or enough vectores to test, so I don't
really dare to touch the algorithm.  Either way the algorithm
change should probably be a different patch than refactoring it and
moving it around.

> > + * If there are more vectors than CPUs we will still only have one bit
> > + * set per CPU, but interrupt code will keep on assining the vectors from
> > + * the start of the bitmap until we run out of vectors.
> > + */
> > +int irq_create_affinity_mask(struct cpumask **affinity_mask,
> > +		unsigned int *nr_vecs)
> 
> Both the callers of this function and the function itself IMHO would
> read better if it simply returned the affinity mask. Or passed the 
> affinity mask pointer.

We can't just return the pointer as NULL is a valid and common return
value.  If we pass the pointer we'd then also need to allocate one for
the (common) nvec = 1 case.

> 
> > +{
> > +	unsigned int vecs = 0;
> 
> In case (*nr_vecs >= num_online_cpus()) the contents of *nr_vecs
> will be overwritten with 0.

Thanks, fixed.

> So considering (*) comment above the number of available vectors
> might be unnecessarily shrunken here.
> 
> I think nr_vecs need not be an out-parameter since we always can
> assign multiple vectors to a CPU. It is better than limiting number
> of available vectors AFAIKT. Or you could pass one-per-cpu flag
> explicitly.

The function is intended to replicate the blk-mq algorithm.  I don't
think it's optimal, but I really want to avoid dragging the discussion
about the optimal algorithm into this patchset.  We should at least
move to a vector per node/socket model instead of just the siblings,
and be able to use all vectors (at least optionally).

WARNING: multiple messages have this Message-ID (diff)

From: hch@lst.de (Christoph Hellwig)
Subject: [PATCH 06/13] irq: add a helper spread an affinity mask for MSI/MSI-X vectors
Date: Thu, 30 Jun 2016 19:48:54 +0200	[thread overview]
Message-ID: <20160630174854.GA23578@lst.de> (raw)
In-Reply-To: <20160625200518.GA29251@dhcp-27-118.brq.redhat.com>

On Sat, Jun 25, 2016@10:05:19PM +0200, Alexander Gordeev wrote:
> > + * and generate an output cpumask suitable for spreading MSI/MSI-X vectors
> > + * so that they are distributed as good as possible around the CPUs.  If
> > + * more vectors than CPUs are available we'll map one to each CPU,
> 
> Unless I do not misinterpret a loop from msix_setup_entries() (patch 08/13),
> the above is incorrect:

What part do you think is incorrect?

> > + * otherwise we map one to the first sibling of each socket.
> 
> (*) I guess, in some topology configurations a total number of all
> first siblings may be less than the number of vectors.

Yes, in that case we'll assign imcompetely.  I've already heard people
complaining about that at LSF/MM, but no one volunteered patches.
I only have devices with 1 or enough vectores to test, so I don't
really dare to touch the algorithm.  Either way the algorithm
change should probably be a different patch than refactoring it and
moving it around.

> > + * If there are more vectors than CPUs we will still only have one bit
> > + * set per CPU, but interrupt code will keep on assining the vectors from
> > + * the start of the bitmap until we run out of vectors.
> > + */
> > +int irq_create_affinity_mask(struct cpumask **affinity_mask,
> > +		unsigned int *nr_vecs)
> 
> Both the callers of this function and the function itself IMHO would
> read better if it simply returned the affinity mask. Or passed the 
> affinity mask pointer.

We can't just return the pointer as NULL is a valid and common return
value.  If we pass the pointer we'd then also need to allocate one for
the (common) nvec = 1 case.

> 
> > +{
> > +	unsigned int vecs = 0;
> 
> In case (*nr_vecs >= num_online_cpus()) the contents of *nr_vecs
> will be overwritten with 0.

Thanks, fixed.

> So considering (*) comment above the number of available vectors
> might be unnecessarily shrunken here.
> 
> I think nr_vecs need not be an out-parameter since we always can
> assign multiple vectors to a CPU. It is better than limiting number
> of available vectors AFAIKT. Or you could pass one-per-cpu flag
> explicitly.

The function is intended to replicate the blk-mq algorithm.  I don't
think it's optimal, but I really want to avoid dragging the discussion
about the optimal algorithm into this patchset.  We should at least
move to a vector per node/socket model instead of just the siblings,
and be able to use all vectors (at least optionally).

next prev parent reply	other threads:[~2016-06-30 17:48 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-14 19:58 automatic interrupt affinity for MSI/MSI-X capable devices V2 Christoph Hellwig
2016-06-14 19:58 ` Christoph Hellwig
2016-06-14 19:58 ` [PATCH 01/13] irq/msi: Remove unused MSI_FLAG_IDENTITY_MAP Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-16  9:05   ` Bart Van Assche
2016-06-16  9:05     ` Bart Van Assche
2016-06-14 19:58 ` [PATCH 02/13] irq: Introduce IRQD_AFFINITY_MANAGED flag Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-15  8:44   ` Bart Van Assche
2016-06-15  8:44     ` Bart Van Assche
2016-06-15 10:23     ` Christoph Hellwig
2016-06-15 10:23       ` Christoph Hellwig
2016-06-15 10:42       ` Bart Van Assche
2016-06-15 10:42         ` Bart Van Assche
2016-06-15 10:42         ` Bart Van Assche
2016-06-15 15:14         ` Keith Busch
2016-06-15 15:14           ` Keith Busch
2016-06-15 15:28           ` Bart Van Assche
2016-06-15 15:28             ` Bart Van Assche
2016-06-15 16:03             ` Keith Busch
2016-06-15 16:03               ` Keith Busch
2016-06-15 19:36               ` Bart Van Assche
2016-06-15 19:36                 ` Bart Van Assche
2016-06-15 20:06                 ` Keith Busch
2016-06-15 20:06                   ` Keith Busch
2016-06-15 20:12                   ` Keith Busch
2016-06-15 20:12                     ` Keith Busch
2016-06-15 20:50                     ` Bart Van Assche
2016-06-15 20:50                       ` Bart Van Assche
2016-06-16 15:19                       ` Keith Busch
2016-06-16 15:19                         ` Keith Busch
2016-06-22 11:56                         ` Alexander Gordeev
2016-06-22 11:56                           ` Alexander Gordeev
2016-06-22 11:56                           ` Alexander Gordeev
2016-06-16 15:20                 ` Christoph Hellwig
2016-06-16 15:20                   ` Christoph Hellwig
2016-06-16 15:39                   ` Bart Van Assche
2016-06-16 15:39                     ` Bart Van Assche
2016-06-20 12:22                     ` Christoph Hellwig
2016-06-20 12:22                       ` Christoph Hellwig
2016-06-20 12:22                       ` Christoph Hellwig
2016-06-20 13:21                       ` Bart Van Assche
2016-06-20 13:21                         ` Bart Van Assche
2016-06-20 13:21                         ` Bart Van Assche
2016-06-21 14:31                         ` Christoph Hellwig
2016-06-21 14:31                           ` Christoph Hellwig
2016-06-21 14:31                           ` Christoph Hellwig
2016-06-16  9:08   ` Bart Van Assche
2016-06-16  9:08     ` Bart Van Assche
2016-06-14 19:58 ` [PATCH 03/13] irq: Add affinity hint to irq allocation Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58 ` [PATCH 04/13] irq: Use affinity hint in irqdesc allocation Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58 ` [PATCH 05/13] irq/msi: Make use of affinity aware allocations Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58 ` [PATCH 06/13] irq: add a helper spread an affinity mask for MSI/MSI-X vectors Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 19:58   ` Christoph Hellwig
2016-06-14 21:54   ` Guilherme G. Piccoli
2016-06-14 21:54     ` Guilherme G. Piccoli
2016-06-15  8:35     ` Bart Van Assche
2016-06-15  8:35       ` Bart Van Assche
2016-06-15  8:35       ` Bart Van Assche
2016-06-15 10:10     ` Christoph Hellwig
2016-06-15 10:10       ` Christoph Hellwig
2016-06-15 13:09       ` Guilherme G. Piccoli
2016-06-15 13:09         ` Guilherme G. Piccoli
2016-06-16 15:16         ` Christoph Hellwig
2016-06-16 15:16           ` Christoph Hellwig
2016-06-25 20:05   ` Alexander Gordeev
2016-06-25 20:05     ` Alexander Gordeev
2016-06-30 17:48     ` Christoph Hellwig [this message]
2016-06-30 17:48       ` Christoph Hellwig
2016-06-30 17:48       ` Christoph Hellwig
2016-07-01  7:25       ` Alexander Gordeev
2016-07-01  7:25         ` Alexander Gordeev
2016-06-14 19:59 ` [PATCH 07/13] pci: Provide sensible irq vector alloc/free routines Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-23 11:16   ` Alexander Gordeev
2016-06-23 11:16     ` Alexander Gordeev
2016-06-30 16:54     ` Christoph Hellwig
2016-06-30 16:54       ` Christoph Hellwig
2016-06-30 17:28       ` Alexander Gordeev
2016-06-30 17:28         ` Alexander Gordeev
2016-06-30 17:35         ` Christoph Hellwig
2016-06-30 17:35           ` Christoph Hellwig
2016-06-14 19:59 ` [PATCH 08/13] pci: spread interrupt vectors in pci_alloc_irq_vectors Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-25 20:22   ` Alexander Gordeev
2016-06-25 20:22     ` Alexander Gordeev
2016-06-14 19:59 ` [PATCH 09/13] blk-mq: don't redistribute hardware queues on a CPU hotplug event Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59 ` [PATCH 10/13] blk-mq: only allocate a single mq_map per tag_set Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59 ` [PATCH 11/13] blk-mq: allow the driver to pass in an affinity mask Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-07-04  8:15   ` Alexander Gordeev
2016-07-04  8:15     ` Alexander Gordeev
2016-07-04  8:38     ` Christoph Hellwig
2016-07-04  8:38       ` Christoph Hellwig
2016-07-04  9:35       ` Alexander Gordeev
2016-07-04  9:35         ` Alexander Gordeev
2016-07-10  3:41         ` Christoph Hellwig
2016-07-10  3:41           ` Christoph Hellwig
2016-07-12  6:42           ` Alexander Gordeev
2016-07-12  6:42             ` Alexander Gordeev
2016-06-14 19:59 ` [PATCH 12/13] nvme: switch to use pci_alloc_irq_vectors Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59 ` [PATCH 13/13] nvme: remove the post_scan callout Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-14 19:59   ` Christoph Hellwig
2016-06-16  9:45 ` automatic interrupt affinity for MSI/MSI-X capable devices V2 Bart Van Assche
2016-06-16  9:45   ` Bart Van Assche
2016-06-16  9:45   ` Bart Van Assche
2016-06-16 15:22   ` Christoph Hellwig
2016-06-16 15:22     ` Christoph Hellwig
2016-06-26 19:40 ` Alexander Gordeev
2016-06-26 19:40   ` Alexander Gordeev
2016-07-04  8:39 automatic interrupt affinity for MSI/MSI-X capable devices V3 Christoph Hellwig
2016-07-04  8:39 ` [PATCH 06/13] irq: add a helper spread an affinity mask for MSI/MSI-X vectors Christoph Hellwig
2016-07-04  8:39   ` Christoph Hellwig
2016-07-04  8:39   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160630174854.GA23578@lst.de \
    --to=hch@lst.de \
    --cc=agordeev@redhat.com \
    --cc=axboe@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.