linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH 2/5] genirq/affinity: allow driver to setup managed IRQ's affinity
Date: Sun, 10 Feb 2019 17:22:09 +0800	[thread overview]
Message-ID: <20190210092207.GA4832@ming.t460p> (raw)
In-Reply-To: <20190207222130.GQ7268@google.com>

On Thu, Feb 07, 2019 at 04:21:30PM -0600, Bjorn Helgaas wrote:
> On Fri, Jan 25, 2019 at 05:53:44PM +0800, Ming Lei wrote:
> > This patch introduces callback of .setup_affinity into 'struct
> > irq_affinity', so that:
> > 
> > 1) allow drivers to customize the affinity for managed IRQ, for
> > example, now NVMe has special requirement for read queues & poll
> > queues
> > 
> > 2) 6da4b3ab9a6e9 ("genirq/affinity: Add support for allocating interrupt sets")
> > makes pci_alloc_irq_vectors_affinity() a bit difficult to use for
> > allocating interrupt sets: 'max_vecs' is required to same with 'min_vecs'.
> 
> s/is required to same with/is required to be equal to/
> 
> > With this patch, driver can implement their own .setup_affinity to
> > customize the affinity, then the above thing can be solved easily.
> 
> s/driver/drivers/
> 
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  include/linux/interrupt.h | 26 +++++++++++++++++---------
> >  kernel/irq/affinity.c     |  6 ++++++
> >  2 files changed, 23 insertions(+), 9 deletions(-)
> > 
> > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> > index c672f34235e7..f6cea778cf50 100644
> > --- a/include/linux/interrupt.h
> > +++ b/include/linux/interrupt.h
> > @@ -242,30 +242,38 @@ struct irq_affinity_notify {
> >  };
> >  
> >  /**
> > + * struct irq_affinity_desc - Interrupt affinity descriptor
> > + * @mask:	cpumask to hold the affinity assignment
> > + */
> > +struct irq_affinity_desc {
> > +	struct cpumask	mask;
> > +	unsigned int	is_managed : 1;
> > +};
> 
> I was going to comment that "managed" already has a common usage
> related to the devm_*() functions, but I don't think that's what you
> mean here.  But then I noticed that you're only *moving* this code,
> so you couldn't change it anyway.
> 
> But I still wonder what "managed" means here.

'managed' irq's affinity is managed by kernel, and user space can't change
it any more via /proc/irq/...

> 
> > +
> > +/**
> >   * struct irq_affinity - Description for automatic irq affinity assignements
> >   * @pre_vectors:	Don't apply affinity to @pre_vectors at beginning of
> >   *			the MSI(-X) vector space
> >   * @post_vectors:	Don't apply affinity to @post_vectors at end of
> >   *			the MSI(-X) vector space
> > + * @setup_affinity:	Use driver's method to setup irq vectors affinity,
> > + * 			and driver has to handle pre_vectors & post_vectors
> > + * 			correctly, set 'is_managed' flag correct too
> 
> s/irq vectors/irq vector/
> s/correct/correctly/
> 
> In general I don't think "correctly" is very useful in changelogs
> and comments.  Usually it just means "how *I* think it should be
> done", but it doesn't tell anybody else exactly *how* it should be
> done.

That is one nice question.

So far there are at least two rules related with correctness:

1) 'is_managed' flag needs to be set

2) for all managed irq vectors in one interrupt set of one device,
all possible CPUs should be covered in the setup affinities, meantime
one CPU can't be allocated to two irq vectors. The interrupt set is
unique for NVMe actually now, such as, all READ queues' interrupts
belong to one set, and other queues belong to another set. For other
device, we can think there is only one set for one device.

> 
> What does it mean for a driver to handle pre_vectors & post_vectors
> "correctly"?  The driver's .setup_affinity() method receives an array
> of struct irq_affinity; maybe it means that method should set the
> cpumask for each element as it desires.  For @pre_vectors and
> @post_vectors, I suppose that means their cpumask would be
> irq_default_affinity?

So far the default affinity for pre_vectors & post_vectors is to use
irq_default_affinity, and this patch changes it to cpu_possible_mask,
and this change won't be one big deal from driver's view.

> 
> But I guess the .setup_affinity() method means the driver would have
> complete flexibility for each vector, and it could use

Yeah, together with simplification in both driver and genirq/affinity
code.

> irq_default_affinity for arbitrary vectors, not just the first few
> (@pre_vectors) and the last few (@post_vectors)?
> 
> What's the rule for how a driver sets "is_managed"?

Please see above, and I plan to enhance the rule in genirq/affinity code
if this approach may be accepted.

Thanks,
Ming

  reply	other threads:[~2019-02-10  9:22 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-25  9:53 [PATCH 0/5] genirq/affinity: introduce .setup_affinity to support allocating interrupt sets Ming Lei
2019-01-25  9:53 ` [PATCH 1/5] genirq/affinity: move allocation of 'node_to_cpumask' to irq_build_affinity_masks Ming Lei
2019-02-07 22:02   ` Bjorn Helgaas
2019-02-10 19:01   ` [tip:irq/core] genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks() tip-bot for Ming Lei
2019-01-25  9:53 ` [PATCH 2/5] genirq/affinity: allow driver to setup managed IRQ's affinity Ming Lei
2019-02-07 22:21   ` Bjorn Helgaas
2019-02-10  9:22     ` Ming Lei [this message]
2019-02-10 16:30   ` Thomas Gleixner
2019-02-11  3:54     ` Ming Lei
2019-02-11 14:39       ` Bjorn Helgaas
2019-02-11 22:38         ` Thomas Gleixner
2019-02-12 11:17           ` Ming Lei
2019-01-25  9:53 ` [PATCH 3/5] genirq/affinity: introduce irq_build_affinity() Ming Lei
2019-01-25  9:53 ` [PATCH 4/5] nvme-pci: simplify nvme_setup_irqs() via .setup_affinity callback Ming Lei
2019-02-10 16:39   ` Thomas Gleixner
2019-02-11  3:58     ` Ming Lei
2019-02-10 18:49   ` Thomas Gleixner
2019-02-11  4:09     ` Ming Lei
2019-01-25  9:53 ` [PATCH 5/5] genirq/affinity: remove support for allocating interrupt sets Ming Lei
2019-02-07 22:22   ` Bjorn Helgaas
2019-01-25  9:56 ` [PATCH 0/5] genirq/affinity: introduce .setup_affinity to support " Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190210092207.GA4832@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sagi@grimberg.me \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).