From: Thomas Gleixner <tglx@linutronix.de>
To: Keith Busch <keith.busch@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Re: [BUG 4.15-rc7] IRQ matrix management errors
Date: Wed, 17 Jan 2018 10:32:12 +0100 (CET) [thread overview]
Message-ID: <alpine.DEB.2.20.1801171030150.1777@nanos> (raw)
In-Reply-To: <alpine.DEB.2.20.1801171020440.1777@nanos>
On Wed, 17 Jan 2018, Thomas Gleixner wrote:
> On Wed, 17 Jan 2018, Keith Busch wrote:
> > On Wed, Jan 17, 2018 at 08:34:22AM +0100, Thomas Gleixner wrote:
> > > Can you trace the matrix allocations from the very beginning or tell me how
> > > to reproduce. I'd like to figure out why this is happening.
> >
> > Sure, I'll get the irq_matrix events.
> >
> > I reproduce this on a machine with 112 CPUs and 3 NVMe controllers. The
> > first two NVMe want 112 MSI-x vectors, and the last only 31 vectors. The
> > test runs 'modprobe nvme' and 'modprobe -r nvme' in a loop with 10
> > second delay between each step. Repro occurs within a few iterations,
> > sometimes already broken after the initial boot.
>
> That doesn't sound right. The vectors should be spread evenly accross the
> CPUs. So ENOSPC should never happen.
>
> Can you please take snapshots of /sys/kernel/debug/irq/ between the
> modprobe and modprobe -r steps?
The allocation fails because CPU1 has exhausted it's vector space here:
[002] d... 333.028216: irq_matrix_alloc_managed: bit=34 cpu=1 online=1 avl=0 alloc=202 managed=2 online_maps=112 global_avl=22085, global_rsvd=158, total_alloc=460
Now the interesting question is how that happens.
Thanks,
tglx
next prev parent reply other threads:[~2018-01-17 9:32 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20180115025759.GG13580@localhost.localdomain>
2018-01-15 3:02 ` [BUG 4.15-rc7] IRQ matrix management errors Keith Busch
2018-01-15 9:13 ` Thomas Gleixner
2018-01-16 6:16 ` Keith Busch
2018-01-16 7:11 ` Keith Busch
2018-01-16 10:33 ` Thomas Gleixner
2018-01-16 11:20 ` Thomas Gleixner
2018-01-16 14:26 ` Keith Busch
2018-01-17 2:25 ` Keith Busch
2018-01-17 7:34 ` Thomas Gleixner
2018-01-17 7:55 ` Keith Busch
2018-01-17 9:24 ` Thomas Gleixner
2018-01-17 9:32 ` Thomas Gleixner [this message]
2018-01-17 14:24 ` Keith Busch
2018-01-17 15:01 ` Thomas Gleixner
2018-01-18 2:37 ` Keith Busch
2018-01-18 8:10 ` Thomas Gleixner
2018-01-18 8:48 ` Keith Busch
2018-01-18 9:06 ` Thomas Gleixner
2018-01-18 10:43 ` [tip:irq/urgent] irq/matrix: Spread interrupts on allocation tip-bot for Thomas Gleixner
2018-01-17 11:15 ` [tip:x86/urgent] x86/apic/vector: Fix off by one in error path tip-bot for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.20.1801171030150.1777@nanos \
--to=tglx@linutronix.de \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.