All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Sven Schnelle <svens@linux.ibm.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Pingfan Liu <kernelfans@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Julien Thierry <julien.thierry@arm.com>,
	Yuichi Ito <ito-yuichi@fujitsu.com>,
	linux-kernel@vger.kernel.org, Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>
Subject: Re: [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of
Date: Tue, 28 Sep 2021 11:26:04 +0100	[thread overview]
Message-ID: <20210928102604.GE1924@C02TD0UTHF1T.local> (raw)
In-Reply-To: <yt9dtui53u30.fsf@linux.ibm.com>

On Tue, Sep 28, 2021 at 11:52:51AM +0200, Sven Schnelle wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
> >> On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> >> > Sure; I didn't mean to suggest those weren't balanced! The problem here
> >> > is *nesting*. Due to the structure of our entry code and the core IRQ
> >> > code, when handling an IRQ we have a sequence:
> >> > 
> >> > 	irq_enter() // arch code
> >> > 	irq_enter() // irq code
> >> > 
> >> > 	< irq handler here >
> >> > 
> >> > 	irq_exit() // irq code
> >> > 	irq_exit() // arch code
> >> > 
> >> > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> >> > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> >> > expected result because of the additional nesting, since
> >> > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> >> > is only incremented once per exception entry, when it does:
> >> > 
> >> > 	/* Are we at first interrupt nesting level? */
> >> > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> >> > 	if (nesting > 1)
> >> > 		return false;
> >> > 
> >> > What I'm trying to figure out is whether that expectation is legitimate,
> >> > and assuming so, where the entry/exit should happen.
> >> 
> >> Oooh...
> >> 
> >> The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
> >> be unable to detect a userspace quiescent state for a non-nohz_full
> >> CPU.  That could result in RCU CPU stall warnings if a user task runs
> >> continuously on a given CPU for more than 21 seconds (60 seconds in
> >> some distros).  And this can easily happen if the user has a CPU-bound
> >> thread that is the only runnable task on that CPU.
> >> 
> >> So, yes, this does need some sort of resolution.
> >> 
> >> The traditional approach is (as you surmise) to have only a single call
> >> to irq_enter() on exception entry and only a single call to irq_exit()
> >> on exception exit.  If this is feasible, it is highly recommended.
> >
> > Cool; that's roughly what I was expecting / hoping to hear!
> >
> >> In theory, we could have that "1" in "nesting > 1" be a constant supplied
> >> by the architecture (you would want "3" if I remember correctly) but
> >> in practice could we please avoid this?  For one thing, if there is
> >> some other path into the kernel for your architecture that does only a
> >> single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
> >> a chance.  It would need to compare against a different value depending
> >> on what exception showed up.  Even if that cannot happen, it would be
> >> better if your architecture could remain in blissful ignorance of the
> >> colorful details of ->dynticks_nmi_nesting manipulations.
> >
> > I completely agree. I think it's much harder to keep that in check than
> > to enforce a "once per architectural exception" policy in the arch code.
> >
> >> Another approach would be for the arch code to supply RCU a function that
> >> it calls.  If there is such a function (or perhaps better, if some new
> >> Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
> >> "1" as it does now.  But you break it, you buy it!  ;-)
> >
> > I guess we could look at the exception regs and inspect the original
> > context, but it sounds overkill...
> >
> > I think the cleanest thing is to leave this to arch code, and have the
> > common IRQ code stay well clear. Unfortunately most architectures
> > (including arch/arm) still need the common IRQ code to handle this, so
> > we'll have to make that conditional on Kconfig, something like the below
> > (build+boot tested only).
> >
> > If there are no objections, I'll go check who else needs the same
> > treatment (IIUC at least s390 will), and spin that as a real
> > patch/series.
> 
> Hmm, s390 doesn't use handle_domain_irq() and doesn't have
> HANDLE_DOMAIN_IRQ set. So i don't think the patch below applies to s390.
> However, i'll follow the code to make sure we're not calling
> irq_enter/irq_exit twice.

I wasn't clear, but for s390, my concern was that in do_io_irq() and
do_ext_irq() you have the sequence:

	irqentry_enter()	// calls rcu_irq_enter()
	irq_enter();		// calls rcu_irq_enter() then irq_enter_rcu();

	< handler>

	irq_exit();		// calls __irq_exit_rcu then rcu_irq_exit();
	irqentry_exit();	// calls rcu_irq_exit()

... and so IIUC you call rcu_irq_enter() and rcu_irq_exit() twice,
getting the same double-increment of `dynticks_nmi_nesting` per
interrupt, and the same potential problem with
rcu_is_cpu_rrupt_from_idle().

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: Sven Schnelle <svens@linux.ibm.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Pingfan Liu <kernelfans@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Julien Thierry <julien.thierry@arm.com>,
	Yuichi Ito <ito-yuichi@fujitsu.com>,
	linux-kernel@vger.kernel.org, Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>
Subject: Re: [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of
Date: Tue, 28 Sep 2021 11:26:04 +0100	[thread overview]
Message-ID: <20210928102604.GE1924@C02TD0UTHF1T.local> (raw)
In-Reply-To: <yt9dtui53u30.fsf@linux.ibm.com>

On Tue, Sep 28, 2021 at 11:52:51AM +0200, Sven Schnelle wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On Mon, Sep 27, 2021 at 05:09:22PM -0700, Paul E. McKenney wrote:
> >> On Mon, Sep 27, 2021 at 10:23:18AM +0100, Mark Rutland wrote:
> >> > Sure; I didn't mean to suggest those weren't balanced! The problem here
> >> > is *nesting*. Due to the structure of our entry code and the core IRQ
> >> > code, when handling an IRQ we have a sequence:
> >> > 
> >> > 	irq_enter() // arch code
> >> > 	irq_enter() // irq code
> >> > 
> >> > 	< irq handler here >
> >> > 
> >> > 	irq_exit() // irq code
> >> > 	irq_exit() // arch code
> >> > 
> >> > ... and if we use something like rcu_is_cpu_rrupt_from_idle() in the
> >> > middle (e.g. as part of rcu_sched_clock_irq()), this will not give the
> >> > expected result because of the additional nesting, since
> >> > rcu_is_cpu_rrupt_from_idle() seems to expect that dynticks_nmi_nesting
> >> > is only incremented once per exception entry, when it does:
> >> > 
> >> > 	/* Are we at first interrupt nesting level? */
> >> > 	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> >> > 	if (nesting > 1)
> >> > 		return false;
> >> > 
> >> > What I'm trying to figure out is whether that expectation is legitimate,
> >> > and assuming so, where the entry/exit should happen.
> >> 
> >> Oooh...
> >> 
> >> The penalty for fooling rcu_is_cpu_rrupt_from_idle() is that RCU will
> >> be unable to detect a userspace quiescent state for a non-nohz_full
> >> CPU.  That could result in RCU CPU stall warnings if a user task runs
> >> continuously on a given CPU for more than 21 seconds (60 seconds in
> >> some distros).  And this can easily happen if the user has a CPU-bound
> >> thread that is the only runnable task on that CPU.
> >> 
> >> So, yes, this does need some sort of resolution.
> >> 
> >> The traditional approach is (as you surmise) to have only a single call
> >> to irq_enter() on exception entry and only a single call to irq_exit()
> >> on exception exit.  If this is feasible, it is highly recommended.
> >
> > Cool; that's roughly what I was expecting / hoping to hear!
> >
> >> In theory, we could have that "1" in "nesting > 1" be a constant supplied
> >> by the architecture (you would want "3" if I remember correctly) but
> >> in practice could we please avoid this?  For one thing, if there is
> >> some other path into the kernel for your architecture that does only a
> >> single irq_enter(), then rcu_is_cpu_rrupt_from_idle() just doesn't stand
> >> a chance.  It would need to compare against a different value depending
> >> on what exception showed up.  Even if that cannot happen, it would be
> >> better if your architecture could remain in blissful ignorance of the
> >> colorful details of ->dynticks_nmi_nesting manipulations.
> >
> > I completely agree. I think it's much harder to keep that in check than
> > to enforce a "once per architectural exception" policy in the arch code.
> >
> >> Another approach would be for the arch code to supply RCU a function that
> >> it calls.  If there is such a function (or perhaps better, if some new
> >> Kconfig option is enabled), RCU invokes it.  Otherwise, it compares to
> >> "1" as it does now.  But you break it, you buy it!  ;-)
> >
> > I guess we could look at the exception regs and inspect the original
> > context, but it sounds overkill...
> >
> > I think the cleanest thing is to leave this to arch code, and have the
> > common IRQ code stay well clear. Unfortunately most architectures
> > (including arch/arm) still need the common IRQ code to handle this, so
> > we'll have to make that conditional on Kconfig, something like the below
> > (build+boot tested only).
> >
> > If there are no objections, I'll go check who else needs the same
> > treatment (IIUC at least s390 will), and spin that as a real
> > patch/series.
> 
> Hmm, s390 doesn't use handle_domain_irq() and doesn't have
> HANDLE_DOMAIN_IRQ set. So i don't think the patch below applies to s390.
> However, i'll follow the code to make sure we're not calling
> irq_enter/irq_exit twice.

I wasn't clear, but for s390, my concern was that in do_io_irq() and
do_ext_irq() you have the sequence:

	irqentry_enter()	// calls rcu_irq_enter()
	irq_enter();		// calls rcu_irq_enter() then irq_enter_rcu();

	< handler>

	irq_exit();		// calls __irq_exit_rcu then rcu_irq_exit();
	irqentry_exit();	// calls rcu_irq_exit()

... and so IIUC you call rcu_irq_enter() and rcu_irq_exit() twice,
getting the same double-increment of `dynticks_nmi_nesting` per
interrupt, and the same potential problem with
rcu_is_cpu_rrupt_from_idle().

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-09-28 10:26 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24 13:28 [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Pingfan Liu
2021-09-24 13:28 ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead Pingfan Liu
2021-09-24 13:28   ` Pingfan Liu
2021-09-24 17:53   ` Mark Rutland
2021-09-24 17:53     ` Mark Rutland
2021-09-25 15:39     ` Pingfan Liu
2021-09-25 15:39       ` Pingfan Liu
2021-09-30 13:32       ` Mark Rutland
2021-09-30 13:32         ` Mark Rutland
2021-10-08  4:01         ` Pingfan Liu
2021-10-08  4:01           ` Pingfan Liu
2021-10-08 14:55           ` Pingfan Liu
2021-10-08 14:55             ` Pingfan Liu
2021-10-08 17:25             ` Mark Rutland
2021-10-08 17:25               ` Mark Rutland
2021-10-09  3:49               ` Pingfan Liu
2021-10-09  3:49                 ` Pingfan Liu
2021-10-08 15:45           ` Paul E. McKenney
2021-10-08 15:45             ` Paul E. McKenney
2021-10-09  4:14             ` Pingfan Liu
2021-10-09  4:14               ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 2/5] irqchip/GICv3: expose handle_nmi() directly Pingfan Liu
2021-09-24 13:28   ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 3/5] kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch optional Pingfan Liu
2021-09-24 13:28   ` [PATCHv2 3/5] kernel/irq: make irq_{enter, exit}() " Pingfan Liu
2021-09-28  8:55   ` [PATCHv2 3/5] kernel/irq: make irq_{enter,exit}() " Mark Rutland
2021-09-28  8:55     ` Mark Rutland
2021-09-29  3:15     ` Pingfan Liu
2021-09-29  3:15       ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 4/5] irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64 Pingfan Liu
2021-09-24 13:28   ` Pingfan Liu
2021-09-28  9:10   ` Mark Rutland
2021-09-28  9:10     ` Mark Rutland
2021-09-29  3:10     ` Pingfan Liu
2021-09-29  3:10       ` Pingfan Liu
2021-09-29  7:20       ` Marc Zyngier
2021-09-29  7:20         ` Marc Zyngier
2021-09-29  8:27         ` Pingfan Liu
2021-09-29  8:27           ` Pingfan Liu
2021-09-29  9:23           ` Mark Rutland
2021-09-29  9:23             ` Mark Rutland
2021-09-29 11:40             ` Pingfan Liu
2021-09-29 11:40               ` Pingfan Liu
2021-09-29 14:29             ` Pingfan Liu
2021-09-29 14:29               ` Pingfan Liu
2021-09-29 17:41               ` Mark Rutland
2021-09-29 17:41                 ` Mark Rutland
2021-09-24 13:28 ` [PATCHv2 5/5] irqchip/GICv3: make reschedule-ipi light weight Pingfan Liu
2021-09-24 13:28   ` Pingfan Liu
2021-09-29  7:24   ` Marc Zyngier
2021-09-29  7:24     ` Marc Zyngier
2021-09-29  8:32     ` Pingfan Liu
2021-09-29  8:32       ` Pingfan Liu
2021-09-24 17:36 ` [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Mark Rutland
2021-09-24 17:36   ` Mark Rutland
2021-09-24 22:59   ` Paul E. McKenney
2021-09-24 22:59     ` Paul E. McKenney
2021-09-27  9:23     ` Mark Rutland
2021-09-27  9:23       ` Mark Rutland
2021-09-28  0:09       ` Paul E. McKenney
2021-09-28  0:09         ` Paul E. McKenney
2021-09-28  8:32         ` Mark Rutland
2021-09-28  8:32           ` Mark Rutland
2021-09-28  8:35           ` Mark Rutland
2021-09-28  8:35             ` Mark Rutland
2021-09-28  9:52           ` Sven Schnelle
2021-09-28  9:52             ` Sven Schnelle
2021-09-28 10:26             ` Mark Rutland [this message]
2021-09-28 10:26               ` Mark Rutland
2021-09-28 13:55           ` Paul E. McKenney
2021-09-28 13:55             ` Paul E. McKenney
2021-09-25 15:12   ` Pingfan Liu
2021-09-25 15:12     ` Pingfan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210928102604.GE1924@C02TD0UTHF1T.local \
    --to=mark.rutland@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=ito-yuichi@fujitsu.com \
    --cc=joey.gouly@arm.com \
    --cc=julien.thierry@arm.com \
    --cc=kernelfans@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=samitolvanen@google.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.