All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russell King - ARM Linux <linux@arm.linux.org.uk>
To: Nicolas Pitre <nico@fluxnic.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Krzysztof Kozlowski <k.kozlowski@samsung.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Stephen Boyd <sboyd@codeaurora.org>,
	linux-kernel@vger.kernel.org, Will Deacon <will.deacon@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Marek Szyprowski <m.szyprowski@samsung.com>
Subject: Re: [PATCH v2] ARM: Don't use complete() during __cpu_die
Date: Wed, 25 Feb 2015 17:00:11 +0000	[thread overview]
Message-ID: <20150225170011.GC8656@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <alpine.LFD.2.11.1502250941210.25484@knanqh.ubzr>

On Wed, Feb 25, 2015 at 11:47:48AM -0500, Nicolas Pitre wrote:
> I completely agree with the r/w spinlock. Something like this ought to 
> be sufficient to make gic_raise_softirq() reentrant which is the issue 
> here, right?  I've been stress-testing it for a while with no problems 
> so far.

No.  The issue is that we need a totally lockless way to raise an IPI
during CPU hot-unplug, so we can raise an IPI in __cpu_die() to tell
the __cpu_kill() code that it's safe to proceed to platform code.

As soon sa that IPI has been received, the receiving CPU can decide
to cut power to the dying CPU.  So, it's entirely possible that power
could be lost on the dying CPU before the unlock has become visible.

It's a catch-22 - the reason we're sending the IPI is for synchronisation,
but right now we need another form of synchronisation because we're
using a form of synchronisation...

We could just use the spin-and-poll solution instead of an IPI, but
I really don't like that - when you see the complexity needed to
re-initialise it each time, it quickly becomes very yucky because
there is no well defined order between __cpu_die() and __cpu_kill()
being called by the two respective CPUs.

The last patch I saw doing that had multiple bits to indicate success
and timeout, and rather a lot of complexity to recover from failures,
and reinitialise state for a second CPU going down.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

WARNING: multiple messages have this Message-ID (diff)
From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] ARM: Don't use complete() during __cpu_die
Date: Wed, 25 Feb 2015 17:00:11 +0000	[thread overview]
Message-ID: <20150225170011.GC8656@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <alpine.LFD.2.11.1502250941210.25484@knanqh.ubzr>

On Wed, Feb 25, 2015 at 11:47:48AM -0500, Nicolas Pitre wrote:
> I completely agree with the r/w spinlock. Something like this ought to 
> be sufficient to make gic_raise_softirq() reentrant which is the issue 
> here, right?  I've been stress-testing it for a while with no problems 
> so far.

No.  The issue is that we need a totally lockless way to raise an IPI
during CPU hot-unplug, so we can raise an IPI in __cpu_die() to tell
the __cpu_kill() code that it's safe to proceed to platform code.

As soon sa that IPI has been received, the receiving CPU can decide
to cut power to the dying CPU.  So, it's entirely possible that power
could be lost on the dying CPU before the unlock has become visible.

It's a catch-22 - the reason we're sending the IPI is for synchronisation,
but right now we need another form of synchronisation because we're
using a form of synchronisation...

We could just use the spin-and-poll solution instead of an IPI, but
I really don't like that - when you see the complexity needed to
re-initialise it each time, it quickly becomes very yucky because
there is no well defined order between __cpu_die() and __cpu_kill()
being called by the two respective CPUs.

The last patch I saw doing that had multiple bits to indicate success
and timeout, and rather a lot of complexity to recover from failures,
and reinitialise state for a second CPU going down.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

  reply	other threads:[~2015-02-25 17:00 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-05 10:14 [PATCH v2] ARM: Don't use complete() during __cpu_die Krzysztof Kozlowski
2015-02-05 10:14 ` Krzysztof Kozlowski
2015-02-05 10:50 ` Russell King - ARM Linux
2015-02-05 10:50   ` Russell King - ARM Linux
2015-02-05 11:00   ` Krzysztof Kozlowski
2015-02-05 11:00     ` Krzysztof Kozlowski
2015-02-05 11:08     ` Russell King - ARM Linux
2015-02-05 11:08       ` Russell King - ARM Linux
2015-02-05 11:28   ` Mark Rutland
2015-02-05 11:28     ` Mark Rutland
2015-02-05 11:30     ` Russell King - ARM Linux
2015-02-05 11:30       ` Russell King - ARM Linux
2015-02-05 14:29   ` Paul E. McKenney
2015-02-05 14:29     ` Paul E. McKenney
2015-02-05 16:11     ` Russell King - ARM Linux
2015-02-05 16:11       ` Russell King - ARM Linux
2015-02-05 17:02       ` Paul E. McKenney
2015-02-05 17:02         ` Paul E. McKenney
2015-02-05 17:34         ` Russell King - ARM Linux
2015-02-05 17:34           ` Russell King - ARM Linux
2015-02-05 17:54           ` Paul E. McKenney
2015-02-05 17:54             ` Paul E. McKenney
2015-02-10  1:24       ` Stephen Boyd
2015-02-10  1:24         ` Stephen Boyd
2015-02-10  1:37         ` Paul E. McKenney
2015-02-10  1:37           ` Paul E. McKenney
2015-02-10  2:05           ` Stephen Boyd
2015-02-10  2:05             ` Stephen Boyd
2015-02-10  3:05             ` Paul E. McKenney
2015-02-10  3:05               ` Paul E. McKenney
2015-02-10 15:14         ` Mark Rutland
2015-02-10 15:14           ` Mark Rutland
2015-02-10 20:48           ` Stephen Boyd
2015-02-10 20:48             ` Stephen Boyd
2015-02-10 21:04             ` Stephen Boyd
2015-02-10 21:04               ` Stephen Boyd
2015-02-10 21:15               ` Russell King - ARM Linux
2015-02-10 21:15                 ` Russell King - ARM Linux
2015-02-10 21:49                 ` Stephen Boyd
2015-02-10 21:49                   ` Stephen Boyd
2015-02-10 22:05                   ` Stephen Boyd
2015-02-10 22:05                     ` Stephen Boyd
2015-02-13 15:52               ` Mark Rutland
2015-02-13 15:52                 ` Mark Rutland
2015-02-13 16:27                 ` Russell King - ARM Linux
2015-02-13 16:27                   ` Russell King - ARM Linux
2015-02-13 17:21                   ` Mark Rutland
2015-02-13 17:21                     ` Mark Rutland
2015-02-13 17:30                     ` Russell King - ARM Linux
2015-02-13 17:30                       ` Russell King - ARM Linux
2015-02-13 16:28                 ` Stephen Boyd
2015-02-13 16:28                   ` Stephen Boyd
2015-02-13 15:38             ` Mark Rutland
2015-02-13 15:38               ` Mark Rutland
2015-02-10 20:58           ` Russell King - ARM Linux
2015-02-10 20:58             ` Russell King - ARM Linux
2015-02-10 15:41         ` Russell King - ARM Linux
2015-02-10 15:41           ` Russell King - ARM Linux
2015-02-10 18:33           ` Stephen Boyd
2015-02-10 18:33             ` Stephen Boyd
2015-02-25 12:56       ` Russell King - ARM Linux
2015-02-25 12:56         ` Russell King - ARM Linux
2015-02-25 16:47         ` Nicolas Pitre
2015-02-25 16:47           ` Nicolas Pitre
2015-02-25 17:00           ` Russell King - ARM Linux [this message]
2015-02-25 17:00             ` Russell King - ARM Linux
2015-02-25 18:13             ` Nicolas Pitre
2015-02-25 18:13               ` Nicolas Pitre
2015-02-25 20:16               ` Nicolas Pitre
2015-02-25 20:16                 ` Nicolas Pitre
2015-02-26  1:05                 ` Paul E. McKenney
2015-02-26  1:05                   ` Paul E. McKenney
2015-03-22 23:30                   ` Paul E. McKenney
2015-03-22 23:30                     ` Paul E. McKenney
2015-03-23 12:55                     ` Russell King - ARM Linux
2015-03-23 12:55                       ` Russell King - ARM Linux
2015-03-23 13:21                       ` Paul E. McKenney
2015-03-23 13:21                         ` Paul E. McKenney
2015-03-23 14:00                         ` Russell King - ARM Linux
2015-03-23 14:00                           ` Russell King - ARM Linux
2015-03-23 15:37                           ` Paul E. McKenney
2015-03-23 15:37                             ` Paul E. McKenney
2015-03-23 16:56                             ` Paul E. McKenney
2015-03-23 16:56                               ` Paul E. McKenney
2015-02-26 19:14           ` Daniel Thompson
2015-02-26 19:14             ` Daniel Thompson
2015-02-26 19:47             ` Nicolas Pitre
2015-02-26 19:47               ` Nicolas Pitre
2015-02-05 10:53 ` Mark Rutland
2015-02-05 10:53   ` Mark Rutland
2015-02-05 10:59   ` Krzysztof Kozlowski
2015-02-05 10:59     ` Krzysztof Kozlowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150225170011.GC8656@n2100.arm.linux.org.uk \
    --to=linux@arm.linux.org.uk \
    --cc=arnd@arndb.de \
    --cc=b.zolnierkie@samsung.com \
    --cc=catalin.marinas@arm.com \
    --cc=k.kozlowski@samsung.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mark.rutland@arm.com \
    --cc=nico@fluxnic.net \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=sboyd@codeaurora.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.