linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: fweisbec@gmail.com, hpa@zytor.com, linux-kernel@vger.kernel.org,
	linux-tip-commits@vger.kernel.org, mingo@kernel.org,
	peterz@infradead.org, rafael.j.wysocki@intel.com,
	tglx@linutronix.de, torvalds@linux-foundation.org
Subject: Re: [tip:sched/core] sched/isolation: Require a present CPU in housekeeping mask
Date: Tue, 07 May 2019 09:50:24 +1000	[thread overview]
Message-ID: <1557186148.ocs72ssdjc.astroid@bobo.none> (raw)
In-Reply-To: <20190506151615.GA14529@lenoir>

Frederic Weisbecker's on May 7, 2019 1:16 am:
> On Sat, May 04, 2019 at 04:59:12PM +1000, Nicholas Piggin wrote:
>> Frederic Weisbecker's on May 4, 2019 10:27 am:
>> > On Fri, May 03, 2019 at 10:47:37AM -0700, tip-bot for Nicholas Piggin wrote:
>> >> Commit-ID:  9219565aa89033a9cfdae788c1940473a1253d6c
>> >> Gitweb:     https://git.kernel.org/tip/9219565aa89033a9cfdae788c1940473a1253d6c
>> >> Author:     Nicholas Piggin <npiggin@gmail.com>
>> >> AuthorDate: Thu, 11 Apr 2019 13:34:47 +1000
>> >> Committer:  Ingo Molnar <mingo@kernel.org>
>> >> CommitDate: Fri, 3 May 2019 19:42:58 +0200
>> >> 
>> >> sched/isolation: Require a present CPU in housekeeping mask
>> >> 
>> >> During housekeeping mask setup, currently a possible CPU is required.
>> >> That does not guarantee the CPU would be available at boot time, so
>> >> check to ensure that at least one present CPU is in the mask.
>> > 
>> > I have a doubt about the requirements and semantics of cpu_present_mask.
>> > IIUC a present CPU means that it is physically plugged in (from ACPI
>> > perspective) but might not be logically plugged in (set on cpu_online_mask).
>> 
>> Right, a superset of cpu_possible_mask, subset of cpu_online_mask. It 
>> means that CPU can be brought online at any time.
>> 
>> > But do we have the guarantee that a present CPU _will_ be online at least once
>> > right after the boot? After all, kernel parameters such as "maxcpus=" can prevent
>> > from turning some CPUs on. I guess there are even more creative ways to achieve
>> > that.
>> > 
>> > In any case we really require the housekeeper to be forced online. Perhaps
>> > I missed that enforcement somewhere in the patchset?
>> 
>> No I think you're right, that may be able to boot without anything in
>> the housekeeping mask. Maybe we can just cpu_up() a CPU in the 
>> housekeeping mask with a warning that it has overidden their SMP
>> command line option. I'll take a look at it.
> 
> But then what if cpu_up() fails? In this case I can think of only two
> answers:
> 
> * Force the boot CPU as the housekeeper.
> * Rollback the whole thing: nohz and all isolation.

If cpu_up fails despite being in the present map and we explicitly
selected it as the housekeeper? I think it would be okay to print
a message telling admin to correct the config, and panic.

We try a best effort to make the system boot and limp along, but if
you misconfigure it, crashing is not unreasonable. There's lots of
command line option misconfiguration that will cause the same thing.

The primary problem with my patch that needs to be addressed is that
the error is not explicitly caught and printed if the housekeeper
does not come up, so the system might die in non-obvious ways.

> 
> The second solution looks sane to me. After all if the user doesn't
> include CPU 0 in the housekeeping set, forcing it isn't going to
> help much.
> 
> But that means we must enhance the isolation code (nohz included)
> to be able to dynamically add/del CPUs to the houseeeping/isolation
> set. That's not going to be easy but it's a necessary evolution
> of that subsystem since we want to drive it through cpusets.
> 
> I should start working on that.

I considered that when looking at the series, but couldn't justify
the complexity based on my usage (which is static boot time).

If you have other uses for it, then that would solve all these boot
time issues as well, which will be nice.

Thanks,
Nick


  reply	other threads:[~2019-05-06 23:50 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11  3:34 [PATCH v2 0/5] Allow CPU0 to be nohz full Nicholas Piggin
2019-04-11  3:34 ` [PATCH v2 1/5] sched/core: allow the remote scheduler tick to be started on CPU0 Nicholas Piggin
2019-05-03 11:28   ` [tip:sched/core] sched/core: Allow " tip-bot for Nicholas Piggin
2019-04-11  3:34 ` [PATCH v2 2/5] PM / suspend: add function to disable secondaries for suspend Nicholas Piggin
2019-05-03 11:29   ` [tip:sched/core] power/suspend: Add " tip-bot for Nicholas Piggin
2019-05-03 17:46   ` tip-bot for Nicholas Piggin
2019-04-11  3:34 ` [PATCH v2 3/5] kernel/cpu: Allow non-zero CPU to be primary for suspend / kexec freeze Nicholas Piggin
2019-04-25 12:02   ` Peter Zijlstra
2019-04-26  4:32     ` Nicholas Piggin
2019-05-03 11:29   ` [tip:sched/core] " tip-bot for Nicholas Piggin
2019-05-03 17:46   ` tip-bot for Nicholas Piggin
2019-04-11  3:34 ` [PATCH v2 4/5] kernel/sched/isolation: require a present CPU in housekeeping mask Nicholas Piggin
2019-05-03 11:30   ` [tip:sched/core] sched/isolation: Require " tip-bot for Nicholas Piggin
2019-05-03 17:47   ` tip-bot for Nicholas Piggin
2019-05-03 23:52     ` Frederic Weisbecker
2019-05-04  0:27     ` Frederic Weisbecker
2019-05-04  6:59       ` Nicholas Piggin
2019-05-06 15:16         ` Frederic Weisbecker
2019-05-06 23:50           ` Nicholas Piggin [this message]
2019-05-08  0:35             ` Frederic Weisbecker
2019-05-08  1:38               ` Nicholas Piggin
2019-04-11  3:34 ` [PATCH v2 5/5] nohz_full: Allow the boot CPU to be nohz_full Nicholas Piggin
2019-05-03 11:31   ` [tip:sched/core] " tip-bot for Nicholas Piggin
2019-05-03 17:48   ` tip-bot for Nicholas Piggin
2019-04-25 12:04 ` [PATCH v2 0/5] Allow CPU0 to be nohz full Peter Zijlstra
2019-04-30  2:46   ` Nicholas Piggin
2019-04-30 12:07     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1557186148.ocs72ssdjc.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=frederic@kernel.org \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).