linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4.2: CONFIG_NO_HZ_FULL_ALL effectively disabling non-boot CPUs
@ 2015-10-10 19:14 Meelis Roos
  2015-10-10 19:24 ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Meelis Roos @ 2015-10-10 19:14 UTC (permalink / raw)
  To: Linux Kernel list; +Cc: Thomas Gleixner, Paul E. McKenney, Frederic Weisbecker

Short summary: turning on CONFIG_NO_HZ_FULL_ALL seems to disable all 
non-boot CPUs for scheduler.

A couple of days ago I noticed that make -j8 on a 4-core i5 is very slow 
(with 4.3.0-rc4+git). Looking at top ('1' for per-cpu states), only 
first CPU is loaded and 3 other CPUs are 100% idle. This seems to be a 
problem on 3 of my desktop machines (different generation Intel: i5-660, 
i5-2400, i3-3220). All the computers run custom kernels.

Further investigation showed that CPU affinity was set to 1 (CPU0 only) 
for init and all the children. Kernel threads had affinities 1,2,4,8 
and f (seems normal).

Even more interesting was the behaviour after setting affinity to f for 
all userland processes and then running make -j4. The other cores were 
still idle!

Switching back to 4.2.0 with my config, the problem persisted. 4.2.3 as 
packaged by Debian worked fine. 4.0.0 and 4.1.0 with my config worked 
also fine. systemd and sysvinit behaved the same and no affinity was 
configured for systemd.

So did a kernel config bisection between my kernel config and Debian 
config and came to CONFIG_NO_HZ_FULL_ALL. Debian has it off, I had it 
on. Turning that off fixed the scheduling and the system spread the 
tasks to all the cores.

I do not remember changing this value for a long time, I set them after 
the settings were introduced and used it. So it seems it broken in 4.2.0 
but was working in 4.1 but I do not have 4.1 config saved anywhere 
(many make oldconfigs since).

Bisection between 4.1 and 4.2 is possible but not easy since the 
machines are usually actively used when I am near them.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 4.2: CONFIG_NO_HZ_FULL_ALL effectively disabling non-boot CPUs
  2015-10-10 19:14 4.2: CONFIG_NO_HZ_FULL_ALL effectively disabling non-boot CPUs Meelis Roos
@ 2015-10-10 19:24 ` Paul E. McKenney
  2015-10-11  7:27   ` Meelis Roos
  2015-10-12  0:51   ` Frederic Weisbecker
  0 siblings, 2 replies; 4+ messages in thread
From: Paul E. McKenney @ 2015-10-10 19:24 UTC (permalink / raw)
  To: Meelis Roos; +Cc: Linux Kernel list, Thomas Gleixner, Frederic Weisbecker

On Sat, Oct 10, 2015 at 10:14:25PM +0300, Meelis Roos wrote:
> Short summary: turning on CONFIG_NO_HZ_FULL_ALL seems to disable all 
> non-boot CPUs for scheduler.
> 
> A couple of days ago I noticed that make -j8 on a 4-core i5 is very slow 
> (with 4.3.0-rc4+git). Looking at top ('1' for per-cpu states), only 
> first CPU is loaded and 3 other CPUs are 100% idle. This seems to be a 
> problem on 3 of my desktop machines (different generation Intel: i5-660, 
> i5-2400, i3-3220). All the computers run custom kernels.
> 
> Further investigation showed that CPU affinity was set to 1 (CPU0 only) 
> for init and all the children. Kernel threads had affinities 1,2,4,8 
> and f (seems normal).
> 
> Even more interesting was the behaviour after setting affinity to f for 
> all userland processes and then running make -j4. The other cores were 
> still idle!
> 
> Switching back to 4.2.0 with my config, the problem persisted. 4.2.3 as 
> packaged by Debian worked fine. 4.0.0 and 4.1.0 with my config worked 
> also fine. systemd and sysvinit behaved the same and no affinity was 
> configured for systemd.
> 
> So did a kernel config bisection between my kernel config and Debian 
> config and came to CONFIG_NO_HZ_FULL_ALL. Debian has it off, I had it 
> on. Turning that off fixed the scheduling and the system spread the 
> tasks to all the cores.
> 
> I do not remember changing this value for a long time, I set them after 
> the settings were introduced and used it. So it seems it broken in 4.2.0 
> but was working in 4.1 but I do not have 4.1 config saved anywhere 
> (many make oldconfigs since).
> 
> Bisection between 4.1 and 4.2 is possible but not easy since the 
> machines are usually actively used when I am near them.

This is expected and intended behavior.  The whole point of
CONFIG_NO_HZ_FULL_ALL is to keep everything off of the non-boot CPUs
that is not explicitly placed there.  Without CONFIG_NO_HZ_FULL_ALL,
you can use the nohz_full boot parameter to select exactly which
CPUs are to behave this way.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 4.2: CONFIG_NO_HZ_FULL_ALL effectively disabling non-boot CPUs
  2015-10-10 19:24 ` Paul E. McKenney
@ 2015-10-11  7:27   ` Meelis Roos
  2015-10-12  0:51   ` Frederic Weisbecker
  1 sibling, 0 replies; 4+ messages in thread
From: Meelis Roos @ 2015-10-11  7:27 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel list, Thomas Gleixner, Frederic Weisbecker

> > Short summary: turning on CONFIG_NO_HZ_FULL_ALL seems to disable all 
> > non-boot CPUs for scheduler.

[...]

> This is expected and intended behavior.  The whole point of
> CONFIG_NO_HZ_FULL_ALL is to keep everything off of the non-boot CPUs
> that is not explicitly placed there.  Without CONFIG_NO_HZ_FULL_ALL,
> you can use the nohz_full boot parameter to select exactly which
> CPUs are to behave this way.

OK, thanks - disabled it on my desktops anyway now.

Perhaps the Kconfig description of it should explian the consequences 
more clearly so other people do not stumble upon it like I did?

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 4.2: CONFIG_NO_HZ_FULL_ALL effectively disabling non-boot CPUs
  2015-10-10 19:24 ` Paul E. McKenney
  2015-10-11  7:27   ` Meelis Roos
@ 2015-10-12  0:51   ` Frederic Weisbecker
  1 sibling, 0 replies; 4+ messages in thread
From: Frederic Weisbecker @ 2015-10-12  0:51 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Meelis Roos, Linux Kernel list, Thomas Gleixner

On Sat, Oct 10, 2015 at 12:24:39PM -0700, Paul E. McKenney wrote:
> On Sat, Oct 10, 2015 at 10:14:25PM +0300, Meelis Roos wrote:
> > Short summary: turning on CONFIG_NO_HZ_FULL_ALL seems to disable all 
> > non-boot CPUs for scheduler.
> > 
> > A couple of days ago I noticed that make -j8 on a 4-core i5 is very slow 
> > (with 4.3.0-rc4+git). Looking at top ('1' for per-cpu states), only 
> > first CPU is loaded and 3 other CPUs are 100% idle. This seems to be a 
> > problem on 3 of my desktop machines (different generation Intel: i5-660, 
> > i5-2400, i3-3220). All the computers run custom kernels.
> > 
> > Further investigation showed that CPU affinity was set to 1 (CPU0 only) 
> > for init and all the children. Kernel threads had affinities 1,2,4,8 
> > and f (seems normal).
> > 
> > Even more interesting was the behaviour after setting affinity to f for 
> > all userland processes and then running make -j4. The other cores were 
> > still idle!
> > 
> > Switching back to 4.2.0 with my config, the problem persisted. 4.2.3 as 
> > packaged by Debian worked fine. 4.0.0 and 4.1.0 with my config worked 
> > also fine. systemd and sysvinit behaved the same and no affinity was 
> > configured for systemd.
> > 
> > So did a kernel config bisection between my kernel config and Debian 
> > config and came to CONFIG_NO_HZ_FULL_ALL. Debian has it off, I had it 
> > on. Turning that off fixed the scheduling and the system spread the 
> > tasks to all the cores.
> > 
> > I do not remember changing this value for a long time, I set them after 
> > the settings were introduced and used it. So it seems it broken in 4.2.0 
> > but was working in 4.1 but I do not have 4.1 config saved anywhere 
> > (many make oldconfigs since).
> > 
> > Bisection between 4.1 and 4.2 is possible but not easy since the 
> > machines are usually actively used when I am near them.
> 
> This is expected and intended behavior.  The whole point of
> CONFIG_NO_HZ_FULL_ALL is to keep everything off of the non-boot CPUs
> that is not explicitly placed there.  Without CONFIG_NO_HZ_FULL_ALL,
> you can use the nohz_full boot parameter to select exactly which
> CPUs are to behave this way.

I'm preparing a revert of this. Many people are complaining about this.
Most of the time it's about accidentally enbling NO_HZ_FULL_ALL and I could
fix this with a warning to avoid time spent by users to chase a non-bug. But Mike
says that CONFIG_NO_HZ_FULL_ALL makes the machine unusable for anything else
than isolation workloads whereas some "normal" workload may be needed as well
by the machine before or after an isolation task.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-12  0:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-10 19:14 4.2: CONFIG_NO_HZ_FULL_ALL effectively disabling non-boot CPUs Meelis Roos
2015-10-10 19:24 ` Paul E. McKenney
2015-10-11  7:27   ` Meelis Roos
2015-10-12  0:51   ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).