* Subject: PROBLEM: CPU accounting/scheduling regression in v4.6 CPU scheduling patchset?
@ 2016-07-03 17:24 Vladimir Panteleev
2016-07-04 20:40 ` Thomas Gleixner
2016-07-05 8:13 ` Peter Zijlstra
0 siblings, 2 replies; 3+ messages in thread
From: Vladimir Panteleev @ 2016-07-03 17:24 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86; +Cc: linux-kernel
Hi,
Since updating my PC to Linux 4.6, I noticed the following problems:
1. CPU-bound tasks which use all CPU cores have a severe impact on
responsiveness. For example, the following bash command (which
simply starts one busyloop per core) is enough to make the machine
almost completely unresponsive:
for N in $(seq $(nproc)) ; do while true ; do ; done & ; done
2. Nearly all tasks in the process listing are shown with 0% CPU
usage, even when they're CPU-bound. The only exceptions are the
kernel migration and kthreadd tasks, and occasionally the init
process.
I have bisected the problem to commit
1cf4f629d9d246519a1e76c021806f2a51ddba4d ("cpu/hotplug: Move online
calls to hotplugged cpu"), which is part of Thomas Gleixner's CPU
hotplug refactoring patchset [1]. It introduces both problems
described above.
My system is a GIGABYTE X79S-UP5-WIFI motherboard (F5f BIOS) with an
i7-4960X CPU, running Arch Linux. I've reproduced with both the
distro's kernel config [2], as well as a minimal config for my
system. I can reproduce the problems on the latest rc at the moment,
v4.7-rc5.
Comparing dmesg output before and after 1cf4f629, I see no notable
differences.
I noticed an existing thread "S3 resume regression" [3] referencing
this commit, however it describes a different problem. I also found a
Bugzilla issue for the zero CPU usage problem [4], however it has no
replies.
[1]: https://lkml.org/lkml/2016/2/26/806
[2]: https://aur.archlinux.org/cgit/aur.git/tree/config.x86_64?h=linux-git
[3]: https://lkml.org/lkml/2016/5/11/238
[4]: https://bugzilla.kernel.org/show_bug.cgi?id=120151
Stuff REPORTING-BUGS told me to include:
ver_linux output:
https://dump.v.panteleev.md/616390d43a4c6a3d085acc5eaa390c82/16%3A58%3A08-stdin.txt
/proc/cpuinfo:
https://dump.v.panteleev.md/5dfeba5d7c64028de51d50559b566088/16%3A58%3A49-stdin.txt
/proc/modules:
https://dump.v.panteleev.md/868c0f2b23651be8164975fa5d7e7aab/16%3A59%3A18-stdin.txt
/proc/ioports:
https://dump.v.panteleev.md/5e44aa12cc403dbd783b0273bd3edab4/17%3A01%3A33-stdin.txt
/proc/iomem:
https://dump.v.panteleev.md/110a8fdd0f647fd8d729c54f4f01a3d0/17%3A01%3A49-stdin.txt
"lspci -vvv" output:
https://dump.v.panteleev.md/0c2448fa8a872e34c4555d876b656013/17%3A02%3A18-stdin.txt
/proc/scsi/scsi:
https://dump.v.panteleev.md/6efa007ce74f0bf4ce10ae56690c63de/17%3A02%3A54-stdin.txt
dmesg output:
https://dump.v.panteleev.md/b8a3ba608a914a3d70667dad697dddfb/1467563818.log
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Subject: PROBLEM: CPU accounting/scheduling regression in v4.6 CPU scheduling patchset?
2016-07-03 17:24 Subject: PROBLEM: CPU accounting/scheduling regression in v4.6 CPU scheduling patchset? Vladimir Panteleev
@ 2016-07-04 20:40 ` Thomas Gleixner
2016-07-05 8:13 ` Peter Zijlstra
1 sibling, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2016-07-04 20:40 UTC (permalink / raw)
To: Vladimir Panteleev; +Cc: Ingo Molnar, H. Peter Anvin, x86, LKML, Peter Zijlstra
On Sun, 3 Jul 2016, Vladimir Panteleev wrote:
> Since updating my PC to Linux 4.6, I noticed the following problems:
>
> 1. CPU-bound tasks which use all CPU cores have a severe impact on
> responsiveness. For example, the following bash command (which
> simply starts one busyloop per core) is enough to make the machine
> almost completely unresponsive:
>
> for N in $(seq $(nproc)) ; do while true ; do ; done & ; done
>
> 2. Nearly all tasks in the process listing are shown with 0% CPU
> usage, even when they're CPU-bound. The only exceptions are the
> kernel migration and kthreadd tasks, and occasionally the init
> process.
>
> I have bisected the problem to commit
> 1cf4f629d9d246519a1e76c021806f2a51ddba4d ("cpu/hotplug: Move online
> calls to hotplugged cpu"), which is part of Thomas Gleixner's CPU
> hotplug refactoring patchset [1]. It introduces both problems
> described above.
I doubt that, but that commit has been a bisect victim before ...
> My system is a GIGABYTE X79S-UP5-WIFI motherboard (F5f BIOS) with an
> i7-4960X CPU, running Arch Linux. I've reproduced with both the
> distro's kernel config [2], as well as a minimal config for my
> system. I can reproduce the problems on the latest rc at the moment,
> v4.7-rc5.
>
> Comparing dmesg output before and after 1cf4f629, I see no notable
> differences.
>
> I noticed an existing thread "S3 resume regression" [3] referencing
> this commit, however it describes a different problem. I also found a
> Bugzilla issue for the zero CPU usage problem [4], however it has no
> replies.
That one says:
* After an hour or less (I have no idea), the top/ps start working
* I do have exactly the same problem with the LTS branch 4.4.14
* With 4.5.4 I cannot reproduce the problem just after booting
I tried to reproduce the issue on a couple of machines, but no luck.
No idea at the moment, but Cc'ed scheduler folks.
Thanks,
tglx
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Subject: PROBLEM: CPU accounting/scheduling regression in v4.6 CPU scheduling patchset?
2016-07-03 17:24 Subject: PROBLEM: CPU accounting/scheduling regression in v4.6 CPU scheduling patchset? Vladimir Panteleev
2016-07-04 20:40 ` Thomas Gleixner
@ 2016-07-05 8:13 ` Peter Zijlstra
1 sibling, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2016-07-05 8:13 UTC (permalink / raw)
To: Vladimir Panteleev
Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, linux-kernel
On Sun, Jul 03, 2016 at 05:24:11PM +0000, Vladimir Panteleev wrote:
> dmesg output:
> https://dump.v.panteleev.md/b8a3ba608a914a3d70667dad697dddfb/1467563818.log
[ 0.059350] smpboot: CPU0: Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz (family: 0x6, model: 0x3e, stepping: 0x4)
...
[ 0.069372] x86: Booting SMP configuration:
[ 0.069375] .... node #0, CPUs: #1
[ 0.132420] TSC synchronization [CPU#0 -> CPU#1]:
[ 0.132427] Measured 170505122558937 cycles TSC warp between CPUs, turning off TSC clock.
[ 0.132435] tsc: Marking TSC unstable due to check_tsc_sync_source failed
[ 0.136636] #2 #3 #4 #5 #6 #7 #8 #9 #10 #11
[ 0.773136] x86: Booted up 1 node, 12 CPUs
How can TSC be borken on single socket IvyBridge !?
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-05 8:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-03 17:24 Subject: PROBLEM: CPU accounting/scheduling regression in v4.6 CPU scheduling patchset? Vladimir Panteleev
2016-07-04 20:40 ` Thomas Gleixner
2016-07-05 8:13 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).