On Tue 23-08-16 10:26:03, Michal Hocko wrote: > On Mon 22-08-16 19:47:09, Michal Hocko wrote: > > On Mon 22-08-16 19:29:36, Michal Hocko wrote: > > > On Mon 22-08-16 18:45:54, Michal Hocko wrote: > > > [...] > > > > I have no idea why those numbers are so different on my laptop > > > > yet. It surely looks suspicious. I will try to debug this further > > > > tomorrow. > > > > > > Hmm, so I've tried to use my version of awk on other machine and vice > > > versa and it didn't make any difference. So this is independent on the > > > awk version it seems. So I've tried to strace /usr/bin/time and > > > wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={0, 0}, ru_stime={0, 688438}, ...}) = 9128 > > > > > > so the kernel indeed reports 0 user time for some reason. Note I > > > was testing with 4.7 and right now with 4.8.0-rc3 kernel (no local > > > modifications). The other machine which reports non-0 utime is 3.12 > > > SLES kernel. Maybe I am hitting some accounting bug. At first I was > > > suspecting CONFIG_NO_HZ_FULL because that is the main difference between > > > my and the other machine but then I've noticed that the tests I was > > > doing in kvm have this disabled too.. so it must be something else. > > > > 4.5 reports non-0 while 4.6 zero utime. NO_HZ configuration is the same > > in both kernels. > > and one more thing. It is not like utime accounting would be completely > broken and always report 0. Other commands report non-0 values even on > 4.6 kernels. I will try to bisect this down later today. OK, so it seems I found it. I was quite lucky because account_user_time is not all that popular function and there were basically no changes besides Riks ff9a9b4c4334 ("sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity") and that seems to cause the regression. Reverting the commit on top of the current mmotm seems to fix the issue for me. And just to give Rik more context. While debugging overhead of the /proc//smaps I am getting a misleading output from /usr/bin/time -v (source for ./max_mmap is [1]) root@test1:~# uname -r 4.5.0-rc6-bisect1-00025-gff9a9b4c4334 root@test1:~# ./max_map pid:2990 maps:65515 root@test1:~# /usr/bin/time -v awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}' /proc/2990/smaps rss:263368 pss:262203 Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss} /proc/2990/smaps" User time (seconds): 0.00 System time (seconds): 0.45 Percent of CPU this job got: 98% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.46 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 1796 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1 Minor (reclaiming a frame) page faults: 83 Voluntary context switches: 6 Involuntary context switches: 6 Swaps: 0 File system inputs: 248 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 See the User time being 0 (as you can see above in the quoted text it is not a rounding error in userspace or something similar because wait4 really returns 0). Now with the revert root@test1:~# uname -r 4.5.0-rc6-revert-00026-g7fc86f968bf5 root@test1:~# /usr/bin/time -v awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}' /proc/3015/smaps rss:263316 pss:262199 Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss} /proc/3015/smaps" User time (seconds): 0.18 System time (seconds): 0.29 Percent of CPU this job got: 97% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.50 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 1760 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1 Minor (reclaiming a frame) page faults: 79 Voluntary context switches: 5 Involuntary context switches: 7 Swaps: 0 File system inputs: 248 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 So it looks like the whole user time is accounted as the system time. My config is attached and yes I do have CONFIG_VIRT_CPU_ACCOUNTING_GEN enabled. Could you have a look please? [1] http://lkml.kernel.org/r/20160817082200.GA10547@dhcp22.suse.cz -- Michal Hocko SUSE Labs