* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
@ 2011-11-28 22:28 pomac
2011-11-29 7:52 ` Michal Hocko
0 siblings, 1 reply; 23+ messages in thread
From: pomac @ 2011-11-28 22:28 UTC (permalink / raw)
To: linux-kernel; +Cc: mhocko, rjw, tino.keitel, t.artem
Hi,
All this time i have been thinking i'm the only one - and i've been to
loaded with work during working hours and tired when home =P
Anyways, I've neen seeing this since -rc1 on:
Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64
AMD Phenom(tm) II X6 1090T - x86-64
AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64
Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686
Configs available on demand - I've been running the same config for
quite some time though.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-28 22:28 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage pomac
@ 2011-11-29 7:52 ` Michal Hocko
2011-11-29 11:38 ` Artem S. Tashkinov
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Michal Hocko @ 2011-11-29 7:52 UTC (permalink / raw)
To: pomac; +Cc: linux-kernel, rjw, tino.keitel, t.artem
On Mon 28-11-11 23:28:03, pomac@vapor.com wrote:
> Hi,
>
> All this time i have been thinking i'm the only one - and i've been to
> loaded with work during working hours and tired when home =P
>
> Anyways, I've neen seeing this since -rc1 on:
> Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64
> AMD Phenom(tm) II X6 1090T - x86-64
> AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64
> Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686
>
> Configs available on demand - I've been running the same config for
> quite some time though.
As I have written in other email could you post your config and collect
the following data?
for i in `seq 30`;
do
cat /proc/stat > `date +'%s'`
sleep 1
done
export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
# for all your available CPUs
grep cpu0 * | while read cpu user nice sys idle iowait rest;
do
echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
done
Thanks
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 7:52 ` Michal Hocko
@ 2011-11-29 11:38 ` Artem S. Tashkinov
2011-11-29 12:31 ` Michal Hocko
2011-12-02 13:35 ` Michal Hocko
2011-11-29 17:23 ` Ian Kumlien
2011-11-29 17:31 ` Ian Kumlien
2 siblings, 2 replies; 23+ messages in thread
From: Artem S. Tashkinov @ 2011-11-29 11:38 UTC (permalink / raw)
To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel
[-- Attachment #1: Type: text/plain, Size: 2189 bytes --]
On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote:
> As I have written in other email could you post your config and collect
> the following data?
> for i in `seq 30`;
> do
> cat /proc/stat > `date +'%s'`
> sleep 1
> done
> export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
>
> # for all your available CPUs
> grep cpu0 * | while read cpu user nice sys idle iowait rest;
> do
> echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> done
1322566208:cpu0 5199 0 2931 357890604 2541
1322566209:cpu0 0 0 1 0 0
1322566210:cpu0 0 0 0 0 0
1322566211:cpu0 0 0 0 0 0
1322566212:cpu0 0 0 0 0 0
1322566213:cpu0 0 0 0 0 0
1322566214:cpu0 1 0 0 0 0
1322566215:cpu0 2 0 0 0 0
1322566216:cpu0 3 0 0 0 0
1322566217:cpu0 2 0 0 0 0
1322566218:cpu0 4 0 0 0 0
1322566219:cpu0 1 0 0 0 0
1322566220:cpu0 2 0 0 0 0
1322566221:cpu0 2 0 1 0 0
1322566222:cpu0 1 0 0 0 0
1322566223:cpu0 2 0 0 0 0
1322566224:cpu0 1 0 1 0 0
1322566225:cpu0 1 0 0 0 0
1322566226:cpu0 2 0 0 0 0
1322566227:cpu0 1 0 1 0 0
1322566228:cpu0 2 0 0 0 0
1322566229:cpu0 2 0 0 0 0
1322566230:cpu0 6 0 3 0 0
1322566231:cpu0 1 0 0 0 0
1322566232:cpu0 2 0 0 0 0
1322566233:cpu0 3 0 0 0 0
1322566234:cpu0 2 0 0 0 0
1322566235:cpu0 2 0 2 0 0
1322566236:cpu0 0 0 1 0 0
1322566237:cpu0 1 0 0 0 0
$ grep . -r /sys/devices/system/cpu/cpuidle/
/sys/devices/system/cpu/cpuidle/current_driver:intel_idle
/sys/devices/system/cpu/cpuidle/current_governor_ro:menu
$ grep . -r /sys/devices/system/cpu/cpufreq/
/sys/devices/system/cpu/cpufreq/ondemand/sampling_rate_min:10000
/sys/devices/system/cpu/cpufreq/ondemand/sampling_rate:10000
/sys/devices/system/cpu/cpufreq/ondemand/up_threshold:95
/sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor:1
/sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load:0
/sys/devices/system/cpu/cpufreq/ondemand/powersave_bias:0
/sys/devices/system/cpu/cpufreq/ondemand/io_is_busy:1
One thing I have to note, it takes some time (from 30 seconds to 10 minutes) before this bug starts manifesting itself.
[-- Attachment #2: out.tar.xz --]
[-- Type: application/octet-stream, Size: 1816 bytes --]
[-- Attachment #3: config.bz2 --]
[-- Type: application/octet-stream, Size: 13899 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 11:38 ` Artem S. Tashkinov
@ 2011-11-29 12:31 ` Michal Hocko
2011-11-29 12:44 ` Michal Hocko
2011-12-02 13:35 ` Michal Hocko
1 sibling, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-11-29 12:31 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote:
>
> > As I have written in other email could you post your config and collect
> > the following data?
> > for i in `seq 30`;
> > do
> > cat /proc/stat > `date +'%s'`
> > sleep 1
> > done
> > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
> >
> > # for all your available CPUs
> > grep cpu0 * | while read cpu user nice sys idle iowait rest;
> > do
> > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> > done
>
> 1322566208:cpu0 5199 0 2931 357890604 2541
> 1322566209:cpu0 0 0 1 0 0
> 1322566210:cpu0 0 0 0 0 0
> 1322566211:cpu0 0 0 0 0 0
> 1322566212:cpu0 0 0 0 0 0
> 1322566213:cpu0 0 0 0 0 0
> 1322566214:cpu0 1 0 0 0 0
> 1322566215:cpu0 2 0 0 0 0
> 1322566216:cpu0 3 0 0 0 0
> 1322566217:cpu0 2 0 0 0 0
> 1322566218:cpu0 4 0 0 0 0
> 1322566219:cpu0 1 0 0 0 0
> 1322566220:cpu0 2 0 0 0 0
> 1322566221:cpu0 2 0 1 0 0
> 1322566222:cpu0 1 0 0 0 0
> 1322566223:cpu0 2 0 0 0 0
> 1322566224:cpu0 1 0 1 0 0
> 1322566225:cpu0 1 0 0 0 0
> 1322566226:cpu0 2 0 0 0 0
> 1322566227:cpu0 1 0 1 0 0
> 1322566228:cpu0 2 0 0 0 0
> 1322566229:cpu0 2 0 0 0 0
> 1322566230:cpu0 6 0 3 0 0
> 1322566231:cpu0 1 0 0 0 0
> 1322566232:cpu0 2 0 0 0 0
> 1322566233:cpu0 3 0 0 0 0
> 1322566234:cpu0 2 0 0 0 0
> 1322566235:cpu0 2 0 2 0 0
> 1322566236:cpu0 0 0 1 0 0
> 1322566237:cpu0 1 0 0 0 0
Hmm, really strange. It looks that idle/iowait is not accounted at
all. Which would explain why the numbers you are seeing are so weird.
>
> $ grep . -r /sys/devices/system/cpu/cpuidle/
> /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
I will check whether I have a machine with intel_idle somewhere around.
>
> $ grep . -r /sys/devices/system/cpu/cpufreq/
> /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate_min:10000
> /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate:10000
> /sys/devices/system/cpu/cpufreq/ondemand/up_threshold:95
> /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor:1
> /sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load:0
> /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias:0
> /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy:1
>
> One thing I have to note, it takes some time (from 30 seconds to 10
> minutes) before this bug starts manifesting itself.
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 12:31 ` Michal Hocko
@ 2011-11-29 12:44 ` Michal Hocko
2011-11-29 12:54 ` Artem S. Tashkinov
0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-11-29 12:44 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Tue 29-11-11 13:31:56, Michal Hocko wrote:
> On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
[...]
> > $ grep . -r /sys/devices/system/cpu/cpuidle/
> > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
Could you try with acpi_idle driver?
echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 12:44 ` Michal Hocko
@ 2011-11-29 12:54 ` Artem S. Tashkinov
2011-11-29 13:10 ` Michal Hocko
0 siblings, 1 reply; 23+ messages in thread
From: Artem S. Tashkinov @ 2011-11-29 12:54 UTC (permalink / raw)
To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel
> On Nov 29, 2011, Michal Hocko wrote:
>
> On Tue 29-11-11 13:31:56, Michal Hocko wrote:
> > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> [...]
> > > $ grep . -r /sys/devices/system/cpu/cpuidle/
> > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
>
> Could you try with acpi_idle driver?
> echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
[root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
-bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 12:54 ` Artem S. Tashkinov
@ 2011-11-29 13:10 ` Michal Hocko
2011-11-29 13:51 ` Artem S. Tashkinov
2011-11-29 22:51 ` Rafael J. Wysocki
0 siblings, 2 replies; 23+ messages in thread
From: Michal Hocko @ 2011-11-29 13:10 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote:
> > On Nov 29, 2011, Michal Hocko wrote:
> >
> > On Tue 29-11-11 13:31:56, Michal Hocko wrote:
> > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> > [...]
> > > > $ grep . -r /sys/devices/system/cpu/cpuidle/
> > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
> >
> > Could you try with acpi_idle driver?
> > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
>
> [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied
It seems that this is cannot be set in runtime:
SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL);
Or maybe you need to boot with cpuidle_sysfs_switch. According to the
documentation you might be able to change the governor. I have no idea
whether this can help somehow but let's try that.
I haven't found any intel_idle machine in my lab so far and all other
acpi_idle machines seem to work (or at least randomly selected ones) so
this smells like a major difference in the setup.
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 13:10 ` Michal Hocko
@ 2011-11-29 13:51 ` Artem S. Tashkinov
2011-11-29 22:51 ` Rafael J. Wysocki
1 sibling, 0 replies; 23+ messages in thread
From: Artem S. Tashkinov @ 2011-11-29 13:51 UTC (permalink / raw)
To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Nov 29, 2011, Michal Hocko wrote:
> It seems that this is cannot be set in runtime:
> SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL);
>
> Or maybe you need to boot with cpuidle_sysfs_switch. According to the
> documentation you might be able to change the governor. I have no idea
> whether this can help somehow but let's try that.
>
> I haven't found any intel_idle machine in my lab so far and all other
> acpi_idle machines seem to work (or at least randomly selected ones) so
> this smells like a major difference in the setup.
# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied
BTW
# cat /sys/devices/system/cpu/cpuidle/available_governors
ladder menu
dmesg | grep -i idle
[ 0.000000] Kernel command line: root=/dev/sda1 ro cpuidle_sysfs_switch
[ 0.000126] using mwait in idle threads.
[ 1.083872] intel_idle: MWAIT substates: 0x1120
[ 1.083873] intel_idle: v0.4 model 0x2A
[ 1.083874] intel_idle: lapic_timer_reliable_states 0xffffffff
[ 1.159043] cpuidle: using governor ladder
[ 1.168842] cpuidle: using governor menu
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 7:52 ` Michal Hocko
2011-11-29 11:38 ` Artem S. Tashkinov
@ 2011-11-29 17:23 ` Ian Kumlien
2011-11-29 17:31 ` Ian Kumlien
2 siblings, 0 replies; 23+ messages in thread
From: Ian Kumlien @ 2011-11-29 17:23 UTC (permalink / raw)
To: Michal Hocko; +Cc: linux-kernel, rjw, tino.keitel, t.artem
[-- Attachment #1: Type: text/plain, Size: 3089 bytes --]
On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote:
> On Mon 28-11-11 23:28:03, pomac@vapor.com wrote:
> > Hi,
> >
> > All this time i have been thinking i'm the only one - and i've been to
> > loaded with work during working hours and tired when home =P
> >
> > Anyways, I've neen seeing this since -rc1 on:
> > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64
> > AMD Phenom(tm) II X6 1090T - x86-64
> > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64
> > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686
> >
> > Configs available on demand - I've been running the same config for
> > quite some time though.
>
> As I have written in other email could you post your config and collect
> the following data?
> for i in `seq 30`;
> do
> cat /proc/stat > `date +'%s'`
> sleep 1
> done
> export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
>
> # for all your available CPUs
> grep cpu0 * | while read cpu user nice sys idle iowait rest;
> do
> echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> done
1322587011:cpu0 5838553 164 1252921 1844674407370 78234
1322587012:cpu0 4 0 1 0 0
1322587013:cpu0 4 0 2 0 0
1322587014:cpu0 2 0 1 0 0
1322587015:cpu0 3 0 2 0 0
1322587016:cpu0 7 0 2 0 4
1322587017:cpu0 2 0 2 0 0
1322587018:cpu0 1 0 1 0 0
1322587019:cpu0 4 0 2 0 1
1322587020:cpu0 5 0 3 0 5
1322587021:cpu0 7 0 1 0 3
1322587022:cpu0 4 0 2 0 0
1322587023:cpu0 4 0 2 0 0
1322587024:cpu0 3 0 3 0 0
1322587025:cpu0 3 0 1 0 0
1322587026:cpu0 4 0 2 0 4
1322587027:cpu0 2 0 2 0 0
1322587028:cpu0 2 0 1 0 0
1322587029:cpu0 4 0 1 0 0
1322587030:cpu0 5 0 1 0 0
1322587032:cpu0 5 0 1 0 4
1322587033:cpu0 6 0 1 0 0
1322587034:cpu0 1 0 2 0 0
1322587035:cpu0 3 0 2 0 0
1322587036:cpu0 3 0 1 0 0
1322587037:cpu0 4 0 2 0 2
1322587038:cpu0 4 0 1 0 0
1322587039:cpu0 2 0 1 0 0
1322587040:cpu0 3 0 1 0 0
1322587041:cpu0 4 0 3 0 0
1322587011:cpu1 5944677 172 1287816 1844674407370 871
1322587012:cpu1 4 0 2 0 0
1322587013:cpu1 2 0 2 0 0
1322587014:cpu1 3 0 0 0 0
1322587015:cpu1 3 0 1 0 9
1322587016:cpu1 22 0 3 0 18
1322587017:cpu1 10 0 3 0 13
1322587018:cpu1 3 0 1 0 0
1322587019:cpu1 4 0 3 0 14
1322587020:cpu1 14 0 2 0 7
1322587021:cpu1 16 0 1 0 2
1322587022:cpu1 5 0 0 0 0
1322587023:cpu1 7 0 1 0 0
1322587024:cpu1 8 0 1 0 0
1322587025:cpu1 2 0 2 0 0
1322587026:cpu1 4 0 2 0 5
1322587027:cpu1 7 0 2 0 0
1322587028:cpu1 5 0 0 0 0
1322587029:cpu1 4 0 2 0 0
1322587030:cpu1 6 0 0 0 0
1322587032:cpu1 3 0 2 0 0
1322587033:cpu1 6 0 1 0 0
1322587034:cpu1 2 0 1 0 0
1322587035:cpu1 4 0 2 0 0
1322587036:cpu1 3 0 1 0 0
1322587037:cpu1 4 0 0 0 0
1322587038:cpu1 8 0 1 0 0
1322587039:cpu1 4 0 2 0 0
1322587040:cpu1 4 0 0 0 0
1322587041:cpu1 8 0 1 0 0
For me the cpu usage is sporadic and only shows up as user or kernel,
nothing else.
--
Ian Kumlien -- http://demius.net || http://pomac.netswarm.net
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 7:52 ` Michal Hocko
2011-11-29 11:38 ` Artem S. Tashkinov
2011-11-29 17:23 ` Ian Kumlien
@ 2011-11-29 17:31 ` Ian Kumlien
2011-11-29 17:56 ` Michal Hocko
2 siblings, 1 reply; 23+ messages in thread
From: Ian Kumlien @ 2011-11-29 17:31 UTC (permalink / raw)
To: Michal Hocko; +Cc: linux-kernel, rjw, tino.keitel, t.artem
[-- Attachment #1: Type: text/plain, Size: 3099 bytes --]
On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote:
> On Mon 28-11-11 23:28:03, pomac@vapor.com wrote:
> > Hi,
> >
> > All this time i have been thinking i'm the only one - and i've been to
> > loaded with work during working hours and tired when home =P
> >
> > Anyways, I've neen seeing this since -rc1 on:
> > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64
> > AMD Phenom(tm) II X6 1090T - x86-64
> > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64
> > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686
> >
> > Configs available on demand - I've been running the same config for
> > quite some time though.
>
> As I have written in other email could you post your config and collect
> the following data?
> for i in `seq 30`;
> do
> cat /proc/stat > `date +'%s'`
> sleep 1
> done
> export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
>
> # for all your available CPUs
> grep cpu0 * | while read cpu user nice sys idle iowait rest;
> do
> echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> done
Sorry, the previous one was AMD X2 4400+ and now the T7200:
1322587696:cpu0 1410110 148 368843 357890604 357890604
1322587697:cpu0 0 0 1 0 0
1322587698:cpu0 1 0 1 0 0
1322587699:cpu0 2 0 1 0 0
1322587700:cpu0 0 0 1 0 0
1322587701:cpu0 0 0 1 0 0
1322587702:cpu0 0 0 1 0 0
1322587703:cpu0 1 0 0 0 0
1322587704:cpu0 0 0 0 0 0
1322587705:cpu0 0 0 1 0 0
1322587706:cpu0 1 0 1 0 0
1322587707:cpu0 0 0 1 0 0
1322587708:cpu0 1 0 0 0 0
1322587709:cpu0 1 0 0 0 0
1322587711:cpu0 0 0 2 0 0
1322587712:cpu0 0 0 0 0 0
1322587713:cpu0 1 0 1 0 0
1322587714:cpu0 1 0 0 0 0
1322587715:cpu0 1 0 1 0 0
1322587716:cpu0 1 0 0 0 0
1322587717:cpu0 0 0 1 0 0
1322587718:cpu0 1 0 1 0 0
1322587719:cpu0 0 0 1 0 0
1322587720:cpu0 0 0 0 0 0
1322587721:cpu0 2 0 1 0 0
1322587722:cpu0 1 0 0 0 0
1322587723:cpu0 0 0 3 0 0
1322587724:cpu0 1 0 0 0 0
1322587725:cpu0 0 0 0 0 0
1322587726:cpu0 0 0 1 0 0
1322587696:cpu1 1395249 509 284978 357890604 95640
1322587697:cpu1 0 0 0 0 0
1322587698:cpu1 0 0 0 0 0
1322587699:cpu1 0 0 0 0 0
1322587700:cpu1 1 0 0 0 0
1322587701:cpu1 1 0 0 0 0
1322587702:cpu1 0 0 0 0 0
1322587703:cpu1 0 0 1 0 0
1322587704:cpu1 1 0 0 0 0
1322587705:cpu1 0 0 1 0 0
1322587706:cpu1 1 0 0 0 0
1322587707:cpu1 0 0 0 0 0
1322587708:cpu1 1 0 0 0 0
1322587709:cpu1 1 0 1 0 0
1322587711:cpu1 0 0 0 0 0
1322587712:cpu1 0 0 0 0 0
1322587713:cpu1 0 0 0 0 0
1322587714:cpu1 0 0 0 0 0
1322587715:cpu1 0 0 0 0 0
1322587716:cpu1 0 0 1 0 0
1322587717:cpu1 0 0 0 0 0
1322587718:cpu1 0 0 0 0 0
1322587719:cpu1 0 0 1 0 0
1322587720:cpu1 1 0 0 0 0
1322587721:cpu1 2 0 0 0 0
1322587722:cpu1 0 0 0 0 0
1322587723:cpu1 0 0 1 0 0
1322587724:cpu1 0 0 0 0 0
1322587725:cpu1 1 0 0 0 0
1322587726:cpu1 0 0 0 0 0
Btw, perf doesn't seem to yield anything..
--
Ian Kumlien -- http://demius.net || http://pomac.netswarm.net
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 17:31 ` Ian Kumlien
@ 2011-11-29 17:56 ` Michal Hocko
2011-11-29 18:37 ` Ian Kumlien
0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-11-29 17:56 UTC (permalink / raw)
To: Ian Kumlien; +Cc: linux-kernel, rjw, tino.keitel, t.artem
On Tue 29-11-11 18:31:00, Ian Kumlien wrote:
> On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote:
> > On Mon 28-11-11 23:28:03, pomac@vapor.com wrote:
> > > Hi,
> > >
> > > All this time i have been thinking i'm the only one - and i've been to
> > > loaded with work during working hours and tired when home =P
> > >
> > > Anyways, I've neen seeing this since -rc1 on:
> > > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64
> > > AMD Phenom(tm) II X6 1090T - x86-64
> > > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64
> > > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686
> > >
> > > Configs available on demand - I've been running the same config for
> > > quite some time though.
> >
> > As I have written in other email could you post your config and collect
> > the following data?
> > for i in `seq 30`;
> > do
> > cat /proc/stat > `date +'%s'`
> > sleep 1
> > done
> > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
> >
> > # for all your available CPUs
> > grep cpu0 * | while read cpu user nice sys idle iowait rest;
> > do
> > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> > done
>
> Sorry, the previous one was AMD X2 4400+ and now the T7200:
>
> 1322587696:cpu0 1410110 148 368843 357890604 357890604
> 1322587697:cpu0 0 0 1 0 0
> 1322587698:cpu0 1 0 1 0 0
> 1322587699:cpu0 2 0 1 0 0
> 1322587700:cpu0 0 0 1 0 0
> 1322587701:cpu0 0 0 1 0 0
> 1322587702:cpu0 0 0 1 0 0
> 1322587703:cpu0 1 0 0 0 0
> 1322587704:cpu0 0 0 0 0 0
> 1322587705:cpu0 0 0 1 0 0
> 1322587706:cpu0 1 0 1 0 0
> 1322587707:cpu0 0 0 1 0 0
> 1322587708:cpu0 1 0 0 0 0
> 1322587709:cpu0 1 0 0 0 0
> 1322587711:cpu0 0 0 2 0 0
> 1322587712:cpu0 0 0 0 0 0
> 1322587713:cpu0 1 0 1 0 0
> 1322587714:cpu0 1 0 0 0 0
> 1322587715:cpu0 1 0 1 0 0
> 1322587716:cpu0 1 0 0 0 0
> 1322587717:cpu0 0 0 1 0 0
> 1322587718:cpu0 1 0 1 0 0
> 1322587719:cpu0 0 0 1 0 0
> 1322587720:cpu0 0 0 0 0 0
> 1322587721:cpu0 2 0 1 0 0
> 1322587722:cpu0 1 0 0 0 0
> 1322587723:cpu0 0 0 3 0 0
> 1322587724:cpu0 1 0 0 0 0
> 1322587725:cpu0 0 0 0 0 0
> 1322587726:cpu0 0 0 1 0 0
OK, so the same thing as in another email in the thread (no idle/io_wait
accounting).
Could you double check what kind of idle driver are you using?
cat /sys/devices/system/cpu/cpuidle/current_driver
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 17:56 ` Michal Hocko
@ 2011-11-29 18:37 ` Ian Kumlien
0 siblings, 0 replies; 23+ messages in thread
From: Ian Kumlien @ 2011-11-29 18:37 UTC (permalink / raw)
To: Michal Hocko; +Cc: linux-kernel, rjw, tino.keitel, t.artem
[-- Attachment #1: Type: text/plain, Size: 1877 bytes --]
On tis, 2011-11-29 at 18:56 +0100, Michal Hocko wrote:
> On Tue 29-11-11 18:31:00, Ian Kumlien wrote:
> > On tis, 2011-11-29 at 08:52 +0100, Michal Hocko wrote:
> > > On Mon 28-11-11 23:28:03, pomac@vapor.com wrote:
> > > > Hi,
> > > >
> > > > All this time i have been thinking i'm the only one - and i've been to
> > > > loaded with work during working hours and tired when home =P
> > > >
> > > > Anyways, I've neen seeing this since -rc1 on:
> > > > Intel(R) Core(TM) i5 CPU U520 @ 1.07GHz - x86-64
> > > > AMD Phenom(tm) II X6 1090T - x86-64
> > > > AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ - x86-64
> > > > Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz - i686
> > > >
> > > > Configs available on demand - I've been running the same config for
> > > > quite some time though.
> > >
> > > As I have written in other email could you post your config and collect
> > > the following data?
> > > for i in `seq 30`;
> > > do
> > > cat /proc/stat > `date +'%s'`
> > > sleep 1
> > > done
> > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
> > >
> > > # for all your available CPUs
> > > grep cpu0 * | while read cpu user nice sys idle iowait rest;
> > > do
> > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> > > done
> >
> > Sorry, the previous one was AMD X2 4400+ and now the T7200:
--8<-- [data] --8<--
> OK, so the same thing as in another email in the thread (no idle/io_wait
> accounting).
> Could you double check what kind of idle driver are you using?
> cat /sys/devices/system/cpu/cpuidle/current_driver
"none", on both machines
--
Ian Kumlien -- http://demius.net || http://pomac.netswarm.net
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 13:10 ` Michal Hocko
2011-11-29 13:51 ` Artem S. Tashkinov
@ 2011-11-29 22:51 ` Rafael J. Wysocki
2011-11-30 10:12 ` Michal Hocko
1 sibling, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2011-11-29 22:51 UTC (permalink / raw)
To: Michal Hocko; +Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel
On Tuesday, November 29, 2011, Michal Hocko wrote:
> On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote:
> > > On Nov 29, 2011, Michal Hocko wrote:
> > >
> > > On Tue 29-11-11 13:31:56, Michal Hocko wrote:
> > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> > > [...]
> > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/
> > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
> > >
> > > Could you try with acpi_idle driver?
> > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> >
> > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied
>
> It seems that this is cannot be set in runtime:
> SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL);
>
> Or maybe you need to boot with cpuidle_sysfs_switch. According to the
> documentation you might be able to change the governor. I have no idea
> whether this can help somehow but let's try that.
>
> I haven't found any intel_idle machine in my lab so far and all other
> acpi_idle machines seem to work (or at least randomly selected ones) so
> this smells like a major difference in the setup.
I'm able to reproduce that with acpi_driver on one box, but not on demand.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 22:51 ` Rafael J. Wysocki
@ 2011-11-30 10:12 ` Michal Hocko
2011-11-30 19:56 ` Rafael J. Wysocki
0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-11-30 10:12 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel
On Tue 29-11-11 23:51:16, Rafael J. Wysocki wrote:
> On Tuesday, November 29, 2011, Michal Hocko wrote:
> > On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote:
> > > > On Nov 29, 2011, Michal Hocko wrote:
> > > >
> > > > On Tue 29-11-11 13:31:56, Michal Hocko wrote:
> > > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> > > > [...]
> > > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/
> > > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> > > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
> > > >
> > > > Could you try with acpi_idle driver?
> > > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> > >
> > > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> > > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied
> >
> > It seems that this is cannot be set in runtime:
> > SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL);
> >
> > Or maybe you need to boot with cpuidle_sysfs_switch. According to the
> > documentation you might be able to change the governor. I have no idea
> > whether this can help somehow but let's try that.
> >
> > I haven't found any intel_idle machine in my lab so far and all other
> > acpi_idle machines seem to work (or at least randomly selected ones) so
> > this smells like a major difference in the setup.
>
> I'm able to reproduce that with acpi_driver on one box, but not on demand.
And do you see the same thing (no idle/io_wait) updates?
>
> Thanks,
> Rafael
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-30 10:12 ` Michal Hocko
@ 2011-11-30 19:56 ` Rafael J. Wysocki
2011-12-01 14:07 ` Michal Hocko
0 siblings, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2011-11-30 19:56 UTC (permalink / raw)
To: Michal Hocko; +Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel
On Wednesday, November 30, 2011, Michal Hocko wrote:
> On Tue 29-11-11 23:51:16, Rafael J. Wysocki wrote:
> > On Tuesday, November 29, 2011, Michal Hocko wrote:
> > > On Tue 29-11-11 12:54:16, Artem S. Tashkinov wrote:
> > > > > On Nov 29, 2011, Michal Hocko wrote:
> > > > >
> > > > > On Tue 29-11-11 13:31:56, Michal Hocko wrote:
> > > > > > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> > > > > [...]
> > > > > > > $ grep . -r /sys/devices/system/cpu/cpuidle/
> > > > > > > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle
> > > > > > > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu
> > > > >
> > > > > Could you try with acpi_idle driver?
> > > > > echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> > > >
> > > > [root@localhost ~]# echo acpi_idle > /sys/devices/system/cpu/cpuidle/current_driver
> > > > -bash: /sys/devices/system/cpu/cpuidle/current_driver: Permission denied
> > >
> > > It seems that this is cannot be set in runtime:
> > > SYSDEV_CLASS_ATTR(current_driver, 0444, show_current_driver, NULL);
> > >
> > > Or maybe you need to boot with cpuidle_sysfs_switch. According to the
> > > documentation you might be able to change the governor. I have no idea
> > > whether this can help somehow but let's try that.
> > >
> > > I haven't found any intel_idle machine in my lab so far and all other
> > > acpi_idle machines seem to work (or at least randomly selected ones) so
> > > this smells like a major difference in the setup.
> >
> > I'm able to reproduce that with acpi_driver on one box, but not on demand.
>
> And do you see the same thing (no idle/io_wait) updates?
Actaully, I was wrong. The box I'm seeing the issue on also has "none"
in /sys/devices/system/cpu/cpuidle/current_driver. Sorry for the confusion.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-30 19:56 ` Rafael J. Wysocki
@ 2011-12-01 14:07 ` Michal Hocko
2011-12-02 10:39 ` Michal Hocko
0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-12-01 14:07 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel, Len Brown
[Let's add Len to the CC for idle driver]
On Wed 30-11-11 20:56:54, Rafael J. Wysocki wrote:
> On Wednesday, November 30, 2011, Michal Hocko wrote:
> > On Tue 29-11-11 23:51:16, Rafael J. Wysocki wrote:
> > > On Tuesday, November 29, 2011, Michal Hocko wrote:
[...]
> > > > I haven't found any intel_idle machine in my lab so far and all other
> > > > acpi_idle machines seem to work (or at least randomly selected ones) so
> > > > this smells like a major difference in the setup.
> > >
> > > I'm able to reproduce that with acpi_driver on one box, but not on demand.
> >
> > And do you see the same thing (no idle/io_wait) updates?
>
> Actaully, I was wrong. The box I'm seeing the issue on also has "none"
> in /sys/devices/system/cpu/cpuidle/current_driver. Sorry for the confusion.
OK. So we have seen the issue only with intel_idle and none drivers so
far. acpi_idle which is at my machines works just fine.
I think we should focus on those drivers.
To summarize this issue.
Users are seeing weird values reported by [h]top. CPUs seem to be at
100% even though there is nothing hogging them. /proc/stat collected
data on the affected system shown that idle/io_wait are not accounted
properly.
It has been identified that problem disappears if a25cac51 [proc:
Consider NO_HZ when printing idle and iowait times] is reverted.
This patch fixes a bug when idle/io_wait times are not repororted
properly when a CPU is tickless. It relies on get_cpu_idle_time_us
which either reports sched_time idle_sleeptime or
(idle_sleeptime + now-idle_entrytime) if we are idle at the moment.
While implementation is not race free (we better not use locks in that
path...) so we might race:
E.g.
CPU1 CPU2
now = ktime_get
tick_nohz_start_idle
ts->idle_entrytime = now;
if (ts->idle_active)
ts->idle_active = 1
[...]
return idle_sleeptime
But this is OK because sleeptime will be more or less accurate. We just
skip few ticks.
It would be worse if we had a race like:
CPU1 CPU2
now = ktime_get
tick_nohz_start_idle
now = ktime_get
update_ts_time_stats()
ts->idle_entrytime = now;
ts->idle_active = 1
if (ts->idle_active)
delta = ktime_sub(now, idle_entrytime)
ktime_add(idle_sleeptime, delta)
In this case we might get an overflow from ktime_sub but AFAIU the
ktime_* magic the overflow should cause to get smaller idle_sleeptime
in the end after ktime_add (we do not add a small number but rather
subtract it), right?
So it shouldn't be a big deal as well.
So the question is. What is the role of the idle driver here?
Thanks
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-12-01 14:07 ` Michal Hocko
@ 2011-12-02 10:39 ` Michal Hocko
0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2011-12-02 10:39 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Artem S. Tashkinov, pomac, linux-kernel, tino.keitel, Len Brown
On Thu 01-12-11 15:07:49, Michal Hocko wrote:
[...]
> While implementation is not race free (we better not use locks in that
> path...) so we might race:
>
> E.g.
>
> CPU1 CPU2
> now = ktime_get
> tick_nohz_start_idle
> ts->idle_entrytime = now;
> if (ts->idle_active)
> ts->idle_active = 1
> [...]
> return idle_sleeptime
>
> But this is OK because sleeptime will be more or less accurate. We just
> skip few ticks.
>
> It would be worse if we had a race like:
> CPU1 CPU2
> now = ktime_get
> tick_nohz_start_idle
> now = ktime_get
> update_ts_time_stats()
> ts->idle_entrytime = now;
> ts->idle_active = 1
> if (ts->idle_active)
> delta = ktime_sub(now, idle_entrytime)
> ktime_add(idle_sleeptime, delta)
>
> In this case we might get an overflow from ktime_sub but AFAIU the
> ktime_* magic the overflow should cause to get smaller idle_sleeptime
> in the end after ktime_add (we do not add a small number but rather
> subtract it), right?
Scratch that. Dunno why but I thought that ktime_t has unsigned values
but it is apparently not true (tv64 is s64). Anyway the above races should
be safe.
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-11-29 11:38 ` Artem S. Tashkinov
2011-11-29 12:31 ` Michal Hocko
@ 2011-12-02 13:35 ` Michal Hocko
2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko
2011-12-02 17:43 ` Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage Artem S. Tashkinov
1 sibling, 2 replies; 23+ messages in thread
From: Michal Hocko @ 2011-12-02 13:35 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote:
>
> > As I have written in other email could you post your config and collect
> > the following data?
> > for i in `seq 30`;
> > do
> > cat /proc/stat > `date +'%s'`
> > sleep 1
> > done
> > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
> >
> > # for all your available CPUs
> > grep cpu0 * | while read cpu user nice sys idle iowait rest;
> > do
> > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> > done
>
> 1322566208:cpu0 5199 0 2931 357890604 2541
> 1322566209:cpu0 0 0 1 0 0
> 1322566210:cpu0 0 0 0 0 0
> 1322566211:cpu0 0 0 0 0 0
> 1322566212:cpu0 0 0 0 0 0
> 1322566213:cpu0 0 0 0 0 0
> 1322566214:cpu0 1 0 0 0 0
> 1322566215:cpu0 2 0 0 0 0
> 1322566216:cpu0 3 0 0 0 0
> 1322566217:cpu0 2 0 0 0 0
> 1322566218:cpu0 4 0 0 0 0
> 1322566219:cpu0 1 0 0 0 0
> 1322566220:cpu0 2 0 0 0 0
> 1322566221:cpu0 2 0 1 0 0
> 1322566222:cpu0 1 0 0 0 0
> 1322566223:cpu0 2 0 0 0 0
> 1322566224:cpu0 1 0 1 0 0
> 1322566225:cpu0 1 0 0 0 0
> 1322566226:cpu0 2 0 0 0 0
> 1322566227:cpu0 1 0 1 0 0
> 1322566228:cpu0 2 0 0 0 0
> 1322566229:cpu0 2 0 0 0 0
> 1322566230:cpu0 6 0 3 0 0
> 1322566231:cpu0 1 0 0 0 0
> 1322566232:cpu0 2 0 0 0 0
> 1322566233:cpu0 3 0 0 0 0
> 1322566234:cpu0 2 0 0 0 0
> 1322566235:cpu0 2 0 2 0 0
> 1322566236:cpu0 0 0 1 0 0
> 1322566237:cpu0 1 0 0 0 0
Could you post raw data as well? Ideally starting right after boot and
collected for more than 30s (longer better...)
Thanks!
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage)
2011-12-02 13:35 ` Michal Hocko
@ 2011-12-02 16:49 ` Michal Hocko
2011-12-02 17:59 ` Michal Hocko
2011-12-02 17:43 ` Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage Artem S. Tashkinov
1 sibling, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-12-02 16:49 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Fri 02-12-11 14:35:15, Michal Hocko wrote:
> On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> > On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote:
> >
> > > As I have written in other email could you post your config and collect
> > > the following data?
> > > for i in `seq 30`;
> > > do
> > > cat /proc/stat > `date +'%s'`
> > > sleep 1
> > > done
> > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
> > >
> > > # for all your available CPUs
> > > grep cpu0 * | while read cpu user nice sys idle iowait rest;
> > > do
> > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> > > done
> >
> > 1322566208:cpu0 5199 0 2931 357890604 2541
> > 1322566209:cpu0 0 0 1 0 0
> > 1322566210:cpu0 0 0 0 0 0
> > 1322566211:cpu0 0 0 0 0 0
[...]
>
> Could you post raw data as well? Ideally starting right after boot and
> collected for more than 30s (longer better...)
Ahh, missed that you attached data. And also noticed that you are using
CONFIG_HZ_300 which explains the problem and why I do cannot reproduce
it.
get_{idle,iowait}_time translates us to cputime64_t and it uses
usecs_to_cputime which is just an alias for usecs_to_jiffies and it does
if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
return MAX_JIFFY_OFFSET;
which in your case (HZ=300) means that we overflow much more often than
for HZ==100. The patch below should fix this:
---
>From 23882e2aabe27934df4d23b0ed52749fd4f61ab4 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 2 Dec 2011 16:17:03 +0100
Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz
get_{idle,iowait}_time use usecs_to_cputime to translate micro seconds
time to cputime64_t. This is just an alias to usecs_to_jiffies which
reduces the data type from u64 to unsigned int and also checks whether
the given paramerer overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) and
returns MAX_JIFFY_OFFSET in that case. How much we overflow depends on
CONFIG_HZ and especially for CONFIG_HZ_300 it is quite low (1431649781).
This results in a bug when people saw [h]top going mad reporting 100%
CPU usage even though there was basically no CPU load at all. The reason
was simply that /proc/stat stopped reporting idle/io_wait changes (and
reported MAX_JIFFY_OFFSET) and so the only change happenning was for
user system time.
Let's use nsecs_to_jiffies64 instead as it doesn't overflow.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
fs/proc/stat.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 42b274d..2a30d67 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu)
idle = kstat_cpu(cpu).cpustat.idle;
idle = cputime64_add(idle, arch_idle_time(cpu));
} else
- idle = usecs_to_cputime(idle_time);
+ idle = nsecs_to_jiffies64(1000 * idle_time);
return idle;
}
@@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu)
/* !NO_HZ so we can rely on cpustat.iowait */
iowait = kstat_cpu(cpu).cpustat.iowait;
else
- iowait = usecs_to_cputime(iowait_time);
+ iowait = nsecs_to_jiffies64(1000 * iowait_time);
return iowait;
}
--
1.7.7.3
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage
2011-12-02 13:35 ` Michal Hocko
2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko
@ 2011-12-02 17:43 ` Artem S. Tashkinov
1 sibling, 0 replies; 23+ messages in thread
From: Artem S. Tashkinov @ 2011-12-02 17:43 UTC (permalink / raw)
To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Dec 2, 2011, Michal Hocko <mhocko@suse.cz> wrote:
> Could you post raw data as well? Ideally starting right after boot and
> collected for more than 30s (longer better...)
Already posted that under the "out.tar.xz" filename - just grep your mail agent history.
If it's not there, here's a copy: http://www.gossamer-threads.com/lists/linux/kernel/1460716#1460716
Best wishes,
Artem
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage)
2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko
@ 2011-12-02 17:59 ` Michal Hocko
2011-12-02 20:12 ` Artem S. Tashkinov
0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2011-12-02 17:59 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Fri 02-12-11 17:49:17, Michal Hocko wrote:
> On Fri 02-12-11 14:35:15, Michal Hocko wrote:
> > On Tue 29-11-11 11:38:47, Artem S. Tashkinov wrote:
> > > On Nov 29, 2011, Michal Hocko <mhocko@suse.cz> wrote:
> > >
> > > > As I have written in other email could you post your config and collect
> > > > the following data?
> > > > for i in `seq 30`;
> > > > do
> > > > cat /proc/stat > `date +'%s'`
> > > > sleep 1
> > > > done
> > > > export old_user=0 old_nice=0 old_sys=0 old_idle=0 old_iowait=0;
> > > >
> > > > # for all your available CPUs
> > > > grep cpu0 * | while read cpu user nice sys idle iowait rest;
> > > > do
> > > > echo $cpu $(($user-$old_user)) $(($nice-$old_nice)) $(($sys-$old_sys)) $(($idle-$old_idle)) $(($iowait-$old_iowait))
> > > > old_user=$user old_nice=$nice old_sys=$sys old_idle=$idle old_iowait=$iowait
> > > > done
> > >
> > > 1322566208:cpu0 5199 0 2931 357890604 2541
> > > 1322566209:cpu0 0 0 1 0 0
> > > 1322566210:cpu0 0 0 0 0 0
> > > 1322566211:cpu0 0 0 0 0 0
> [...]
> >
> > Could you post raw data as well? Ideally starting right after boot and
> > collected for more than 30s (longer better...)
>
> Ahh, missed that you attached data. And also noticed that you are using
> CONFIG_HZ_300 which explains the problem and why I do cannot reproduce
> it.
>
> get_{idle,iowait}_time translates us to cputime64_t and it uses
> usecs_to_cputime which is just an alias for usecs_to_jiffies and it does
> if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
> return MAX_JIFFY_OFFSET;
> which in your case (HZ=300) means that we overflow much more often than
> for HZ==100. The patch below should fix this:
And the one with a more cleaned up changelog. No functional changes
---
>From 107887016b91de59194a93c751d040b05d5e37fe Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 2 Dec 2011 16:17:03 +0100
Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz
Since a25cac51 [proc: Consider NO_HZ when printing idle and iowait times]
we are reporting idle/io_wait time also while a CPU is tickless. We rely
on get_{idle,iowait}_time functions to retrieve proper data.
These functions, however, use usecs_to_cputime to translate micro
seconds time to cputime64_t. This is just an alias to usecs_to_jiffies
which reduces the data type from u64 to unsigned int and also checks
whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
and returns MAX_JIFFY_OFFSET in that case.
When do we overflow depends on CONFIG_HZ but especially for
CONFIG_HZ_300 it is quite low (1431649781) so we are getting
MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int.
Just for reference CONFIG_100 has an overflow window around 20s,
CONFIG_250 ~8s and CONFIG_1000 ~2s.
This results in a bug when people saw [h]top going mad reporting 100%
CPU usage even though there was basically no CPU load. The reason was
simply that /proc/stat stopped reporting idle/io_wait changes (and
reported MAX_JIFFY_OFFSET) and so the only change happening was for
user system time.
Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
to 32b type and it is much more appropriate for cumulative time values
(unlike usecs_to_jiffies which intended for timeout calculations).
Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
fs/proc/stat.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 42b274d..2a30d67 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu)
idle = kstat_cpu(cpu).cpustat.idle;
idle = cputime64_add(idle, arch_idle_time(cpu));
} else
- idle = usecs_to_cputime(idle_time);
+ idle = nsecs_to_jiffies64(1000 * idle_time);
return idle;
}
@@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu)
/* !NO_HZ so we can rely on cpustat.iowait */
iowait = kstat_cpu(cpu).cpustat.iowait;
else
- iowait = usecs_to_cputime(iowait_time);
+ iowait = nsecs_to_jiffies64(1000 * iowait_time);
return iowait;
}
--
1.7.7.3
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: Re: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage)
2011-12-02 17:59 ` Michal Hocko
@ 2011-12-02 20:12 ` Artem S. Tashkinov
2011-12-05 8:56 ` Michal Hocko
0 siblings, 1 reply; 23+ messages in thread
From: Artem S. Tashkinov @ 2011-12-02 20:12 UTC (permalink / raw)
To: mhocko; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Dec 2, 2011, Michal Hocko wrote:
> And the one with a more cleaned up changelog. No functional changes
> ---
> From 107887016b91de59194a93c751d040b05d5e37fe Mon Sep 17 00:00:00 2001
> From: Michal Hocko <>
> Date: Fri, 2 Dec 2011 16:17:03 +0100
> Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz
>
> Since a25cac51 [proc: Consider NO_HZ when printing idle and iowait times]
> we are reporting idle/io_wait time also while a CPU is tickless. We rely
> on get_{idle,iowait}_time functions to retrieve proper data.
>
> These functions, however, use usecs_to_cputime to translate micro
> seconds time to cputime64_t. This is just an alias to usecs_to_jiffies
> which reduces the data type from u64 to unsigned int and also checks
> whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
> and returns MAX_JIFFY_OFFSET in that case.
>
> When do we overflow depends on CONFIG_HZ but especially for
> CONFIG_HZ_300 it is quite low (1431649781) so we are getting
> MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int.
> Just for reference CONFIG_100 has an overflow window around 20s,
> CONFIG_250 ~8s and CONFIG_1000 ~2s.
>
> This results in a bug when people saw [h]top going mad reporting 100%
> CPU usage even though there was basically no CPU load. The reason was
> simply that /proc/stat stopped reporting idle/io_wait changes (and
> reported MAX_JIFFY_OFFSET) and so the only change happening was for
> user system time.
>
> Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
> to 32b type and it is much more appropriate for cumulative time values
> (unlike usecs_to_jiffies which intended for timeout calculations).
>
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
> fs/proc/stat.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/stat.c b/fs/proc/stat.c
> index 42b274d..2a30d67 100644
> --- a/fs/proc/stat.c
> +++ b/fs/proc/stat.c
> @@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu)
> idle = kstat_cpu(cpu).cpustat.idle;
> idle = cputime64_add(idle, arch_idle_time(cpu));
> } else
> - idle = usecs_to_cputime(idle_time);
> + idle = nsecs_to_jiffies64(1000 * idle_time);
>
> return idle;
> }
> @@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu)
> /* !NO_HZ so we can rely on cpustat.iowait */
> iowait = kstat_cpu(cpu).cpustat.iowait;
> else
> - iowait = usecs_to_cputime(iowait_time);
> + iowait = nsecs_to_jiffies64(1000 * iowait_time);
>
> return iowait;
> }
> --
> 1.7.7.3
Thank you, this patch has fixed the issue for me.
Tested-by: Artem S. Tashkinov <t.artem@mailcity.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Re: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage)
2011-12-02 20:12 ` Artem S. Tashkinov
@ 2011-12-05 8:56 ` Michal Hocko
0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2011-12-05 8:56 UTC (permalink / raw)
To: Artem S. Tashkinov; +Cc: pomac, linux-kernel, rjw, tino.keitel
On Fri 02-12-11 20:12:14, Artem S. Tashkinov wrote:
> On Dec 2, 2011, Michal Hocko wrote:
>
> > And the one with a more cleaned up changelog. No functional changes
> > ---
> > From 107887016b91de59194a93c751d040b05d5e37fe Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <>
> > Date: Fri, 2 Dec 2011 16:17:03 +0100
> > Subject: [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz
> >
> > Since a25cac51 [proc: Consider NO_HZ when printing idle and iowait times]
> > we are reporting idle/io_wait time also while a CPU is tickless. We rely
> > on get_{idle,iowait}_time functions to retrieve proper data.
> >
> > These functions, however, use usecs_to_cputime to translate micro
> > seconds time to cputime64_t. This is just an alias to usecs_to_jiffies
> > which reduces the data type from u64 to unsigned int and also checks
> > whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
> > and returns MAX_JIFFY_OFFSET in that case.
> >
> > When do we overflow depends on CONFIG_HZ but especially for
> > CONFIG_HZ_300 it is quite low (1431649781) so we are getting
> > MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int.
> > Just for reference CONFIG_100 has an overflow window around 20s,
> > CONFIG_250 ~8s and CONFIG_1000 ~2s.
> >
> > This results in a bug when people saw [h]top going mad reporting 100%
> > CPU usage even though there was basically no CPU load. The reason was
> > simply that /proc/stat stopped reporting idle/io_wait changes (and
> > reported MAX_JIFFY_OFFSET) and so the only change happening was for
> > user system time.
> >
> > Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
> > to 32b type and it is much more appropriate for cumulative time values
> > (unlike usecs_to_jiffies which intended for timeout calculations).
> >
> > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> > ---
> > fs/proc/stat.c | 4 ++--
> > 1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/proc/stat.c b/fs/proc/stat.c
> > index 42b274d..2a30d67 100644
> > --- a/fs/proc/stat.c
> > +++ b/fs/proc/stat.c
> > @@ -32,7 +32,7 @@ static cputime64_t get_idle_time(int cpu)
> > idle = kstat_cpu(cpu).cpustat.idle;
> > idle = cputime64_add(idle, arch_idle_time(cpu));
> > } else
> > - idle = usecs_to_cputime(idle_time);
> > + idle = nsecs_to_jiffies64(1000 * idle_time);
> >
> > return idle;
> > }
> > @@ -46,7 +46,7 @@ static cputime64_t get_iowait_time(int cpu)
> > /* !NO_HZ so we can rely on cpustat.iowait */
> > iowait = kstat_cpu(cpu).cpustat.iowait;
> > else
> > - iowait = usecs_to_cputime(iowait_time);
> > + iowait = nsecs_to_jiffies64(1000 * iowait_time);
> >
> > return iowait;
> > }
> > --
> > 1.7.7.3
>
> Thank you, this patch has fixed the issue for me.
>
> Tested-by: Artem S. Tashkinov <t.artem@mailcity.com>
Thanks for retesting!
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2011-12-05 8:56 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-28 22:28 [REGRESSION] [Linux 3.2] top/htop and all other CPU usage pomac
2011-11-29 7:52 ` Michal Hocko
2011-11-29 11:38 ` Artem S. Tashkinov
2011-11-29 12:31 ` Michal Hocko
2011-11-29 12:44 ` Michal Hocko
2011-11-29 12:54 ` Artem S. Tashkinov
2011-11-29 13:10 ` Michal Hocko
2011-11-29 13:51 ` Artem S. Tashkinov
2011-11-29 22:51 ` Rafael J. Wysocki
2011-11-30 10:12 ` Michal Hocko
2011-11-30 19:56 ` Rafael J. Wysocki
2011-12-01 14:07 ` Michal Hocko
2011-12-02 10:39 ` Michal Hocko
2011-12-02 13:35 ` Michal Hocko
2011-12-02 16:49 ` [PATCH] proc: Do not overflow get_{idle,iowait}_time for nohz (was: Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage) Michal Hocko
2011-12-02 17:59 ` Michal Hocko
2011-12-02 20:12 ` Artem S. Tashkinov
2011-12-05 8:56 ` Michal Hocko
2011-12-02 17:43 ` Re: Re: [REGRESSION] [Linux 3.2] top/htop and all other CPU usage Artem S. Tashkinov
2011-11-29 17:23 ` Ian Kumlien
2011-11-29 17:31 ` Ian Kumlien
2011-11-29 17:56 ` Michal Hocko
2011-11-29 18:37 ` Ian Kumlien
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.