* Re: scheduler bug: process running since 5124095h
[not found] <4BADD408.8080609@gmail.com>
@ 2010-03-28 8:49 ` Török Edwin
2010-03-29 10:52 ` Mike Galbraith
0 siblings, 1 reply; 11+ messages in thread
From: Török Edwin @ 2010-03-28 8:49 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra; +Cc: Linux Kernel
On 03/27/2010 11:46 AM, Török Edwin wrote:
> Hi Ingo, Peter,
>
> top has just shown me this:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>
> 6524
> edwin 20 0 228m 10m 8116 R 2 0.3 5124095h gkrellm
>
> Now obviously that process is not running since 5124095h!
> It looks like some overflow to me, the time in nanoseconds would be
> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
> Thats about consistent with the uptime, but I don't know why it became
> negative:
> 11:45:48 up 42 min, 9 users, load average: 0.56, 0.25, 0.19
>
> I've attached the cfs-debug-info.sh output.
>
> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
> which is based on 2.6.33), its the first time I noticed this.
>
> I don't know what caused it, the last things I did was:
I have a simple way to reproduce this:
1. Boot the system, run top, confirm everything is normal
2. Run latencytop, and quit (I used version 0.5)
3. Run top, see 5124095h in the TIME column
For example:
6649 daemon 20 0 74500 8892 3796 S 6 0.2 0:00.03 debsecan
4255 root 20 0 8908 360 260 S 2 0.0 0:00.04 irqbalance
1 root 20 0 10332 692 580 S 0 0.0 5124095h init
The processes that get the 5124095h seem random, but there are plenty of
them (if you sort by 'T', the entire top display is filled with
processes running since 5124095h).
Best regards,
--Edwin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-28 8:49 ` scheduler bug: process running since 5124095h Török Edwin
@ 2010-03-29 10:52 ` Mike Galbraith
2010-03-29 10:54 ` Peter Zijlstra
2010-03-29 12:04 ` Hidetoshi Seto
0 siblings, 2 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-03-29 10:52 UTC (permalink / raw)
To: Török Edwin
Cc: Ingo Molnar, Peter Zijlstra, Linux Kernel, Hidetoshi Seto
On Sun, 2010-03-28 at 11:49 +0300, Török Edwin wrote:
> On 03/27/2010 11:46 AM, Török Edwin wrote:
> > Hi Ingo, Peter,
> >
> > top has just shown me this:
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> >
> > 6524
> > edwin 20 0 228m 10m 8116 R 2 0.3 5124095h gkrellm
> >
> > Now obviously that process is not running since 5124095h!
> > It looks like some overflow to me, the time in nanoseconds would be
> > approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
> > Thats about consistent with the uptime, but I don't know why it became
> > negative:
> > 11:45:48 up 42 min, 9 users, load average: 0.56, 0.25, 0.19
> >
> > I've attached the cfs-debug-info.sh output.
> >
> > This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
> > which is based on 2.6.33), its the first time I noticed this.
> >
> > I don't know what caused it, the last things I did was:
>
> I have a simple way to reproduce this:
> 1. Boot the system, run top, confirm everything is normal
> 2. Run latencytop, and quit (I used version 0.5)
> 3. Run top, see 5124095h in the TIME column
Indeed, and I don't even have CONFIG_LATENCYTOP set. It bisected to...
761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Date: Thu Nov 12 13:33:45 2009 +0900
sched: Fix granularity of task_u/stime()
Originally task_s/utime() were designed to return clock_t but
later changed to return cputime_t by following commit:
commit efe567fc8281661524ffa75477a7c4ca9b466c63
Author: Christian Borntraeger <borntraeger@de.ibm.com>
Date: Thu Aug 23 15:18:02 2007 +0200
It only changed the type of return value, but not the
implementation. As the result the granularity of task_s/utime()
is still that of clock_t, not that of cputime_t.
So using task_s/utime() in __exit_signal() makes values
accumulated to the signal struct to be rounded and coarse
grained.
This patch removes casts to clock_t in task_u/stime(), to keep
granularity of cputime_t over the calculation.
v2:
Use div_u64() to avoid error "undefined reference to `__udivdi3`"
on some 32bit systems.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: xiyou.wangcong@gmail.com
Cc: Spencer Candland <spencer@bluehost.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
LKML-Reference: <4AFB9029.9000208@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
:040000 040000 9b4e832857ee9694807de86be7d88bbbc2223c50 a78e66f0b8356005c3b963de64b38269147e47e0 M kernel
git bisect start
# good: [7f5e918e62cbc9ac27c2f47d3c3dd4b86f67ff0e] Linux 2.6.32.9
git bisect good 7f5e918e62cbc9ac27c2f47d3c3dd4b86f67ff0e
# bad: [baac35c4155a8aa826c70acee6553368ca5243a2] security: fix error return path in ima_inode_alloc
git bisect bad baac35c4155a8aa826c70acee6553368ca5243a2
# bad: [baac35c4155a8aa826c70acee6553368ca5243a2] security: fix error return path in ima_inode_alloc
git bisect bad baac35c4155a8aa826c70acee6553368ca5243a2
# bad: [baac35c4155a8aa826c70acee6553368ca5243a2] security: fix error return path in ima_inode_alloc
git bisect bad baac35c4155a8aa826c70acee6553368ca5243a2
# good: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32
git bisect good 22763c5cf3690a681551162c15d34d935308c8d7
# bad: [5f1141eb352ea79d849920039503e40dd623fffa] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
git bisect bad 5f1141eb352ea79d849920039503e40dd623fffa
# bad: [d7fc02c7bae7b1cf69269992cf880a43a350cdaa] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
git bisect bad d7fc02c7bae7b1cf69269992cf880a43a350cdaa
# good: [cfb3f91af49dff9b50de6929dc4de06100c4cfa8] ixgbe: handle parameters for tx and rx EITR, no div0
git bisect good cfb3f91af49dff9b50de6929dc4de06100c4cfa8
# good: [c3fa27d1367fac63ac8533d6f20ea851d0d70a10] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good c3fa27d1367fac63ac8533d6f20ea851d0d70a10
# good: [86e1d57e4f24ca27ce813bdc2afaac4adafcbaf4] Merge branch 'topic/hda' into for-linus
git bisect good 86e1d57e4f24ca27ce813bdc2afaac4adafcbaf4
# good: [8f56874bd7e8bee73ed6a1cf80dcec2753616262] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
git bisect good 8f56874bd7e8bee73ed6a1cf80dcec2753616262
# bad: [c2ed69cdc9da49a8d2d7b4212fd225abf902ceaa] Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad c2ed69cdc9da49a8d2d7b4212fd225abf902ceaa
# bad: [897e81bea1fcfcd2c5cdb720c9efdb25da9ff374] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 897e81bea1fcfcd2c5cdb720c9efdb25da9ff374
# bad: [fe3bcfe1f6c1fc4ea7706ac2d05e579fd9092682] sched: More generic WAKE_AFFINE vs select_idle_sibling()
git bisect bad fe3bcfe1f6c1fc4ea7706ac2d05e579fd9092682
# good: [e2c880630438f80b474378d5487b511b07665051] cpumask: Simplify sched_rt.c
git bisect good e2c880630438f80b474378d5487b511b07665051
# good: [d8c80ce091f6ead6710bc71b58f2c32e5bf855e4] sched, no_hz: Remove unused rq->last_tick_seen field
git bisect good d8c80ce091f6ead6710bc71b58f2c32e5bf855e4
# good: [ffd44db5f02af32bcc25a8eb5981bf02a141cdab] sched: Make sure task has correct sched_class after policy change
git bisect good ffd44db5f02af32bcc25a8eb5981bf02a141cdab
# bad: [a50bde5130f65733142b32975616427d0ea50856] sched: Cleanup select_task_rq_fair()
git bisect bad a50bde5130f65733142b32975616427d0ea50856
# bad: [761b1d26df542fd5eb348837351e4d2f3bc7bffe] sched: Fix granularity of task_u/stime()
git bisect bad 761b1d26df542fd5eb348837351e4d2f3bc7bffe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-29 10:52 ` Mike Galbraith
@ 2010-03-29 10:54 ` Peter Zijlstra
2010-03-29 11:33 ` Mike Galbraith
2010-03-29 12:04 ` Hidetoshi Seto
1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2010-03-29 10:54 UTC (permalink / raw)
To: Mike Galbraith
Cc: Török Edwin, Ingo Molnar, Linux Kernel, Hidetoshi Seto
On Mon, 2010-03-29 at 12:52 +0200, Mike Galbraith wrote:
> > I have a simple way to reproduce this:
> > 1. Boot the system, run top, confirm everything is normal
> > 2. Run latencytop, and quit (I used version 0.5)
> > 3. Run top, see 5124095h in the TIME column
>
> Indeed, and I don't even have CONFIG_LATENCYTOP set. It bisected
> to...
>
> 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
> commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
> Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
> Date: Thu Nov 12 13:33:45 2009 +0900
Oh lovely, and reverting that cures the issue?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-29 10:54 ` Peter Zijlstra
@ 2010-03-29 11:33 ` Mike Galbraith
0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-03-29 11:33 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Török Edwin, Ingo Molnar, Linux Kernel, Hidetoshi Seto
On Mon, 2010-03-29 at 12:54 +0200, Peter Zijlstra wrote:
> On Mon, 2010-03-29 at 12:52 +0200, Mike Galbraith wrote:
> > > I have a simple way to reproduce this:
> > > 1. Boot the system, run top, confirm everything is normal
> > > 2. Run latencytop, and quit (I used version 0.5)
> > > 3. Run top, see 5124095h in the TIME column
> >
> > Indeed, and I don't even have CONFIG_LATENCYTOP set. It bisected
> > to...
> >
> > 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
> > commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
> > Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
> > Date: Thu Nov 12 13:33:45 2009 +0900
>
>
> Oh lovely, and reverting that cures the issue?
Haven't done so yet, will do later this afternoon.
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-29 10:52 ` Mike Galbraith
2010-03-29 10:54 ` Peter Zijlstra
@ 2010-03-29 12:04 ` Hidetoshi Seto
2010-03-30 6:22 ` Hidetoshi Seto
1 sibling, 1 reply; 11+ messages in thread
From: Hidetoshi Seto @ 2010-03-29 12:04 UTC (permalink / raw)
To: Mike Galbraith
Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel
(2010/03/29 19:52), Mike Galbraith wrote:
> On Sun, 2010-03-28 at 11:49 +0300, Török Edwin wrote:
>> On 03/27/2010 11:46 AM, Török Edwin wrote:
>>> Hi Ingo, Peter,
>>>
>>> top has just shown me this:
>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>
>>> 6524
>>> edwin 20 0 228m 10m 8116 R 2 0.3 5124095h gkrellm
>>>
>>> Now obviously that process is not running since 5124095h!
>>> It looks like some overflow to me, the time in nanoseconds would be
>>> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
>>> Thats about consistent with the uptime, but I don't know why it became
>>> negative:
>>> 11:45:48 up 42 min, 9 users, load average: 0.56, 0.25, 0.19
>>>
>>> I've attached the cfs-debug-info.sh output.
>>>
>>> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
>>> which is based on 2.6.33), its the first time I noticed this.
>>>
>>> I don't know what caused it, the last things I did was:
>>
>> I have a simple way to reproduce this:
>> 1. Boot the system, run top, confirm everything is normal
>> 2. Run latencytop, and quit (I used version 0.5)
>> 3. Run top, see 5124095h in the TIME column
>
> Indeed, and I don't even have CONFIG_LATENCYTOP set. It bisected to...
>
> 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
> commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
> Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
> Date: Thu Nov 12 13:33:45 2009 +0900
Wow, it's easy to reproduce. I'll check it later...
Thanks,
H.Seto
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-29 12:04 ` Hidetoshi Seto
@ 2010-03-30 6:22 ` Hidetoshi Seto
2010-03-30 8:07 ` Mike Galbraith
2010-03-30 9:01 ` Peter Zijlstra
0 siblings, 2 replies; 11+ messages in thread
From: Hidetoshi Seto @ 2010-03-30 6:22 UTC (permalink / raw)
To: Mike Galbraith
Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel
(2010/03/29 21:04), Hidetoshi Seto wrote:
> (2010/03/29 19:52), Mike Galbraith wrote:
>> On Sun, 2010-03-28 at 11:49 +0300, Török Edwin wrote:
>>> On 03/27/2010 11:46 AM, Török Edwin wrote:
>>>> Hi Ingo, Peter,
>>>>
>>>> top has just shown me this:
>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>>
>>>> 6524
>>>> edwin 20 0 228m 10m 8116 R 2 0.3 5124095h gkrellm
>>>>
>>>> Now obviously that process is not running since 5124095h!
>>>> It looks like some overflow to me, the time in nanoseconds would be
>>>> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
>>>> Thats about consistent with the uptime, but I don't know why it became
>>>> negative:
>>>> 11:45:48 up 42 min, 9 users, load average: 0.56, 0.25, 0.19
>>>>
>>>> I've attached the cfs-debug-info.sh output.
>>>>
>>>> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
>>>> which is based on 2.6.33), its the first time I noticed this.
>>>>
>>>> I don't know what caused it, the last things I did was:
>>>
>>> I have a simple way to reproduce this:
>>> 1. Boot the system, run top, confirm everything is normal
>>> 2. Run latencytop, and quit (I used version 0.5)
>>> 3. Run top, see 5124095h in the TIME column
>>
>> Indeed, and I don't even have CONFIG_LATENCYTOP set. It bisected to...
>>
>> 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
>> commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
>> Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
>> Date: Thu Nov 12 13:33:45 2009 +0900
Quick report:
The reason why this commit have bisected is because it changed
the type of time values from signed clock_t to unsigned cputime_t,
so that the following if-block become to be always taken:
> - stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
> - cputime_to_clock_t(task_utime(p));
> + stime = nsecs_to_cputime(p->se.sum_exec_runtime) - task_utime(p);
>
>> > if (stime >= 0)
> - p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
> + p->prev_stime = max(p->prev_stime, stime);
>
> return p->prev_stime;
>From strace of latancytop, it does write to /proc/<pid>/sched:
5891 open("/proc/1/sched", O_RDWR) = 5
5891 fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
5891 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x7fc6668f3000
5891 read(5, "init (1, #threads: 1)\n----------"..., 1024) = 776
5891 read(5, "", 1024) = 0
>> 5891 write(5, "erase", 5) = 5
5891 close(5) = 0
It results in:
[kernel/sched_debug.c]
void proc_sched_set_task(struct task_struct *p)
{
:
p->se.sum_exec_runtime = 0;
p->se.prev_sum_exec_runtime = 0;
p->nvcsw = 0;
p->nivcsw = 0;
}
So soon some task will have great (in fact negative) stime.
There would be no doubt that this initialize in sched_debug.c
will break monotonicity of sum_exec_runtime. I confirmed that
the issue is disappeared by comment-out of lines above.
Reverting the bisected commit is wrong solution, because it
will bring another issue, i.e. lost of runtime, and u/stime
seems to be frozen because these values restart from 0 so
prev_* is used for a while.
How to fix? Is this a bug of latencytop? Kernel?
Please comment.
Thanks,
H.Seto
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-30 6:22 ` Hidetoshi Seto
@ 2010-03-30 8:07 ` Mike Galbraith
2010-03-30 9:09 ` Mike Galbraith
2010-03-30 9:01 ` Peter Zijlstra
1 sibling, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2010-03-30 8:07 UTC (permalink / raw)
To: Hidetoshi Seto
Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel
On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> Quick report:
>
> The reason why this commit have bisected is because it changed
> the type of time values from signed clock_t to unsigned cputime_t,
> so that the following if-block become to be always taken:
>
> > - stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
> > - cputime_to_clock_t(task_utime(p));
> > + stime = nsecs_to_cputime(p->se.sum_exec_runtime) - task_utime(p);
> >
> >> > if (stime >= 0)
> > - p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
> > + p->prev_stime = max(p->prev_stime, stime);
> >
> > return p->prev_stime;
>
> >From strace of latancytop, it does write to /proc/<pid>/sched:
>
> 5891 open("/proc/1/sched", O_RDWR) = 5
> 5891 fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> 5891 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7fc6668f3000
> 5891 read(5, "init (1, #threads: 1)\n----------"..., 1024) = 776
> 5891 read(5, "", 1024) = 0
> >> 5891 write(5, "erase", 5) = 5
> 5891 close(5) = 0
>
> It results in:
>
> [kernel/sched_debug.c]
> void proc_sched_set_task(struct task_struct *p)
> {
> :
> p->se.sum_exec_runtime = 0;
> p->se.prev_sum_exec_runtime = 0;
> p->nvcsw = 0;
> p->nivcsw = 0;
> }
Hm. Dunno why that (evilness) exists.
> So soon some task will have great (in fact negative) stime.
>
> There would be no doubt that this initialize in sched_debug.c
> will break monotonicity of sum_exec_runtime. I confirmed that
> the issue is disappeared by comment-out of lines above.
>
> Reverting the bisected commit is wrong solution, because it
> will bring another issue, i.e. lost of runtime, and u/stime
> seems to be frozen because these values restart from 0 so
> prev_* is used for a while.
>
> How to fix? Is this a bug of latencytop? Kernel?
> Please comment.
I looked at it in tip this morning (where task_stime doesn't exist), and
did the below to swat the underflow. Seems ok, but I haven't yet
enabled CONFIG_LATENCYTOP.
diff --git a/kernel/sched.c b/kernel/sched.c
index 81575f1..cdf4fa4 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3366,6 +3366,7 @@ void task_times(struct task_struct *p, cputime_t *ut, cputime_t *st)
* Compare with previous values, to keep monotonicity:
*/
p->prev_utime = max(p->prev_utime, utime);
+ rtime = max(rtime, p->prev_utime);
p->prev_stime = max(p->prev_stime, cputime_sub(rtime, p->prev_utime));
*ut = p->prev_utime;
@@ -3396,6 +3397,7 @@ void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *st)
utime = rtime;
sig->prev_utime = max(sig->prev_utime, utime);
+ rtime = max(rtime, sig->prev_utime);
sig->prev_stime = max(sig->prev_stime,
cputime_sub(rtime, sig->prev_utime));
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-30 6:22 ` Hidetoshi Seto
2010-03-30 8:07 ` Mike Galbraith
@ 2010-03-30 9:01 ` Peter Zijlstra
2010-03-30 9:12 ` Mike Galbraith
1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2010-03-30 9:01 UTC (permalink / raw)
To: Hidetoshi Seto
Cc: Mike Galbraith, Török Edwin, Ingo Molnar, Linux Kernel,
Arjan van de Ven
On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
>
> [kernel/sched_debug.c]
> void proc_sched_set_task(struct task_struct *p)
> {
> :
> p->se.sum_exec_runtime = 0;
> p->se.prev_sum_exec_runtime = 0;
> p->nvcsw = 0;
> p->nivcsw = 0;
> }
Urgh, that code should really die, bugger that latencytop relies on that
though. Arjan anything you can do to cure latencytop of this madness?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-30 8:07 ` Mike Galbraith
@ 2010-03-30 9:09 ` Mike Galbraith
0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-03-30 9:09 UTC (permalink / raw)
To: Hidetoshi Seto
Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel
On Tue, 2010-03-30 at 10:07 +0200, Mike Galbraith wrote:
> On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
>
> > Quick report:
> >
> > The reason why this commit have bisected is because it changed
> > the type of time values from signed clock_t to unsigned cputime_t,
> > so that the following if-block become to be always taken:
> >
> > > - stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
> > > - cputime_to_clock_t(task_utime(p));
> > > + stime = nsecs_to_cputime(p->se.sum_exec_runtime) - task_utime(p);
> > >
> > >> > if (stime >= 0)
> > > - p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
> > > + p->prev_stime = max(p->prev_stime, stime);
> > >
> > > return p->prev_stime;
> >
> > >From strace of latancytop, it does write to /proc/<pid>/sched:
> >
> > 5891 open("/proc/1/sched", O_RDWR) = 5
> > 5891 fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> > 5891 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> > 0x7fc6668f3000
> > 5891 read(5, "init (1, #threads: 1)\n----------"..., 1024) = 776
> > 5891 read(5, "", 1024) = 0
> > >> 5891 write(5, "erase", 5) = 5
> > 5891 close(5) = 0
> >
> > It results in:
> >
> > [kernel/sched_debug.c]
> > void proc_sched_set_task(struct task_struct *p)
> > {
> > :
> > p->se.sum_exec_runtime = 0;
> > p->se.prev_sum_exec_runtime = 0;
> > p->nvcsw = 0;
> > p->nivcsw = 0;
> > }
>
> Hm. Dunno why that (evilness) exists.
>
> > So soon some task will have great (in fact negative) stime.
> >
> > There would be no doubt that this initialize in sched_debug.c
> > will break monotonicity of sum_exec_runtime. I confirmed that
> > the issue is disappeared by comment-out of lines above.
> >
> > Reverting the bisected commit is wrong solution, because it
> > will bring another issue, i.e. lost of runtime, and u/stime
> > seems to be frozen because these values restart from 0 so
> > prev_* is used for a while.
> >
> > How to fix? Is this a bug of latencytop? Kernel?
> > Please comment.
I think the correct fix is to not allow resetting of those 4 fields, as
they're used elsewhere in kernel. Latencytop doesn't seem to miss being
allowed to scribble. Dunno why it wants to reset any of these though.
sched: fix proc_sched_set_task()
Latencytop clearing sum_exec_runtime via proc_sched_set_task() breaks
task_times(). Other places in kernel use nvcsw and nivcsw, which are
being cleared as well, Clear task statistics only.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reported-by: Török Edwin <edwintorok@gmail.com>
LKML-Reference: <new-submission>
diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
index 8a46a71..ea2c690 100644
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -491,8 +491,4 @@ void proc_sched_set_task(struct task_struct *p)
#ifdef CONFIG_SCHEDSTATS
memset(&p->se.statistics, 0, sizeof(p->se.statistics));
#endif
- p->se.sum_exec_runtime = 0;
- p->se.prev_sum_exec_runtime = 0;
- p->nvcsw = 0;
- p->nivcsw = 0;
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-30 9:01 ` Peter Zijlstra
@ 2010-03-30 9:12 ` Mike Galbraith
2010-03-30 9:34 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2010-03-30 9:12 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Hidetoshi Seto, Török Edwin, Ingo Molnar, Linux Kernel,
Arjan van de Ven
On Tue, 2010-03-30 at 11:01 +0200, Peter Zijlstra wrote:
> On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> >
> > [kernel/sched_debug.c]
> > void proc_sched_set_task(struct task_struct *p)
> > {
> > :
> > p->se.sum_exec_runtime = 0;
> > p->se.prev_sum_exec_runtime = 0;
> > p->nvcsw = 0;
> > p->nivcsw = 0;
> > }
>
> Urgh, that code should really die, bugger that latencytop relies on that
> though. Arjan anything you can do to cure latencytop of this madness?
I killed those 4 fields without _seeming_ to upset it.
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: scheduler bug: process running since 5124095h
2010-03-30 9:12 ` Mike Galbraith
@ 2010-03-30 9:34 ` Peter Zijlstra
0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2010-03-30 9:34 UTC (permalink / raw)
To: Mike Galbraith
Cc: Hidetoshi Seto, Török Edwin, Ingo Molnar, Linux Kernel,
Arjan van de Ven
On Tue, 2010-03-30 at 11:12 +0200, Mike Galbraith wrote:
> On Tue, 2010-03-30 at 11:01 +0200, Peter Zijlstra wrote:
> > On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> > >
> > > [kernel/sched_debug.c]
> > > void proc_sched_set_task(struct task_struct *p)
> > > {
> > > :
> > > p->se.sum_exec_runtime = 0;
> > > p->se.prev_sum_exec_runtime = 0;
> > > p->nvcsw = 0;
> > > p->nivcsw = 0;
> > > }
> >
> > Urgh, that code should really die, bugger that latencytop relies on that
> > though. Arjan anything you can do to cure latencytop of this madness?
>
> I killed those 4 fields without _seeming_ to upset it.
OK, if Arjan doesn't complain I'll take it ;-)
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-03-30 9:35 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <4BADD408.8080609@gmail.com>
2010-03-28 8:49 ` scheduler bug: process running since 5124095h Török Edwin
2010-03-29 10:52 ` Mike Galbraith
2010-03-29 10:54 ` Peter Zijlstra
2010-03-29 11:33 ` Mike Galbraith
2010-03-29 12:04 ` Hidetoshi Seto
2010-03-30 6:22 ` Hidetoshi Seto
2010-03-30 8:07 ` Mike Galbraith
2010-03-30 9:09 ` Mike Galbraith
2010-03-30 9:01 ` Peter Zijlstra
2010-03-30 9:12 ` Mike Galbraith
2010-03-30 9:34 ` Peter Zijlstra
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.