All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: scheduler bug: process running since 5124095h
       [not found] <4BADD408.8080609@gmail.com>
@ 2010-03-28  8:49 ` Török Edwin
  2010-03-29 10:52   ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Török Edwin @ 2010-03-28  8:49 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra; +Cc: Linux Kernel

On 03/27/2010 11:46 AM, Török Edwin wrote:
> Hi Ingo, Peter,
> 
> top has just shown me this:
> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 
>                                                                  6524
> edwin     20   0  228m  10m 8116 R    2  0.3  5124095h gkrellm
> 
> Now obviously that process is not running since 5124095h!
> It looks like some overflow to me, the time in nanoseconds would be
> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
> Thats about consistent with the uptime, but I don't know why it became
> negative:
>  11:45:48 up 42 min,  9 users,  load average: 0.56, 0.25, 0.19
> 
> I've attached the cfs-debug-info.sh output.
> 
> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
> which is based on 2.6.33), its the first time I noticed this.
> 
> I don't know what caused it, the last things I did was:

I have a simple way to reproduce this:
1. Boot the system, run top, confirm everything is normal
2. Run latencytop, and quit (I used version 0.5)
3. Run top, see 5124095h in the TIME column

For example:
 6649 daemon    20   0 74500 8892 3796 S    6  0.2   0:00.03 debsecan
 4255 root      20   0  8908  360  260 S    2  0.0   0:00.04 irqbalance
    1 root      20   0 10332  692  580 S    0  0.0  5124095h init

The processes that get the 5124095h seem random, but there are plenty of
them (if you sort by 'T', the entire top display is filled with
processes running since 5124095h).

Best regards,
--Edwin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-28  8:49 ` scheduler bug: process running since 5124095h Török Edwin
@ 2010-03-29 10:52   ` Mike Galbraith
  2010-03-29 10:54     ` Peter Zijlstra
  2010-03-29 12:04     ` Hidetoshi Seto
  0 siblings, 2 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-03-29 10:52 UTC (permalink / raw)
  To: Török Edwin
  Cc: Ingo Molnar, Peter Zijlstra, Linux Kernel, Hidetoshi Seto

On Sun, 2010-03-28 at 11:49 +0300, Török Edwin wrote:
> On 03/27/2010 11:46 AM, Török Edwin wrote:
> > Hi Ingo, Peter,
> > 
> > top has just shown me this:
> > PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 
> >                                                                  6524
> > edwin     20   0  228m  10m 8116 R    2  0.3  5124095h gkrellm
> > 
> > Now obviously that process is not running since 5124095h!
> > It looks like some overflow to me, the time in nanoseconds would be
> > approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
> > Thats about consistent with the uptime, but I don't know why it became
> > negative:
> >  11:45:48 up 42 min,  9 users,  load average: 0.56, 0.25, 0.19
> > 
> > I've attached the cfs-debug-info.sh output.
> > 
> > This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
> > which is based on 2.6.33), its the first time I noticed this.
> > 
> > I don't know what caused it, the last things I did was:
> 
> I have a simple way to reproduce this:
> 1. Boot the system, run top, confirm everything is normal
> 2. Run latencytop, and quit (I used version 0.5)
> 3. Run top, see 5124095h in the TIME column

Indeed, and I don't even have CONFIG_LATENCYTOP set.  It bisected to...

761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Date:   Thu Nov 12 13:33:45 2009 +0900

    sched: Fix granularity of task_u/stime()
    
    Originally task_s/utime() were designed to return clock_t but
    later changed to return cputime_t by following commit:
    
      commit efe567fc8281661524ffa75477a7c4ca9b466c63
      Author: Christian Borntraeger <borntraeger@de.ibm.com>
      Date:   Thu Aug 23 15:18:02 2007 +0200
    
    It only changed the type of return value, but not the
    implementation. As the result the granularity of task_s/utime()
    is still that of clock_t, not that of cputime_t.
    
    So using task_s/utime() in __exit_signal() makes values
    accumulated to the signal struct to be rounded and coarse
    grained.
    
    This patch removes casts to clock_t in task_u/stime(), to keep
    granularity of cputime_t over the calculation.
    
    v2:
      Use div_u64() to avoid error "undefined reference to `__udivdi3`"
      on some 32bit systems.
    
    Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
    Acked-by: Peter Zijlstra <peterz@infradead.org>
    Cc: xiyou.wangcong@gmail.com
    Cc: Spencer Candland <spencer@bluehost.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Stanislaw Gruszka <sgruszka@redhat.com>
    LKML-Reference: <4AFB9029.9000208@jp.fujitsu.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

:040000 040000 9b4e832857ee9694807de86be7d88bbbc2223c50 a78e66f0b8356005c3b963de64b38269147e47e0 M	kernel
git bisect start
# good: [7f5e918e62cbc9ac27c2f47d3c3dd4b86f67ff0e] Linux 2.6.32.9
git bisect good 7f5e918e62cbc9ac27c2f47d3c3dd4b86f67ff0e
# bad: [baac35c4155a8aa826c70acee6553368ca5243a2] security: fix error return path in ima_inode_alloc
git bisect bad baac35c4155a8aa826c70acee6553368ca5243a2
# bad: [baac35c4155a8aa826c70acee6553368ca5243a2] security: fix error return path in ima_inode_alloc
git bisect bad baac35c4155a8aa826c70acee6553368ca5243a2
# bad: [baac35c4155a8aa826c70acee6553368ca5243a2] security: fix error return path in ima_inode_alloc
git bisect bad baac35c4155a8aa826c70acee6553368ca5243a2
# good: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32
git bisect good 22763c5cf3690a681551162c15d34d935308c8d7
# bad: [5f1141eb352ea79d849920039503e40dd623fffa] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
git bisect bad 5f1141eb352ea79d849920039503e40dd623fffa
# bad: [d7fc02c7bae7b1cf69269992cf880a43a350cdaa] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
git bisect bad d7fc02c7bae7b1cf69269992cf880a43a350cdaa
# good: [cfb3f91af49dff9b50de6929dc4de06100c4cfa8] ixgbe: handle parameters for tx and rx EITR, no div0
git bisect good cfb3f91af49dff9b50de6929dc4de06100c4cfa8
# good: [c3fa27d1367fac63ac8533d6f20ea851d0d70a10] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good c3fa27d1367fac63ac8533d6f20ea851d0d70a10
# good: [86e1d57e4f24ca27ce813bdc2afaac4adafcbaf4] Merge branch 'topic/hda' into for-linus
git bisect good 86e1d57e4f24ca27ce813bdc2afaac4adafcbaf4
# good: [8f56874bd7e8bee73ed6a1cf80dcec2753616262] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
git bisect good 8f56874bd7e8bee73ed6a1cf80dcec2753616262
# bad: [c2ed69cdc9da49a8d2d7b4212fd225abf902ceaa] Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad c2ed69cdc9da49a8d2d7b4212fd225abf902ceaa
# bad: [897e81bea1fcfcd2c5cdb720c9efdb25da9ff374] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 897e81bea1fcfcd2c5cdb720c9efdb25da9ff374
# bad: [fe3bcfe1f6c1fc4ea7706ac2d05e579fd9092682] sched: More generic WAKE_AFFINE vs select_idle_sibling()
git bisect bad fe3bcfe1f6c1fc4ea7706ac2d05e579fd9092682
# good: [e2c880630438f80b474378d5487b511b07665051] cpumask: Simplify sched_rt.c
git bisect good e2c880630438f80b474378d5487b511b07665051
# good: [d8c80ce091f6ead6710bc71b58f2c32e5bf855e4] sched, no_hz: Remove unused rq->last_tick_seen field
git bisect good d8c80ce091f6ead6710bc71b58f2c32e5bf855e4
# good: [ffd44db5f02af32bcc25a8eb5981bf02a141cdab] sched: Make sure task has correct sched_class after policy change
git bisect good ffd44db5f02af32bcc25a8eb5981bf02a141cdab
# bad: [a50bde5130f65733142b32975616427d0ea50856] sched: Cleanup select_task_rq_fair()
git bisect bad a50bde5130f65733142b32975616427d0ea50856
# bad: [761b1d26df542fd5eb348837351e4d2f3bc7bffe] sched: Fix granularity of task_u/stime()
git bisect bad 761b1d26df542fd5eb348837351e4d2f3bc7bffe



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-29 10:52   ` Mike Galbraith
@ 2010-03-29 10:54     ` Peter Zijlstra
  2010-03-29 11:33       ` Mike Galbraith
  2010-03-29 12:04     ` Hidetoshi Seto
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2010-03-29 10:54 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Török Edwin, Ingo Molnar, Linux Kernel, Hidetoshi Seto

On Mon, 2010-03-29 at 12:52 +0200, Mike Galbraith wrote:
> > I have a simple way to reproduce this:
> > 1. Boot the system, run top, confirm everything is normal
> > 2. Run latencytop, and quit (I used version 0.5)
> > 3. Run top, see 5124095h in the TIME column
> 
> Indeed, and I don't even have CONFIG_LATENCYTOP set.  It bisected
> to...
> 
> 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
> commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
> Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
> Date:   Thu Nov 12 13:33:45 2009 +0900


Oh lovely, and reverting that cures the issue?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-29 10:54     ` Peter Zijlstra
@ 2010-03-29 11:33       ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-03-29 11:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Török Edwin, Ingo Molnar, Linux Kernel, Hidetoshi Seto

On Mon, 2010-03-29 at 12:54 +0200, Peter Zijlstra wrote:
> On Mon, 2010-03-29 at 12:52 +0200, Mike Galbraith wrote:
> > > I have a simple way to reproduce this:
> > > 1. Boot the system, run top, confirm everything is normal
> > > 2. Run latencytop, and quit (I used version 0.5)
> > > 3. Run top, see 5124095h in the TIME column
> > 
> > Indeed, and I don't even have CONFIG_LATENCYTOP set.  It bisected
> > to...
> > 
> > 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
> > commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
> > Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
> > Date:   Thu Nov 12 13:33:45 2009 +0900
> 
> 
> Oh lovely, and reverting that cures the issue?

Haven't done so yet, will do later this afternoon.

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-29 10:52   ` Mike Galbraith
  2010-03-29 10:54     ` Peter Zijlstra
@ 2010-03-29 12:04     ` Hidetoshi Seto
  2010-03-30  6:22       ` Hidetoshi Seto
  1 sibling, 1 reply; 11+ messages in thread
From: Hidetoshi Seto @ 2010-03-29 12:04 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel

(2010/03/29 19:52), Mike Galbraith wrote:
> On Sun, 2010-03-28 at 11:49 +0300, Török Edwin wrote:
>> On 03/27/2010 11:46 AM, Török Edwin wrote:
>>> Hi Ingo, Peter,
>>>
>>> top has just shown me this:
>>> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>
>>>                                                                  6524
>>> edwin     20   0  228m  10m 8116 R    2  0.3  5124095h gkrellm
>>>
>>> Now obviously that process is not running since 5124095h!
>>> It looks like some overflow to me, the time in nanoseconds would be
>>> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
>>> Thats about consistent with the uptime, but I don't know why it became
>>> negative:
>>>  11:45:48 up 42 min,  9 users,  load average: 0.56, 0.25, 0.19
>>>
>>> I've attached the cfs-debug-info.sh output.
>>>
>>> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
>>> which is based on 2.6.33), its the first time I noticed this.
>>>
>>> I don't know what caused it, the last things I did was:
>>
>> I have a simple way to reproduce this:
>> 1. Boot the system, run top, confirm everything is normal
>> 2. Run latencytop, and quit (I used version 0.5)
>> 3. Run top, see 5124095h in the TIME column
> 
> Indeed, and I don't even have CONFIG_LATENCYTOP set.  It bisected to...
> 
> 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
> commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
> Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
> Date:   Thu Nov 12 13:33:45 2009 +0900

Wow, it's easy to reproduce. I'll check it later... 


Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-29 12:04     ` Hidetoshi Seto
@ 2010-03-30  6:22       ` Hidetoshi Seto
  2010-03-30  8:07         ` Mike Galbraith
  2010-03-30  9:01         ` Peter Zijlstra
  0 siblings, 2 replies; 11+ messages in thread
From: Hidetoshi Seto @ 2010-03-30  6:22 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel

(2010/03/29 21:04), Hidetoshi Seto wrote:
> (2010/03/29 19:52), Mike Galbraith wrote:
>> On Sun, 2010-03-28 at 11:49 +0300, Török Edwin wrote:
>>> On 03/27/2010 11:46 AM, Török Edwin wrote:
>>>> Hi Ingo, Peter,
>>>>
>>>> top has just shown me this:
>>>> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>
>>>>                                                                  6524
>>>> edwin     20   0  228m  10m 8116 R    2  0.3  5124095h gkrellm
>>>>
>>>> Now obviously that process is not running since 5124095h!
>>>> It looks like some overflow to me, the time in nanoseconds would be
>>>> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
>>>> Thats about consistent with the uptime, but I don't know why it became
>>>> negative:
>>>>  11:45:48 up 42 min,  9 users,  load average: 0.56, 0.25, 0.19
>>>>
>>>> I've attached the cfs-debug-info.sh output.
>>>>
>>>> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
>>>> which is based on 2.6.33), its the first time I noticed this.
>>>>
>>>> I don't know what caused it, the last things I did was:
>>>
>>> I have a simple way to reproduce this:
>>> 1. Boot the system, run top, confirm everything is normal
>>> 2. Run latencytop, and quit (I used version 0.5)
>>> 3. Run top, see 5124095h in the TIME column
>>
>> Indeed, and I don't even have CONFIG_LATENCYTOP set.  It bisected to...
>>
>> 761b1d26df542fd5eb348837351e4d2f3bc7bffe is the first bad commit
>> commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe
>> Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
>> Date:   Thu Nov 12 13:33:45 2009 +0900

Quick report:

The reason why this commit have bisected is because it changed
the type of time values from signed clock_t to unsigned cputime_t,
so that the following if-block become to be always taken:

   > -       stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
   > -                       cputime_to_clock_t(task_utime(p));
   > +       stime = nsecs_to_cputime(p->se.sum_exec_runtime) - task_utime(p);
   > 
>> >         if (stime >= 0)
   > -               p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
   > +               p->prev_stime = max(p->prev_stime, stime);
   > 
   >         return p->prev_stime;

>From strace of latancytop, it does write to /proc/<pid>/sched:

   5891  open("/proc/1/sched", O_RDWR)     = 5
   5891  fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
   5891  mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
    0x7fc6668f3000
   5891  read(5, "init (1, #threads: 1)\n----------"..., 1024) = 776
   5891  read(5, "", 1024)                 = 0
>> 5891  write(5, "erase", 5)              = 5
   5891  close(5)                          = 0

It results in:

[kernel/sched_debug.c]
void proc_sched_set_task(struct task_struct *p)
{
 :
        p->se.sum_exec_runtime                  = 0;
        p->se.prev_sum_exec_runtime             = 0;
        p->nvcsw                                = 0;
        p->nivcsw                               = 0;
}

So soon some task will have great (in fact negative) stime.

There would be no doubt that this initialize in sched_debug.c
will break monotonicity of sum_exec_runtime.  I confirmed that
the issue is disappeared by comment-out of lines above.

Reverting the bisected commit is wrong solution, because it
will bring another issue, i.e. lost of runtime, and u/stime
seems to be frozen because these values restart from 0 so
prev_* is used for a while.

How to fix?  Is this a bug of latencytop? Kernel?
Please comment.


Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-30  6:22       ` Hidetoshi Seto
@ 2010-03-30  8:07         ` Mike Galbraith
  2010-03-30  9:09           ` Mike Galbraith
  2010-03-30  9:01         ` Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2010-03-30  8:07 UTC (permalink / raw)
  To: Hidetoshi Seto
  Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel

On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:

> Quick report:
> 
> The reason why this commit have bisected is because it changed
> the type of time values from signed clock_t to unsigned cputime_t,
> so that the following if-block become to be always taken:
> 
>    > -       stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
>    > -                       cputime_to_clock_t(task_utime(p));
>    > +       stime = nsecs_to_cputime(p->se.sum_exec_runtime) - task_utime(p);
>    > 
> >> >         if (stime >= 0)
>    > -               p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
>    > +               p->prev_stime = max(p->prev_stime, stime);
>    > 
>    >         return p->prev_stime;
> 
> >From strace of latancytop, it does write to /proc/<pid>/sched:
> 
>    5891  open("/proc/1/sched", O_RDWR)     = 5
>    5891  fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
>    5891  mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
>     0x7fc6668f3000
>    5891  read(5, "init (1, #threads: 1)\n----------"..., 1024) = 776
>    5891  read(5, "", 1024)                 = 0
> >> 5891  write(5, "erase", 5)              = 5
>    5891  close(5)                          = 0
> 
> It results in:
> 
> [kernel/sched_debug.c]
> void proc_sched_set_task(struct task_struct *p)
> {
>  :
>         p->se.sum_exec_runtime                  = 0;
>         p->se.prev_sum_exec_runtime             = 0;
>         p->nvcsw                                = 0;
>         p->nivcsw                               = 0;
> }

Hm.  Dunno why that (evilness) exists.

> So soon some task will have great (in fact negative) stime.
> 
> There would be no doubt that this initialize in sched_debug.c
> will break monotonicity of sum_exec_runtime.  I confirmed that
> the issue is disappeared by comment-out of lines above.
> 
> Reverting the bisected commit is wrong solution, because it
> will bring another issue, i.e. lost of runtime, and u/stime
> seems to be frozen because these values restart from 0 so
> prev_* is used for a while.
> 
> How to fix?  Is this a bug of latencytop? Kernel?
> Please comment.

I looked at it in tip this morning (where task_stime doesn't exist), and
did the below to swat the underflow.  Seems ok, but I haven't yet
enabled CONFIG_LATENCYTOP.

diff --git a/kernel/sched.c b/kernel/sched.c
index 81575f1..cdf4fa4 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3366,6 +3366,7 @@ void task_times(struct task_struct *p, cputime_t *ut, cputime_t *st)
 	 * Compare with previous values, to keep monotonicity:
 	 */
 	p->prev_utime = max(p->prev_utime, utime);
+	rtime = max(rtime, p->prev_utime);
 	p->prev_stime = max(p->prev_stime, cputime_sub(rtime, p->prev_utime));
 
 	*ut = p->prev_utime;
@@ -3396,6 +3397,7 @@ void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *st)
 		utime = rtime;
 
 	sig->prev_utime = max(sig->prev_utime, utime);
+	rtime = max(rtime, sig->prev_utime);
 	sig->prev_stime = max(sig->prev_stime,
 			      cputime_sub(rtime, sig->prev_utime));
 



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-30  6:22       ` Hidetoshi Seto
  2010-03-30  8:07         ` Mike Galbraith
@ 2010-03-30  9:01         ` Peter Zijlstra
  2010-03-30  9:12           ` Mike Galbraith
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2010-03-30  9:01 UTC (permalink / raw)
  To: Hidetoshi Seto
  Cc: Mike Galbraith, Török Edwin, Ingo Molnar, Linux Kernel,
	Arjan van de Ven

On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> 
> [kernel/sched_debug.c]
> void proc_sched_set_task(struct task_struct *p)
> {
>  :
>         p->se.sum_exec_runtime                  = 0;
>         p->se.prev_sum_exec_runtime             = 0;
>         p->nvcsw                                = 0;
>         p->nivcsw                               = 0;
> } 

Urgh, that code should really die, bugger that latencytop relies on that
though. Arjan anything you can do to cure latencytop of this madness?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-30  8:07         ` Mike Galbraith
@ 2010-03-30  9:09           ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-03-30  9:09 UTC (permalink / raw)
  To: Hidetoshi Seto
  Cc: Török Edwin, Ingo Molnar, Peter Zijlstra, Linux Kernel

On Tue, 2010-03-30 at 10:07 +0200, Mike Galbraith wrote:
> On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> 
> > Quick report:
> > 
> > The reason why this commit have bisected is because it changed
> > the type of time values from signed clock_t to unsigned cputime_t,
> > so that the following if-block become to be always taken:
> > 
> >    > -       stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
> >    > -                       cputime_to_clock_t(task_utime(p));
> >    > +       stime = nsecs_to_cputime(p->se.sum_exec_runtime) - task_utime(p);
> >    > 
> > >> >         if (stime >= 0)
> >    > -               p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
> >    > +               p->prev_stime = max(p->prev_stime, stime);
> >    > 
> >    >         return p->prev_stime;
> > 
> > >From strace of latancytop, it does write to /proc/<pid>/sched:
> > 
> >    5891  open("/proc/1/sched", O_RDWR)     = 5
> >    5891  fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> >    5891  mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> >     0x7fc6668f3000
> >    5891  read(5, "init (1, #threads: 1)\n----------"..., 1024) = 776
> >    5891  read(5, "", 1024)                 = 0
> > >> 5891  write(5, "erase", 5)              = 5
> >    5891  close(5)                          = 0
> > 
> > It results in:
> > 
> > [kernel/sched_debug.c]
> > void proc_sched_set_task(struct task_struct *p)
> > {
> >  :
> >         p->se.sum_exec_runtime                  = 0;
> >         p->se.prev_sum_exec_runtime             = 0;
> >         p->nvcsw                                = 0;
> >         p->nivcsw                               = 0;
> > }
> 
> Hm.  Dunno why that (evilness) exists.
> 
> > So soon some task will have great (in fact negative) stime.
> > 
> > There would be no doubt that this initialize in sched_debug.c
> > will break monotonicity of sum_exec_runtime.  I confirmed that
> > the issue is disappeared by comment-out of lines above.
> > 
> > Reverting the bisected commit is wrong solution, because it
> > will bring another issue, i.e. lost of runtime, and u/stime
> > seems to be frozen because these values restart from 0 so
> > prev_* is used for a while.
> > 
> > How to fix?  Is this a bug of latencytop? Kernel?
> > Please comment.

I think the correct fix is to not allow resetting of those 4 fields, as
they're used elsewhere in kernel.  Latencytop doesn't seem to miss being
allowed to scribble.  Dunno why it wants to reset any of these though.

sched: fix proc_sched_set_task()

Latencytop clearing sum_exec_runtime via proc_sched_set_task() breaks
task_times().  Other places in kernel use nvcsw and nivcsw, which are
being cleared as well,  Clear task statistics only.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reported-by: Török Edwin <edwintorok@gmail.com>
LKML-Reference: <new-submission>

diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
index 8a46a71..ea2c690 100644
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -491,8 +491,4 @@ void proc_sched_set_task(struct task_struct *p)
 #ifdef CONFIG_SCHEDSTATS
 	memset(&p->se.statistics, 0, sizeof(p->se.statistics));
 #endif
-	p->se.sum_exec_runtime			= 0;
-	p->se.prev_sum_exec_runtime		= 0;
-	p->nvcsw				= 0;
-	p->nivcsw				= 0;
 }




^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-30  9:01         ` Peter Zijlstra
@ 2010-03-30  9:12           ` Mike Galbraith
  2010-03-30  9:34             ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2010-03-30  9:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Hidetoshi Seto, Török Edwin, Ingo Molnar, Linux Kernel,
	Arjan van de Ven

On Tue, 2010-03-30 at 11:01 +0200, Peter Zijlstra wrote:
> On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> > 
> > [kernel/sched_debug.c]
> > void proc_sched_set_task(struct task_struct *p)
> > {
> >  :
> >         p->se.sum_exec_runtime                  = 0;
> >         p->se.prev_sum_exec_runtime             = 0;
> >         p->nvcsw                                = 0;
> >         p->nivcsw                               = 0;
> > } 
> 
> Urgh, that code should really die, bugger that latencytop relies on that
> though. Arjan anything you can do to cure latencytop of this madness?

I killed those 4 fields without _seeming_ to upset it.

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: scheduler bug: process running since 5124095h
  2010-03-30  9:12           ` Mike Galbraith
@ 2010-03-30  9:34             ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2010-03-30  9:34 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Hidetoshi Seto, Török Edwin, Ingo Molnar, Linux Kernel,
	Arjan van de Ven

On Tue, 2010-03-30 at 11:12 +0200, Mike Galbraith wrote:
> On Tue, 2010-03-30 at 11:01 +0200, Peter Zijlstra wrote:
> > On Tue, 2010-03-30 at 15:22 +0900, Hidetoshi Seto wrote:
> > > 
> > > [kernel/sched_debug.c]
> > > void proc_sched_set_task(struct task_struct *p)
> > > {
> > >  :
> > >         p->se.sum_exec_runtime                  = 0;
> > >         p->se.prev_sum_exec_runtime             = 0;
> > >         p->nvcsw                                = 0;
> > >         p->nivcsw                               = 0;
> > > } 
> > 
> > Urgh, that code should really die, bugger that latencytop relies on that
> > though. Arjan anything you can do to cure latencytop of this madness?
> 
> I killed those 4 fields without _seeming_ to upset it.

OK, if Arjan doesn't complain I'll take it ;-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-03-30  9:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4BADD408.8080609@gmail.com>
2010-03-28  8:49 ` scheduler bug: process running since 5124095h Török Edwin
2010-03-29 10:52   ` Mike Galbraith
2010-03-29 10:54     ` Peter Zijlstra
2010-03-29 11:33       ` Mike Galbraith
2010-03-29 12:04     ` Hidetoshi Seto
2010-03-30  6:22       ` Hidetoshi Seto
2010-03-30  8:07         ` Mike Galbraith
2010-03-30  9:09           ` Mike Galbraith
2010-03-30  9:01         ` Peter Zijlstra
2010-03-30  9:12           ` Mike Galbraith
2010-03-30  9:34             ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.