All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: reading /proc/stat segfaults after long uptimes
       [not found] <20091020091315.5e23263e@mschwide.boeblingen.de.ibm.com>
@ 2009-10-20  8:00 ` Martin Schwidefsky
  0 siblings, 0 replies; only message in thread
From: Martin Schwidefsky @ 2009-10-20  8:00 UTC (permalink / raw)
  To: linux-s390

On Tue, 20 Oct 2009 09:13:15 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Mon, 19 Oct 2009 00:17:30 +0200
> Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Sun, 18 Oct 2009 05:35:29 -0400
> > Mike Frysinger <vapier@gentoo.org> wrote:
> > 
> > > this bug has been around for as long as i can remember (before 2.6.16.x), and 
> > > it is still in 2.6.27.10.  i'm upgrading to 2.6.31.4 now, but it'll be a while 
> > > before i can report back since the bug doesnt manifest itself for a long time.  
> > > current uptime is ~3 months.
> > > 
> > > $ cat /proc/stat
> > > Segmentation fault
> > > 
> > > $ dmesg
> > > ------------[ cut here ]------------
> > > Kernel BUG at 001b0c92 [verbose debug info unavailable]
> > > fixpoint divide exception: 0009 [#13] SMP
> > > Modules linked in: ipv6
> > > CPU: 1 Tainted: G      D   2.6.27.10 #5
> > > Process cat (pid: 21352, task: 1fb34138, ksp: 1d2a3d98)
> > > Krnl PSW : 070c2000 801b0c92 (show_stat+0x2ca/0x68c)
> > >            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0
> > > Krnl GPRS: 00000001 00001388 00000bb8 0015d2a1
> > >            00000000 00000000 000003e8 0001fd91
> > >            00000000 00000000 0000129d eecd2ff0
> > >            1cc533b9 0036f780 801b0bce 1d2a3cc0
> > > Krnl Code: 801b0c86: f18890abf198       mvo     171(9,%r9),408(9,%r15)
> > >            801b0c8c: 98abf170           lm      %r10,%r11,368(%r15)
> > >            801b0c90: 1da1               dr      %r10,%r1
> > >           >801b0c92: 90abf170           stm     %r10,%r11,368(%r15)
> > >            801b0c96: 98abf190           lm      %r10,%r11,400(%r15)
> > >            801b0c9a: 1da1               dr      %r10,%r1
> > >            801b0c9c: 90abf190           stm     %r10,%r11,400(%r15)
> > >            801b0ca0: 18a3               lr      %r10,%r3
> > > Call Trace:
> > > ([<00000000001b09f4>] show_stat+0x2c/0x68c)
> > >  [<000000000018dcee>] seq_read+0xb2/0x364
> > >  [<00000000001a9980>] proc_reg_read+0x68/0x98
> > >  [<00000000001705ee>] vfs_read+0x6e/0xe8
> > >  [<0000000000170732>] sys_read+0x36/0x78
> > >  [<000000000010f750>] sysc_do_restart+0x12/0x16
> > >  [<0000000077f3ad6a>] 0x77f3ad6a
> > >  <4>---[ end trace 1436ea9559d3de9e ]---
> > > 
> > > i'm making sure to enable verbose debug this time in case the bug comes up 
> > > again, but perhaps someone can ninja this out before
> > 
> > The dr %r10,%r1 got a divide exception because the division result does
> > not fit into a 32 bit register. Does not happen on 64 bit because the
> > target register is larger. It is probably the idle time conversion to
> > ticks. I'll have a look.
> 
> The bug should show up on a completely idle machine after 49.7 days.
> cputime64_to_clock_t is broken. I'll post a patch shortly.

This patch should fix it:
--
Subject: [PATCH] cputime: fix overflow on 31 bit systems

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

The cputime_to_msecs / cputime_to_clock_t and cputime64_to_clock_t
cause fixpoint divide exceptions if the cputime is too large.
On a machine that collected 49.7 days worth of idle time reading
from /proc/stat will generate oopses like this:

Kernel BUG at 001b0c92 [verbose debug info unavailable]
fixpoint divide exception: 0009 [#13] SMP
Modules linked in: ipv6
CPU: 1 Tainted: G      D   2.6.27.10 #5
Process cat (pid: 21352, task: 1fb34138, ksp: 1d2a3d98)
Krnl PSW : 070c2000 801b0c92 (show_stat+0x2ca/0x68c)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0
Krnl GPRS: 00000001 00001388 00000bb8 0015d2a1
           00000000 00000000 000003e8 0001fd91
           00000000 00000000 0000129d eecd2ff0
           1cc533b9 0036f780 801b0bce 1d2a3cc0
Krnl Code: 801b0c86: f18890abf198       mvo     171(9,%r9),408(9,%r15)
           801b0c8c: 98abf170           lm      %r10,%r11,368(%r15)
           801b0c90: 1da1               dr      %r10,%r1
          >801b0c92: 90abf170           stm     %r10,%r11,368(%r15)  
           801b0c96: 98abf190           lm      %r10,%r11,400(%r15)
           801b0c9a: 1da1               dr      %r10,%r1
           801b0c9c: 90abf190           stm     %r10,%r11,400(%r15)
           801b0ca0: 18a3               lr      %r10,%r3
Call Trace:
([<00000000001b09f4>] show_stat+0x2c/0x68c)
 [<000000000018dcee>] seq_read+0xb2/0x364
 [<00000000001a9980>] proc_reg_read+0x68/0x98
 [<00000000001705ee>] vfs_read+0x6e/0xe8
 [<0000000000170732>] sys_read+0x36/0x78
 [<000000000010f750>] sysc_do_restart+0x12/0x16
 [<0000000077f3ad6a>] 0x77f3ad6a
 <4>---[ end trace 1436ea9559d3de9e ]---

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 arch/s390/include/asm/cputime.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -urpN linux-2.6/arch/s390/include/asm/cputime.h linux-2.6-patched/arch/s390/include/asm/cputime.h
--- linux-2.6/arch/s390/include/asm/cputime.h	2009-10-20 09:48:48.000000000 +0200
+++ linux-2.6-patched/arch/s390/include/asm/cputime.h	2009-10-20 09:48:56.000000000 +0200
@@ -78,7 +78,7 @@ cputime64_to_jiffies64(cputime64_t cputi
 static inline unsigned int
 cputime_to_msecs(const cputime_t cputime)
 {
-	return __div(cputime, 4096000);
+	return cputime_div(cputime, 4096000);
 }
 
 static inline cputime_t
@@ -160,7 +160,7 @@ cputime_to_timeval(const cputime_t cputi
 static inline clock_t
 cputime_to_clock_t(cputime_t cputime)
 {
-	return __div(cputime, 4096000000ULL / USER_HZ);
+	return cputime_div(cputime, 4096000000ULL / USER_HZ);
 }
 
 static inline cputime_t
@@ -175,7 +175,7 @@ clock_t_to_cputime(unsigned long x)
 static inline clock_t
 cputime64_to_clock_t(cputime64_t cputime)
 {
-       return __div(cputime, 4096000000ULL / USER_HZ);
+       return cputime_div(cputime, 4096000000ULL / USER_HZ);
 }
 
 struct s390_idle_data {
-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2009-10-20  8:00 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20091020091315.5e23263e@mschwide.boeblingen.de.ibm.com>
2009-10-20  8:00 ` reading /proc/stat segfaults after long uptimes Martin Schwidefsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.