Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
       [not found] <200205160528.g4G5S631019167@sol.mixi.net>
@ 2002-05-16 12:28 ` Todd R. Eigenschink
  2002-05-16 19:38   ` William Lee Irwin III
  2002-05-20 12:58   ` Todd R. Eigenschink
  0 siblings, 2 replies; 9+ messages in thread
From: Todd R. Eigenschink @ 2002-05-16 12:28 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel

Mike Galbraith writes:
>Methinks there's an easier way to get to the line in question.  Compile sched.c with -g via make kernel/sched.o EXTRA_CFLAGS=-g.. gbd can then be used to get you the line with list *__wake_up+0xb2.


Ooh, spiffy idea.  (Like I said, asm rookie.)  I just compiled gdb,
and here's what it says.  Interesting, to me, at least.


(gdb) list *__wake_up+0xb2
0x9d6 is in __wake_up
(/src/linux-2.4.19-pre8/include/asm/processor.h:488).
483     #ifdef  CONFIG_MPENTIUMIII
484
485     #define ARCH_HAS_PREFETCH
486     extern inline void prefetch(const void *x)
487     {
488             __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
489     }
490
491     #elif CONFIG_X86_USE_3DNOW
492


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-16 12:28 ` Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0) Todd R. Eigenschink
@ 2002-05-16 19:38   ` William Lee Irwin III
  2002-05-20 12:58   ` Todd R. Eigenschink
  1 sibling, 0 replies; 9+ messages in thread
From: William Lee Irwin III @ 2002-05-16 19:38 UTC (permalink / raw)
  To: Todd R. Eigenschink; +Cc: Mike Galbraith, linux-kernel

On Thu, May 16, 2002 at 07:28:44AM -0500, Todd R. Eigenschink wrote:
> Ooh, spiffy idea.  (Like I said, asm rookie.)  I just compiled gdb,
> and here's what it says.  Interesting, to me, at least.
> (gdb) list *__wake_up+0xb2
> 0x9d6 is in __wake_up
> (/src/linux-2.4.19-pre8/include/asm/processor.h:488).
> 483     #ifdef  CONFIG_MPENTIUMIII
> 484
> 485     #define ARCH_HAS_PREFETCH
> 486     extern inline void prefetch(const void *x)
> 487     {
> 488             __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> 489     }
> 490
> 491     #elif CONFIG_X86_USE_3DNOW

list_for_each() uses prefetch() and is used in __wake_up_common(), which
is in turn used by __wake_up(). This is waitqueue list corruption.


Cheers,
Bill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-16 12:28 ` Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0) Todd R. Eigenschink
  2002-05-16 19:38   ` William Lee Irwin III
@ 2002-05-20 12:58   ` Todd R. Eigenschink
  2002-05-20 17:00     ` William Lee Irwin III
  1 sibling, 1 reply; 9+ messages in thread
From: Todd R. Eigenschink @ 2002-05-20 12:58 UTC (permalink / raw)
  To: linux-kernel

Todd R. Eigenschink writes:
>Mike Galbraith writes:
>>Methinks there's an easier way to get to the line in question.  Compile sched.c with -g via make kernel/sched.o EXTRA_CFLAGS=-g.. gbd can then be used to get you the line with list *__wake_up+0xb2.


Since the particular snippet of code at the point of oops in the last
one I posted was P3-specified, I recompiled for 586.  The oops remains
the same, although the call stack happens to be a lot longer this
time.

I'm going to run memtest86 on it for a while after it gets done with
its morning processing, although this failure seems a little too
consistent to be memory related.


Trace; c0129b39 <unlock_page+81/88>
Trace; c0139179 <end_buffer_io_async+8d/a8>
Trace; c01b6f45 <end_that_request_first+65/c8>
Trace; c01c1c3c <ide_end_request+68/a8>
Trace; c01c806a <ide_dma_intr+6a/ac>
Trace; c01c38ad <ide_intr+f9/164>
Trace; c01c8000 <ide_dma_intr+0/ac>
Trace; c010a1e1 <handle_IRQ_event+59/84>
Trace; c010a3d9 <do_IRQ+a9/f4>
Trace; c010c568 <call_do_IRQ+5/d>
Trace; c0154b07 <statm_pgd_range+133/1a8>
Trace; c0154c43 <proc_pid_statm+c7/16c>
Trace; c015279e <proc_info_read+5a/118>
Trace; c0137497 <sys_read+8f/104>
Trace; c0108a43 <system_call+33/40>

Code;  c0116383 <__wake_up+3b/c0>
00000000 <_EIP>:
Code;  c0116383 <__wake_up+3b/c0>   <=====
   0:   8b 01                     mov    (%ecx),%eax   <=====
Code;  c0116385 <__wake_up+3d/c0>
   2:   85 45 fc                  test   %eax,0xfffffffc(%ebp)
Code;  c0116388 <__wake_up+40/c0>
   5:   74 66                     je     6d <_EIP+0x6d> c01163f0 <__wake_up+a8/c
0>
Code;  c011638a <__wake_up+42/c0>
   7:   31 d2                     xor    %edx,%edx
Code;  c011638c <__wake_up+44/c0>
   9:   9c                        pushf  
Code;  c011638d <__wake_up+45/c0>
   a:   5e                        pop    %esi
Code;  c011638e <__wake_up+46/c0>
   b:   fa                        cli    
Code;  c011638f <__wake_up+47/c0>
   c:   f0 fe 0d 80 99 30 c0      lock decb 0xc0309980
Code;  c0116396 <__wake_up+4e/c0>
  13:   0f 00 00                  sldtl  (%eax)


(gdb) list *__wake_up+0x3b
0x96f is in __wake_up (kernel/sched.c:732).
727                     wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
728
729                     CHECK_MAGIC(curr->__magic);
730                     p = curr->task;
731                     state = p->state;
732                     if (state & mode) {
733                             WQ_NOTE_WAKER(curr);
734                             if (try_to_wake_up(p, sync) && (curr->flags&WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
735                                     break;
736                     }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-20 12:58   ` Todd R. Eigenschink
@ 2002-05-20 17:00     ` William Lee Irwin III
  2002-05-20 20:26       ` Todd R. Eigenschink
  0 siblings, 1 reply; 9+ messages in thread
From: William Lee Irwin III @ 2002-05-20 17:00 UTC (permalink / raw)
  To: Todd R. Eigenschink; +Cc: linux-kernel

On Mon, May 20, 2002 at 07:58:25AM -0500, Todd R. Eigenschink wrote:
> Since the particular snippet of code at the point of oops in the last
> one I posted was P3-specified, I recompiled for 586.  The oops remains
> the same, although the call stack happens to be a lot longer this
> time.

I suspect the lowest parts of the call chain are being handed bad data.


On Mon, May 20, 2002 at 07:58:25AM -0500, Todd R. Eigenschink wrote:
> I'm going to run memtest86 on it for a while after it gets done with
> its morning processing, although this failure seems a little too
> consistent to be memory related.

I hope I didn't say that.


On Mon, May 20, 2002 at 07:58:25AM -0500, Todd R. Eigenschink wrote:
> Trace; c0129b39 <unlock_page+81/88>
> Trace; c0139179 <end_buffer_io_async+8d/a8>
> Trace; c01b6f45 <end_that_request_first+65/c8>
> Trace; c01c1c3c <ide_end_request+68/a8>
> Trace; c01c806a <ide_dma_intr+6a/ac>
> Trace; c01c38ad <ide_intr+f9/164>
> Trace; c01c8000 <ide_dma_intr+0/ac>
> Trace; c010a1e1 <handle_IRQ_event+59/84>
> Trace; c010a3d9 <do_IRQ+a9/f4>
> Trace; c010c568 <call_do_IRQ+5/d>
> Trace; c0154b07 <statm_pgd_range+133/1a8>
> Trace; c0154c43 <proc_pid_statm+c7/16c>
> Trace; c015279e <proc_info_read+5a/118>
> Trace; c0137497 <sys_read+8f/104>
> Trace; c0108a43 <system_call+33/40>

The __wake_up()/unlock_page() isn't the interesting part of the call
chain, the parts from end_buffer_io_async() to ide_dma_intr() are.

Any chance you can list them in gdb?


Cheers,
Bill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-20 17:00     ` William Lee Irwin III
@ 2002-05-20 20:26       ` Todd R. Eigenschink
  2002-05-20 22:36         ` William Lee Irwin III
  0 siblings, 1 reply; 9+ messages in thread
From: Todd R. Eigenschink @ 2002-05-20 20:26 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel

William Lee Irwin III writes:
>On Mon, May 20, 2002 at 07:58:25AM -0500, Todd R. Eigenschink wrote:
>> I'm going to run memtest86 on it for a while after it gets done with
>> its morning processing, although this failure seems a little too
>> consistent to be memory related.
>
>I hope I didn't say that.

Someone else had suggested testing the memory and power supply.
memtest86 is easy to run, so I'll try that.  It'll have to be tonight,
now.


>The __wake_up()/unlock_page() isn't the interesting part of the call
>chain, the parts from end_buffer_io_async() to ide_dma_intr() are.
>
>Any chance you can list them in gdb?

Well, after my posting from earlier today, I recompiled the kernel
after stripping some more stuff.  I just induced an oops in that one,
so I can list the call stack for it.

No IDE stuff this time; this looks a lot like most of the other ones
I've seen.  This morning was the first time I've ever seen IDE stuff
in the post-oops call stack.

It seems I can pretty much induce them at will, now.  I started up
four simultaneous Webtrends sessions, which grow fairly quickly to
400-600 MB each, give or take.  (The machine has 2 GB of RAM, so it
only swaps a little, sometimes.)  Within half an hour, it fell over.

Here's the oops itself, then the gdb output.


----------------------------------------------------------------------
Oops: 0000
CPU:    1
EIP:    0010:[<c0116363>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010087
eax: c2802db4   ebx: c2002db4   ecx: 00000000   edx: 00000003
esi: c2802db0   edi: c2802db0   ebp: f7bf3ee8   esp: f7bf3ecc
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 5, stackpage=f7bf3000)
Stack: c133d790 c2802db0 c02acbf4 c2802db4 00000000 00000282 00000003 d911d9f0
       c0129b19 0076eb00 c133d790 f7bf3f4c c0130817 00000000 c12e9ca0 00000020
       00008efe 81e65000 81a7d000 1147d047 00000009 81c00000 f6c99818 81c00000
Call Trace: [<c0129b19>] [<c0130817>] [<c0130ca7>] [<c0130ea0>] [<c0130efc>]
   [<c0130f97>] [<c0130ff6>] [<c0131107>] [<c010712c>]
Code: 8b 01 85 45 fc 74 66 31 d2 9c 5e fa f0 fe 0d 80 99 30 c0 0f


>>EIP; c0116363 <__wake_up+3b/c0>   <=====

>>eax; c2802db4 <END_OF_CODE+249b758/????>
>>ebx; c2002db4 <END_OF_CODE+1c9b758/????>
>>esi; c2802db0 <END_OF_CODE+249b754/????>
>>edi; c2802db0 <END_OF_CODE+249b754/????>
>>ebp; f7bf3ee8 <END_OF_CODE+3788c88c/????>
>>esp; f7bf3ecc <END_OF_CODE+3788c870/????>

Trace; c0129b19 <unlock_page+81/88>
Trace; c0130817 <swap_out+347/4b4>
Trace; c0130ca7 <shrink_cache+323/3cc>
Trace; c0130ea0 <shrink_caches+5c/84>
Trace; c0130efc <try_to_free_pages+34/54>
Trace; c0130f97 <kswapd_balance_pgdat+47/90>
Trace; c0130ff6 <kswapd_balance+16/2c>
Trace; c0131107 <kswapd+9b/b6>
Trace; c010712c <kernel_thread+28/38>

Code;  c0116363 <__wake_up+3b/c0>
00000000 <_EIP>:
Code;  c0116363 <__wake_up+3b/c0>   <=====
   0:   8b 01                     mov    (%ecx),%eax   <=====
Code;  c0116365 <__wake_up+3d/c0>
   2:   85 45 fc                  test   %eax,0xfffffffc(%ebp)
Code;  c0116368 <__wake_up+40/c0>
   5:   74 66                     je     6d <_EIP+0x6d> c01163d0 <__wake_up+a8/c
0>
Code;  c011636a <__wake_up+42/c0>
   7:   31 d2                     xor    %edx,%edx
Code;  c011636c <__wake_up+44/c0>
   9:   9c                        pushf  
Code;  c011636d <__wake_up+45/c0>
   a:   5e                        pop    %esi
Code;  c011636e <__wake_up+46/c0>
   b:   fa                        cli    
Code;  c011636f <__wake_up+47/c0>
   c:   f0 fe 0d 80 99 30 c0      lock decb 0xc0309980
Code;  c0116376 <__wake_up+4e/c0>
  13:   0f 00 00                  sldtl  (%eax)


----------------------------------------------------------------------

(gdb) list *__wake_up+0x3b
0x973 is in __wake_up (sched.c:731).
726                     unsigned int state;
727                     wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
728
729                     CHECK_MAGIC(curr->__magic);
730                     p = curr->task;
731                     state = p->state;
732                     if (state & mode) {
733                             WQ_NOTE_WAKER(curr);
734                             if (try_to_wake_up(p, sync) && (curr->flags&WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
735                                     break;


(gdb) list *unlock_page+0x81
0xcf9 is in unlock_page (filemap.c:845).
840             smp_mb__before_clear_bit();
841             if (!test_and_clear_bit(PG_locked, &(page)->flags))
842                     BUG();
843             smp_mb__after_clear_bit(); 
844             if (waitqueue_active(waitqueue))
845                     wake_up_all(waitqueue);
846     }
847
848     /*
849      * Get a lock on the page, assuming we need to sleep



(gdb) list *swap_out+0x347
No source file for address 0x347.

(gdb) list *swap_out
0x0 is in kswapd_init (vmscan.c:750).
745             }
746     }
747
748     static int __init kswapd_init(void)
749     {
750             printk("Starting kswapd\n");
751             swap_setup();
752             kernel_thread(kswapd, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL);
753             return 0;
754     }


(I'm fuzzzy on swap_out...can I not see the code because it's a static
function?)



(gdb) list *shrink_cache+0x323
0x7d7 is in shrink_cache (vmscan.c:483).
478                              * Alert! We've found too many mapped pages on the
479                              * inactive list, so we start swapping out now!
480                              */
481                             spin_unlock(&pagemap_lru_lock);
482                             swap_out(priority, gfp_mask, classzone);
483                             return nr_pages;
484                     }
485
486                     /*
487                      * It is critical to check PageDirty _after_ we made sure


(gdb) list *shrink_caches+0x5c
0x9d0 is in shrink_caches (vmscan.c:571).
566             nr_pages = chunk_size;
567             /* try to keep the active list 2/3 of the size of the cache */
568             ratio = (unsigned long) nr_pages * nr_active_pages / ((nr_inactive_pages + 1) * 2);
569             refill_inactive(ratio);
570
571             nr_pages = shrink_cache(nr_pages, classzone, gfp_mask, priority);
572             if (nr_pages <= 0)
573                     return 0;
574
575             shrink_dcache_memory(priority, gfp_mask);


(gdb) list *try_to_free_pages+0x34
0xa2c is in try_to_free_pages (vmscan.c:591).
586             int priority = DEF_PRIORITY;
587             int nr_pages = SWAP_CLUSTER_MAX;
588
589             gfp_mask = pf_gfp_mask(gfp_mask);
590             do {
591                     nr_pages = shrink_caches(classzone, priority, gfp_mask, nr_pages);
592                     if (nr_pages <= 0)
593                             return 1;
594             } while (--priority);
595


(gdb) list *kswapd_balance_pgdat+0x47
0xac7 is in kswapd_balance_pgdat (vmscan.c:630).
625                     zone = pgdat->node_zones + i;
626                     if (unlikely(current->need_resched))
627                             schedule();
628                     if (!zone->need_balance)
629                             continue;
630                     if (!try_to_free_pages(zone, GFP_KSWAPD, 0)) {
631                             zone->need_balance = 0;
632                             __set_current_state(TASK_INTERRUPTIBLE);
633                             schedule_timeout(HZ);
634                             continue;


(gdb) list *kswapd_balance+0x16
0xb26 is in kswapd_balance (vmscan.c:655).
650             do {
651                     need_more_balance = 0;
652                     pgdat = pgdat_list;
653                     do
654                             need_more_balance |= kswapd_balance_pgdat(pgdat);
655                     while ((pgdat = pgdat->node_next));
656             } while (need_more_balance);
657     }
658
659     static int kswapd_can_sleep_pgdat(pg_data_t * pgdat)


(gdb) list *kswapd+0x9b
0xc37 is in kswapd (/src/linux-2.4.19-pre8/include/linux/tqueue.h:121).
116
117     extern void __run_task_queue(task_queue *list);
118
119     static inline void run_task_queue(task_queue *list)
120     {
121             if (TQ_ACTIVE(*list))
122                     __run_task_queue(list);
123     }
124
125     #endif /* _LINUX_TQUEUE_H */


(gdb) list *kernel_thread+0x28
0x3fc is in kernel_thread (process.c:492).
487      */
488     int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
489     {
490             long retval, d0;
491
492             __asm__ __volatile__(
493                     "movl %%esp,%%esi\n\t"
494                     "int $0x80\n\t"         /* Linux/i386 system call */
495                     "cmpl %%esp,%%esi\n\t"  /* child or parent? */
496                     "je 1f\n\t"             /* parent - jump */

----------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-20 20:26       ` Todd R. Eigenschink
@ 2002-05-20 22:36         ` William Lee Irwin III
  2002-05-20 23:07           ` Todd R. Eigenschink
  0 siblings, 1 reply; 9+ messages in thread
From: William Lee Irwin III @ 2002-05-20 22:36 UTC (permalink / raw)
  To: Todd R. Eigenschink; +Cc: linux-kernel

On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> Someone else had suggested testing the memory and power supply.
> memtest86 is easy to run, so I'll try that.  It'll have to be tonight,
> now.

Bitflips are usually things where a pointer turns up invalid (or
non-NULL) and the difference between it and a valid pointer (or NULL)
is one bit. I don't see that here and don't like blaming hardware.


On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> Well, after my posting from earlier today, I recompiled the kernel
> after stripping some more stuff.  I just induced an oops in that one,
> so I can list the call stack for it.

Nice, I presume you've got -g there? Any chance of doing something like
objdump --disassemble --source vmlinux and sending me the annotated
disassembly of __wake_up()? I want to doublecheck something...


On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> No IDE stuff this time; this looks a lot like most of the other ones
> I've seen.  This morning was the first time I've ever seen IDE stuff
> in the post-oops call stack.

This is pretty strange, yes.


On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> It seems I can pretty much induce them at will, now.  I started up
> four simultaneous Webtrends sessions, which grow fairly quickly to
> 400-600 MB each, give or take.  (The machine has 2 GB of RAM, so it
> only swaps a little, sometimes.)  Within half an hour, it fell over.
> Here's the oops itself, then the gdb output.

Great stuff! Thanks.


On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> Oops: 0000
> CPU:    1
> EIP:    0010:[<c0116363>]    Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010087
> eax: c2802db4   ebx: c2002db4   ecx: 00000000   edx: 00000003
> esi: c2802db0   edi: c2802db0   ebp: f7bf3ee8   esp: f7bf3ecc
> ds: 0018   es: 0018   ss: 0018

Okay, %ecx is 0 -- no bitflip, just plain old NULL...


On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> Code;  c0116363 <__wake_up+3b/c0>
> 00000000 <_EIP>:
> Code;  c0116363 <__wake_up+3b/c0>   <=====
>    0:   8b 01                     mov    (%ecx),%eax   <=====
> Code;  c0116365 <__wake_up+3d/c0>
>    2:   85 45 fc                  test   %eax,0xfffffffc(%ebp)
> Code;  c0116368 <__wake_up+40/c0>
>    5:   74 66                     je     6d <_EIP+0x6d> c01163d0 <__wake_up+a8/c

Okay, the offending instruction is mov (%ecx), %eax -- dereferencing the
NULL %ecx...


On Mon, May 20, 2002 at 03:26:56PM -0500, Todd R. Eigenschink wrote:
> (gdb) list *__wake_up+0x3b
> 0x973 is in __wake_up (sched.c:731).
> 726                     unsigned int state;
> 727                     wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
> 728
> 729                     CHECK_MAGIC(curr->__magic);
> 730                     p = curr->task;
> 731                     state = p->state;
> 732                     if (state & mode) {
> 733                             WQ_NOTE_WAKER(curr);
> 734                             if (try_to_wake_up(p, sync) && (curr->flags&WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
> 735                                     break;


This makes it pretty clear the offending instruction corresponds to the
first dereference of curr->task. Someone's leaving a NULL pointer in
there when they shouldn't. So this entire call chain has nothing to do
with the offender -- it only trips over the bad pointer the offending
code left behind. This looks like a PITA. The objdump --disassemble
--source stuff is just to have the assembly and source next to each
other for a "more convincing" demonstration, not that this isn't already
pretty good as it stands. Of course, finding the offender will be painful.


Cheers,
Bill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-20 22:36         ` William Lee Irwin III
@ 2002-05-20 23:07           ` Todd R. Eigenschink
  2002-05-20 23:28             ` William Lee Irwin III
  0 siblings, 1 reply; 9+ messages in thread
From: Todd R. Eigenschink @ 2002-05-20 23:07 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel

William Lee Irwin III writes:
>Bitflips are usually things where a pointer turns up invalid (or
>non-NULL) and the difference between it and a valid pointer (or NULL)
>is one bit. I don't see that here and don't like blaming hardware.

Good point.


>Nice, I presume you've got -g there? Any chance of doing something like
>objdump --disassemble --source vmlinux and sending me the annotated
>disassembly of __wake_up()? I want to doublecheck something...

Everything's compiled with -g at the moment.  In fact, I tried
compiling without the -O2, but found out pretty quickly that You Can't
Do That. :) The disassembly is included below.  It's not too big.

I was upstairs rebooting from another oops when your mail arrived,
just a few hours after the last oops.  (Same workload, continuing
where it left off before.)  It was identical apart from trivialities,
and of course %ecx was 0.


>                                          The objdump --disassemble
>--source stuff is just to have the assembly and source next to each
>other for a "more convincing" demonstration, not that this isn't already
>pretty good as it stands. Of course, finding the offender will be painful.

I'll be glad to do whatever I can to help.  If four jobs crashes it in
a couple hours, 20 will probably crash it a lot sooner. :)


For whatever this may be worth--probably nothing--I have softdog
compiled in, but it has only successfully rebooted after an oops maybe
twice out of 20 or more oopsen.  On a bunch of them, the message has
come out to the serial console that it was initiating a reboot (but it
didn't).  Most of the time, it's just the oops and then...darkness.

Also, on the off chance that this is a code generation problem, this
is gcc 2.95.3.  I actually was about to say 3.0.4 and wait for the
slaps-upside-the-head, but I just checked and realized I haven't
upgraded this box.


Todd


Partial disassembly follows.  If for some strange reason you want the
whole thing, it's ~5MB and at
http://www.mixi.net/~eigenstr/vmlinux.disassembly.bz2 .

----------------------------------------------------------------------

c0116328 <__wake_up>:

/*
 * The core wakeup function.  Non-exclusive wakeups (nr_exclusive == 0) just wake everything
 * up.  If it's an exclusive wakeup (nr_exclusive == small +ve number) then we wake all the
 * non-exclusive tasks and one exclusive task.
 *
 * There are circumstances in which we can try to wake a task which has already
 * started to run but is not in state TASK_RUNNING.  try_to_wake_up() returns zero
 * in this (rare) case, and we handle it by contonuing to scan the queue.
 */
static inline void __wake_up_common (wait_queue_head_t *q, unsigned int mode,
			 	     int nr_exclusive, const int sync)
{
	struct list_head *tmp;
	struct task_struct *p;

	CHECK_MAGIC_WQHEAD(q);
	WQ_CHECK_LIST_HEAD(&q->task_list);
	
	list_for_each(tmp,&q->task_list) {
		unsigned int state;
                wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);

		CHECK_MAGIC(curr->__magic);
		p = curr->task;
		state = p->state;
		if (state & mode) {
			WQ_NOTE_WAKER(curr);
			if (try_to_wake_up(p, sync) && (curr->flags&WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
				break;
		}
	}
}

void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr)
{
c0116328:	55                   	push   %ebp
c0116329:	89 e5                	mov    %esp,%ebp
c011632b:	83 ec 10             	sub    $0x10,%esp
c011632e:	57                   	push   %edi
c011632f:	56                   	push   %esi
c0116330:	53                   	push   %ebx
c0116331:	89 55 fc             	mov    %edx,0xfffffffc(%ebp)
c0116334:	89 c7                	mov    %eax,%edi
	if (q) {
c0116336:	85 ff                	test   %edi,%edi
c0116338:	0f 84 a2 00 00 00    	je     c01163e0 <__wake_up+0xb8>
		unsigned long flags;
		wq_read_lock_irqsave(&q->lock, flags);
c011633e:	9c                   	pushf  
c011633f:	8f 45 f8             	popl   0xfffffff8(%ebp)
c0116342:	fa                   	cli    
printk("eip: %p\n", &&here);
		BUG();
	}
#endif
	__asm__ __volatile__(
c0116343:	f0 fe 0f             	lock decb (%edi)
c0116346:	0f 88 5f 0f 00 00    	js     c01172ab <Letext+0x8a>
c011634c:	89 4d f4             	mov    %ecx,0xfffffff4(%ebp)
c011634f:	8b 5f 04             	mov    0x4(%edi),%ebx
c0116352:	8d 47 04             	lea    0x4(%edi),%eax
c0116355:	89 45 f0             	mov    %eax,0xfffffff0(%ebp)
c0116358:	39 c3                	cmp    %eax,%ebx
c011635a:	74 7b                	je     c01163d7 <__wake_up+0xaf>
c011635c:	8d 74 26 00          	lea    0x0(%esi,1),%esi
c0116360:	8b 4b fc             	mov    0xfffffffc(%ebx),%ecx
c0116363:	8b 01                	mov    (%ecx),%eax
c0116365:	85 45 fc             	test   %eax,0xfffffffc(%ebp)
c0116368:	74 66                	je     c01163d0 <__wake_up+0xa8>
c011636a:	31 d2                	xor    %edx,%edx
c011636c:	9c                   	pushf  
c011636d:	5e                   	pop    %esi
c011636e:	fa                   	cli    
printk("eip: %p\n", &&here);
		BUG();
	}
#endif
	__asm__ __volatile__(
c011636f:	f0 fe 0d 80 99 30 c0 	lock decb 0xc0309980
c0116376:	0f 88 3b 0f 00 00    	js     c01172b7 <Letext+0x96>
c011637c:	c7 01 00 00 00 00    	movl   $0x0,(%ecx)
c0116382:	83 79 3c 00          	cmpl   $0x0,0x3c(%ecx)
c0116386:	75 2d                	jne    c01163b5 <__wake_up+0x8d>
 */
static __inline__ void __list_add(struct list_head * new,
	struct list_head * prev,
	struct list_head * next)
{
c0116388:	a1 c0 b5 2a c0       	mov    0xc02ab5c0,%eax
	next->prev = new;
	new->next = next;
	new->prev = prev;
	prev->next = new;
}

/**
 * list_add - add a new entry
 * @new: new entry to be added
 * @head: list head to add it after
 *
 * Insert a new entry after the specified head.
 * This is good for implementing stacks.
 */
static __inline__ void list_add(struct list_head *new, struct list_head *head)
{
c011638d:	8d 51 3c             	lea    0x3c(%ecx),%edx
c0116390:	89 50 04             	mov    %edx,0x4(%eax)
c0116393:	89 41 3c             	mov    %eax,0x3c(%ecx)
c0116396:	c7 42 04 c0 b5 2a c0 	movl   $0xc02ab5c0,0x4(%edx)
c011639d:	89 15 c0 b5 2a c0    	mov    %edx,0xc02ab5c0
c01163a3:	ff 05 60 7a 32 c0    	incl   0xc0327a60
c01163a9:	89 c8                	mov    %ecx,%eax
c01163ab:	e8 48 f6 ff ff       	call   c01159f8 <reschedule_idle>
c01163b0:	ba 01 00 00 00       	mov    $0x1,%edx
		:"0" (oldval) : "memory"

static inline void spin_unlock(spinlock_t *lock)
{
	char oldval = 1;
c01163b5:	b0 01                	mov    $0x1,%al
#if SPINLOCK_DEBUG
	if (lock->magic != SPINLOCK_MAGIC)
		BUG();
	if (!spin_is_locked(lock))
		BUG();
#endif
	__asm__ __volatile__(
c01163b7:	86 05 80 99 30 c0    	xchg   %al,0xc0309980
c01163bd:	56                   	push   %esi
c01163be:	9d                   	popf   
c01163bf:	85 d2                	test   %edx,%edx
c01163c1:	74 0d                	je     c01163d0 <__wake_up+0xa8>
c01163c3:	f6 43 f8 01          	testb  $0x1,0xfffffff8(%ebx)
c01163c7:	74 07                	je     c01163d0 <__wake_up+0xa8>
c01163c9:	ff 4d f4             	decl   0xfffffff4(%ebp)
c01163cc:	74 09                	je     c01163d7 <__wake_up+0xaf>
c01163ce:	89 f6                	mov    %esi,%esi
c01163d0:	8b 1b                	mov    (%ebx),%ebx
c01163d2:	3b 5d f0             	cmp    0xfffffff0(%ebp),%ebx
c01163d5:	75 89                	jne    c0116360 <__wake_up+0x38>
		:"0" (oldval) : "memory"

static inline void spin_unlock(spinlock_t *lock)
{
	char oldval = 1;
c01163d7:	b0 01                	mov    $0x1,%al
#if SPINLOCK_DEBUG
	if (lock->magic != SPINLOCK_MAGIC)
		BUG();
	if (!spin_is_locked(lock))
		BUG();
#endif
	__asm__ __volatile__(
c01163d9:	86 07                	xchg   %al,(%edi)
		__wake_up_common(q, mode, nr, 0);
		wq_read_unlock_irqrestore(&q->lock, flags);
c01163db:	ff 75 f8             	pushl  0xfffffff8(%ebp)
c01163de:	9d                   	popf   
	}
c01163df:	90                   	nop    
c01163e0:	5b                   	pop    %ebx
c01163e1:	5e                   	pop    %esi
c01163e2:	5f                   	pop    %edi
c01163e3:	89 ec                	mov    %ebp,%esp
c01163e5:	5d                   	pop    %ebp
c01163e6:	c3                   	ret    
}
c01163e7:	90                   	nop    

----------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-20 23:07           ` Todd R. Eigenschink
@ 2002-05-20 23:28             ` William Lee Irwin III
  2002-05-20 23:59               ` Todd R. Eigenschink
  0 siblings, 1 reply; 9+ messages in thread
From: William Lee Irwin III @ 2002-05-20 23:28 UTC (permalink / raw)
  To: Todd R. Eigenschink; +Cc: linux-kernel

On Mon, May 20, 2002 at 06:07:12PM -0500, Todd R. Eigenschink wrote:
> For whatever this may be worth--probably nothing--I have softdog
> compiled in, but it has only successfully rebooted after an oops maybe
> twice out of 20 or more oopsen.  On a bunch of them, the message has
> come out to the serial console that it was initiating a reboot (but it
> didn't).  Most of the time, it's just the oops and then...darkness.

Actually, getting  a notion of your sourcebase and what's actually
running sounds like a great idea. Any chance you could rattle off what
patches you've got and/or name the tree, and maybe send me a .config?
Also, any chance you could tell me a little about the hardware?
I'm not going to tell you what to run or not to run, I just want to
know where to start looking.

On Mon, May 20, 2002 at 06:07:12PM -0500, Todd R. Eigenschink wrote:
> Also, on the off chance that this is a code generation problem, this
> is gcc 2.95.3.  I actually was about to say 3.0.4 and wait for the
> slaps-upside-the-head, but I just checked and realized I haven't
> upgraded this box.

I don't know of any particular issues with gcc 2.95.3, but I'll compare
the disassemblies you sent me just in case.

Your help in tracking this down has been immense, I hope you have the
patience to bear with me as I try to fix this for you.

Thanks,
Bill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0)
  2002-05-20 23:28             ` William Lee Irwin III
@ 2002-05-20 23:59               ` Todd R. Eigenschink
  0 siblings, 0 replies; 9+ messages in thread
From: Todd R. Eigenschink @ 2002-05-20 23:59 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel

William Lee Irwin III writes:
>Actually, getting  a notion of your sourcebase and what's actually
>running sounds like a great idea. Any chance you could rattle off what
>patches you've got and/or name the tree, and maybe send me a .config?
>Also, any chance you could tell me a little about the hardware?
>I'm not going to tell you what to run or not to run, I just want to
>know where to start looking.

Kernel: vanilla 2.4.19-pre8 at the moment.  I recompiled after adding
Steven Tweedie's latest ext3 patch the other night, but that's it.
I've been following the 2.4.19-pre kernels "religiously", but never
mix in *any* other patches.  While I don't have any actual oops output
from previous kernels, I think this has been around in every
2.4.19-pre.  (I've been having trouble for longer than that, but my
last round--see link below--at least *appeared* different.)

Stuff That Runs: vanilla.  syslog-ng, bind 9.2.1, gated, portmap,
ypserv, xinted automount, cron, rpc.mountd, ypbind, rpc.nfsd, Apache
(hardly ever touched), Backup Exec agent, postgres 7.2.1 (only hit by
Apache).

Webtrends runs early every morning.  A bunch of other machines rcp log
files to it between midnight and 04:00.  I've had oopsen while
webtrends is running and while it's not running.  I've had them just
when there are rsh/rcp sessions from a couple different machines at
the same time.  I've even had them when the machine is (as far as I
could predict) completely idle.

If you have suggestions for stuff to run (or not)--whatever--I'll be
glad to try it.  I can start going backwards kernel-wise, if you want
me to try to pin a starting point for the problem.

A couple other references:

http://groups.google.com/groups?q=todd+eigenschink&hl=en&lr=&ie=utf-8&oe=utf-8&scoring=d&selm=linux.kernel.15404.36497.77658.797884%40rtfm.ofc.tekinteractive.com&rnum=7

http://groups.google.com/groups?q=todd+eigenschink&hl=en&lr=&ie=utf-8&oe=utf-8&scoring=d&selm=linux.kernel.3C3D375C.E4A7EE77%40zip.com.au&rnum=6

>Your help in tracking this down has been immense, I hope you have the
>patience to bear with me as I try to fix this for you.

I have a lot more patience than kernel hacking skill, so I'll do what
I can, and you do your thing. :-)

A steak dinner and a case of your favorite if you fix it.  I'm
*really* tired of getting paged and driving in to the office in the
wee hours of the morning to hit the freaking reset button.  I do
preemptive reboots some evenings so I can control it, but it may still
croak a couple hours later.  (I'd love an APC MasterSwitch right now,
but I can do a *lot* of driving and switch-flipping for $600.)

Todd

(Hardware info and .config follows.)

----------------------------------------------------------------------

Hardware:

Intel L440GX-C mainboard.  Dual P3/500 CPUs, 2 GB of RAM.

1 9GB SCSI disk, 1 36GB SCSI, 4 x 30GB IDE disks, all on the internal
IDE & Adaptec SCSI.  (The IDE used to be one 4-disk softraid RAID0
partition; now it's two separate 2-disk RAID0 partitions.)

----------------------------------------------------------------------
"grep =y .config" (nothing configured as modules).  It had been
CONFIG_MPENTIUMIII; I recompiled as M586 a few days ago.  No change.

CONFIG_X86=y
CONFIG_ISA=y
CONFIG_UID16=y
CONFIG_EXPERIMENTAL=y
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y
CONFIG_M586=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_USE_STRING_486=y
CONFIG_X86_ALIGNMENT_16=y
CONFIG_X86_PPRO_FENCE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y
CONFIG_MTRR=y
CONFIG_SMP=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_NET=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
CONFIG_BINFMT_ELF=y
CONFIG_BLK_DEV_FD=y
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_RAID0=y
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK_DEV=y
CONFIG_NETFILTER=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_FTP=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_MULTIPORT=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_TARGET_LOG=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_BLK_DEV_ADMA=y
CONFIG_BLK_DEV_PIIX=y
CONFIG_PIIX_TUNING=y
CONFIG_IDE_CHIPSETS=y
CONFIG_IDEDMA_AUTO=y
CONFIG_BLK_DEV_IDE_MODES=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_AIC7XXX=y
CONFIG_NETDEVICES=y
CONFIG_NET_ETHERNET=y
CONFIG_NET_PCI=y
CONFIG_EEPRO100=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_SERIAL_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_WATCHDOG=y
CONFIG_SOFT_WATCHDOG=y
CONFIG_RTC=y
CONFIG_AUTOFS_FS=y
CONFIG_AUTOFS4_FS=y
CONFIG_EXT3_FS=y
CONFIG_JBD=y
CONFIG_RAMFS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_VGA_CONSOLE=y

----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-05-20 23:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200205160528.g4G5S631019167@sol.mixi.net>
2002-05-16 12:28 ` Re: kswapd OOPS under 2.4.19-pre8 (ext3, Reiserfs + (soft)raid0) Todd R. Eigenschink
2002-05-16 19:38   ` William Lee Irwin III
2002-05-20 12:58   ` Todd R. Eigenschink
2002-05-20 17:00     ` William Lee Irwin III
2002-05-20 20:26       ` Todd R. Eigenschink
2002-05-20 22:36         ` William Lee Irwin III
2002-05-20 23:07           ` Todd R. Eigenschink
2002-05-20 23:28             ` William Lee Irwin III
2002-05-20 23:59               ` Todd R. Eigenschink

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).