linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Carsten Aulbert <carsten.aulbert@aei.mpg.de>
Cc: linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: kernel BUG at kernel/workqueue.c:291
Date: Mon, 2 Mar 2009 23:26:43 -0800	[thread overview]
Message-ID: <20090302232643.7c7ca284.akpm@linux-foundation.org> (raw)
In-Reply-To: <49ABBA44.1060302@aei.mpg.de>

On Mon, 02 Mar 2009 11:51:48 +0100 Carsten Aulbert <carsten.aulbert@aei.mpg.de> wrote:

> Hi again,
> 
> in the mean time 43 of our nodes were struck with this error. It seems
> that the jobs of a certain user can trigger this bug, however I have no
> clue how to really trigger it manually.

That's a lot of nodes.

> My questions:
> Is this a know bug for 2.6.27.14 (we can upgrade to .19 if necessary),
> but as this file was not modyfied recently, I suspect there is no ready
> fix for that.
> 
> Do you need any more info of our systems (Intel X3220 based Supermirco
> systems), the kernel config (deadline scheduler in use,...) or something
> else?

Let's cc the NFS developers, see if this rpciod crash is familiar to them?

> Carsten Aulbert schrieb:
> > [228704.928037] ------------[ cut here ]------------
> > [228704.928224] kernel BUG at kernel/workqueue.c:291!
> > [228704.928404] invalid opcode: 0000 [1] SMP
> > [228704.928647] CPU 0
> > [228704.928852] Modules linked in: lm92 w83793 w83781d hwmon_vid hwmon nfs nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs autofs4 netconsole configfs ipmi_si ipmi_devintf ipmi_watchdog ipmi_poweroff ipmi_msghandler e1000e i2c_i801 8250_pnp 8250 serial_core i2c_core
> > [228704.930002] Pid: 1609, comm: rpciod/0 Not tainted 2.6.27.14-nodes #1
> > [228704.930002] RIP: 0010:[<ffffffff8023c6db>]  [<ffffffff8023c6db>] run_workqueue+0x6f/0x102
> > [228704.930002] RSP: 0018:ffff880214bcdec0  EFLAGS: 00010207
> > [228704.930002] RAX: 0000000000000000 RBX: ffff880214b82f40 RCX: ffff880215444418
> > [228704.930002] RDX: ffff880187d07d58 RSI: ffff880214bcdee0 RDI: ffff880215444410
> > [228704.930002] RBP: ffffffffa0077186 R08: ffff880214bcc000 R09: ffff88021491f808
> > [228704.930002] R10: 0000000000000246 R11: ffff880187d07d50 R12: ffff880214ad7d28
> > [228704.930002] R13: ffffffff806065a0 R14: ffffffff80607280 R15: 0000000000000000
> > [228704.930002] FS:  0000000000000000(0000) GS:ffffffff80636040(0000) knlGS:0000000000000000
> > [228704.930002] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > [228704.930002] CR2: 00007fc056333fd8 CR3: 00000001ed270000 CR4: 00000000000006e0
> > [228704.930002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [228704.930002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [228704.930002] Process rpciod/0 (pid: 1609, threadinfo ffff880214bcc000, task ffff880217b08780)
> > [228704.930002] Stack:  ffff880214b82f40 ffff880214b82f40 ffff880214b82f58 ffffffff8023cff3
> > [228704.930002]  0000000000000000 ffff880217b08780 ffffffff8023f7d7 ffff880214bcdef8
> > [228704.930002]  ffff880214bcdef8 ffffffff806065a0 ffffffff80607280 ffff880214b82f40
> > [228704.930002] Call Trace:
> > [228704.930002]  [<ffffffff8023cff3>] ? worker_thread+0x90/0x9b
> > [228704.930002]  [<ffffffff8023f7d7>] ? autoremove_wake_function+0x0/0x2e
> > [228704.930002]  [<ffffffff8023cf63>] ? worker_thread+0x0/0x9b
> > [228704.930002]  [<ffffffff8023f6c2>] ? kthread+0x47/0x75
> > [228704.930002]  [<ffffffff8022afa8>] ? schedule_tail+0x27/0x5f
> > [228704.930002]  [<ffffffff8020ccb9>] ? child_rip+0xa/0x11
> > [228704.930002]  [<ffffffff8023f67b>] ? kthread+0x0/0x75
> > [228704.930002]  [<ffffffff8020ccaf>] ? child_rip+0x0/0x11
> > [228704.930002]
> > [228704.930002]
> > [228704.930002] Code: 6f 18 48 89 7b 30 48 8b 11 48 8b 41 08 48 89 42 08 48 89 10 48 89 49 08 48 89 09 fe 03 fb 48 8b 41 f8 48 83 e0 fc 48 39 d8 74 04 <0f> 0b eb fe f0 80 61 f8 fe ff d5 65 48 8b 04 25 10 00 00 00 8b
> > [228704.930002] RIP  [<ffffffff8023c6db>] run_workqueue+0x6f/0x102
> > [228704.930002]  RSP <ffff880214bcdec0>
> > [228704.941003] ---[ end trace deef6e5387b5a584 ]---
> 
> Thanks for any input, for reight now I'm quite helpless....


  reply	other threads:[~2009-03-03  7:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-27 19:48 kernel BUG at kernel/workqueue.c:291 Carsten Aulbert
2009-03-02 10:51 ` Carsten Aulbert
2009-03-03  7:26   ` Andrew Morton [this message]
2009-03-03  7:36     ` Carsten Aulbert
2009-03-03 15:16     ` Trond Myklebust
2009-03-03 15:23       ` Carsten Aulbert
2009-03-03 20:41         ` Aaron Straus
2009-03-03 21:21         ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090302232643.7c7ca284.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=carsten.aulbert@aei.mpg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).