From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bas van der Vlies Subject: Re: nfs clients crashes Date: Thu, 12 Mar 2009 22:24:15 +0100 Message-ID: <516A1955-7F37-435A-99FD-EC26BF5D35E0@sara.nl> References: <49B91468.3020006@sara.nl> <1236880443.7179.35.camel@heimdal.trondhjem.org> Mime-Version: 1.0 (Apple Message framework v930.3) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Cc: "linux-nfs@vger.kernel.org" To: Trond Myklebust Return-path: Received: from smtp-vbr15.xs4all.nl ([194.109.24.35]:1932 "EHLO smtp-vbr15.xs4all.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756924AbZCLVYZ (ORCPT ); Thu, 12 Mar 2009 17:24:25 -0400 In-Reply-To: <1236880443.7179.35.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: On 12 mrt 2009, at 18:54, Trond Myklebust wrote: > On Thu, 2009-03-12 at 14:55 +0100, Bas van der Vlies wrote: >> OS: debian lenny >> kernel release tested: 2.6.28.[1-7] , 2.6.29.rc5 and 2.6.29.rc7 >> >> NFS-server: solaris 10 zfs/nfs server >> >> Is this a familiar bug? >> {{{ >> ------------[ cut here ]------------ >> kernel BUG at fs/nfs/write.c:252! >> invalid opcode: 0000 [#1] SMP >> last sysfs file: /sys/class/infiniband/mlx4_0/ports/1/gids/0 >> CPU 2 >> Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler autofs4 fuse >> dm_snapshot dm_mirror dm_region_hash dm_log dm_mod mptctl rdma_ucm >> rdma_cm >> iw_cm ib_addr ib_ipoib inet_lro ib_ucm ib_cm ib_sa ib_uverbs ib_umad >> mlx4_ib ib_mad ib_core dcdbas ehci_hcd uhci_hcd mlx4_core bnx2 crc32 >> Pid: 262, comm: pdflush Not tainted 2.6.28.7-sara1 #1 >> RIP: 0010:[] [] >> nfs_do_writepage+0x107/0x1a0 >> RSP: 0000:ffff88043e0f7b10 EFLAGS: 00010202 >> RAX: 0000000000000001 RBX: ffffe2000e5c63e8 RCX: 0000000000000015 >> RDX: 0000000000000000 RSI: 0000000000600020 RDI: ffff8804354a7550 >> RBP: ffff88043e0f7b40 R08: ffff880435597268 R09: ffff8804386ea140 >> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8804386ea140 >> R13: ffff8804354a769c R14: ffffe2000e5c63e8 R15: ffff8804354a75e8 >> FS: 0000000000000000(0000) GS:ffff88043f846840(0000) knlGS: >> 0000000000000000 >> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b >> CR2: 000000005555c18c CR3: 0000000000201000 CR4: 00000000000406e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process pdflush (pid: 262, threadinfo ffff88043e0f6000, task >> ffff88043faf9270) >> Stack: >> ffff88043e0f7c90 ffffe2000e5c63e8 ffffe2000e5c63e8 ffff88043e0f7be0 >> 0000000000000001 0000000000000002 ffff88043e0f7b60 ffffffff803096a9 >> ffffe2000e5c63e8 0000000000000001 ffff88043e0f7c80 ffffffff80272707 >> Call Trace: >> [] nfs_writepages_callback+0x19/0x30 >> [] write_cache_pages+0x227/0x460 >> [] ? nfs_writepages_callback+0x0/0x30 >> [] ? nfs_flush_one+0xb1/0xf0 >> [] nfs_writepages+0xa2/0xf0 >> [] ? nfs_flush_one+0x0/0xf0 >> [] do_writepages+0x28/0x50 >> [] __writeback_single_inode+0x9b/0x470 >> [] ? update_curr+0xd0/0x120 >> [] ? dequeue_entity+0x18/0x190 >> [] generic_sync_sb_inodes+0x3a0/0x4d0 >> [] writeback_inodes+0x4e/0xf0 >> [] wb_kupdate+0xa4/0x130 >> [] pdflush+0x10e/0x1f0 >> [] ? wb_kupdate+0x0/0x130 >> [] ? pdflush+0x0/0x1f0 >> [] kthread+0x49/0x90 >> [] child_rip+0xa/0x11 >> [] ? kthread+0x0/0x90 >> [] ? child_rip+0x0/0x11 >> Code: b4 00 00 00 31 db 48 83 c4 08 89 d8 5b 41 5c 41 5d 41 5e 41 >> 5f c9 c3 >> 0f 1f 44 00 00 41 f6 44 24 40 02 74 0b 41 fe 87 b4 00 00 00 <0f> 0b >> eb fe >> 4c 89 f7 e8 2d 8c f6 ff 85 c0 75 72 49 8b 46 18 ba >> RIP [] nfs_do_writepage+0x107/0x1a0 >> RSP >> ---[ end trace 4fac3d44a611662b ]--- >> }}} > > Would this be occurring when you're doing mmap() writes? If so I might > have an idea about what's wrong. > We do some burn tests for our new hardware and we start: * http://boinc.berkeley.edu I do not know if they use mmap(). I have to check the source for it. Regards -- Bas van der Vlies basv-mYZPGKKnAUw@public.gmane.org