From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750919AbVLFQId (ORCPT ); Tue, 6 Dec 2005 11:08:33 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750847AbVLFQId (ORCPT ); Tue, 6 Dec 2005 11:08:33 -0500 Received: from solarneutrino.net ([66.199.224.43]:38661 "EHLO tau.solarneutrino.net") by vger.kernel.org with ESMTP id S1750832AbVLFQIc (ORCPT ); Tue, 6 Dec 2005 11:08:32 -0500 Date: Tue, 6 Dec 2005 11:08:15 -0500 To: Hugh Dickins Cc: Linus Torvalds , Kai Makisara , Andrew Morton , James Bottomley , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, ryan@tau.solarneutrino.net Subject: Re: Fw: crash on x86_64 - mm related? Message-ID: <20051206160815.GC11560@tau.solarneutrino.net> References: <20051129092432.0f5742f0.akpm@osdl.org> <20051201195657.GB7236@tau.solarneutrino.net> <20051202180326.GB7634@tau.solarneutrino.net> <20051202194447.GA7679@tau.solarneutrino.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i From: Ryan Richter Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Another crash during last night's backup: Bad page state at free_hot_cold_page (in process 'taper', page ffff810002410cc0) flags:0x010000000000000c mapping:ffff81017d5beb70 mapcount:2 count:0 Backtrace: Call Trace:{bad_page+99} {free_hot_cold_page+101} {__page_cache_release+151} {sgl_unmap_user_pages+120} {release_buffering+27} {st_write+1697} {vfs_write+198} {sys_write+83} {system_call+126} Trying to fix it up, but a reboot is needed Bad page state at free_hot_cold_page (in process 'taper', page ffff810002410cc0) flags:0x010000000000081c mapping:ffff810064cd6910 mapcount:0 count:0 Backtrace: Call Trace:{bad_page+99} {free_hot_cold_page+101} {__page_cache_release+151} {sgl_unmap_user_pages+120} {release_buffering+27} {st_write+1697} {vfs_write+198} {sys_write+83} {system_call+126} Trying to fix it up, but a reboot is needed ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at include/linux/mm.h:341 invalid operand: 0000 [1] SMP CPU 1 Modules linked in: bonding Pid: 11402, comm: taper Tainted: G B 2.6.14.3 #1 RIP: 0010:[] {sgl_unmap_user_pages+93} RSP: 0018:ffff8100ca6d9e18 EFLAGS: 00010256 RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000000005 RDX: 00000000000000e0 RSI: 0000000000000001 RDI: ffff810002410cc0 RBP: ffff810004990068 R08: 00000000ffffffff R09: 0000000000000000 R10: 0000000000008000 R11: 0000000000000200 R12: 0000000000000008 R13: 0000000000000000 R14: 0000000000008000 R15: ffff810004953d10 FS: 00002aaaab53d880(0000) GS:ffffffff804db880(0000) knlGS:00000000555bc920 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaab5ffff CR3: 000000008c49e000 CR4: 00000000000006e0 Process taper (pid: 11402, threadinfo ffff8100ca6d8000, task ffff81017dabc200) Stack: ffff81017ff1e600 ffff810004990000 0000000000000040 0000000000008000 ffff810004953c00 ffffffff802b497b ffff810004990000 ffffffff802b5031 ffff810000000000 ffffffff00000001 Call Trace:{release_buffering+27} {st_write+1697} {vfs_write+198} {sys_write+83} {system_call+126} Code: 0f 0b 68 3a 13 3a 80 c2 55 01 f0 83 47 08 ff 0f 98 c0 84 c0 RIP {sgl_unmap_user_pages+93} RSP ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at mm/rmap.c:487 invalid operand: 0000 [2] SMP CPU 1 Modules linked in: bonding Pid: 11402, comm: taper Tainted: G B 2.6.14.3 #1 RIP: 0010:[] {page_remove_rmap+39} RSP: 0018:ffff8100ca6d9ab0 EFLAGS: 00010286 RAX: 00000000ffffffff RBX: ffff8100cc4e96f8 RCX: ffff81000000f000 RDX: 0000000000000000 RSI: 800000005bba8067 RDI: ffff810002410cc0 RBP: 00002aaaaaadf000 R08: 0000000000000000 R09: ffff810002802738 R10: 00000000fffffffa R11: 0000000000000000 R12: ffff810101c22380 R13: 800000005bba8067 R14: ffff810002410cc0 R15: 0000000000000000 FS: 00002aaaab53d880(0000) GS:ffffffff804db880(0000) knlGS:00000000555bc920 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaab5ffff CR3: 000000008c49e000 CR4: 00000000000006e0 Process taper (pid: 11402, threadinfo ffff8100ca6d8000, task ffff81017dabc200) Stack: ffffffff80166f0d 00002aaaaab62000 ffff81006d16faa8 00002aaaaab62000 00002aaaaab62000 00002aaaaab61fff ffff8100b424c550 00002aaaaab62000 ffffffff801671c0 ffff8100ca6d9d68 Call Trace:{zap_pte_range+477} {unmap_page_range+496} {unmap_vmas+293} {exit_mmap+162} {mmput+49} {do_exit+438} {die+81} {do_invalid_op+159} {sgl_unmap_user_pages+93} {thread_return+86} {sym_setup_data_and_start+402} {error_exit+0} {sgl_unmap_user_pages+93} {sgl_unmap_user_pages+120} {release_buffering+27} {st_write+1697} {vfs_write+198} {sys_write+83} {system_call+126} Code: 0f 0b 68 1b 36 3a 80 c2 e7 01 48 c7 c6 ff ff ff ff bf 20 00 RIP {page_remove_rmap+39} RSP <1>Fixing recursive fault but reboot is needed! Bad page state at prep_new_page (in process 'dumper', page ffff810002410cc0) flags:0x010000000000001c mapping:0000000000000000 mapcount:-1 count:0 Backtrace: Call Trace:{bad_page+99} {prep_new_page+65} {buffered_rmqueue+302} {__alloc_pages+261} {generic_file_buffered_write+413} {inode_update_time+62} {__generic_file_aio_write_nolock+936} {sock_common_recvmsg+52}<0>Bad page state at free_hot_cold_page (in process 'dump', page ffff810002410cc0) flags:0x010000000000001c mapping:0000000000000000 mapcount:-1 count:0 Backtrace: Call Trace:{bad_page+99} {free_hot_cold_page+101} {__pagevec_free+32} {release_pages+349} {journal_stop+512} {__ext3_journal_stop+45} {__pagevec_release+23} {mpage_writepages+690} {ext3_ordered_writepage+0} {__sync_single_inode+114} {__writeback_single_inode+355} {find_get_pages_tag+127} {pagevec_lookup_tag+26} {wait_on_page_writeback_range+220} {sync_sb_inodes+481} {sync_inodes_sb+132} {__sync_inodes+116} {sync_inodes+17} {do_sync+18} {sys_sync+14} {system_call+126} Trying to fix it up, but a reboot is needed {sock_aio_read+272} {generic_file_aio_write+110} {ext3_file_write+35} {do_sync_write+211} {__pollwait+0} {autoremove_wake_function+0} {sys_select+1153} {vfs_write+198} {sys_write+83} {system_call+126} Trying to fix it up, but a reboot is needed general protection fault: 0000 [3] SMP CPU 0 Modules linked in: bonding Pid: 1303, comm: kjournald Tainted: G B 2.6.14.3 #1 RIP: 0010:[] {cache_alloc_refill+330} RSP: 0018:ffff81017ed4dc38 EFLAGS: 00010012 RAX: cb0f489d027409f5 RBX: 000000000000002c RCX: 000000000000000f RDX: ffff810004820760 RSI: ffff81005bba8000 RDI: ffff81001a184030 RBP: ffff81000483a400 R08: ffff810004820750 R09: ffff810004820760 R10: 0000000000000180 R11: 0000000000000000 R12: ffff810004820740 R13: ffff810004834440 R14: ffff810004820788 R15: 0000000000011200 FS: 00002aaaab00c4a0(0000) GS:ffffffff804db800(0000) knlGS:00000000555bc920 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaac0000 CR3: 00000000a21ae000 CR4: 00000000000006e0 Process kjournald (pid: 1303, threadinfo ffff81017ed4c000, task ffff8101028bcf40) Stack: ffffffff8014a2f0 0000000000011200 0000000000011210 ffff810004820ec0 0000000000000010 ffff81017eda3000 0000000000000200 ffffffff80160266 0000000000000296 ffffffff80159669 Call Trace:{autoremove_wake_function+0} {kmem_cache_alloc+54} {mempool_alloc+57} {wake_bit_function+0} {bio_alloc_bioset+28} {bio_alloc+16} {submit_bh+150} {ll_rw_block+143} {journal_commit_transaction+1114} {thread_return+0} {thread_return+86} {lock_timer_base+41} {kjournald+238} {autoremove_wake_function+0} {autoremove_wake_function+0} {commit_timeout+0} {child_rip+8} {kjournald+0} {child_rip+0} Code: 48 89 50 08 48 89 02 48 c7 46 08 00 02 20 00 83 7e 24 ff 48 RIP {cache_alloc_refill+330} RSP NMI Watchdog detected LOCKUP on CPU 1 CPU 1 Modules linked in: bonding Pid: 918, comm: md0_raid5 Tainted: G B 2.6.14.3 #1 RIP: 0010:[] {.text.lock.spinlock+116} RSP: 0018:ffff8101028a1c90 EFLAGS: 00000086 RAX: ffff810004820740 RBX: ffff810004820788 RCX: 000000000000000c RDX: 0000000000000001 RSI: ffff81000000f000 RDI: ffff810004820788 RBP: ffff810004834440 R08: ffff8100048208c0 R09: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000001 R12: 0000000000000000 R13: ffff8101028547c0 R14: ffff8101028547d0 R15: 0000000000000001 FS: 00002aaaab53d8e0(0000) GS:ffffffff804db880(0000) knlGS:00000000555bc920 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002aaaaaac0000 CR3: 00000000a21af000 CR4: 00000000000006e0 Process md0_raid5 (pid: 918, threadinfo ffff8101028a0000, task ffff8101028bc1c0) (serial console output cut off here) -ryan