From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cloud1-vm154.de-nserver.de ([178.250.10.56]:59904 "EHLO cloud1-vm154.de-nserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751225AbdCMH0Z (ORCPT ); Mon, 13 Mar 2017 03:26:25 -0400 Subject: Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error To: Qu Wenruo , "linux-btrfs@vger.kernel.org" , fdmanana@suse.com References: <20170308022552.14686-1-quwenruo@cn.fujitsu.com> <03d57894-e8df-1001-b57a-daa205037f1c@cn.fujitsu.com> From: Stefan Priebe - Profihost AG Message-ID: <899cc443-07ab-eb7a-5658-f68d8b8b3ce7@profihost.ag> Date: Mon, 13 Mar 2017 08:26:22 +0100 MIME-Version: 1.0 In-Reply-To: <03d57894-e8df-1001-b57a-daa205037f1c@cn.fujitsu.com> Content-Type: text/plain; charset=utf-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: Hi Qu, Am 13.03.2017 um 02:16 schrieb Qu Wenruo: > > At 03/13/2017 04:49 AM, Stefan Priebe - Profihost AG wrote: >> Hi Qu, >> >> while V5 was running fine against the openSUSE-42.2 kernel (based on >> v4.4). > > Thanks for the test. > >> V7 results in OOPS to me: >> BUG: unable to handle kernel NULL pointer dereference at 00000000000001f0 > > This 0x1f0 is the same as offsetof(struct brrfs_root, fs_info), quite > nice clue. > >> IP: [] __endio_write_update_ordered+0x33/0x140 [btrfs] > > IP points to: > --- > static inline bool btrfs_is_free_space_inode(struct btrfs_inode *inode) > { > struct btrfs_root *root = inode->root; << Either here > > if (root == root->fs_info->tree_root && << Or here > btrfs_ino(inode) != BTRFS_BTREE_INODE_OBJECTID) > > --- > > Taking the above offset into consideration, it's only possible for later > case. > > So here, we have a btrfs_inode whose @root is NULL. But wasn't this part of the code identical in V5? Why does it only happen with V7? > This can be fixed easily by checking @root inside > btrfs_is_free_space_inode(), as the backtrace shows that it's only > happening for DirectIO, and it won't happen for free space cache inode. > > But I'm more curious how this happened for a more accurate fix, or we > could have other NULL pointer access. > > Did you have any reproducer for this? Sorry no - this is a production MariaDB Server running btrfs with compress-force=zlib. But if i could test anything i'll do. Greets, Stefan > > Thanks, > Qu > >> PGD 14e18d4067 PUD 14e1868067 PMD 0 >> Oops: 0000 [#1] SMP >> Modules linked in: netconsole xt_multiport ipt_REJECT nf_reject_ipv4 >> xt_set iptable_filter ip_tables x_tables ip_set_hash_net ip_set >> nfnetlink crc32_pclmul button loop btrfs xor usbhid raid6_pq ata_generic >> virtio_blk virtio_net uhci_hcd ehci_hcd i2c_piix4 usbcore virtio_pci >> i2c_core usb_common ata_piix floppy >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.52+112-ph #1 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> 1.7.5-20140722_172050-sagunt 04/01/2014 >> task: ffffffffb4e0f500 ti: ffffffffb4e00000 task.ti: ffffffffb4e00000 >> RIP: 0010:[] [] >> __endio_write_update_ordered+0x33/0x140 [btrfs] >> RSP: 0018:ffff8814eae03cd8 EFLAGS: 00010086 >> RAX: 0000000000000000 RBX: ffff8814e8fd5aa8 RCX: 0000000000000001 >> RDX: 0000000000100000 RSI: 0000000000100000 RDI: ffff8814e45885c0 >> RBP: ffff8814eae03d10 R08: ffff8814e8334000 R09: 000000018040003a >> R10: ffffea00507d8d00 R11: ffff88141f634080 R12: ffff8814e45885c0 >> R13: ffff8814e125d700 R14: 0000000000100000 R15: ffff8800376c6a80 >> FS: 0000000000000000(0000) GS:ffff8814eae00000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00000000000001f0 CR3: 00000014e34c9000 CR4: 00000000001406f0Stack: >> 0000000000000000 0000000000100000 ffff8814e8fd5aa8 ffff8814e953f3c0 >> ffff8814e125d700 0000000000100000 ffff8800376c6a80 ffff8814eae03d38 >> ffffffffc03ddf67 ffff8814e86b6a80 ffff8814e8fd5aa8 0000000000000001 >> Call Trace: >> [] btrfs_endio_direct_write+0x37/0x60 [btrfs] >> [] bio_endio+0x57/0x60 >> [] btrfs_end_bio+0xa1/0x140 [btrfs] >> [] bio_endio+0x57/0x60 >> [] blk_update_request+0x8b/0x330 >> [] blk_mq_end_request+0x1a/0x70 >> [] virtblk_request_done+0x3f/0x70 [virtio_blk] >> [] __blk_mq_complete_request+0x78/0xe0 >> [] blk_mq_complete_request+0x1c/0x20 >> [] virtblk_done+0x64/0xe0 [virtio_blk] >> [] vring_interrupt+0x3a/0x90 >> [] __handle_irq_event_percpu+0x89/0x1b0 >> [] handle_irq_event_percpu+0x23/0x60 >> [] handle_irq_event+0x3b/0x60 >> [] handle_edge_irq+0x6f/0x150 >> [] handle_irq+0x1d/0x30 >> [] do_IRQ+0x4b/0xd0 >> [] common_interrupt+0x8c/0x8c >> DWARF2 unwinder stuck at ret_from_intr+0x0/0x1b >> Leftover inexact backtrace: >> 2017-03-12 20:33:08 >> 2017-03-12 20:33:08 [] ? native_safe_halt+0x6/0x10 >> [] default_idle+0x1e/0xe0 >> [] arch_cpu_idle+0xf/0x20 >> [] default_idle_call+0x3b/0x40 >> [] cpu_startup_entry+0x29a/0x370 >> [] rest_init+0x7c/0x80 >> [] start_kernel+0x490/0x49d >> [] ? early_idt_handler_array+0x120/0x120 >> [] x86_64_start_reservations+0x2a/0x2c >> [] x86_64_start_kernel+0x13b/0x14a >> Code: e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 48 8b 87 70 fc >> ff ff 4c 8b 87 38 fe ff ff 48 c7 45 c8 00 00 00 00 48 89 75 d0 <48> 8b >> b8 f0 01 00 00 48 3b 47 28 49 8b 84 24 78 fc ff ff 0f 84 >> RIP [] __endio_write_update_ordered+0x33/0x140 [btrfs] >> RSP >> CR2: 00000000000001f0 >> ---[ end trace 7529a0652fd7873e ]--- >> Kernel panic - not syncing: Fatal exception in interrupt >> Kernel Offset: 0x33000000 from 0xffffffff81000000 (relocation range: >> 0xffffffff80000000-0xffffffffbfffffff) >> >> Greets, >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > >