* Null pointer oops @ 2014-08-13 5:02 Larkin Lowrey [not found] ` <CALJ65z=25CrrO9uMc2vfYVAQWb=6eK+OhB5TGJJrCp=D4ALvrQ@mail.gmail.com> 0 siblings, 1 reply; 13+ messages in thread From: Larkin Lowrey @ 2014-08-13 5:02 UTC (permalink / raw) To: linux-bcache I got an oops while doing some heavy I/O. I have an md raid10 cache device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been well behaved for about 6 months. If this isn't a known issue is there anything I can do to provide more useful information? I'm running kernel 3.15.8-200.fc20.x86_64. [210884.047249] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [210884.055605] IP: [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] [210884.063723] PGD 0 [210884.066053] Oops: 0002 [#1] SMP [210884.069610] Modules linked in: lp parport binfmt_misc ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle tun bridge stp llc xt_multiport ebtable_nat ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq btrfs bcache raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper ttm drm i2c_core mpt2sas mvsas libsas raid_class scsi_transport_sas cpufreq_stats [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted 3.15.8-200.fc20.x86_64 #1 [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 [210884.155280] Workqueue: bcache cache_lookup [bcache] [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 task.ti: ffff8800217b8000 [210884.168502] RIP: 0010:[<ffffffffa01625fc>] [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: 0000000000000000 [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: 0000000000000246 [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: 0000000000000f6b [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: ffff880413d06c00 [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: ffff880413d06c00 [210884.222961] FS: 00007f73bacd6880(0000) GS:ffff88021fd40000(0000) knlGS:0000000000000000 [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: 00000000000407e0 [210884.245131] Stack: [210884.247395] ffff880274f4d020 ffff880413d06c00 0000bfcc44a463f8 ffff8800217bbc20 [210884.255337] ffff880413d06c00 ffff8800217bbc78 ffffffffa0162b68 0000000000000000 [210884.263256] ffff880218633160 0000000000000000 0000000000000000 0000000000000000 [210884.271234] Call Trace: [210884.273985] [<ffffffffa0162b68>] bch_btree_node_read+0x168/0x190 [bcache] [210884.281258] [<ffffffffa0163f69>] bch_btree_node_get+0x169/0x290 [bcache] [210884.288377] [<ffffffffa01642f5>] bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] [210884.296311] [<ffffffffa016dcb0>] ? cached_dev_congested+0x180/0x180 [bcache] [210884.303953] [<ffffffff8135b204>] ? call_rwsem_down_read_failed+0x14/0x30 [210884.311158] [<ffffffffa01673f7>] bch_btree_map_keys+0x127/0x150 [bcache] [210884.318273] [<ffffffffa016dcb0>] ? cached_dev_congested+0x180/0x180 [bcache] [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 48 8b 43 10 48 85 [210884.395405] RIP [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] [210884.403389] RSP <ffff8800217bbbe8> [210884.407171] CR2: 0000000000000008 [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- [210884.416352] BUG: unable to handle kernel paging request at ffffffffffffffd8 [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 --Larkin ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CALJ65z=25CrrO9uMc2vfYVAQWb=6eK+OhB5TGJJrCp=D4ALvrQ@mail.gmail.com>]
* Re: Null pointer oops [not found] ` <CALJ65z=25CrrO9uMc2vfYVAQWb=6eK+OhB5TGJJrCp=D4ALvrQ@mail.gmail.com> @ 2014-08-13 16:40 ` Larkin Lowrey 2014-08-13 17:41 ` Slava Pestov 0 siblings, 1 reply; 13+ messages in thread From: Larkin Lowrey @ 2014-08-13 16:40 UTC (permalink / raw) To: Kent Overstreet; +Cc: linux-bcache This is making be feel very dumb. I've googled extensively but can't figure out how to run addr2line for a module. I'm running Fedora 20 and the kernel did not have debugging symbols. I downloaded the version with symbols but I don't know if the addresses are going to be the same. Bcache is a module for me and that's where things get tricky. Do you have any tips? --Larkin On 8/13/2014 12:04 AM, Kent Overstreet wrote: > > Any chance you could do an addr2line and get me the exact line where > it happened? > > On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com > <mailto:llowrey@nuclearwinter.com>> wrote: > > I got an oops while doing some heavy I/O. I have an md raid10 cache > device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been > well behaved for about 6 months. > > If this isn't a known issue is there anything I can do to provide more > useful information? > > I'm running kernel 3.15.8-200.fc20.x86_64. > > [210884.047249] BUG: unable to handle kernel NULL pointer > dereference at 0000000000000008 > [210884.055605] IP: [<ffffffffa01625fc>] > bch_btree_node_read_done+0x4c/0x450 [bcache] > [210884.063723] PGD 0 > [210884.066053] Oops: 0002 [#1] SMP > [210884.069610] Modules linked in: lp parport binfmt_misc > ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM > iptable_mangle tun bridge stp llc xt_multiport ebtable_nat > ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 > nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack > ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul > crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw > amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd > sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq > btrfs bcache raid456 async_raid6_recov async_memcpy async_pq > async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper > ttm drm i2c_core mpt2sas mvsas libsas raid_class > scsi_transport_sas cpufreq_stats > [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted > 3.15.8-200.fc20.x86_64 #1 > [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 > [210884.155280] Workqueue: bcache cache_lookup [bcache] > [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 > task.ti: ffff8800217b8000 > [210884.168502] RIP: 0010:[<ffffffffa01625fc>] > [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] > [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 > [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: > 0000000000000000 > [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: > 0000000000000246 > [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: > 0000000000000f6b > [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: > ffff880413d06c00 > [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: > ffff880413d06c00 > [210884.222961] FS: 00007f73bacd6880(0000) > GS:ffff88021fd40000(0000) knlGS:0000000000000000 > [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: > 00000000000407e0 > [210884.245131] Stack: > [210884.247395] ffff880274f4d020 ffff880413d06c00 > 0000bfcc44a463f8 ffff8800217bbc20 > [210884.255337] ffff880413d06c00 ffff8800217bbc78 > ffffffffa0162b68 0000000000000000 > [210884.263256] ffff880218633160 0000000000000000 > 0000000000000000 0000000000000000 > [210884.271234] Call Trace: > [210884.273985] [<ffffffffa0162b68>] > bch_btree_node_read+0x168/0x190 [bcache] > [210884.281258] [<ffffffffa0163f69>] > bch_btree_node_get+0x169/0x290 [bcache] > [210884.288377] [<ffffffffa01642f5>] > bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] > [210884.296311] [<ffffffffa016dcb0>] ? > cached_dev_congested+0x180/0x180 [bcache] > [210884.303953] [<ffffffff8135b204>] ? > call_rwsem_down_read_failed+0x14/0x30 > [210884.311158] [<ffffffffa01673f7>] > bch_btree_map_keys+0x127/0x150 [bcache] > [210884.318273] [<ffffffffa016dcb0>] ? > cached_dev_congested+0x180/0x180 [bcache] > [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] > [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 > [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 > [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 > [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 > [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 > [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 > [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 > [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 > e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 > f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 > 48 8b 43 10 48 85 > [210884.395405] RIP [<ffffffffa01625fc>] > bch_btree_node_read_done+0x4c/0x450 [bcache] > [210884.403389] RSP <ffff8800217bbbe8> > [210884.407171] CR2: 0000000000000008 > [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- > [210884.416352] BUG: unable to handle kernel paging request at > ffffffffffffffd8 > [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 > [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 > > --Larkin > > -- > To unsubscribe from this list: send the line "unsubscribe > linux-bcache" in > the body of a message to majordomo@vger.kernel.org > <mailto:majordomo@vger.kernel.org> > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 16:40 ` Larkin Lowrey @ 2014-08-13 17:41 ` Slava Pestov 2014-08-13 18:35 ` Larkin Lowrey 0 siblings, 1 reply; 13+ messages in thread From: Slava Pestov @ 2014-08-13 17:41 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Kent Overstreet, linux-bcache You can try to use gdb: gdb /lib/modules/.../foo.ko list *(bch_btree_node_read_done+0x4c) On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > This is making be feel very dumb. I've googled extensively but can't > figure out how to run addr2line for a module. > > I'm running Fedora 20 and the kernel did not have debugging symbols. I > downloaded the version with symbols but I don't know if the addresses > are going to be the same. Bcache is a module for me and that's where > things get tricky. Do you have any tips? > > --Larkin > > On 8/13/2014 12:04 AM, Kent Overstreet wrote: >> >> Any chance you could do an addr2line and get me the exact line where >> it happened? >> >> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >> <mailto:llowrey@nuclearwinter.com>> wrote: >> >> I got an oops while doing some heavy I/O. I have an md raid10 cache >> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >> well behaved for about 6 months. >> >> If this isn't a known issue is there anything I can do to provide more >> useful information? >> >> I'm running kernel 3.15.8-200.fc20.x86_64. >> >> [210884.047249] BUG: unable to handle kernel NULL pointer >> dereference at 0000000000000008 >> [210884.055605] IP: [<ffffffffa01625fc>] >> bch_btree_node_read_done+0x4c/0x450 [bcache] >> [210884.063723] PGD 0 >> [210884.066053] Oops: 0002 [#1] SMP >> [210884.069610] Modules linked in: lp parport binfmt_misc >> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >> ttm drm i2c_core mpt2sas mvsas libsas raid_class >> scsi_transport_sas cpufreq_stats >> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >> 3.15.8-200.fc20.x86_64 #1 >> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >> [210884.155280] Workqueue: bcache cache_lookup [bcache] >> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >> task.ti: ffff8800217b8000 >> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >> 0000000000000000 >> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >> 0000000000000246 >> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >> 0000000000000f6b >> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >> ffff880413d06c00 >> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >> ffff880413d06c00 >> [210884.222961] FS: 00007f73bacd6880(0000) >> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >> 00000000000407e0 >> [210884.245131] Stack: >> [210884.247395] ffff880274f4d020 ffff880413d06c00 >> 0000bfcc44a463f8 ffff8800217bbc20 >> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >> ffffffffa0162b68 0000000000000000 >> [210884.263256] ffff880218633160 0000000000000000 >> 0000000000000000 0000000000000000 >> [210884.271234] Call Trace: >> [210884.273985] [<ffffffffa0162b68>] >> bch_btree_node_read+0x168/0x190 [bcache] >> [210884.281258] [<ffffffffa0163f69>] >> bch_btree_node_get+0x169/0x290 [bcache] >> [210884.288377] [<ffffffffa01642f5>] >> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >> [210884.296311] [<ffffffffa016dcb0>] ? >> cached_dev_congested+0x180/0x180 [bcache] >> [210884.303953] [<ffffffff8135b204>] ? >> call_rwsem_down_read_failed+0x14/0x30 >> [210884.311158] [<ffffffffa01673f7>] >> bch_btree_map_keys+0x127/0x150 [bcache] >> [210884.318273] [<ffffffffa016dcb0>] ? >> cached_dev_congested+0x180/0x180 [bcache] >> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >> 48 8b 43 10 48 85 >> [210884.395405] RIP [<ffffffffa01625fc>] >> bch_btree_node_read_done+0x4c/0x450 [bcache] >> [210884.403389] RSP <ffff8800217bbbe8> >> [210884.407171] CR2: 0000000000000008 >> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >> [210884.416352] BUG: unable to handle kernel paging request at >> ffffffffffffffd8 >> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >> >> --Larkin >> >> -- >> To unsubscribe from this list: send the line "unsubscribe >> linux-bcache" in >> the body of a message to majordomo@vger.kernel.org >> <mailto:majordomo@vger.kernel.org> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 17:41 ` Slava Pestov @ 2014-08-13 18:35 ` Larkin Lowrey 2014-08-13 18:45 ` Slava Pestov 0 siblings, 1 reply; 13+ messages in thread From: Larkin Lowrey @ 2014-08-13 18:35 UTC (permalink / raw) To: Slava Pestov; +Cc: Kent Overstreet, linux-bcache Thanks. Trying gdb helped me find the answer. I needed to install the kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. From addr2line: > bch_btree_node_read_done+0x4c > drivers/md/bcache/btree.c:207 Here'a a snippet from gdb: > (gdb) list *(bch_btree_node_read_done+0x4c) > 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). > 202 struct bset *i = btree_bset_first(b); > 203 struct btree_iter *iter; > 204 > 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); > 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; > 207 iter->used = 0; > 208 > 209 #ifdef CONFIG_BCACHE_DEBUG > 210 iter->b = &b->keys; > 211 #endif This doesn't make any sense to me. If iter was null I would expect line 206 to blow up first. --Larkin On 8/13/2014 12:41 PM, Slava Pestov wrote: > You can try to use gdb: > > gdb /lib/modules/.../foo.ko > > list *(bch_btree_node_read_done+0x4c) > > > On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey > <llowrey@nuclearwinter.com> wrote: >> This is making be feel very dumb. I've googled extensively but can't >> figure out how to run addr2line for a module. >> >> I'm running Fedora 20 and the kernel did not have debugging symbols. I >> downloaded the version with symbols but I don't know if the addresses >> are going to be the same. Bcache is a module for me and that's where >> things get tricky. Do you have any tips? >> >> --Larkin >> >> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>> Any chance you could do an addr2line and get me the exact line where >>> it happened? >>> >>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>> <mailto:llowrey@nuclearwinter.com>> wrote: >>> >>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>> well behaved for about 6 months. >>> >>> If this isn't a known issue is there anything I can do to provide more >>> useful information? >>> >>> I'm running kernel 3.15.8-200.fc20.x86_64. >>> >>> [210884.047249] BUG: unable to handle kernel NULL pointer >>> dereference at 0000000000000008 >>> [210884.055605] IP: [<ffffffffa01625fc>] >>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>> [210884.063723] PGD 0 >>> [210884.066053] Oops: 0002 [#1] SMP >>> [210884.069610] Modules linked in: lp parport binfmt_misc >>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>> scsi_transport_sas cpufreq_stats >>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>> 3.15.8-200.fc20.x86_64 #1 >>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>> task.ti: ffff8800217b8000 >>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>> 0000000000000000 >>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>> 0000000000000246 >>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>> 0000000000000f6b >>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>> ffff880413d06c00 >>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>> ffff880413d06c00 >>> [210884.222961] FS: 00007f73bacd6880(0000) >>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>> 00000000000407e0 >>> [210884.245131] Stack: >>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>> 0000bfcc44a463f8 ffff8800217bbc20 >>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>> ffffffffa0162b68 0000000000000000 >>> [210884.263256] ffff880218633160 0000000000000000 >>> 0000000000000000 0000000000000000 >>> [210884.271234] Call Trace: >>> [210884.273985] [<ffffffffa0162b68>] >>> bch_btree_node_read+0x168/0x190 [bcache] >>> [210884.281258] [<ffffffffa0163f69>] >>> bch_btree_node_get+0x169/0x290 [bcache] >>> [210884.288377] [<ffffffffa01642f5>] >>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>> [210884.296311] [<ffffffffa016dcb0>] ? >>> cached_dev_congested+0x180/0x180 [bcache] >>> [210884.303953] [<ffffffff8135b204>] ? >>> call_rwsem_down_read_failed+0x14/0x30 >>> [210884.311158] [<ffffffffa01673f7>] >>> bch_btree_map_keys+0x127/0x150 [bcache] >>> [210884.318273] [<ffffffffa016dcb0>] ? >>> cached_dev_congested+0x180/0x180 [bcache] >>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>> 48 8b 43 10 48 85 >>> [210884.395405] RIP [<ffffffffa01625fc>] >>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>> [210884.403389] RSP <ffff8800217bbbe8> >>> [210884.407171] CR2: 0000000000000008 >>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>> [210884.416352] BUG: unable to handle kernel paging request at >>> ffffffffffffffd8 >>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>> >>> --Larkin >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-bcache" in >>> the body of a message to majordomo@vger.kernel.org >>> <mailto:majordomo@vger.kernel.org> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 18:35 ` Larkin Lowrey @ 2014-08-13 18:45 ` Slava Pestov 2014-08-13 21:21 ` Larkin Lowrey 0 siblings, 1 reply; 13+ messages in thread From: Slava Pestov @ 2014-08-13 18:45 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Kent Overstreet, linux-bcache Can you post the disassembly of the function? On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > Thanks. Trying gdb helped me find the answer. I needed to install the > kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. > > From addr2line: >> bch_btree_node_read_done+0x4c >> drivers/md/bcache/btree.c:207 > > Here'a a snippet from gdb: > >> (gdb) list *(bch_btree_node_read_done+0x4c) >> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >> 202 struct bset *i = btree_bset_first(b); >> 203 struct btree_iter *iter; >> 204 >> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >> 207 iter->used = 0; >> 208 >> 209 #ifdef CONFIG_BCACHE_DEBUG >> 210 iter->b = &b->keys; >> 211 #endif > > This doesn't make any sense to me. If iter was null I would expect line > 206 to blow up first. > > --Larkin > > On 8/13/2014 12:41 PM, Slava Pestov wrote: >> You can try to use gdb: >> >> gdb /lib/modules/.../foo.ko >> >> list *(bch_btree_node_read_done+0x4c) >> >> >> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >> <llowrey@nuclearwinter.com> wrote: >>> This is making be feel very dumb. I've googled extensively but can't >>> figure out how to run addr2line for a module. >>> >>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>> downloaded the version with symbols but I don't know if the addresses >>> are going to be the same. Bcache is a module for me and that's where >>> things get tricky. Do you have any tips? >>> >>> --Larkin >>> >>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>> Any chance you could do an addr2line and get me the exact line where >>>> it happened? >>>> >>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>> >>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>> well behaved for about 6 months. >>>> >>>> If this isn't a known issue is there anything I can do to provide more >>>> useful information? >>>> >>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>> >>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>> dereference at 0000000000000008 >>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>> [210884.063723] PGD 0 >>>> [210884.066053] Oops: 0002 [#1] SMP >>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>> scsi_transport_sas cpufreq_stats >>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>> 3.15.8-200.fc20.x86_64 #1 >>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>> task.ti: ffff8800217b8000 >>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>> 0000000000000000 >>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>> 0000000000000246 >>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>> 0000000000000f6b >>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>> ffff880413d06c00 >>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>> ffff880413d06c00 >>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>> 00000000000407e0 >>>> [210884.245131] Stack: >>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>> ffffffffa0162b68 0000000000000000 >>>> [210884.263256] ffff880218633160 0000000000000000 >>>> 0000000000000000 0000000000000000 >>>> [210884.271234] Call Trace: >>>> [210884.273985] [<ffffffffa0162b68>] >>>> bch_btree_node_read+0x168/0x190 [bcache] >>>> [210884.281258] [<ffffffffa0163f69>] >>>> bch_btree_node_get+0x169/0x290 [bcache] >>>> [210884.288377] [<ffffffffa01642f5>] >>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>> cached_dev_congested+0x180/0x180 [bcache] >>>> [210884.303953] [<ffffffff8135b204>] ? >>>> call_rwsem_down_read_failed+0x14/0x30 >>>> [210884.311158] [<ffffffffa01673f7>] >>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>> cached_dev_congested+0x180/0x180 [bcache] >>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>> 48 8b 43 10 48 85 >>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>> [210884.403389] RSP <ffff8800217bbbe8> >>>> [210884.407171] CR2: 0000000000000008 >>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>> [210884.416352] BUG: unable to handle kernel paging request at >>>> ffffffffffffffd8 >>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>> >>>> --Larkin >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe >>>> linux-bcache" in >>>> the body of a message to majordomo@vger.kernel.org >>>> <mailto:majordomo@vger.kernel.org> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 18:45 ` Slava Pestov @ 2014-08-13 21:21 ` Larkin Lowrey 2014-08-13 21:25 ` Slava Pestov 0 siblings, 1 reply; 13+ messages in thread From: Larkin Lowrey @ 2014-08-13 21:21 UTC (permalink / raw) To: Slava Pestov; +Cc: Kent Overstreet, linux-bcache Here's the dissassembly of bch_btree_node_read_done. The offending line is 207 and the instruction is at offset 76. --Larkin 199 void bch_btree_node_read_done(struct btree *b) 200 { 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> 0x00000000000065b5 <+5>: push %rbp 0x00000000000065b8 <+8>: mov %rsp,%rbp 0x00000000000065bb <+11>: push %r15 0x00000000000065bd <+13>: push %r14 0x00000000000065bf <+15>: push %r13 0x00000000000065c1 <+17>: push %r12 0x00000000000065c3 <+19>: mov %rdi,%r12 0x00000000000065c6 <+22>: push %rbx 201 const char *err = "bad btree header"; 0x0000000000006800 <+592>: mov $0x0,%rdx 202 struct bset *i = btree_bset_first(b); 203 struct btree_iter *iter; 204 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); 0x00000000000065b6 <+6>: xor %esi,%esi 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> 0x00000000000065e9 <+57>: mov %rax,%r13 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi 0x00000000000065ec <+60>: xor %edx,%edx 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax 0x00000000000065f5 <+69>: divw 0x430(%rsi) 0x0000000000006604 <+84>: movzwl %ax,%eax 0x0000000000006607 <+87>: mov %rax,0x0(%r13) 207 iter->used = 0; 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) 208 209 #ifdef CONFIG_BCACHE_DEBUG 210 iter->b = &b->keys; 211 #endif 212 213 if (!i->seq) 0x000000000000660b <+91>: mov 0x10(%rbx),%rax 0x000000000000660f <+95>: test %rax,%rax 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> 214 goto err; 215 216 for (; 0x000000000000664d <+157>: cmp %r9d,%ecx 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> 0x0000000000006744 <+404>: cmp %r9d,%r10d 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> 217 b->written < btree_blocks(b) && i->seq == b->keys.set[0].data->seq; 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi 0x000000000000662e <+126>: mov 0x108(%r12),%r8 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx 0x0000000000006644 <+148>: mov %rdx,%r9 0x0000000000006647 <+151>: shr %cl,%r9 0x000000000000664a <+154>: movzwl %di,%ecx 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> 0x000000000000670f <+351>: mov %rdx,%r9 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx 0x0000000000006738 <+392>: shr %cl,%r9 0x000000000000674d <+413>: mov 0x10(%r8),%rcx 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> 0x0000000000006892 <+738>: add %r8,%rbx 0x0000000000006895 <+741>: nopl (%rax) 218 i = write_block(b)) { 219 err = "unsupported bset version"; 0x00000000000069c0 <+1040>: mov $0x0,%rdx 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> 0x00000000000069cc <+1052>: nopl 0x0(%rax) 220 if (i->version > BCACHE_BSET_VERSION) 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d 0x0000000000006664 <+180>: cmp $0x1,%r10d 0x0000000000006668 <+184>: ja 0x69c0 <bch_btree_node_read_done+1040> 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d 0x000000000000675f <+431>: cmp $0x1,%r10d 0x0000000000006763 <+435>: ja 0x69c0 <bch_btree_node_read_done+1040> 221 goto err; 222 223 err = "bad btree header"; 224 if (b->written + set_blocks(i, block_bytes(b->c)) > 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax 0x000000000000676c <+444>: mov %r11,%rcx 0x000000000000676f <+447>: xor %edx,%edx 0x0000000000006771 <+449>: shl $0x9,%rcx 0x0000000000006775 <+453>: movzwl %di,%edi 0x0000000000006778 <+456>: mov %r9d,%r9d 0x000000000000677b <+459>: and $0x1fffe00,%ecx 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax 0x000000000000678e <+478>: div %rcx 0x0000000000006791 <+481>: add %rdi,%rax 0x0000000000006794 <+484>: cmp %r9,%rax 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> 225 btree_blocks(b)) 226 goto err; 227 228 err = "bad magic"; 0x00000000000069d0 <+1056>: mov $0x0,%rdx 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> 0x00000000000069dc <+1068>: nopl 0x0(%rax) 229 if (i->magic != bset_magic(&b->c->sb)) 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) 0x00000000000067ae <+510>: jne 0x69d0 <bch_btree_node_read_done+1056> 230 goto err; 231 232 err = "bad checksum"; 0x00000000000067df <+559>: mov $0x0,%rdx 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) 233 switch (i->version) { 0x00000000000067b4 <+516>: cmp $0x1,%r10d 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> 234 case 0: 235 if (i->csum != csum_set(i)) 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi 0x00000000000067ce <+542>: sub %rdi,%rsi 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> 0x00000000000067d6 <+550>: cmp %rax,%r15 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> 236 goto err; 237 break; 238 case BCACHE_BSET_VERSION: 239 if (i->csum != btree_csum_set(b, i)) 0x000000000000669d <+237>: cmp %rax,%r15 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> 0x00000000000067b8 <+520>: mov (%rbx),%r15 240 goto err; 241 break; 242 } 243 244 err = "empty set"; 0x00000000000069e0 <+1072>: mov $0x0,%rdx 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> 245 if (i != b->keys.set[0].data && !i->keys) 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax 0x00000000000066b7 <+263>: test %eax,%eax 0x00000000000066b9 <+265>: je 0x69e0 <bch_btree_node_read_done+1072> 246 goto err; 247 248 bch_btree_iter_push(iter, i->start, bset_bkey_last(i)); 0x00000000000066c3 <+275>: mov %r14,%rsi 0x00000000000066c6 <+278>: mov %r13,%rdi 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> 249 250 b->written += set_blocks(i, block_bytes(b->c)); 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax 0x00000000000066d9 <+297>: xor %edx,%edx 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx 0x00000000000066ea <+314>: shl $0x9,%ecx 0x00000000000066ed <+317>: movslq %ecx,%rcx 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax 0x00000000000066f5 <+325>: div %rcx 0x0000000000006704 <+340>: mov %eax,%edi 0x0000000000006706 <+342>: add 0xc0(%r12),%di 0x0000000000006712 <+354>: mov %di,0xc0(%r12) 251 } 252 253 err = "corrupted btree"; 0x00000000000069b0 <+1024>: mov $0x0,%rdx 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> 0x00000000000069bc <+1036>: nopl 0x0(%rax) 254 for (i = write_block(b); 0x00000000000068a1 <+753>: cmp %rdx,%rcx 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> 0x00000000000068e0 <+816>: cmp %rdx,%rcx 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); 256 i = ((void *) i) + block_bytes(b->c)) 0x00000000000068d7 <+807>: mov %rcx,%rbx 0x00000000000068da <+810>: sub %r8d,%ecx 257 if (i->seq == b->keys.set[0].data->seq) 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) 0x00000000000068ae <+766>: je 0x69b0 <bch_btree_node_read_done+1024> 0x00000000000068b4 <+772>: cltq 0x00000000000068b6 <+774>: mov %rax,%r9 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx 0x00000000000068bd <+781>: neg %r9 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi 0x00000000000068d1 <+801>: je 0x69b0 <bch_btree_node_read_done+1024> 258 goto err; 259 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx 0x00000000000068f4 <+836>: mov %r13,%rsi 0x00000000000068f7 <+839>: mov %r14,%rdi 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> 261 262 i = b->keys.set[0].data; 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx 263 err = "short btree key"; 0x00000000000069ec <+1084>: mov $0x0,%rdx 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> 264 if (b->keys.set[0].size && 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax 0x0000000000006914 <+868>: test %eax,%eax 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> 0x0000000000006944 <+916>: test %rax,%rax 0x0000000000006947 <+919>: js 0x69ec <bch_btree_node_read_done+1084> 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) 266 goto err; 267 268 if (b->written < btree_blocks(b)) 0x000000000000694d <+925>: mov 0x80(%r12),%rax 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx 0x000000000000696c <+956>: shr %cl,%rdx 0x000000000000696f <+959>: cmp %edx,%esi 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> 269 bch_bset_init_next(&b->keys, write_block(b), 0x000000000000698f <+991>: mov %r14,%rdi 0x000000000000699e <+1006>: callq 0x69a3 <bch_btree_node_read_done+1011> 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> 270 bset_magic(&b->c->sb)); 271 out: 272 mempool_free(iter, b->c->fill_iter); 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi 0x000000000000686f <+703>: mov %r13,%rdi 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> 273 return; 274 err: 275 set_btree_node_io_error(b); 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, %u keys", 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d 0x000000000000684a <+666>: mov %esi,%ecx 0x000000000000684c <+668>: mov $0x0,%rsi 0x0000000000006853 <+675>: shr %cl,%r8d 0x0000000000006856 <+678>: mov %rax,%rcx 0x0000000000006859 <+681>: xor %eax,%eax 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> 0x0000000000006860 <+688>: mov 0x80(%r12),%rax 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), 278 bset_block_offset(b, i), i->keys); 279 goto out; 280 } 0x0000000000006877 <+711>: pop %rbx 0x0000000000006878 <+712>: pop %r12 0x000000000000687a <+714>: pop %r13 0x000000000000687c <+716>: pop %r14 0x000000000000687e <+718>: pop %r15 0x0000000000006880 <+720>: pop %rbp 0x0000000000006881 <+721>: retq 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax 0x0000000000006889 <+729>: shl $0x9,%eax 0x000000000000688c <+732>: imul %eax,%ecx 0x000000000000688f <+735>: movslq %ecx,%rbx On 8/13/2014 1:45 PM, Slava Pestov wrote: > Can you post the disassembly of the function? > > On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey > <llowrey@nuclearwinter.com> wrote: >> Thanks. Trying gdb helped me find the answer. I needed to install the >> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >> >> From addr2line: >>> bch_btree_node_read_done+0x4c >>> drivers/md/bcache/btree.c:207 >> Here'a a snippet from gdb: >> >>> (gdb) list *(bch_btree_node_read_done+0x4c) >>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>> 202 struct bset *i = btree_bset_first(b); >>> 203 struct btree_iter *iter; >>> 204 >>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>> 207 iter->used = 0; >>> 208 >>> 209 #ifdef CONFIG_BCACHE_DEBUG >>> 210 iter->b = &b->keys; >>> 211 #endif >> This doesn't make any sense to me. If iter was null I would expect line >> 206 to blow up first. >> >> --Larkin >> >> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>> You can try to use gdb: >>> >>> gdb /lib/modules/.../foo.ko >>> >>> list *(bch_btree_node_read_done+0x4c) >>> >>> >>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>> <llowrey@nuclearwinter.com> wrote: >>>> This is making be feel very dumb. I've googled extensively but can't >>>> figure out how to run addr2line for a module. >>>> >>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>> downloaded the version with symbols but I don't know if the addresses >>>> are going to be the same. Bcache is a module for me and that's where >>>> things get tricky. Do you have any tips? >>>> >>>> --Larkin >>>> >>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>> Any chance you could do an addr2line and get me the exact line where >>>>> it happened? >>>>> >>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>> >>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>> well behaved for about 6 months. >>>>> >>>>> If this isn't a known issue is there anything I can do to provide more >>>>> useful information? >>>>> >>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>> >>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>> dereference at 0000000000000008 >>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>> [210884.063723] PGD 0 >>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>> scsi_transport_sas cpufreq_stats >>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>> 3.15.8-200.fc20.x86_64 #1 >>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>> task.ti: ffff8800217b8000 >>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>> 0000000000000000 >>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>> 0000000000000246 >>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>> 0000000000000f6b >>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>> ffff880413d06c00 >>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>> ffff880413d06c00 >>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>> 00000000000407e0 >>>>> [210884.245131] Stack: >>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>> ffffffffa0162b68 0000000000000000 >>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>> 0000000000000000 0000000000000000 >>>>> [210884.271234] Call Trace: >>>>> [210884.273985] [<ffffffffa0162b68>] >>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>> [210884.281258] [<ffffffffa0163f69>] >>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>> [210884.288377] [<ffffffffa01642f5>] >>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>> [210884.311158] [<ffffffffa01673f7>] >>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>> 48 8b 43 10 48 85 >>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>> [210884.407171] CR2: 0000000000000008 >>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>> ffffffffffffffd8 >>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>> >>>>> --Larkin >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-bcache" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> <mailto:majordomo@vger.kernel.org> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:21 ` Larkin Lowrey @ 2014-08-13 21:25 ` Slava Pestov 2014-08-13 21:30 ` Slava Pestov 2014-08-13 21:32 ` Larkin Lowrey 0 siblings, 2 replies; 13+ messages in thread From: Slava Pestov @ 2014-08-13 21:25 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Kent Overstreet, linux-bcache Indeed it looks like iter is NULL. I see the bug is still present in the latest dev branch. The problem is that we're not checking the return value of mempoool_alloc(), which may be NULL if we pass GFP_NOWAIT. On Wed, Aug 13, 2014 at 2:21 PM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > Here's the dissassembly of bch_btree_node_read_done. The offending line > is 207 and the instruction is at offset 76. > > --Larkin > > 199 void bch_btree_node_read_done(struct btree *b) > 200 { > 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> > 0x00000000000065b5 <+5>: push %rbp > 0x00000000000065b8 <+8>: mov %rsp,%rbp > 0x00000000000065bb <+11>: push %r15 > 0x00000000000065bd <+13>: push %r14 > 0x00000000000065bf <+15>: push %r13 > 0x00000000000065c1 <+17>: push %r12 > 0x00000000000065c3 <+19>: mov %rdi,%r12 > 0x00000000000065c6 <+22>: push %rbx > > 201 const char *err = "bad btree header"; > 0x0000000000006800 <+592>: mov $0x0,%rdx > > 202 struct bset *i = btree_bset_first(b); > 203 struct btree_iter *iter; > 204 > 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); > 0x00000000000065b6 <+6>: xor %esi,%esi > 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax > 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi > 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> > 0x00000000000065e9 <+57>: mov %rax,%r13 > > 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; > 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi > 0x00000000000065ec <+60>: xor %edx,%edx > 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax > 0x00000000000065f5 <+69>: divw 0x430(%rsi) > 0x0000000000006604 <+84>: movzwl %ax,%eax > 0x0000000000006607 <+87>: mov %rax,0x0(%r13) > > 207 iter->used = 0; > 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) > > 208 > 209 #ifdef CONFIG_BCACHE_DEBUG > 210 iter->b = &b->keys; > 211 #endif > 212 > 213 if (!i->seq) > 0x000000000000660b <+91>: mov 0x10(%rbx),%rax > 0x000000000000660f <+95>: test %rax,%rax > 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> > > 214 goto err; > 215 > 216 for (; > 0x000000000000664d <+157>: cmp %r9d,%ecx > 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> > 0x0000000000006744 <+404>: cmp %r9d,%r10d > 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> > > 217 b->written < btree_blocks(b) && i->seq == > b->keys.set[0].data->seq; > 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi > 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi > 0x000000000000662e <+126>: mov 0x108(%r12),%r8 > 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx > 0x0000000000006644 <+148>: mov %rdx,%r9 > 0x0000000000006647 <+151>: shr %cl,%r9 > 0x000000000000664a <+154>: movzwl %di,%ecx > 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax > 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> > 0x000000000000670f <+351>: mov %rdx,%r9 > 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx > 0x0000000000006738 <+392>: shr %cl,%r9 > 0x000000000000674d <+413>: mov 0x10(%r8),%rcx > 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) > 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> > 0x0000000000006892 <+738>: add %r8,%rbx > 0x0000000000006895 <+741>: nopl (%rax) > > 218 i = write_block(b)) { > 219 err = "unsupported bset version"; > 0x00000000000069c0 <+1040>: mov $0x0,%rdx > 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> > 0x00000000000069cc <+1052>: nopl 0x0(%rax) > > 220 if (i->version > BCACHE_BSET_VERSION) > 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d > 0x0000000000006664 <+180>: cmp $0x1,%r10d > 0x0000000000006668 <+184>: ja 0x69c0 > <bch_btree_node_read_done+1040> > 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d > 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> > 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) > 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d > 0x000000000000675f <+431>: cmp $0x1,%r10d > 0x0000000000006763 <+435>: ja 0x69c0 > <bch_btree_node_read_done+1040> > > 221 goto err; > 222 > 223 err = "bad btree header"; > 224 if (b->written + set_blocks(i, block_bytes(b->c)) > > 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax > 0x000000000000676c <+444>: mov %r11,%rcx > 0x000000000000676f <+447>: xor %edx,%edx > 0x0000000000006771 <+449>: shl $0x9,%rcx > 0x0000000000006775 <+453>: movzwl %di,%edi > 0x0000000000006778 <+456>: mov %r9d,%r9d > 0x000000000000677b <+459>: and $0x1fffe00,%ecx > 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 > 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax > 0x000000000000678e <+478>: div %rcx > 0x0000000000006791 <+481>: add %rdi,%rax > 0x0000000000006794 <+484>: cmp %r9,%rax > 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> > > 225 btree_blocks(b)) > 226 goto err; > 227 > 228 err = "bad magic"; > 0x00000000000069d0 <+1056>: mov $0x0,%rdx > 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> > 0x00000000000069dc <+1068>: nopl 0x0(%rax) > > 229 if (i->magic != bset_magic(&b->c->sb)) > 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) > 0x00000000000067ae <+510>: jne 0x69d0 > <bch_btree_node_read_done+1056> > > 230 goto err; > 231 > 232 err = "bad checksum"; > 0x00000000000067df <+559>: mov $0x0,%rdx > 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> > 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) > 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax > 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> > 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) > > 233 switch (i->version) { > 0x00000000000067b4 <+516>: cmp $0x1,%r10d > 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> > > 234 case 0: > 235 if (i->csum != csum_set(i)) > 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 > 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi > 0x00000000000067ce <+542>: sub %rdi,%rsi > 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> > 0x00000000000067d6 <+550>: cmp %rax,%r15 > 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> > 236 goto err; > 237 break; > 238 case BCACHE_BSET_VERSION: > 239 if (i->csum != btree_csum_set(b, i)) > 0x000000000000669d <+237>: cmp %rax,%r15 > 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> > 0x00000000000067b8 <+520>: mov (%rbx),%r15 > > 240 goto err; > 241 break; > 242 } > 243 > 244 err = "empty set"; > 0x00000000000069e0 <+1072>: mov $0x0,%rdx > 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> > > 245 if (i != b->keys.set[0].data && !i->keys) > 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) > 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> > 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax > 0x00000000000066b7 <+263>: test %eax,%eax > 0x00000000000066b9 <+265>: je 0x69e0 > <bch_btree_node_read_done+1072> > > 246 goto err; > 247 > 248 bch_btree_iter_push(iter, i->start, > bset_bkey_last(i)); > 0x00000000000066c3 <+275>: mov %r14,%rsi > 0x00000000000066c6 <+278>: mov %r13,%rdi > 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> > > 249 > 250 b->written += set_blocks(i, block_bytes(b->c)); > 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi > 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax > 0x00000000000066d9 <+297>: xor %edx,%edx > 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx > 0x00000000000066ea <+314>: shl $0x9,%ecx > 0x00000000000066ed <+317>: movslq %ecx,%rcx > 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax > 0x00000000000066f5 <+325>: div %rcx > 0x0000000000006704 <+340>: mov %eax,%edi > 0x0000000000006706 <+342>: add 0xc0(%r12),%di > 0x0000000000006712 <+354>: mov %di,0xc0(%r12) > > 251 } > 252 > 253 err = "corrupted btree"; > 0x00000000000069b0 <+1024>: mov $0x0,%rdx > 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> > 0x00000000000069bc <+1036>: nopl 0x0(%rax) > > 254 for (i = write_block(b); > 0x00000000000068a1 <+753>: cmp %rdx,%rcx > 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> > 0x00000000000068e0 <+816>: cmp %rdx,%rcx > 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> > > 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); > 256 i = ((void *) i) + block_bytes(b->c)) > 0x00000000000068d7 <+807>: mov %rcx,%rbx > 0x00000000000068da <+810>: sub %r8d,%ecx > > 257 if (i->seq == b->keys.set[0].data->seq) > 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi > 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) > 0x00000000000068ae <+766>: je 0x69b0 > <bch_btree_node_read_done+1024> > 0x00000000000068b4 <+772>: cltq > 0x00000000000068b6 <+774>: mov %rax,%r9 > 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx > 0x00000000000068bd <+781>: neg %r9 > 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> > 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) > 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx > 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi > 0x00000000000068d1 <+801>: je 0x69b0 > <bch_btree_node_read_done+1024> > > 258 goto err; > 259 > 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); > 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 > 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx > 0x00000000000068f4 <+836>: mov %r13,%rsi > 0x00000000000068f7 <+839>: mov %r14,%rdi > 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> > > 261 > 262 i = b->keys.set[0].data; > 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx > > 263 err = "short btree key"; > 0x00000000000069ec <+1084>: mov $0x0,%rdx > 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> > > 264 if (b->keys.set[0].size && > 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax > 0x0000000000006914 <+868>: test %eax,%eax > 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> > 0x0000000000006944 <+916>: test %rax,%rax > 0x0000000000006947 <+919>: js 0x69ec > <bch_btree_node_read_done+1084> > > 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) > 266 goto err; > 267 > 268 if (b->written < btree_blocks(b)) > 0x000000000000694d <+925>: mov 0x80(%r12),%rax > 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi > 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx > 0x000000000000696c <+956>: shr %cl,%rdx > 0x000000000000696f <+959>: cmp %edx,%esi > 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> > > 269 bch_bset_init_next(&b->keys, write_block(b), > 0x000000000000698f <+991>: mov %r14,%rdi > 0x000000000000699e <+1006>: callq 0x69a3 > <bch_btree_node_read_done+1011> > 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax > 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> > > 270 bset_magic(&b->c->sb)); > 271 out: > 272 mempool_free(iter, b->c->fill_iter); > 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi > 0x000000000000686f <+703>: mov %r13,%rdi > 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> > > 273 return; > 274 err: > 275 set_btree_node_io_error(b); > 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, > %u keys", > 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d > 0x000000000000684a <+666>: mov %esi,%ecx > 0x000000000000684c <+668>: mov $0x0,%rsi > 0x0000000000006853 <+675>: shr %cl,%r8d > 0x0000000000006856 <+678>: mov %rax,%rcx > 0x0000000000006859 <+681>: xor %eax,%eax > 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> > 0x0000000000006860 <+688>: mov 0x80(%r12),%rax > > 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), > 278 bset_block_offset(b, i), i->keys); > 279 goto out; > 280 } > 0x0000000000006877 <+711>: pop %rbx > 0x0000000000006878 <+712>: pop %r12 > 0x000000000000687a <+714>: pop %r13 > 0x000000000000687c <+716>: pop %r14 > 0x000000000000687e <+718>: pop %r15 > 0x0000000000006880 <+720>: pop %rbp > 0x0000000000006881 <+721>: retq > 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax > 0x0000000000006889 <+729>: shl $0x9,%eax > 0x000000000000688c <+732>: imul %eax,%ecx > 0x000000000000688f <+735>: movslq %ecx,%rbx > > > On 8/13/2014 1:45 PM, Slava Pestov wrote: >> Can you post the disassembly of the function? >> >> On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey >> <llowrey@nuclearwinter.com> wrote: >>> Thanks. Trying gdb helped me find the answer. I needed to install the >>> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >>> >>> From addr2line: >>>> bch_btree_node_read_done+0x4c >>>> drivers/md/bcache/btree.c:207 >>> Here'a a snippet from gdb: >>> >>>> (gdb) list *(bch_btree_node_read_done+0x4c) >>>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>>> 202 struct bset *i = btree_bset_first(b); >>>> 203 struct btree_iter *iter; >>>> 204 >>>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>>> 207 iter->used = 0; >>>> 208 >>>> 209 #ifdef CONFIG_BCACHE_DEBUG >>>> 210 iter->b = &b->keys; >>>> 211 #endif >>> This doesn't make any sense to me. If iter was null I would expect line >>> 206 to blow up first. >>> >>> --Larkin >>> >>> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>>> You can try to use gdb: >>>> >>>> gdb /lib/modules/.../foo.ko >>>> >>>> list *(bch_btree_node_read_done+0x4c) >>>> >>>> >>>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>>> <llowrey@nuclearwinter.com> wrote: >>>>> This is making be feel very dumb. I've googled extensively but can't >>>>> figure out how to run addr2line for a module. >>>>> >>>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>>> downloaded the version with symbols but I don't know if the addresses >>>>> are going to be the same. Bcache is a module for me and that's where >>>>> things get tricky. Do you have any tips? >>>>> >>>>> --Larkin >>>>> >>>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>>> Any chance you could do an addr2line and get me the exact line where >>>>>> it happened? >>>>>> >>>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>>> >>>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>>> well behaved for about 6 months. >>>>>> >>>>>> If this isn't a known issue is there anything I can do to provide more >>>>>> useful information? >>>>>> >>>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>>> >>>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>>> dereference at 0000000000000008 >>>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>> [210884.063723] PGD 0 >>>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>>> scsi_transport_sas cpufreq_stats >>>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>>> 3.15.8-200.fc20.x86_64 #1 >>>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>>> task.ti: ffff8800217b8000 >>>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>>> 0000000000000000 >>>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>>> 0000000000000246 >>>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>>> 0000000000000f6b >>>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>>> ffff880413d06c00 >>>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>>> ffff880413d06c00 >>>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>>> 00000000000407e0 >>>>>> [210884.245131] Stack: >>>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>>> ffffffffa0162b68 0000000000000000 >>>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>>> 0000000000000000 0000000000000000 >>>>>> [210884.271234] Call Trace: >>>>>> [210884.273985] [<ffffffffa0162b68>] >>>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>>> [210884.281258] [<ffffffffa0163f69>] >>>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>>> [210884.288377] [<ffffffffa01642f5>] >>>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>>> [210884.311158] [<ffffffffa01673f7>] >>>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>>> 48 8b 43 10 48 85 >>>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>>> [210884.407171] CR2: 0000000000000008 >>>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>>> ffffffffffffffd8 >>>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>>> >>>>>> --Larkin >>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>> linux-bcache" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> <mailto:majordomo@vger.kernel.org> >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:25 ` Slava Pestov @ 2014-08-13 21:30 ` Slava Pestov 2014-08-13 21:34 ` Jianjian Huo ` (2 more replies) 2014-08-13 21:32 ` Larkin Lowrey 1 sibling, 3 replies; 13+ messages in thread From: Slava Pestov @ 2014-08-13 21:30 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Kent Overstreet, linux-bcache I was mistaken. The bug is fixed in the pull request Kent sent to Jens for 3.16: http://evilpiepirate.org/git/linux-bcache.git/commit/?h=bcache-dev&id=bcf090e0040e30f8409e6a535a01e6473afb096f On Wed, Aug 13, 2014 at 2:25 PM, Slava Pestov <sp@datera.io> wrote: > Indeed it looks like iter is NULL. I see the bug is still present in > the latest dev branch. The problem is that we're not checking the > return value of mempoool_alloc(), which may be NULL if we pass > GFP_NOWAIT. > > On Wed, Aug 13, 2014 at 2:21 PM, Larkin Lowrey > <llowrey@nuclearwinter.com> wrote: >> Here's the dissassembly of bch_btree_node_read_done. The offending line >> is 207 and the instruction is at offset 76. >> >> --Larkin >> >> 199 void bch_btree_node_read_done(struct btree *b) >> 200 { >> 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> >> 0x00000000000065b5 <+5>: push %rbp >> 0x00000000000065b8 <+8>: mov %rsp,%rbp >> 0x00000000000065bb <+11>: push %r15 >> 0x00000000000065bd <+13>: push %r14 >> 0x00000000000065bf <+15>: push %r13 >> 0x00000000000065c1 <+17>: push %r12 >> 0x00000000000065c3 <+19>: mov %rdi,%r12 >> 0x00000000000065c6 <+22>: push %rbx >> >> 201 const char *err = "bad btree header"; >> 0x0000000000006800 <+592>: mov $0x0,%rdx >> >> 202 struct bset *i = btree_bset_first(b); >> 203 struct btree_iter *iter; >> 204 >> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >> 0x00000000000065b6 <+6>: xor %esi,%esi >> 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax >> 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi >> 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> >> 0x00000000000065e9 <+57>: mov %rax,%r13 >> >> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >> 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi >> 0x00000000000065ec <+60>: xor %edx,%edx >> 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax >> 0x00000000000065f5 <+69>: divw 0x430(%rsi) >> 0x0000000000006604 <+84>: movzwl %ax,%eax >> 0x0000000000006607 <+87>: mov %rax,0x0(%r13) >> >> 207 iter->used = 0; >> 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) >> >> 208 >> 209 #ifdef CONFIG_BCACHE_DEBUG >> 210 iter->b = &b->keys; >> 211 #endif >> 212 >> 213 if (!i->seq) >> 0x000000000000660b <+91>: mov 0x10(%rbx),%rax >> 0x000000000000660f <+95>: test %rax,%rax >> 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> >> >> 214 goto err; >> 215 >> 216 for (; >> 0x000000000000664d <+157>: cmp %r9d,%ecx >> 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> >> 0x0000000000006744 <+404>: cmp %r9d,%r10d >> 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> >> >> 217 b->written < btree_blocks(b) && i->seq == >> b->keys.set[0].data->seq; >> 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi >> 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi >> 0x000000000000662e <+126>: mov 0x108(%r12),%r8 >> 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx >> 0x0000000000006644 <+148>: mov %rdx,%r9 >> 0x0000000000006647 <+151>: shr %cl,%r9 >> 0x000000000000664a <+154>: movzwl %di,%ecx >> 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax >> 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> >> 0x000000000000670f <+351>: mov %rdx,%r9 >> 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx >> 0x0000000000006738 <+392>: shr %cl,%r9 >> 0x000000000000674d <+413>: mov 0x10(%r8),%rcx >> 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) >> 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> >> 0x0000000000006892 <+738>: add %r8,%rbx >> 0x0000000000006895 <+741>: nopl (%rax) >> >> 218 i = write_block(b)) { >> 219 err = "unsupported bset version"; >> 0x00000000000069c0 <+1040>: mov $0x0,%rdx >> 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000069cc <+1052>: nopl 0x0(%rax) >> >> 220 if (i->version > BCACHE_BSET_VERSION) >> 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d >> 0x0000000000006664 <+180>: cmp $0x1,%r10d >> 0x0000000000006668 <+184>: ja 0x69c0 >> <bch_btree_node_read_done+1040> >> 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d >> 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> >> 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) >> 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d >> 0x000000000000675f <+431>: cmp $0x1,%r10d >> 0x0000000000006763 <+435>: ja 0x69c0 >> <bch_btree_node_read_done+1040> >> >> 221 goto err; >> 222 >> 223 err = "bad btree header"; >> 224 if (b->written + set_blocks(i, block_bytes(b->c)) > >> 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax >> 0x000000000000676c <+444>: mov %r11,%rcx >> 0x000000000000676f <+447>: xor %edx,%edx >> 0x0000000000006771 <+449>: shl $0x9,%rcx >> 0x0000000000006775 <+453>: movzwl %di,%edi >> 0x0000000000006778 <+456>: mov %r9d,%r9d >> 0x000000000000677b <+459>: and $0x1fffe00,%ecx >> 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 >> 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax >> 0x000000000000678e <+478>: div %rcx >> 0x0000000000006791 <+481>: add %rdi,%rax >> 0x0000000000006794 <+484>: cmp %r9,%rax >> 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> >> >> 225 btree_blocks(b)) >> 226 goto err; >> 227 >> 228 err = "bad magic"; >> 0x00000000000069d0 <+1056>: mov $0x0,%rdx >> 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000069dc <+1068>: nopl 0x0(%rax) >> >> 229 if (i->magic != bset_magic(&b->c->sb)) >> 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) >> 0x00000000000067ae <+510>: jne 0x69d0 >> <bch_btree_node_read_done+1056> >> >> 230 goto err; >> 231 >> 232 err = "bad checksum"; >> 0x00000000000067df <+559>: mov $0x0,%rdx >> 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) >> 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax >> 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> >> 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) >> >> 233 switch (i->version) { >> 0x00000000000067b4 <+516>: cmp $0x1,%r10d >> 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> >> >> 234 case 0: >> 235 if (i->csum != csum_set(i)) >> 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 >> 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi >> 0x00000000000067ce <+542>: sub %rdi,%rsi >> 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> >> 0x00000000000067d6 <+550>: cmp %rax,%r15 >> 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> >> 236 goto err; >> 237 break; >> 238 case BCACHE_BSET_VERSION: >> 239 if (i->csum != btree_csum_set(b, i)) >> 0x000000000000669d <+237>: cmp %rax,%r15 >> 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> >> 0x00000000000067b8 <+520>: mov (%rbx),%r15 >> >> 240 goto err; >> 241 break; >> 242 } >> 243 >> 244 err = "empty set"; >> 0x00000000000069e0 <+1072>: mov $0x0,%rdx >> 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> >> >> 245 if (i != b->keys.set[0].data && !i->keys) >> 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) >> 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> >> 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax >> 0x00000000000066b7 <+263>: test %eax,%eax >> 0x00000000000066b9 <+265>: je 0x69e0 >> <bch_btree_node_read_done+1072> >> >> 246 goto err; >> 247 >> 248 bch_btree_iter_push(iter, i->start, >> bset_bkey_last(i)); >> 0x00000000000066c3 <+275>: mov %r14,%rsi >> 0x00000000000066c6 <+278>: mov %r13,%rdi >> 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> >> >> 249 >> 250 b->written += set_blocks(i, block_bytes(b->c)); >> 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi >> 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax >> 0x00000000000066d9 <+297>: xor %edx,%edx >> 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx >> 0x00000000000066ea <+314>: shl $0x9,%ecx >> 0x00000000000066ed <+317>: movslq %ecx,%rcx >> 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax >> 0x00000000000066f5 <+325>: div %rcx >> 0x0000000000006704 <+340>: mov %eax,%edi >> 0x0000000000006706 <+342>: add 0xc0(%r12),%di >> 0x0000000000006712 <+354>: mov %di,0xc0(%r12) >> >> 251 } >> 252 >> 253 err = "corrupted btree"; >> 0x00000000000069b0 <+1024>: mov $0x0,%rdx >> 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000069bc <+1036>: nopl 0x0(%rax) >> >> 254 for (i = write_block(b); >> 0x00000000000068a1 <+753>: cmp %rdx,%rcx >> 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> >> 0x00000000000068e0 <+816>: cmp %rdx,%rcx >> 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> >> >> 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); >> 256 i = ((void *) i) + block_bytes(b->c)) >> 0x00000000000068d7 <+807>: mov %rcx,%rbx >> 0x00000000000068da <+810>: sub %r8d,%ecx >> >> 257 if (i->seq == b->keys.set[0].data->seq) >> 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi >> 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) >> 0x00000000000068ae <+766>: je 0x69b0 >> <bch_btree_node_read_done+1024> >> 0x00000000000068b4 <+772>: cltq >> 0x00000000000068b6 <+774>: mov %rax,%r9 >> 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx >> 0x00000000000068bd <+781>: neg %r9 >> 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> >> 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) >> 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx >> 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi >> 0x00000000000068d1 <+801>: je 0x69b0 >> <bch_btree_node_read_done+1024> >> >> 258 goto err; >> 259 >> 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); >> 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 >> 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx >> 0x00000000000068f4 <+836>: mov %r13,%rsi >> 0x00000000000068f7 <+839>: mov %r14,%rdi >> 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> >> >> 261 >> 262 i = b->keys.set[0].data; >> 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx >> >> 263 err = "short btree key"; >> 0x00000000000069ec <+1084>: mov $0x0,%rdx >> 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> >> >> 264 if (b->keys.set[0].size && >> 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax >> 0x0000000000006914 <+868>: test %eax,%eax >> 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> >> 0x0000000000006944 <+916>: test %rax,%rax >> 0x0000000000006947 <+919>: js 0x69ec >> <bch_btree_node_read_done+1084> >> >> 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) >> 266 goto err; >> 267 >> 268 if (b->written < btree_blocks(b)) >> 0x000000000000694d <+925>: mov 0x80(%r12),%rax >> 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi >> 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx >> 0x000000000000696c <+956>: shr %cl,%rdx >> 0x000000000000696f <+959>: cmp %edx,%esi >> 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> >> >> 269 bch_bset_init_next(&b->keys, write_block(b), >> 0x000000000000698f <+991>: mov %r14,%rdi >> 0x000000000000699e <+1006>: callq 0x69a3 >> <bch_btree_node_read_done+1011> >> 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax >> 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> >> >> 270 bset_magic(&b->c->sb)); >> 271 out: >> 272 mempool_free(iter, b->c->fill_iter); >> 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi >> 0x000000000000686f <+703>: mov %r13,%rdi >> 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> >> >> 273 return; >> 274 err: >> 275 set_btree_node_io_error(b); >> 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, >> %u keys", >> 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d >> 0x000000000000684a <+666>: mov %esi,%ecx >> 0x000000000000684c <+668>: mov $0x0,%rsi >> 0x0000000000006853 <+675>: shr %cl,%r8d >> 0x0000000000006856 <+678>: mov %rax,%rcx >> 0x0000000000006859 <+681>: xor %eax,%eax >> 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> >> 0x0000000000006860 <+688>: mov 0x80(%r12),%rax >> >> 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), >> 278 bset_block_offset(b, i), i->keys); >> 279 goto out; >> 280 } >> 0x0000000000006877 <+711>: pop %rbx >> 0x0000000000006878 <+712>: pop %r12 >> 0x000000000000687a <+714>: pop %r13 >> 0x000000000000687c <+716>: pop %r14 >> 0x000000000000687e <+718>: pop %r15 >> 0x0000000000006880 <+720>: pop %rbp >> 0x0000000000006881 <+721>: retq >> 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax >> 0x0000000000006889 <+729>: shl $0x9,%eax >> 0x000000000000688c <+732>: imul %eax,%ecx >> 0x000000000000688f <+735>: movslq %ecx,%rbx >> >> >> On 8/13/2014 1:45 PM, Slava Pestov wrote: >>> Can you post the disassembly of the function? >>> >>> On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey >>> <llowrey@nuclearwinter.com> wrote: >>>> Thanks. Trying gdb helped me find the answer. I needed to install the >>>> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >>>> >>>> From addr2line: >>>>> bch_btree_node_read_done+0x4c >>>>> drivers/md/bcache/btree.c:207 >>>> Here'a a snippet from gdb: >>>> >>>>> (gdb) list *(bch_btree_node_read_done+0x4c) >>>>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>>>> 202 struct bset *i = btree_bset_first(b); >>>>> 203 struct btree_iter *iter; >>>>> 204 >>>>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>>>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>>>> 207 iter->used = 0; >>>>> 208 >>>>> 209 #ifdef CONFIG_BCACHE_DEBUG >>>>> 210 iter->b = &b->keys; >>>>> 211 #endif >>>> This doesn't make any sense to me. If iter was null I would expect line >>>> 206 to blow up first. >>>> >>>> --Larkin >>>> >>>> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>>>> You can try to use gdb: >>>>> >>>>> gdb /lib/modules/.../foo.ko >>>>> >>>>> list *(bch_btree_node_read_done+0x4c) >>>>> >>>>> >>>>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>>>> <llowrey@nuclearwinter.com> wrote: >>>>>> This is making be feel very dumb. I've googled extensively but can't >>>>>> figure out how to run addr2line for a module. >>>>>> >>>>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>>>> downloaded the version with symbols but I don't know if the addresses >>>>>> are going to be the same. Bcache is a module for me and that's where >>>>>> things get tricky. Do you have any tips? >>>>>> >>>>>> --Larkin >>>>>> >>>>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>>>> Any chance you could do an addr2line and get me the exact line where >>>>>>> it happened? >>>>>>> >>>>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>>>> >>>>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>>>> well behaved for about 6 months. >>>>>>> >>>>>>> If this isn't a known issue is there anything I can do to provide more >>>>>>> useful information? >>>>>>> >>>>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>>>> >>>>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>>>> dereference at 0000000000000008 >>>>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>> [210884.063723] PGD 0 >>>>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>>>> scsi_transport_sas cpufreq_stats >>>>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>>>> 3.15.8-200.fc20.x86_64 #1 >>>>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>>>> task.ti: ffff8800217b8000 >>>>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>>>> 0000000000000000 >>>>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>>>> 0000000000000246 >>>>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>>>> 0000000000000f6b >>>>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>>>> ffff880413d06c00 >>>>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>>>> ffff880413d06c00 >>>>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>>>> 00000000000407e0 >>>>>>> [210884.245131] Stack: >>>>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>>>> ffffffffa0162b68 0000000000000000 >>>>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>>>> 0000000000000000 0000000000000000 >>>>>>> [210884.271234] Call Trace: >>>>>>> [210884.273985] [<ffffffffa0162b68>] >>>>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>>>> [210884.281258] [<ffffffffa0163f69>] >>>>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>>>> [210884.288377] [<ffffffffa01642f5>] >>>>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>>>> [210884.311158] [<ffffffffa01673f7>] >>>>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>>>> 48 8b 43 10 48 85 >>>>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>>>> [210884.407171] CR2: 0000000000000008 >>>>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>>>> ffffffffffffffd8 >>>>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>>>> >>>>>>> --Larkin >>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>> linux-bcache" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> <mailto:majordomo@vger.kernel.org> >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:30 ` Slava Pestov @ 2014-08-13 21:34 ` Jianjian Huo 2014-08-13 22:14 ` Larkin Lowrey 2014-08-16 5:48 ` Peter Kieser 2 siblings, 0 replies; 13+ messages in thread From: Jianjian Huo @ 2014-08-13 21:34 UTC (permalink / raw) To: Slava Pestov; +Cc: Larkin Lowrey, Kent Overstreet, linux-bcache yes, it's GFP_NOIO in 3.16. And Line 207 could be executed before 206, due to out-of-order execution. On Wed, Aug 13, 2014 at 2:30 PM, Slava Pestov <sp@datera.io> wrote: > I was mistaken. The bug is fixed in the pull request Kent sent to Jens for 3.16: > > http://evilpiepirate.org/git/linux-bcache.git/commit/?h=bcache-dev&id=bcf090e0040e30f8409e6a535a01e6473afb096f > > On Wed, Aug 13, 2014 at 2:25 PM, Slava Pestov <sp@datera.io> wrote: >> Indeed it looks like iter is NULL. I see the bug is still present in >> the latest dev branch. The problem is that we're not checking the >> return value of mempoool_alloc(), which may be NULL if we pass >> GFP_NOWAIT. >> >> On Wed, Aug 13, 2014 at 2:21 PM, Larkin Lowrey >> <llowrey@nuclearwinter.com> wrote: >>> Here's the dissassembly of bch_btree_node_read_done. The offending line >>> is 207 and the instruction is at offset 76. >>> >>> --Larkin >>> >>> 199 void bch_btree_node_read_done(struct btree *b) >>> 200 { >>> 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> >>> 0x00000000000065b5 <+5>: push %rbp >>> 0x00000000000065b8 <+8>: mov %rsp,%rbp >>> 0x00000000000065bb <+11>: push %r15 >>> 0x00000000000065bd <+13>: push %r14 >>> 0x00000000000065bf <+15>: push %r13 >>> 0x00000000000065c1 <+17>: push %r12 >>> 0x00000000000065c3 <+19>: mov %rdi,%r12 >>> 0x00000000000065c6 <+22>: push %rbx >>> >>> 201 const char *err = "bad btree header"; >>> 0x0000000000006800 <+592>: mov $0x0,%rdx >>> >>> 202 struct bset *i = btree_bset_first(b); >>> 203 struct btree_iter *iter; >>> 204 >>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>> 0x00000000000065b6 <+6>: xor %esi,%esi >>> 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax >>> 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi >>> 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> >>> 0x00000000000065e9 <+57>: mov %rax,%r13 >>> >>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>> 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi >>> 0x00000000000065ec <+60>: xor %edx,%edx >>> 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax >>> 0x00000000000065f5 <+69>: divw 0x430(%rsi) >>> 0x0000000000006604 <+84>: movzwl %ax,%eax >>> 0x0000000000006607 <+87>: mov %rax,0x0(%r13) >>> >>> 207 iter->used = 0; >>> 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) >>> >>> 208 >>> 209 #ifdef CONFIG_BCACHE_DEBUG >>> 210 iter->b = &b->keys; >>> 211 #endif >>> 212 >>> 213 if (!i->seq) >>> 0x000000000000660b <+91>: mov 0x10(%rbx),%rax >>> 0x000000000000660f <+95>: test %rax,%rax >>> 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> >>> >>> 214 goto err; >>> 215 >>> 216 for (; >>> 0x000000000000664d <+157>: cmp %r9d,%ecx >>> 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> >>> 0x0000000000006744 <+404>: cmp %r9d,%r10d >>> 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> >>> >>> 217 b->written < btree_blocks(b) && i->seq == >>> b->keys.set[0].data->seq; >>> 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi >>> 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi >>> 0x000000000000662e <+126>: mov 0x108(%r12),%r8 >>> 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx >>> 0x0000000000006644 <+148>: mov %rdx,%r9 >>> 0x0000000000006647 <+151>: shr %cl,%r9 >>> 0x000000000000664a <+154>: movzwl %di,%ecx >>> 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax >>> 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> >>> 0x000000000000670f <+351>: mov %rdx,%r9 >>> 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx >>> 0x0000000000006738 <+392>: shr %cl,%r9 >>> 0x000000000000674d <+413>: mov 0x10(%r8),%rcx >>> 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) >>> 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> >>> 0x0000000000006892 <+738>: add %r8,%rbx >>> 0x0000000000006895 <+741>: nopl (%rax) >>> >>> 218 i = write_block(b)) { >>> 219 err = "unsupported bset version"; >>> 0x00000000000069c0 <+1040>: mov $0x0,%rdx >>> 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069cc <+1052>: nopl 0x0(%rax) >>> >>> 220 if (i->version > BCACHE_BSET_VERSION) >>> 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d >>> 0x0000000000006664 <+180>: cmp $0x1,%r10d >>> 0x0000000000006668 <+184>: ja 0x69c0 >>> <bch_btree_node_read_done+1040> >>> 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d >>> 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> >>> 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) >>> 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d >>> 0x000000000000675f <+431>: cmp $0x1,%r10d >>> 0x0000000000006763 <+435>: ja 0x69c0 >>> <bch_btree_node_read_done+1040> >>> >>> 221 goto err; >>> 222 >>> 223 err = "bad btree header"; >>> 224 if (b->written + set_blocks(i, block_bytes(b->c)) > >>> 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax >>> 0x000000000000676c <+444>: mov %r11,%rcx >>> 0x000000000000676f <+447>: xor %edx,%edx >>> 0x0000000000006771 <+449>: shl $0x9,%rcx >>> 0x0000000000006775 <+453>: movzwl %di,%edi >>> 0x0000000000006778 <+456>: mov %r9d,%r9d >>> 0x000000000000677b <+459>: and $0x1fffe00,%ecx >>> 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 >>> 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax >>> 0x000000000000678e <+478>: div %rcx >>> 0x0000000000006791 <+481>: add %rdi,%rax >>> 0x0000000000006794 <+484>: cmp %r9,%rax >>> 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> >>> >>> 225 btree_blocks(b)) >>> 226 goto err; >>> 227 >>> 228 err = "bad magic"; >>> 0x00000000000069d0 <+1056>: mov $0x0,%rdx >>> 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069dc <+1068>: nopl 0x0(%rax) >>> >>> 229 if (i->magic != bset_magic(&b->c->sb)) >>> 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) >>> 0x00000000000067ae <+510>: jne 0x69d0 >>> <bch_btree_node_read_done+1056> >>> >>> 230 goto err; >>> 231 >>> 232 err = "bad checksum"; >>> 0x00000000000067df <+559>: mov $0x0,%rdx >>> 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) >>> 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax >>> 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> >>> 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) >>> >>> 233 switch (i->version) { >>> 0x00000000000067b4 <+516>: cmp $0x1,%r10d >>> 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> >>> >>> 234 case 0: >>> 235 if (i->csum != csum_set(i)) >>> 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 >>> 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi >>> 0x00000000000067ce <+542>: sub %rdi,%rsi >>> 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> >>> 0x00000000000067d6 <+550>: cmp %rax,%r15 >>> 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> >>> 236 goto err; >>> 237 break; >>> 238 case BCACHE_BSET_VERSION: >>> 239 if (i->csum != btree_csum_set(b, i)) >>> 0x000000000000669d <+237>: cmp %rax,%r15 >>> 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> >>> 0x00000000000067b8 <+520>: mov (%rbx),%r15 >>> >>> 240 goto err; >>> 241 break; >>> 242 } >>> 243 >>> 244 err = "empty set"; >>> 0x00000000000069e0 <+1072>: mov $0x0,%rdx >>> 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> >>> 245 if (i != b->keys.set[0].data && !i->keys) >>> 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) >>> 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> >>> 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax >>> 0x00000000000066b7 <+263>: test %eax,%eax >>> 0x00000000000066b9 <+265>: je 0x69e0 >>> <bch_btree_node_read_done+1072> >>> >>> 246 goto err; >>> 247 >>> 248 bch_btree_iter_push(iter, i->start, >>> bset_bkey_last(i)); >>> 0x00000000000066c3 <+275>: mov %r14,%rsi >>> 0x00000000000066c6 <+278>: mov %r13,%rdi >>> 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> >>> >>> 249 >>> 250 b->written += set_blocks(i, block_bytes(b->c)); >>> 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi >>> 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax >>> 0x00000000000066d9 <+297>: xor %edx,%edx >>> 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx >>> 0x00000000000066ea <+314>: shl $0x9,%ecx >>> 0x00000000000066ed <+317>: movslq %ecx,%rcx >>> 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax >>> 0x00000000000066f5 <+325>: div %rcx >>> 0x0000000000006704 <+340>: mov %eax,%edi >>> 0x0000000000006706 <+342>: add 0xc0(%r12),%di >>> 0x0000000000006712 <+354>: mov %di,0xc0(%r12) >>> >>> 251 } >>> 252 >>> 253 err = "corrupted btree"; >>> 0x00000000000069b0 <+1024>: mov $0x0,%rdx >>> 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069bc <+1036>: nopl 0x0(%rax) >>> >>> 254 for (i = write_block(b); >>> 0x00000000000068a1 <+753>: cmp %rdx,%rcx >>> 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> >>> 0x00000000000068e0 <+816>: cmp %rdx,%rcx >>> 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> >>> >>> 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); >>> 256 i = ((void *) i) + block_bytes(b->c)) >>> 0x00000000000068d7 <+807>: mov %rcx,%rbx >>> 0x00000000000068da <+810>: sub %r8d,%ecx >>> >>> 257 if (i->seq == b->keys.set[0].data->seq) >>> 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi >>> 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) >>> 0x00000000000068ae <+766>: je 0x69b0 >>> <bch_btree_node_read_done+1024> >>> 0x00000000000068b4 <+772>: cltq >>> 0x00000000000068b6 <+774>: mov %rax,%r9 >>> 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx >>> 0x00000000000068bd <+781>: neg %r9 >>> 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> >>> 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) >>> 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx >>> 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi >>> 0x00000000000068d1 <+801>: je 0x69b0 >>> <bch_btree_node_read_done+1024> >>> >>> 258 goto err; >>> 259 >>> 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); >>> 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 >>> 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx >>> 0x00000000000068f4 <+836>: mov %r13,%rsi >>> 0x00000000000068f7 <+839>: mov %r14,%rdi >>> 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> >>> >>> 261 >>> 262 i = b->keys.set[0].data; >>> 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx >>> >>> 263 err = "short btree key"; >>> 0x00000000000069ec <+1084>: mov $0x0,%rdx >>> 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> >>> 264 if (b->keys.set[0].size && >>> 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax >>> 0x0000000000006914 <+868>: test %eax,%eax >>> 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> >>> 0x0000000000006944 <+916>: test %rax,%rax >>> 0x0000000000006947 <+919>: js 0x69ec >>> <bch_btree_node_read_done+1084> >>> >>> 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) >>> 266 goto err; >>> 267 >>> 268 if (b->written < btree_blocks(b)) >>> 0x000000000000694d <+925>: mov 0x80(%r12),%rax >>> 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi >>> 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx >>> 0x000000000000696c <+956>: shr %cl,%rdx >>> 0x000000000000696f <+959>: cmp %edx,%esi >>> 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> >>> >>> 269 bch_bset_init_next(&b->keys, write_block(b), >>> 0x000000000000698f <+991>: mov %r14,%rdi >>> 0x000000000000699e <+1006>: callq 0x69a3 >>> <bch_btree_node_read_done+1011> >>> 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax >>> 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> >>> >>> 270 bset_magic(&b->c->sb)); >>> 271 out: >>> 272 mempool_free(iter, b->c->fill_iter); >>> 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi >>> 0x000000000000686f <+703>: mov %r13,%rdi >>> 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> >>> >>> 273 return; >>> 274 err: >>> 275 set_btree_node_io_error(b); >>> 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, >>> %u keys", >>> 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d >>> 0x000000000000684a <+666>: mov %esi,%ecx >>> 0x000000000000684c <+668>: mov $0x0,%rsi >>> 0x0000000000006853 <+675>: shr %cl,%r8d >>> 0x0000000000006856 <+678>: mov %rax,%rcx >>> 0x0000000000006859 <+681>: xor %eax,%eax >>> 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> >>> 0x0000000000006860 <+688>: mov 0x80(%r12),%rax >>> >>> 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), >>> 278 bset_block_offset(b, i), i->keys); >>> 279 goto out; >>> 280 } >>> 0x0000000000006877 <+711>: pop %rbx >>> 0x0000000000006878 <+712>: pop %r12 >>> 0x000000000000687a <+714>: pop %r13 >>> 0x000000000000687c <+716>: pop %r14 >>> 0x000000000000687e <+718>: pop %r15 >>> 0x0000000000006880 <+720>: pop %rbp >>> 0x0000000000006881 <+721>: retq >>> 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax >>> 0x0000000000006889 <+729>: shl $0x9,%eax >>> 0x000000000000688c <+732>: imul %eax,%ecx >>> 0x000000000000688f <+735>: movslq %ecx,%rbx >>> >>> >>> On 8/13/2014 1:45 PM, Slava Pestov wrote: >>>> Can you post the disassembly of the function? >>>> >>>> On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey >>>> <llowrey@nuclearwinter.com> wrote: >>>>> Thanks. Trying gdb helped me find the answer. I needed to install the >>>>> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >>>>> >>>>> From addr2line: >>>>>> bch_btree_node_read_done+0x4c >>>>>> drivers/md/bcache/btree.c:207 >>>>> Here'a a snippet from gdb: >>>>> >>>>>> (gdb) list *(bch_btree_node_read_done+0x4c) >>>>>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>>>>> 202 struct bset *i = btree_bset_first(b); >>>>>> 203 struct btree_iter *iter; >>>>>> 204 >>>>>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>>>>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>>>>> 207 iter->used = 0; >>>>>> 208 >>>>>> 209 #ifdef CONFIG_BCACHE_DEBUG >>>>>> 210 iter->b = &b->keys; >>>>>> 211 #endif >>>>> This doesn't make any sense to me. If iter was null I would expect line >>>>> 206 to blow up first. >>>>> >>>>> --Larkin >>>>> >>>>> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>>>>> You can try to use gdb: >>>>>> >>>>>> gdb /lib/modules/.../foo.ko >>>>>> >>>>>> list *(bch_btree_node_read_done+0x4c) >>>>>> >>>>>> >>>>>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>>>>> <llowrey@nuclearwinter.com> wrote: >>>>>>> This is making be feel very dumb. I've googled extensively but can't >>>>>>> figure out how to run addr2line for a module. >>>>>>> >>>>>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>>>>> downloaded the version with symbols but I don't know if the addresses >>>>>>> are going to be the same. Bcache is a module for me and that's where >>>>>>> things get tricky. Do you have any tips? >>>>>>> >>>>>>> --Larkin >>>>>>> >>>>>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>>>>> Any chance you could do an addr2line and get me the exact line where >>>>>>>> it happened? >>>>>>>> >>>>>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>>>>> >>>>>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>>>>> well behaved for about 6 months. >>>>>>>> >>>>>>>> If this isn't a known issue is there anything I can do to provide more >>>>>>>> useful information? >>>>>>>> >>>>>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>>>>> >>>>>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>>>>> dereference at 0000000000000008 >>>>>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.063723] PGD 0 >>>>>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>>>>> scsi_transport_sas cpufreq_stats >>>>>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>>>>> 3.15.8-200.fc20.x86_64 #1 >>>>>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>>>>> task.ti: ffff8800217b8000 >>>>>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>>>>> 0000000000000000 >>>>>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>>>>> 0000000000000246 >>>>>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>>>>> 0000000000000f6b >>>>>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>>>>> ffff880413d06c00 >>>>>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>>>>> ffff880413d06c00 >>>>>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>>>>> 00000000000407e0 >>>>>>>> [210884.245131] Stack: >>>>>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>>>>> ffffffffa0162b68 0000000000000000 >>>>>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>>>>> 0000000000000000 0000000000000000 >>>>>>>> [210884.271234] Call Trace: >>>>>>>> [210884.273985] [<ffffffffa0162b68>] >>>>>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>>>>> [210884.281258] [<ffffffffa0163f69>] >>>>>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>>>>> [210884.288377] [<ffffffffa01642f5>] >>>>>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>>>>> [210884.311158] [<ffffffffa01673f7>] >>>>>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>>>>> 48 8b 43 10 48 85 >>>>>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>>>>> [210884.407171] CR2: 0000000000000008 >>>>>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>>>>> ffffffffffffffd8 >>>>>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>>>>> >>>>>>>> --Larkin >>>>>>>> >>>>>>>> -- >>>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>>> linux-bcache" in >>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>> <mailto:majordomo@vger.kernel.org> >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:30 ` Slava Pestov 2014-08-13 21:34 ` Jianjian Huo @ 2014-08-13 22:14 ` Larkin Lowrey 2014-08-16 5:48 ` Peter Kieser 2 siblings, 0 replies; 13+ messages in thread From: Larkin Lowrey @ 2014-08-13 22:14 UTC (permalink / raw) To: Slava Pestov; +Cc: Kent Overstreet, linux-bcache Thanks for looking into this. It's good to know it has already been addressed. --Larkin On 8/13/2014 4:30 PM, Slava Pestov wrote: > I was mistaken. The bug is fixed in the pull request Kent sent to Jens for 3.16: > > http://evilpiepirate.org/git/linux-bcache.git/commit/?h=bcache-dev&id=bcf090e0040e30f8409e6a535a01e6473afb096f > > On Wed, Aug 13, 2014 at 2:25 PM, Slava Pestov <sp@datera.io> wrote: >> Indeed it looks like iter is NULL. I see the bug is still present in >> the latest dev branch. The problem is that we're not checking the >> return value of mempoool_alloc(), which may be NULL if we pass >> GFP_NOWAIT. >> >> On Wed, Aug 13, 2014 at 2:21 PM, Larkin Lowrey >> <llowrey@nuclearwinter.com> wrote: >>> Here's the dissassembly of bch_btree_node_read_done. The offending line >>> is 207 and the instruction is at offset 76. >>> >>> --Larkin >>> >>> 199 void bch_btree_node_read_done(struct btree *b) >>> 200 { >>> 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> >>> 0x00000000000065b5 <+5>: push %rbp >>> 0x00000000000065b8 <+8>: mov %rsp,%rbp >>> 0x00000000000065bb <+11>: push %r15 >>> 0x00000000000065bd <+13>: push %r14 >>> 0x00000000000065bf <+15>: push %r13 >>> 0x00000000000065c1 <+17>: push %r12 >>> 0x00000000000065c3 <+19>: mov %rdi,%r12 >>> 0x00000000000065c6 <+22>: push %rbx >>> >>> 201 const char *err = "bad btree header"; >>> 0x0000000000006800 <+592>: mov $0x0,%rdx >>> >>> 202 struct bset *i = btree_bset_first(b); >>> 203 struct btree_iter *iter; >>> 204 >>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>> 0x00000000000065b6 <+6>: xor %esi,%esi >>> 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax >>> 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi >>> 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> >>> 0x00000000000065e9 <+57>: mov %rax,%r13 >>> >>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>> 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi >>> 0x00000000000065ec <+60>: xor %edx,%edx >>> 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax >>> 0x00000000000065f5 <+69>: divw 0x430(%rsi) >>> 0x0000000000006604 <+84>: movzwl %ax,%eax >>> 0x0000000000006607 <+87>: mov %rax,0x0(%r13) >>> >>> 207 iter->used = 0; >>> 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) >>> >>> 208 >>> 209 #ifdef CONFIG_BCACHE_DEBUG >>> 210 iter->b = &b->keys; >>> 211 #endif >>> 212 >>> 213 if (!i->seq) >>> 0x000000000000660b <+91>: mov 0x10(%rbx),%rax >>> 0x000000000000660f <+95>: test %rax,%rax >>> 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> >>> >>> 214 goto err; >>> 215 >>> 216 for (; >>> 0x000000000000664d <+157>: cmp %r9d,%ecx >>> 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> >>> 0x0000000000006744 <+404>: cmp %r9d,%r10d >>> 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> >>> >>> 217 b->written < btree_blocks(b) && i->seq == >>> b->keys.set[0].data->seq; >>> 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi >>> 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi >>> 0x000000000000662e <+126>: mov 0x108(%r12),%r8 >>> 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx >>> 0x0000000000006644 <+148>: mov %rdx,%r9 >>> 0x0000000000006647 <+151>: shr %cl,%r9 >>> 0x000000000000664a <+154>: movzwl %di,%ecx >>> 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax >>> 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> >>> 0x000000000000670f <+351>: mov %rdx,%r9 >>> 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx >>> 0x0000000000006738 <+392>: shr %cl,%r9 >>> 0x000000000000674d <+413>: mov 0x10(%r8),%rcx >>> 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) >>> 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> >>> 0x0000000000006892 <+738>: add %r8,%rbx >>> 0x0000000000006895 <+741>: nopl (%rax) >>> >>> 218 i = write_block(b)) { >>> 219 err = "unsupported bset version"; >>> 0x00000000000069c0 <+1040>: mov $0x0,%rdx >>> 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069cc <+1052>: nopl 0x0(%rax) >>> >>> 220 if (i->version > BCACHE_BSET_VERSION) >>> 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d >>> 0x0000000000006664 <+180>: cmp $0x1,%r10d >>> 0x0000000000006668 <+184>: ja 0x69c0 >>> <bch_btree_node_read_done+1040> >>> 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d >>> 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> >>> 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) >>> 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d >>> 0x000000000000675f <+431>: cmp $0x1,%r10d >>> 0x0000000000006763 <+435>: ja 0x69c0 >>> <bch_btree_node_read_done+1040> >>> >>> 221 goto err; >>> 222 >>> 223 err = "bad btree header"; >>> 224 if (b->written + set_blocks(i, block_bytes(b->c)) > >>> 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax >>> 0x000000000000676c <+444>: mov %r11,%rcx >>> 0x000000000000676f <+447>: xor %edx,%edx >>> 0x0000000000006771 <+449>: shl $0x9,%rcx >>> 0x0000000000006775 <+453>: movzwl %di,%edi >>> 0x0000000000006778 <+456>: mov %r9d,%r9d >>> 0x000000000000677b <+459>: and $0x1fffe00,%ecx >>> 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 >>> 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax >>> 0x000000000000678e <+478>: div %rcx >>> 0x0000000000006791 <+481>: add %rdi,%rax >>> 0x0000000000006794 <+484>: cmp %r9,%rax >>> 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> >>> >>> 225 btree_blocks(b)) >>> 226 goto err; >>> 227 >>> 228 err = "bad magic"; >>> 0x00000000000069d0 <+1056>: mov $0x0,%rdx >>> 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069dc <+1068>: nopl 0x0(%rax) >>> >>> 229 if (i->magic != bset_magic(&b->c->sb)) >>> 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) >>> 0x00000000000067ae <+510>: jne 0x69d0 >>> <bch_btree_node_read_done+1056> >>> >>> 230 goto err; >>> 231 >>> 232 err = "bad checksum"; >>> 0x00000000000067df <+559>: mov $0x0,%rdx >>> 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) >>> 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax >>> 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> >>> 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) >>> >>> 233 switch (i->version) { >>> 0x00000000000067b4 <+516>: cmp $0x1,%r10d >>> 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> >>> >>> 234 case 0: >>> 235 if (i->csum != csum_set(i)) >>> 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 >>> 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi >>> 0x00000000000067ce <+542>: sub %rdi,%rsi >>> 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> >>> 0x00000000000067d6 <+550>: cmp %rax,%r15 >>> 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> >>> 236 goto err; >>> 237 break; >>> 238 case BCACHE_BSET_VERSION: >>> 239 if (i->csum != btree_csum_set(b, i)) >>> 0x000000000000669d <+237>: cmp %rax,%r15 >>> 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> >>> 0x00000000000067b8 <+520>: mov (%rbx),%r15 >>> >>> 240 goto err; >>> 241 break; >>> 242 } >>> 243 >>> 244 err = "empty set"; >>> 0x00000000000069e0 <+1072>: mov $0x0,%rdx >>> 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> >>> 245 if (i != b->keys.set[0].data && !i->keys) >>> 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) >>> 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> >>> 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax >>> 0x00000000000066b7 <+263>: test %eax,%eax >>> 0x00000000000066b9 <+265>: je 0x69e0 >>> <bch_btree_node_read_done+1072> >>> >>> 246 goto err; >>> 247 >>> 248 bch_btree_iter_push(iter, i->start, >>> bset_bkey_last(i)); >>> 0x00000000000066c3 <+275>: mov %r14,%rsi >>> 0x00000000000066c6 <+278>: mov %r13,%rdi >>> 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> >>> >>> 249 >>> 250 b->written += set_blocks(i, block_bytes(b->c)); >>> 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi >>> 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax >>> 0x00000000000066d9 <+297>: xor %edx,%edx >>> 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx >>> 0x00000000000066ea <+314>: shl $0x9,%ecx >>> 0x00000000000066ed <+317>: movslq %ecx,%rcx >>> 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax >>> 0x00000000000066f5 <+325>: div %rcx >>> 0x0000000000006704 <+340>: mov %eax,%edi >>> 0x0000000000006706 <+342>: add 0xc0(%r12),%di >>> 0x0000000000006712 <+354>: mov %di,0xc0(%r12) >>> >>> 251 } >>> 252 >>> 253 err = "corrupted btree"; >>> 0x00000000000069b0 <+1024>: mov $0x0,%rdx >>> 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069bc <+1036>: nopl 0x0(%rax) >>> >>> 254 for (i = write_block(b); >>> 0x00000000000068a1 <+753>: cmp %rdx,%rcx >>> 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> >>> 0x00000000000068e0 <+816>: cmp %rdx,%rcx >>> 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> >>> >>> 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); >>> 256 i = ((void *) i) + block_bytes(b->c)) >>> 0x00000000000068d7 <+807>: mov %rcx,%rbx >>> 0x00000000000068da <+810>: sub %r8d,%ecx >>> >>> 257 if (i->seq == b->keys.set[0].data->seq) >>> 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi >>> 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) >>> 0x00000000000068ae <+766>: je 0x69b0 >>> <bch_btree_node_read_done+1024> >>> 0x00000000000068b4 <+772>: cltq >>> 0x00000000000068b6 <+774>: mov %rax,%r9 >>> 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx >>> 0x00000000000068bd <+781>: neg %r9 >>> 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> >>> 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) >>> 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx >>> 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi >>> 0x00000000000068d1 <+801>: je 0x69b0 >>> <bch_btree_node_read_done+1024> >>> >>> 258 goto err; >>> 259 >>> 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); >>> 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 >>> 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx >>> 0x00000000000068f4 <+836>: mov %r13,%rsi >>> 0x00000000000068f7 <+839>: mov %r14,%rdi >>> 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> >>> >>> 261 >>> 262 i = b->keys.set[0].data; >>> 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx >>> >>> 263 err = "short btree key"; >>> 0x00000000000069ec <+1084>: mov $0x0,%rdx >>> 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> >>> 264 if (b->keys.set[0].size && >>> 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax >>> 0x0000000000006914 <+868>: test %eax,%eax >>> 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> >>> 0x0000000000006944 <+916>: test %rax,%rax >>> 0x0000000000006947 <+919>: js 0x69ec >>> <bch_btree_node_read_done+1084> >>> >>> 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) >>> 266 goto err; >>> 267 >>> 268 if (b->written < btree_blocks(b)) >>> 0x000000000000694d <+925>: mov 0x80(%r12),%rax >>> 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi >>> 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx >>> 0x000000000000696c <+956>: shr %cl,%rdx >>> 0x000000000000696f <+959>: cmp %edx,%esi >>> 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> >>> >>> 269 bch_bset_init_next(&b->keys, write_block(b), >>> 0x000000000000698f <+991>: mov %r14,%rdi >>> 0x000000000000699e <+1006>: callq 0x69a3 >>> <bch_btree_node_read_done+1011> >>> 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax >>> 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> >>> >>> 270 bset_magic(&b->c->sb)); >>> 271 out: >>> 272 mempool_free(iter, b->c->fill_iter); >>> 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi >>> 0x000000000000686f <+703>: mov %r13,%rdi >>> 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> >>> >>> 273 return; >>> 274 err: >>> 275 set_btree_node_io_error(b); >>> 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, >>> %u keys", >>> 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d >>> 0x000000000000684a <+666>: mov %esi,%ecx >>> 0x000000000000684c <+668>: mov $0x0,%rsi >>> 0x0000000000006853 <+675>: shr %cl,%r8d >>> 0x0000000000006856 <+678>: mov %rax,%rcx >>> 0x0000000000006859 <+681>: xor %eax,%eax >>> 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> >>> 0x0000000000006860 <+688>: mov 0x80(%r12),%rax >>> >>> 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), >>> 278 bset_block_offset(b, i), i->keys); >>> 279 goto out; >>> 280 } >>> 0x0000000000006877 <+711>: pop %rbx >>> 0x0000000000006878 <+712>: pop %r12 >>> 0x000000000000687a <+714>: pop %r13 >>> 0x000000000000687c <+716>: pop %r14 >>> 0x000000000000687e <+718>: pop %r15 >>> 0x0000000000006880 <+720>: pop %rbp >>> 0x0000000000006881 <+721>: retq >>> 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax >>> 0x0000000000006889 <+729>: shl $0x9,%eax >>> 0x000000000000688c <+732>: imul %eax,%ecx >>> 0x000000000000688f <+735>: movslq %ecx,%rbx >>> >>> >>> On 8/13/2014 1:45 PM, Slava Pestov wrote: >>>> Can you post the disassembly of the function? >>>> >>>> On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey >>>> <llowrey@nuclearwinter.com> wrote: >>>>> Thanks. Trying gdb helped me find the answer. I needed to install the >>>>> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >>>>> >>>>> From addr2line: >>>>>> bch_btree_node_read_done+0x4c >>>>>> drivers/md/bcache/btree.c:207 >>>>> Here'a a snippet from gdb: >>>>> >>>>>> (gdb) list *(bch_btree_node_read_done+0x4c) >>>>>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>>>>> 202 struct bset *i = btree_bset_first(b); >>>>>> 203 struct btree_iter *iter; >>>>>> 204 >>>>>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>>>>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>>>>> 207 iter->used = 0; >>>>>> 208 >>>>>> 209 #ifdef CONFIG_BCACHE_DEBUG >>>>>> 210 iter->b = &b->keys; >>>>>> 211 #endif >>>>> This doesn't make any sense to me. If iter was null I would expect line >>>>> 206 to blow up first. >>>>> >>>>> --Larkin >>>>> >>>>> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>>>>> You can try to use gdb: >>>>>> >>>>>> gdb /lib/modules/.../foo.ko >>>>>> >>>>>> list *(bch_btree_node_read_done+0x4c) >>>>>> >>>>>> >>>>>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>>>>> <llowrey@nuclearwinter.com> wrote: >>>>>>> This is making be feel very dumb. I've googled extensively but can't >>>>>>> figure out how to run addr2line for a module. >>>>>>> >>>>>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>>>>> downloaded the version with symbols but I don't know if the addresses >>>>>>> are going to be the same. Bcache is a module for me and that's where >>>>>>> things get tricky. Do you have any tips? >>>>>>> >>>>>>> --Larkin >>>>>>> >>>>>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>>>>> Any chance you could do an addr2line and get me the exact line where >>>>>>>> it happened? >>>>>>>> >>>>>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>>>>> >>>>>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>>>>> well behaved for about 6 months. >>>>>>>> >>>>>>>> If this isn't a known issue is there anything I can do to provide more >>>>>>>> useful information? >>>>>>>> >>>>>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>>>>> >>>>>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>>>>> dereference at 0000000000000008 >>>>>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.063723] PGD 0 >>>>>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>>>>> scsi_transport_sas cpufreq_stats >>>>>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>>>>> 3.15.8-200.fc20.x86_64 #1 >>>>>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>>>>> task.ti: ffff8800217b8000 >>>>>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>>>>> 0000000000000000 >>>>>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>>>>> 0000000000000246 >>>>>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>>>>> 0000000000000f6b >>>>>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>>>>> ffff880413d06c00 >>>>>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>>>>> ffff880413d06c00 >>>>>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>>>>> 00000000000407e0 >>>>>>>> [210884.245131] Stack: >>>>>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>>>>> ffffffffa0162b68 0000000000000000 >>>>>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>>>>> 0000000000000000 0000000000000000 >>>>>>>> [210884.271234] Call Trace: >>>>>>>> [210884.273985] [<ffffffffa0162b68>] >>>>>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>>>>> [210884.281258] [<ffffffffa0163f69>] >>>>>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>>>>> [210884.288377] [<ffffffffa01642f5>] >>>>>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>>>>> [210884.311158] [<ffffffffa01673f7>] >>>>>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>>>>> 48 8b 43 10 48 85 >>>>>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>>>>> [210884.407171] CR2: 0000000000000008 >>>>>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>>>>> ffffffffffffffd8 >>>>>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>>>>> >>>>>>>> --Larkin >>>>>>>> >>>>>>>> -- >>>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>>> linux-bcache" in >>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>> <mailto:majordomo@vger.kernel.org> >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:30 ` Slava Pestov 2014-08-13 21:34 ` Jianjian Huo 2014-08-13 22:14 ` Larkin Lowrey @ 2014-08-16 5:48 ` Peter Kieser 2 siblings, 0 replies; 13+ messages in thread From: Peter Kieser @ 2014-08-16 5:48 UTC (permalink / raw) To: Slava Pestov, Larkin Lowrey; +Cc: Kent Overstreet, linux-bcache [-- Attachment #1: Type: text/plain, Size: 352 bytes --] On 2014-08-13 2:30 PM, Slava Pestov wrote: > I was mistaken. The bug is fixed in the pull request Kent sent to Jens for 3.16: > > http://evilpiepirate.org/git/linux-bcache.git/commit/?h=bcache-dev&id=bcf090e0040e30f8409e6a535a01e6473afb096f (Again) are these fixes going to be backported to Linux 3.10 (or other longterm kernels?) -Peter [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4504 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:25 ` Slava Pestov 2014-08-13 21:30 ` Slava Pestov @ 2014-08-13 21:32 ` Larkin Lowrey 2014-08-13 21:37 ` Slava Pestov 1 sibling, 1 reply; 13+ messages in thread From: Larkin Lowrey @ 2014-08-13 21:32 UTC (permalink / raw) To: Slava Pestov; +Cc: Kent Overstreet, linux-bcache My swap is an LVM LV on top of a raid10 backed bcache device. I have had a few oopses in recent months but have not been able to pin down the cause. I have begun to suspect that the swap may be involved. The SSDs in that raid10 are junky OCZ Agility3s. They seem to have a reputation for periodic freezes or long pauses. Could it be that the kernel wanted to write to the swap but couldn't because the SSDs were in a long pause and that caused mempool_alloc to return null which then blew up the world? Is there any reason not to put swap on top of a bcache device? --Larkin On 8/13/2014 4:25 PM, Slava Pestov wrote: > Indeed it looks like iter is NULL. I see the bug is still present in > the latest dev branch. The problem is that we're not checking the > return value of mempoool_alloc(), which may be NULL if we pass > GFP_NOWAIT. > > On Wed, Aug 13, 2014 at 2:21 PM, Larkin Lowrey > <llowrey@nuclearwinter.com> wrote: >> Here's the dissassembly of bch_btree_node_read_done. The offending line >> is 207 and the instruction is at offset 76. >> >> --Larkin >> >> 199 void bch_btree_node_read_done(struct btree *b) >> 200 { >> 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> >> 0x00000000000065b5 <+5>: push %rbp >> 0x00000000000065b8 <+8>: mov %rsp,%rbp >> 0x00000000000065bb <+11>: push %r15 >> 0x00000000000065bd <+13>: push %r14 >> 0x00000000000065bf <+15>: push %r13 >> 0x00000000000065c1 <+17>: push %r12 >> 0x00000000000065c3 <+19>: mov %rdi,%r12 >> 0x00000000000065c6 <+22>: push %rbx >> >> 201 const char *err = "bad btree header"; >> 0x0000000000006800 <+592>: mov $0x0,%rdx >> >> 202 struct bset *i = btree_bset_first(b); >> 203 struct btree_iter *iter; >> 204 >> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >> 0x00000000000065b6 <+6>: xor %esi,%esi >> 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax >> 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi >> 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> >> 0x00000000000065e9 <+57>: mov %rax,%r13 >> >> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >> 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi >> 0x00000000000065ec <+60>: xor %edx,%edx >> 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax >> 0x00000000000065f5 <+69>: divw 0x430(%rsi) >> 0x0000000000006604 <+84>: movzwl %ax,%eax >> 0x0000000000006607 <+87>: mov %rax,0x0(%r13) >> >> 207 iter->used = 0; >> 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) >> >> 208 >> 209 #ifdef CONFIG_BCACHE_DEBUG >> 210 iter->b = &b->keys; >> 211 #endif >> 212 >> 213 if (!i->seq) >> 0x000000000000660b <+91>: mov 0x10(%rbx),%rax >> 0x000000000000660f <+95>: test %rax,%rax >> 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> >> >> 214 goto err; >> 215 >> 216 for (; >> 0x000000000000664d <+157>: cmp %r9d,%ecx >> 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> >> 0x0000000000006744 <+404>: cmp %r9d,%r10d >> 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> >> >> 217 b->written < btree_blocks(b) && i->seq == >> b->keys.set[0].data->seq; >> 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi >> 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi >> 0x000000000000662e <+126>: mov 0x108(%r12),%r8 >> 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx >> 0x0000000000006644 <+148>: mov %rdx,%r9 >> 0x0000000000006647 <+151>: shr %cl,%r9 >> 0x000000000000664a <+154>: movzwl %di,%ecx >> 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax >> 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> >> 0x000000000000670f <+351>: mov %rdx,%r9 >> 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx >> 0x0000000000006738 <+392>: shr %cl,%r9 >> 0x000000000000674d <+413>: mov 0x10(%r8),%rcx >> 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) >> 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> >> 0x0000000000006892 <+738>: add %r8,%rbx >> 0x0000000000006895 <+741>: nopl (%rax) >> >> 218 i = write_block(b)) { >> 219 err = "unsupported bset version"; >> 0x00000000000069c0 <+1040>: mov $0x0,%rdx >> 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000069cc <+1052>: nopl 0x0(%rax) >> >> 220 if (i->version > BCACHE_BSET_VERSION) >> 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d >> 0x0000000000006664 <+180>: cmp $0x1,%r10d >> 0x0000000000006668 <+184>: ja 0x69c0 >> <bch_btree_node_read_done+1040> >> 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d >> 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> >> 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) >> 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d >> 0x000000000000675f <+431>: cmp $0x1,%r10d >> 0x0000000000006763 <+435>: ja 0x69c0 >> <bch_btree_node_read_done+1040> >> >> 221 goto err; >> 222 >> 223 err = "bad btree header"; >> 224 if (b->written + set_blocks(i, block_bytes(b->c)) > >> 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax >> 0x000000000000676c <+444>: mov %r11,%rcx >> 0x000000000000676f <+447>: xor %edx,%edx >> 0x0000000000006771 <+449>: shl $0x9,%rcx >> 0x0000000000006775 <+453>: movzwl %di,%edi >> 0x0000000000006778 <+456>: mov %r9d,%r9d >> 0x000000000000677b <+459>: and $0x1fffe00,%ecx >> 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 >> 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax >> 0x000000000000678e <+478>: div %rcx >> 0x0000000000006791 <+481>: add %rdi,%rax >> 0x0000000000006794 <+484>: cmp %r9,%rax >> 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> >> >> 225 btree_blocks(b)) >> 226 goto err; >> 227 >> 228 err = "bad magic"; >> 0x00000000000069d0 <+1056>: mov $0x0,%rdx >> 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000069dc <+1068>: nopl 0x0(%rax) >> >> 229 if (i->magic != bset_magic(&b->c->sb)) >> 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) >> 0x00000000000067ae <+510>: jne 0x69d0 >> <bch_btree_node_read_done+1056> >> >> 230 goto err; >> 231 >> 232 err = "bad checksum"; >> 0x00000000000067df <+559>: mov $0x0,%rdx >> 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) >> 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax >> 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> >> 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) >> >> 233 switch (i->version) { >> 0x00000000000067b4 <+516>: cmp $0x1,%r10d >> 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> >> >> 234 case 0: >> 235 if (i->csum != csum_set(i)) >> 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 >> 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi >> 0x00000000000067ce <+542>: sub %rdi,%rsi >> 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> >> 0x00000000000067d6 <+550>: cmp %rax,%r15 >> 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> >> 236 goto err; >> 237 break; >> 238 case BCACHE_BSET_VERSION: >> 239 if (i->csum != btree_csum_set(b, i)) >> 0x000000000000669d <+237>: cmp %rax,%r15 >> 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> >> 0x00000000000067b8 <+520>: mov (%rbx),%r15 >> >> 240 goto err; >> 241 break; >> 242 } >> 243 >> 244 err = "empty set"; >> 0x00000000000069e0 <+1072>: mov $0x0,%rdx >> 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> >> >> 245 if (i != b->keys.set[0].data && !i->keys) >> 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) >> 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> >> 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax >> 0x00000000000066b7 <+263>: test %eax,%eax >> 0x00000000000066b9 <+265>: je 0x69e0 >> <bch_btree_node_read_done+1072> >> >> 246 goto err; >> 247 >> 248 bch_btree_iter_push(iter, i->start, >> bset_bkey_last(i)); >> 0x00000000000066c3 <+275>: mov %r14,%rsi >> 0x00000000000066c6 <+278>: mov %r13,%rdi >> 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> >> >> 249 >> 250 b->written += set_blocks(i, block_bytes(b->c)); >> 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi >> 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax >> 0x00000000000066d9 <+297>: xor %edx,%edx >> 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx >> 0x00000000000066ea <+314>: shl $0x9,%ecx >> 0x00000000000066ed <+317>: movslq %ecx,%rcx >> 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax >> 0x00000000000066f5 <+325>: div %rcx >> 0x0000000000006704 <+340>: mov %eax,%edi >> 0x0000000000006706 <+342>: add 0xc0(%r12),%di >> 0x0000000000006712 <+354>: mov %di,0xc0(%r12) >> >> 251 } >> 252 >> 253 err = "corrupted btree"; >> 0x00000000000069b0 <+1024>: mov $0x0,%rdx >> 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> >> 0x00000000000069bc <+1036>: nopl 0x0(%rax) >> >> 254 for (i = write_block(b); >> 0x00000000000068a1 <+753>: cmp %rdx,%rcx >> 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> >> 0x00000000000068e0 <+816>: cmp %rdx,%rcx >> 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> >> >> 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); >> 256 i = ((void *) i) + block_bytes(b->c)) >> 0x00000000000068d7 <+807>: mov %rcx,%rbx >> 0x00000000000068da <+810>: sub %r8d,%ecx >> >> 257 if (i->seq == b->keys.set[0].data->seq) >> 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi >> 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) >> 0x00000000000068ae <+766>: je 0x69b0 >> <bch_btree_node_read_done+1024> >> 0x00000000000068b4 <+772>: cltq >> 0x00000000000068b6 <+774>: mov %rax,%r9 >> 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx >> 0x00000000000068bd <+781>: neg %r9 >> 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> >> 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) >> 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx >> 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi >> 0x00000000000068d1 <+801>: je 0x69b0 >> <bch_btree_node_read_done+1024> >> >> 258 goto err; >> 259 >> 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); >> 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 >> 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx >> 0x00000000000068f4 <+836>: mov %r13,%rsi >> 0x00000000000068f7 <+839>: mov %r14,%rdi >> 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> >> >> 261 >> 262 i = b->keys.set[0].data; >> 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx >> >> 263 err = "short btree key"; >> 0x00000000000069ec <+1084>: mov $0x0,%rdx >> 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> >> >> 264 if (b->keys.set[0].size && >> 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax >> 0x0000000000006914 <+868>: test %eax,%eax >> 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> >> 0x0000000000006944 <+916>: test %rax,%rax >> 0x0000000000006947 <+919>: js 0x69ec >> <bch_btree_node_read_done+1084> >> >> 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) >> 266 goto err; >> 267 >> 268 if (b->written < btree_blocks(b)) >> 0x000000000000694d <+925>: mov 0x80(%r12),%rax >> 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi >> 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx >> 0x000000000000696c <+956>: shr %cl,%rdx >> 0x000000000000696f <+959>: cmp %edx,%esi >> 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> >> >> 269 bch_bset_init_next(&b->keys, write_block(b), >> 0x000000000000698f <+991>: mov %r14,%rdi >> 0x000000000000699e <+1006>: callq 0x69a3 >> <bch_btree_node_read_done+1011> >> 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax >> 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> >> >> 270 bset_magic(&b->c->sb)); >> 271 out: >> 272 mempool_free(iter, b->c->fill_iter); >> 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi >> 0x000000000000686f <+703>: mov %r13,%rdi >> 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> >> >> 273 return; >> 274 err: >> 275 set_btree_node_io_error(b); >> 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, >> %u keys", >> 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d >> 0x000000000000684a <+666>: mov %esi,%ecx >> 0x000000000000684c <+668>: mov $0x0,%rsi >> 0x0000000000006853 <+675>: shr %cl,%r8d >> 0x0000000000006856 <+678>: mov %rax,%rcx >> 0x0000000000006859 <+681>: xor %eax,%eax >> 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> >> 0x0000000000006860 <+688>: mov 0x80(%r12),%rax >> >> 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), >> 278 bset_block_offset(b, i), i->keys); >> 279 goto out; >> 280 } >> 0x0000000000006877 <+711>: pop %rbx >> 0x0000000000006878 <+712>: pop %r12 >> 0x000000000000687a <+714>: pop %r13 >> 0x000000000000687c <+716>: pop %r14 >> 0x000000000000687e <+718>: pop %r15 >> 0x0000000000006880 <+720>: pop %rbp >> 0x0000000000006881 <+721>: retq >> 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax >> 0x0000000000006889 <+729>: shl $0x9,%eax >> 0x000000000000688c <+732>: imul %eax,%ecx >> 0x000000000000688f <+735>: movslq %ecx,%rbx >> >> >> On 8/13/2014 1:45 PM, Slava Pestov wrote: >>> Can you post the disassembly of the function? >>> >>> On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey >>> <llowrey@nuclearwinter.com> wrote: >>>> Thanks. Trying gdb helped me find the answer. I needed to install the >>>> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >>>> >>>> From addr2line: >>>>> bch_btree_node_read_done+0x4c >>>>> drivers/md/bcache/btree.c:207 >>>> Here'a a snippet from gdb: >>>> >>>>> (gdb) list *(bch_btree_node_read_done+0x4c) >>>>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>>>> 202 struct bset *i = btree_bset_first(b); >>>>> 203 struct btree_iter *iter; >>>>> 204 >>>>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>>>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>>>> 207 iter->used = 0; >>>>> 208 >>>>> 209 #ifdef CONFIG_BCACHE_DEBUG >>>>> 210 iter->b = &b->keys; >>>>> 211 #endif >>>> This doesn't make any sense to me. If iter was null I would expect line >>>> 206 to blow up first. >>>> >>>> --Larkin >>>> >>>> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>>>> You can try to use gdb: >>>>> >>>>> gdb /lib/modules/.../foo.ko >>>>> >>>>> list *(bch_btree_node_read_done+0x4c) >>>>> >>>>> >>>>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>>>> <llowrey@nuclearwinter.com> wrote: >>>>>> This is making be feel very dumb. I've googled extensively but can't >>>>>> figure out how to run addr2line for a module. >>>>>> >>>>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>>>> downloaded the version with symbols but I don't know if the addresses >>>>>> are going to be the same. Bcache is a module for me and that's where >>>>>> things get tricky. Do you have any tips? >>>>>> >>>>>> --Larkin >>>>>> >>>>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>>>> Any chance you could do an addr2line and get me the exact line where >>>>>>> it happened? >>>>>>> >>>>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>>>> >>>>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>>>> well behaved for about 6 months. >>>>>>> >>>>>>> If this isn't a known issue is there anything I can do to provide more >>>>>>> useful information? >>>>>>> >>>>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>>>> >>>>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>>>> dereference at 0000000000000008 >>>>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>> [210884.063723] PGD 0 >>>>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>>>> scsi_transport_sas cpufreq_stats >>>>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>>>> 3.15.8-200.fc20.x86_64 #1 >>>>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>>>> task.ti: ffff8800217b8000 >>>>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>>>> 0000000000000000 >>>>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>>>> 0000000000000246 >>>>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>>>> 0000000000000f6b >>>>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>>>> ffff880413d06c00 >>>>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>>>> ffff880413d06c00 >>>>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>>>> 00000000000407e0 >>>>>>> [210884.245131] Stack: >>>>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>>>> ffffffffa0162b68 0000000000000000 >>>>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>>>> 0000000000000000 0000000000000000 >>>>>>> [210884.271234] Call Trace: >>>>>>> [210884.273985] [<ffffffffa0162b68>] >>>>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>>>> [210884.281258] [<ffffffffa0163f69>] >>>>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>>>> [210884.288377] [<ffffffffa01642f5>] >>>>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>>>> [210884.311158] [<ffffffffa01673f7>] >>>>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>>>> 48 8b 43 10 48 85 >>>>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>>>> [210884.407171] CR2: 0000000000000008 >>>>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>>>> ffffffffffffffd8 >>>>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>>>> >>>>>>> --Larkin >>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>> linux-bcache" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> <mailto:majordomo@vger.kernel.org> >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Null pointer oops 2014-08-13 21:32 ` Larkin Lowrey @ 2014-08-13 21:37 ` Slava Pestov 0 siblings, 0 replies; 13+ messages in thread From: Slava Pestov @ 2014-08-13 21:37 UTC (permalink / raw) To: Larkin Lowrey; +Cc: Kent Overstreet, linux-bcache Hi Larkin, A mempool_alloc() failing indicates memory pressure. The SSD is not at fault here. On Wed, Aug 13, 2014 at 2:32 PM, Larkin Lowrey <llowrey@nuclearwinter.com> wrote: > My swap is an LVM LV on top of a raid10 backed bcache device. I have had > a few oopses in recent months but have not been able to pin down the > cause. I have begun to suspect that the swap may be involved. The SSDs > in that raid10 are junky OCZ Agility3s. They seem to have a reputation > for periodic freezes or long pauses. Could it be that the kernel wanted > to write to the swap but couldn't because the SSDs were in a long pause > and that caused mempool_alloc to return null which then blew up the world? > > Is there any reason not to put swap on top of a bcache device? > > --Larkin > > On 8/13/2014 4:25 PM, Slava Pestov wrote: >> Indeed it looks like iter is NULL. I see the bug is still present in >> the latest dev branch. The problem is that we're not checking the >> return value of mempoool_alloc(), which may be NULL if we pass >> GFP_NOWAIT. >> >> On Wed, Aug 13, 2014 at 2:21 PM, Larkin Lowrey >> <llowrey@nuclearwinter.com> wrote: >>> Here's the dissassembly of bch_btree_node_read_done. The offending line >>> is 207 and the instruction is at offset 76. >>> >>> --Larkin >>> >>> 199 void bch_btree_node_read_done(struct btree *b) >>> 200 { >>> 0x00000000000065b0 <+0>: callq 0x65b5 <bch_btree_node_read_done+5> >>> 0x00000000000065b5 <+5>: push %rbp >>> 0x00000000000065b8 <+8>: mov %rsp,%rbp >>> 0x00000000000065bb <+11>: push %r15 >>> 0x00000000000065bd <+13>: push %r14 >>> 0x00000000000065bf <+15>: push %r13 >>> 0x00000000000065c1 <+17>: push %r12 >>> 0x00000000000065c3 <+19>: mov %rdi,%r12 >>> 0x00000000000065c6 <+22>: push %rbx >>> >>> 201 const char *err = "bad btree header"; >>> 0x0000000000006800 <+592>: mov $0x0,%rdx >>> >>> 202 struct bset *i = btree_bset_first(b); >>> 203 struct btree_iter *iter; >>> 204 >>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>> 0x00000000000065b6 <+6>: xor %esi,%esi >>> 0x00000000000065c7 <+23>: mov 0x80(%rdi),%rax >>> 0x00000000000065d5 <+37>: mov 0xcb58(%rax),%rdi >>> 0x00000000000065dc <+44>: callq 0x65e1 <bch_btree_node_read_done+49> >>> 0x00000000000065e9 <+57>: mov %rax,%r13 >>> >>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>> 0x00000000000065e1 <+49>: mov 0x80(%r12),%rsi >>> 0x00000000000065ec <+60>: xor %edx,%edx >>> 0x00000000000065ee <+62>: movzwl 0x432(%rsi),%eax >>> 0x00000000000065f5 <+69>: divw 0x430(%rsi) >>> 0x0000000000006604 <+84>: movzwl %ax,%eax >>> 0x0000000000006607 <+87>: mov %rax,0x0(%r13) >>> >>> 207 iter->used = 0; >>> 0x00000000000065fc <+76>: movq $0x0,0x8(%r13) >>> >>> 208 >>> 209 #ifdef CONFIG_BCACHE_DEBUG >>> 210 iter->b = &b->keys; >>> 211 #endif >>> 212 >>> 213 if (!i->seq) >>> 0x000000000000660b <+91>: mov 0x10(%rbx),%rax >>> 0x000000000000660f <+95>: test %rax,%rax >>> 0x0000000000006612 <+98>: je 0x6800 <bch_btree_node_read_done+592> >>> >>> 214 goto err; >>> 215 >>> 216 for (; >>> 0x000000000000664d <+157>: cmp %r9d,%ecx >>> 0x0000000000006650 <+160>: jae 0x6882 <bch_btree_node_read_done+722> >>> 0x0000000000006744 <+404>: cmp %r9d,%r10d >>> 0x0000000000006747 <+407>: jae 0x6898 <bch_btree_node_read_done+744> >>> >>> 217 b->written < btree_blocks(b) && i->seq == >>> b->keys.set[0].data->seq; >>> 0x0000000000006618 <+104>: mov 0x80(%r12),%rsi >>> 0x0000000000006625 <+117>: movzwl 0xc0(%r12),%edi >>> 0x000000000000662e <+126>: mov 0x108(%r12),%r8 >>> 0x0000000000006636 <+134>: movzwl 0xde2(%rsi),%ecx >>> 0x0000000000006644 <+148>: mov %rdx,%r9 >>> 0x0000000000006647 <+151>: shr %cl,%r9 >>> 0x000000000000664a <+154>: movzwl %di,%ecx >>> 0x0000000000006656 <+166>: cmp 0x10(%r8),%rax >>> 0x000000000000665a <+170>: jne 0x6882 <bch_btree_node_read_done+722> >>> 0x000000000000670f <+351>: mov %rdx,%r9 >>> 0x000000000000672a <+378>: movzwl 0xde2(%rsi),%ecx >>> 0x0000000000006738 <+392>: shr %cl,%r9 >>> 0x000000000000674d <+413>: mov 0x10(%r8),%rcx >>> 0x0000000000006751 <+417>: cmp %rcx,0x10(%rbx) >>> 0x0000000000006755 <+421>: jne 0x6898 <bch_btree_node_read_done+744> >>> 0x0000000000006892 <+738>: add %r8,%rbx >>> 0x0000000000006895 <+741>: nopl (%rax) >>> >>> 218 i = write_block(b)) { >>> 219 err = "unsupported bset version"; >>> 0x00000000000069c0 <+1040>: mov $0x0,%rdx >>> 0x00000000000069c7 <+1047>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069cc <+1052>: nopl 0x0(%rax) >>> >>> 220 if (i->version > BCACHE_BSET_VERSION) >>> 0x0000000000006660 <+176>: mov 0x18(%rbx),%r10d >>> 0x0000000000006664 <+180>: cmp $0x1,%r10d >>> 0x0000000000006668 <+184>: ja 0x69c0 >>> <bch_btree_node_read_done+1040> >>> 0x000000000000666e <+190>: movzwl 0x430(%rsi),%r11d >>> 0x0000000000006676 <+198>: jmpq 0x6769 <bch_btree_node_read_done+441> >>> 0x000000000000667b <+203>: nopl 0x0(%rax,%rax,1) >>> 0x000000000000675b <+427>: mov 0x18(%rbx),%r10d >>> 0x000000000000675f <+431>: cmp $0x1,%r10d >>> 0x0000000000006763 <+435>: ja 0x69c0 >>> <bch_btree_node_read_done+1040> >>> >>> 221 goto err; >>> 222 >>> 223 err = "bad btree header"; >>> 224 if (b->written + set_blocks(i, block_bytes(b->c)) > >>> 0x0000000000006769 <+441>: mov 0x1c(%rbx),%eax >>> 0x000000000000676c <+444>: mov %r11,%rcx >>> 0x000000000000676f <+447>: xor %edx,%edx >>> 0x0000000000006771 <+449>: shl $0x9,%rcx >>> 0x0000000000006775 <+453>: movzwl %di,%edi >>> 0x0000000000006778 <+456>: mov %r9d,%r9d >>> 0x000000000000677b <+459>: and $0x1fffe00,%ecx >>> 0x0000000000006781 <+465>: lea 0x20(,%rax,8),%r8 >>> 0x0000000000006789 <+473>: lea -0x1(%r8,%rcx,1),%rax >>> 0x000000000000678e <+478>: div %rcx >>> 0x0000000000006791 <+481>: add %rdi,%rax >>> 0x0000000000006794 <+484>: cmp %r9,%rax >>> 0x0000000000006797 <+487>: ja 0x6800 <bch_btree_node_read_done+592> >>> >>> 225 btree_blocks(b)) >>> 226 goto err; >>> 227 >>> 228 err = "bad magic"; >>> 0x00000000000069d0 <+1056>: mov $0x0,%rdx >>> 0x00000000000069d7 <+1063>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069dc <+1068>: nopl 0x0(%rax) >>> >>> 229 if (i->magic != bset_magic(&b->c->sb)) >>> 0x00000000000067aa <+506>: cmp %rax,0x8(%rbx) >>> 0x00000000000067ae <+510>: jne 0x69d0 >>> <bch_btree_node_read_done+1056> >>> >>> 230 goto err; >>> 231 >>> 232 err = "bad checksum"; >>> 0x00000000000067df <+559>: mov $0x0,%rdx >>> 0x00000000000067e6 <+566>: jmp 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000067e8 <+568>: nopl 0x0(%rax,%rax,1) >>> 0x00000000000067f0 <+576>: mov 0x1c(%rbx),%eax >>> 0x00000000000067f3 <+579>: jmpq 0x66bf <bch_btree_node_read_done+271> >>> 0x00000000000067f8 <+584>: nopl 0x0(%rax,%rax,1) >>> >>> 233 switch (i->version) { >>> 0x00000000000067b4 <+516>: cmp $0x1,%r10d >>> 0x00000000000067bb <+523>: je 0x6680 <bch_btree_node_read_done+208> >>> >>> 234 case 0: >>> 235 if (i->csum != csum_set(i)) >>> 0x00000000000067c1 <+529>: lea 0x20(%rbx),%r14 >>> 0x00000000000067c5 <+533>: lea 0x8(%rbx),%rdi >>> 0x00000000000067ce <+542>: sub %rdi,%rsi >>> 0x00000000000067d1 <+545>: callq 0x67d6 <bch_btree_node_read_done+550> >>> 0x00000000000067d6 <+550>: cmp %rax,%r15 >>> 0x00000000000067d9 <+553>: je 0x66a6 <bch_btree_node_read_done+246> >>> 236 goto err; >>> 237 break; >>> 238 case BCACHE_BSET_VERSION: >>> 239 if (i->csum != btree_csum_set(b, i)) >>> 0x000000000000669d <+237>: cmp %rax,%r15 >>> 0x00000000000066a0 <+240>: jne 0x67df <bch_btree_node_read_done+559> >>> 0x00000000000067b8 <+520>: mov (%rbx),%r15 >>> >>> 240 goto err; >>> 241 break; >>> 242 } >>> 243 >>> 244 err = "empty set"; >>> 0x00000000000069e0 <+1072>: mov $0x0,%rdx >>> 0x00000000000069e7 <+1079>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> >>> 245 if (i != b->keys.set[0].data && !i->keys) >>> 0x00000000000066a6 <+246>: cmp %rbx,0x108(%r12) >>> 0x00000000000066ae <+254>: je 0x67f0 <bch_btree_node_read_done+576> >>> 0x00000000000066b4 <+260>: mov 0x1c(%rbx),%eax >>> 0x00000000000066b7 <+263>: test %eax,%eax >>> 0x00000000000066b9 <+265>: je 0x69e0 >>> <bch_btree_node_read_done+1072> >>> >>> 246 goto err; >>> 247 >>> 248 bch_btree_iter_push(iter, i->start, >>> bset_bkey_last(i)); >>> 0x00000000000066c3 <+275>: mov %r14,%rsi >>> 0x00000000000066c6 <+278>: mov %r13,%rdi >>> 0x00000000000066c9 <+281>: callq 0x66ce <bch_btree_node_read_done+286> >>> >>> 249 >>> 250 b->written += set_blocks(i, block_bytes(b->c)); >>> 0x00000000000066ce <+286>: mov 0x80(%r12),%rsi >>> 0x00000000000066d6 <+294>: mov 0x1c(%rbx),%eax >>> 0x00000000000066d9 <+297>: xor %edx,%edx >>> 0x00000000000066e3 <+307>: movzwl 0x430(%rsi),%ecx >>> 0x00000000000066ea <+314>: shl $0x9,%ecx >>> 0x00000000000066ed <+317>: movslq %ecx,%rcx >>> 0x00000000000066f0 <+320>: lea 0x1f(%rcx,%rax,8),%rax >>> 0x00000000000066f5 <+325>: div %rcx >>> 0x0000000000006704 <+340>: mov %eax,%edi >>> 0x0000000000006706 <+342>: add 0xc0(%r12),%di >>> 0x0000000000006712 <+354>: mov %di,0xc0(%r12) >>> >>> 251 } >>> 252 >>> 253 err = "corrupted btree"; >>> 0x00000000000069b0 <+1024>: mov $0x0,%rdx >>> 0x00000000000069b7 <+1031>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> 0x00000000000069bc <+1036>: nopl 0x0(%rax) >>> >>> 254 for (i = write_block(b); >>> 0x00000000000068a1 <+753>: cmp %rdx,%rcx >>> 0x00000000000068a4 <+756>: jae 0x68e5 <bch_btree_node_read_done+821> >>> 0x00000000000068e0 <+816>: cmp %rdx,%rcx >>> 0x00000000000068e3 <+819>: jb 0x68c8 <bch_btree_node_read_done+792> >>> >>> 255 bset_sector_offset(&b->keys, i) < KEY_SIZE(&b->key); >>> 256 i = ((void *) i) + block_bytes(b->c)) >>> 0x00000000000068d7 <+807>: mov %rcx,%rbx >>> 0x00000000000068da <+810>: sub %r8d,%ecx >>> >>> 257 if (i->seq == b->keys.set[0].data->seq) >>> 0x00000000000068a6 <+758>: mov 0x10(%r8),%rdi >>> 0x00000000000068aa <+762>: cmp %rdi,0x10(%rbx) >>> 0x00000000000068ae <+766>: je 0x69b0 >>> <bch_btree_node_read_done+1024> >>> 0x00000000000068b4 <+772>: cltq >>> 0x00000000000068b6 <+774>: mov %rax,%r9 >>> 0x00000000000068b9 <+777>: lea (%rbx,%rax,1),%rcx >>> 0x00000000000068bd <+781>: neg %r9 >>> 0x00000000000068c0 <+784>: jmp 0x68d7 <bch_btree_node_read_done+807> >>> 0x00000000000068c2 <+786>: nopw 0x0(%rax,%rax,1) >>> 0x00000000000068c8 <+792>: lea (%rbx,%rax,1),%rcx >>> 0x00000000000068cc <+796>: cmp 0x10(%rcx,%r9,1),%rdi >>> 0x00000000000068d1 <+801>: je 0x69b0 >>> <bch_btree_node_read_done+1024> >>> >>> 258 goto err; >>> 259 >>> 260 bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); >>> 0x00000000000068e5 <+821>: lea 0xc8(%r12),%r14 >>> 0x00000000000068ed <+829>: lea 0xcb60(%rsi),%rdx >>> 0x00000000000068f4 <+836>: mov %r13,%rsi >>> 0x00000000000068f7 <+839>: mov %r14,%rdi >>> 0x00000000000068fa <+842>: callq 0x68ff <bch_btree_node_read_done+847> >>> >>> 261 >>> 262 i = b->keys.set[0].data; >>> 0x0000000000006907 <+855>: mov 0x108(%r12),%rbx >>> >>> 263 err = "short btree key"; >>> 0x00000000000069ec <+1084>: mov $0x0,%rdx >>> 0x00000000000069f3 <+1091>: jmpq 0x6807 <bch_btree_node_read_done+599> >>> >>> 264 if (b->keys.set[0].size && >>> 0x00000000000068ff <+847>: mov 0xe0(%r12),%eax >>> 0x0000000000006914 <+868>: test %eax,%eax >>> 0x0000000000006916 <+870>: je 0x694d <bch_btree_node_read_done+925> >>> 0x0000000000006944 <+916>: test %rax,%rax >>> 0x0000000000006947 <+919>: js 0x69ec >>> <bch_btree_node_read_done+1084> >>> >>> 265 bkey_cmp(&b->key, &b->keys.set[0].end) < 0) >>> 266 goto err; >>> 267 >>> 268 if (b->written < btree_blocks(b)) >>> 0x000000000000694d <+925>: mov 0x80(%r12),%rax >>> 0x0000000000006955 <+933>: movzwl 0xc0(%r12),%esi >>> 0x0000000000006965 <+949>: movzwl 0xde2(%rax),%ecx >>> 0x000000000000696c <+956>: shr %cl,%rdx >>> 0x000000000000696f <+959>: cmp %edx,%esi >>> 0x0000000000006971 <+961>: jae 0x6868 <bch_btree_node_read_done+696> >>> >>> 269 bch_bset_init_next(&b->keys, write_block(b), >>> 0x000000000000698f <+991>: mov %r14,%rdi >>> 0x000000000000699e <+1006>: callq 0x69a3 >>> <bch_btree_node_read_done+1011> >>> 0x00000000000069a3 <+1011>: mov 0x80(%r12),%rax >>> 0x00000000000069ab <+1019>: jmpq 0x6868 <bch_btree_node_read_done+696> >>> >>> 270 bset_magic(&b->c->sb)); >>> 271 out: >>> 272 mempool_free(iter, b->c->fill_iter); >>> 0x0000000000006868 <+696>: mov 0xcb58(%rax),%rsi >>> 0x000000000000686f <+703>: mov %r13,%rdi >>> 0x0000000000006872 <+706>: callq 0x6877 <bch_btree_node_read_done+711> >>> >>> 273 return; >>> 274 err: >>> 275 set_btree_node_io_error(b); >>> 276 bch_cache_set_error(b->c, "%s at bucket %zu, block %u, >>> %u keys", >>> 0x0000000000006829 <+633>: mov 0x1c(%rbx),%r9d >>> 0x000000000000684a <+666>: mov %esi,%ecx >>> 0x000000000000684c <+668>: mov $0x0,%rsi >>> 0x0000000000006853 <+675>: shr %cl,%r8d >>> 0x0000000000006856 <+678>: mov %rax,%rcx >>> 0x0000000000006859 <+681>: xor %eax,%eax >>> 0x000000000000685b <+683>: callq 0x6860 <bch_btree_node_read_done+688> >>> 0x0000000000006860 <+688>: mov 0x80(%r12),%rax >>> >>> 277 err, PTR_BUCKET_NR(b->c, &b->key, 0), >>> 278 bset_block_offset(b, i), i->keys); >>> 279 goto out; >>> 280 } >>> 0x0000000000006877 <+711>: pop %rbx >>> 0x0000000000006878 <+712>: pop %r12 >>> 0x000000000000687a <+714>: pop %r13 >>> 0x000000000000687c <+716>: pop %r14 >>> 0x000000000000687e <+718>: pop %r15 >>> 0x0000000000006880 <+720>: pop %rbp >>> 0x0000000000006881 <+721>: retq >>> 0x0000000000006882 <+722>: movzwl 0x430(%rsi),%eax >>> 0x0000000000006889 <+729>: shl $0x9,%eax >>> 0x000000000000688c <+732>: imul %eax,%ecx >>> 0x000000000000688f <+735>: movslq %ecx,%rbx >>> >>> >>> On 8/13/2014 1:45 PM, Slava Pestov wrote: >>>> Can you post the disassembly of the function? >>>> >>>> On Wed, Aug 13, 2014 at 11:35 AM, Larkin Lowrey >>>> <llowrey@nuclearwinter.com> wrote: >>>>> Thanks. Trying gdb helped me find the answer. I needed to install the >>>>> kernel-debuginfo-3.15.8-200.fc20.x86_64 package via yum. >>>>> >>>>> From addr2line: >>>>>> bch_btree_node_read_done+0x4c >>>>>> drivers/md/bcache/btree.c:207 >>>>> Here'a a snippet from gdb: >>>>> >>>>>> (gdb) list *(bch_btree_node_read_done+0x4c) >>>>>> 0x65fc is in bch_btree_node_read_done (drivers/md/bcache/btree.c:207). >>>>>> 202 struct bset *i = btree_bset_first(b); >>>>>> 203 struct btree_iter *iter; >>>>>> 204 >>>>>> 205 iter = mempool_alloc(b->c->fill_iter, GFP_NOWAIT); >>>>>> 206 iter->size = b->c->sb.bucket_size / b->c->sb.block_size; >>>>>> 207 iter->used = 0; >>>>>> 208 >>>>>> 209 #ifdef CONFIG_BCACHE_DEBUG >>>>>> 210 iter->b = &b->keys; >>>>>> 211 #endif >>>>> This doesn't make any sense to me. If iter was null I would expect line >>>>> 206 to blow up first. >>>>> >>>>> --Larkin >>>>> >>>>> On 8/13/2014 12:41 PM, Slava Pestov wrote: >>>>>> You can try to use gdb: >>>>>> >>>>>> gdb /lib/modules/.../foo.ko >>>>>> >>>>>> list *(bch_btree_node_read_done+0x4c) >>>>>> >>>>>> >>>>>> On Wed, Aug 13, 2014 at 9:40 AM, Larkin Lowrey >>>>>> <llowrey@nuclearwinter.com> wrote: >>>>>>> This is making be feel very dumb. I've googled extensively but can't >>>>>>> figure out how to run addr2line for a module. >>>>>>> >>>>>>> I'm running Fedora 20 and the kernel did not have debugging symbols. I >>>>>>> downloaded the version with symbols but I don't know if the addresses >>>>>>> are going to be the same. Bcache is a module for me and that's where >>>>>>> things get tricky. Do you have any tips? >>>>>>> >>>>>>> --Larkin >>>>>>> >>>>>>> On 8/13/2014 12:04 AM, Kent Overstreet wrote: >>>>>>>> Any chance you could do an addr2line and get me the exact line where >>>>>>>> it happened? >>>>>>>> >>>>>>>> On Aug 12, 2014 10:02 PM, "Larkin Lowrey" <llowrey@nuclearwinter.com >>>>>>>> <mailto:llowrey@nuclearwinter.com>> wrote: >>>>>>>> >>>>>>>> I got an oops while doing some heavy I/O. I have an md raid10 cache >>>>>>>> device (4 SSDs) and 3 md raid5/6 backing devices. This setup has been >>>>>>>> well behaved for about 6 months. >>>>>>>> >>>>>>>> If this isn't a known issue is there anything I can do to provide more >>>>>>>> useful information? >>>>>>>> >>>>>>>> I'm running kernel 3.15.8-200.fc20.x86_64. >>>>>>>> >>>>>>>> [210884.047249] BUG: unable to handle kernel NULL pointer >>>>>>>> dereference at 0000000000000008 >>>>>>>> [210884.055605] IP: [<ffffffffa01625fc>] >>>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.063723] PGD 0 >>>>>>>> [210884.066053] Oops: 0002 [#1] SMP >>>>>>>> [210884.069610] Modules linked in: lp parport binfmt_misc >>>>>>>> ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM >>>>>>>> iptable_mangle tun bridge stp llc xt_multiport ebtable_nat >>>>>>>> ebtables hwmon_vid ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>>>>>>> nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter xt_conntrack >>>>>>>> ip6_tables nf_conntrack keyspan ezusb kvm_amd kvm crct10dif_pclmul >>>>>>>> crc32_pclmul crc32c_intel ghash_clmulni_intel microcode serio_raw >>>>>>>> amd64_edac_mod edac_core fam15h_power k10temp edac_mce_amd >>>>>>>> sp5100_tco i2c_piix4 igb ptp pps_core dca shpchp acpi_cpufreq >>>>>>>> btrfs bcache raid456 async_raid6_recov async_memcpy async_pq >>>>>>>> async_xor async_tx xor raid6_pq raid10 i2c_algo_bit drm_kms_helper >>>>>>>> ttm drm i2c_core mpt2sas mvsas libsas raid_class >>>>>>>> scsi_transport_sas cpufreq_stats >>>>>>>> [210884.140704] CPU: 5 PID: 11188 Comm: kworker/5:1 Not tainted >>>>>>>> 3.15.8-200.fc20.x86_64 #1 >>>>>>>> [210884.149069] Hardware name: /H8DG6/H8DGi, BIOS 3.0a 07/2 >>>>>>>> [210884.155280] Workqueue: bcache cache_lookup [bcache] >>>>>>>> [210884.160531] task: ffff880218633160 ti: ffff8800217b8000 >>>>>>>> task.ti: ffff8800217b8000 >>>>>>>> [210884.168502] RIP: 0010:[<ffffffffa01625fc>] >>>>>>>> [<ffffffffa01625fc>] bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.179105] RSP: 0000:ffff8800217bbbe8 EFLAGS: 00010212 >>>>>>>> [210884.184806] RAX: 0000000000000400 RBX: ffff880245ec0000 RCX: >>>>>>>> 0000000000000000 >>>>>>>> [210884.192480] RDX: 0000000000000000 RSI: ffff880418380000 RDI: >>>>>>>> 0000000000000246 >>>>>>>> [210884.200075] RBP: ffff8800217bbc10 R08: 0000000000000000 R09: >>>>>>>> 0000000000000f6b >>>>>>>> [210884.207738] R10: 0000000000000000 R11: 0000000000000400 R12: >>>>>>>> ffff880413d06c00 >>>>>>>> [210884.215391] R13: 0000000000000000 R14: ffff8800217bbc20 R15: >>>>>>>> ffff880413d06c00 >>>>>>>> [210884.222961] FS: 00007f73bacd6880(0000) >>>>>>>> GS:ffff88021fd40000(0000) knlGS:0000000000000000 >>>>>>>> [210884.231516] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>>>>>> [210884.237557] CR2: 0000000000000008 CR3: 0000000001c11000 CR4: >>>>>>>> 00000000000407e0 >>>>>>>> [210884.245131] Stack: >>>>>>>> [210884.247395] ffff880274f4d020 ffff880413d06c00 >>>>>>>> 0000bfcc44a463f8 ffff8800217bbc20 >>>>>>>> [210884.255337] ffff880413d06c00 ffff8800217bbc78 >>>>>>>> ffffffffa0162b68 0000000000000000 >>>>>>>> [210884.263256] ffff880218633160 0000000000000000 >>>>>>>> 0000000000000000 0000000000000000 >>>>>>>> [210884.271234] Call Trace: >>>>>>>> [210884.273985] [<ffffffffa0162b68>] >>>>>>>> bch_btree_node_read+0x168/0x190 [bcache] >>>>>>>> [210884.281258] [<ffffffffa0163f69>] >>>>>>>> bch_btree_node_get+0x169/0x290 [bcache] >>>>>>>> [210884.288377] [<ffffffffa01642f5>] >>>>>>>> bch_btree_map_keys_recurse+0xd5/0x1d0 [bcache] >>>>>>>> [210884.296311] [<ffffffffa016dcb0>] ? >>>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>>> [210884.303953] [<ffffffff8135b204>] ? >>>>>>>> call_rwsem_down_read_failed+0x14/0x30 >>>>>>>> [210884.311158] [<ffffffffa01673f7>] >>>>>>>> bch_btree_map_keys+0x127/0x150 [bcache] >>>>>>>> [210884.318273] [<ffffffffa016dcb0>] ? >>>>>>>> cached_dev_congested+0x180/0x180 [bcache] >>>>>>>> [210884.325826] [<ffffffffa016e7f5>] cache_lookup+0xf5/0x1f0 [bcache] >>>>>>>> [210884.332325] [<ffffffff810a4af6>] process_one_work+0x176/0x430 >>>>>>>> [210884.338427] [<ffffffff810a578b>] worker_thread+0x11b/0x3a0 >>>>>>>> [210884.344282] [<ffffffff810a5670>] ? rescuer_thread+0x3b0/0x3b0 >>>>>>>> [210884.350447] [<ffffffff810ac528>] kthread+0xd8/0xf0 >>>>>>>> [210884.355615] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>>> [210884.362017] [<ffffffff816ff93c>] ret_from_fork+0x7c/0xb0 >>>>>>>> [210884.367756] [<ffffffff810ac450>] ? insert_kthread_work+0x40/0x40 >>>>>>>> [210884.374234] Code: 08 01 00 00 48 8b b8 58 cb 00 00 e8 bf 25 01 >>>>>>>> e1 49 8b b4 24 80 00 00 00 49 89 c5 31 d2 0f b7 86 32 04 00 00 66 >>>>>>>> f7 b6 30 04 00 00 <49> c7 45 08 00 00 00 00 0f b7 c0 49 89 45 00 >>>>>>>> 48 8b 43 10 48 85 >>>>>>>> [210884.395405] RIP [<ffffffffa01625fc>] >>>>>>>> bch_btree_node_read_done+0x4c/0x450 [bcache] >>>>>>>> [210884.403389] RSP <ffff8800217bbbe8> >>>>>>>> [210884.407171] CR2: 0000000000000008 >>>>>>>> [210884.411233] ---[ end trace 0064e6abfd068c85 ]--- >>>>>>>> [210884.416352] BUG: unable to handle kernel paging request at >>>>>>>> ffffffffffffffd8 >>>>>>>> [210884.423871] IP: [<ffffffff810acb10>] kthread_data+0x10/0x20 >>>>>>>> [210884.429915] PGD 1c14067 PUD 1c16067 PMD 0 >>>>>>>> >>>>>>>> --Larkin >>>>>>>> >>>>>>>> -- >>>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>>> linux-bcache" in >>>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>>> <mailto:majordomo@vger.kernel.org> >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>>> the body of a message to majordomo@vger.kernel.org >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2014-08-16 5:48 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-08-13 5:02 Null pointer oops Larkin Lowrey [not found] ` <CALJ65z=25CrrO9uMc2vfYVAQWb=6eK+OhB5TGJJrCp=D4ALvrQ@mail.gmail.com> 2014-08-13 16:40 ` Larkin Lowrey 2014-08-13 17:41 ` Slava Pestov 2014-08-13 18:35 ` Larkin Lowrey 2014-08-13 18:45 ` Slava Pestov 2014-08-13 21:21 ` Larkin Lowrey 2014-08-13 21:25 ` Slava Pestov 2014-08-13 21:30 ` Slava Pestov 2014-08-13 21:34 ` Jianjian Huo 2014-08-13 22:14 ` Larkin Lowrey 2014-08-16 5:48 ` Peter Kieser 2014-08-13 21:32 ` Larkin Lowrey 2014-08-13 21:37 ` Slava Pestov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.