* null pointer dereference in nvme_tcp_io_work
@ 2019-12-25 10:06 Stefan Majer
2019-12-26 17:47 ` Keith Busch
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-25 10:06 UTC (permalink / raw)
To: linux-nvme; +Cc: kbusch
Hi,
im trying to setup a nvme-over-tcp test environment with a qemu-kvm
based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
with kernel 5.4.6 installed. Kernel was taken from
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
occurs with ubuntu 19.10 kernel 5.3.x
After setup the target i can discover and connect the exported nvme
device on the host with:
modprobe nvme
modprobe nvme-tcp
nvme discover -t tcp -a 192.168.22.1 -s 4420
nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
No errors so far, but when i try to format the device with:
mkfs.ext4 /dev/nvme0n1
The kernel panics with:
Writing inode tables:
[ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 692.653158] #PF: supervisor read access in kernel mode
[ 692.653922] #PF: error_code(0x0000) - not-present page
[ 692.653922] PGD 0 P4D 0
[ 692.653922] Oops: 0000 [#1] SMP PTI
[ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
5.4.6-050406-generic #201912211140
[ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
[ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
[ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
89 75
[ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
[ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
[ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
[ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
[ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
[ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
[ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
knlGS:0000000000000000
[ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
[ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 692.653922] Call Trace:
[ 692.653922] process_one_work+0x1ec/0x3a0
[ 692.653922] worker_thread+0x4d/0x400
[ 692.653922] kthread+0x104/0x140
[ 692.653922] ? process_one_work+0x3a0/0x3a0
[ 692.653922] ? kthread_park+0x90/0x90
[ 692.653922] ret_from_fork+0x35/0x40
[ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
net_failover failover libahci i2c_piix4 pata_acpi floppy
[ 692.653922] CR2: 0000000000000008
[ 692.653922] ---[ end trace d688c2c182feef87 ]---
[ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
[ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
89 75
[ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
[ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
[ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
[ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
[ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
[ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
[ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
knlGS:0000000000000000
[ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
[ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Any help appreciated.
Greetings
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2019-12-25 10:06 null pointer dereference in nvme_tcp_io_work Stefan Majer
@ 2019-12-26 17:47 ` Keith Busch
2019-12-27 7:18 ` Stefan Majer
0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2019-12-26 17:47 UTC (permalink / raw)
To: Stefan Majer, sagi; +Cc: linux-nvme
Adding Sagi.
On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> Hi,
>
> im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> with kernel 5.4.6 installed. Kernel was taken from
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> occurs with ubuntu 19.10 kernel 5.3.x
>
> After setup the target i can discover and connect the exported nvme
> device on the host with:
> modprobe nvme
> modprobe nvme-tcp
> nvme discover -t tcp -a 192.168.22.1 -s 4420
> nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
>
> No errors so far, but when i try to format the device with:
>
> mkfs.ext4 /dev/nvme0n1
>
> The kernel panics with:
> Writing inode tables:
> [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> [ 692.653158] #PF: supervisor read access in kernel mode
> [ 692.653922] #PF: error_code(0x0000) - not-present page
> [ 692.653922] PGD 0 P4D 0
> [ 692.653922] Oops: 0000 [#1] SMP PTI
> [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> 5.4.6-050406-generic #201912211140
> [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 0.0.0 02/06/2015
> [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> 89 75
> [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> knlGS:0000000000000000
> [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 692.653922] Call Trace:
> [ 692.653922] process_one_work+0x1ec/0x3a0
> [ 692.653922] worker_thread+0x4d/0x400
> [ 692.653922] kthread+0x104/0x140
> [ 692.653922] ? process_one_work+0x3a0/0x3a0
> [ 692.653922] ? kthread_park+0x90/0x90
> [ 692.653922] ret_from_fork+0x35/0x40
> [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> net_failover failover libahci i2c_piix4 pata_acpi floppy
> [ 692.653922] CR2: 0000000000000008
> [ 692.653922] ---[ end trace d688c2c182feef87 ]---
> [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> 89 75
> [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> knlGS:0000000000000000
> [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>
>
> Any help appreciated.
>
> Greetings
>
> --
> Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2019-12-26 17:47 ` Keith Busch
@ 2019-12-27 7:18 ` Stefan Majer
[not found] ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-27 7:18 UTC (permalink / raw)
To: Keith Busch; +Cc: sagi, linux-nvme
Hi,
i have to add that doing the same on bare metal does work without any problems.
I suspect that this is probably caused by the fact that in the above
example my target is a qemu-kvm machine with a emulated nvme device.
Greetings
Stefan
On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
>
> Adding Sagi.
>
> On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > Hi,
> >
> > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > with kernel 5.4.6 installed. Kernel was taken from
> > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > occurs with ubuntu 19.10 kernel 5.3.x
> >
> > After setup the target i can discover and connect the exported nvme
> > device on the host with:
> > modprobe nvme
> > modprobe nvme-tcp
> > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> >
> > No errors so far, but when i try to format the device with:
> >
> > mkfs.ext4 /dev/nvme0n1
> >
> > The kernel panics with:
> > Writing inode tables:
> > [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > [ 692.653158] #PF: supervisor read access in kernel mode
> > [ 692.653922] #PF: error_code(0x0000) - not-present page
> > [ 692.653922] PGD 0 P4D 0
> > [ 692.653922] Oops: 0000 [#1] SMP PTI
> > [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > 5.4.6-050406-generic #201912211140
> > [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 0.0.0 02/06/2015
> > [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > 89 75
> > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> > knlGS:0000000000000000
> > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 692.653922] Call Trace:
> > [ 692.653922] process_one_work+0x1ec/0x3a0
> > [ 692.653922] worker_thread+0x4d/0x400
> > [ 692.653922] kthread+0x104/0x140
> > [ 692.653922] ? process_one_work+0x3a0/0x3a0
> > [ 692.653922] ? kthread_park+0x90/0x90
> > [ 692.653922] ret_from_fork+0x35/0x40
> > [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > [ 692.653922] CR2: 0000000000000008
> > [ 692.653922] ---[ end trace d688c2c182feef87 ]---
> > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > 89 75
> > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> > knlGS:0000000000000000
> > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >
> >
> > Any help appreciated.
> >
> > Greetings
> >
> > --
> > Stefan Majer
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
[not found] ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
@ 2019-12-27 7:54 ` Stefan Majer
2019-12-28 17:49 ` Stefan Majer
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-27 7:54 UTC (permalink / raw)
To: sagi grimberg; +Cc: Keith Busch, linux-nvme
Hi,
no problem, i am also on vacation.
the issue is not reproducible in a pure bare metal environment, target
and host are physical machines.
The environment where it happens both machines are kvm based.
I first have to figure out howto gdb on the kernel crash, thats not my
daily jobs, so please be patient.
Greetings
Stefan
On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
>
> Hey,
>
> On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
>
> Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
>
> On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
>>
>> Hi,
>>
>> i have to add that doing the same on bare metal does work without any problems.
>> I suspect that this is probably caused by the fact that in the above
>> example my target is a qemu-kvm machine with a emulated nvme device.
>> Greetings
>> Stefan
>>
>> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
>> >
>> > Adding Sagi.
>> >
>> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
>> > > Hi,
>> > >
>> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
>> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
>> > > with kernel 5.4.6 installed. Kernel was taken from
>> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
>> > > occurs with ubuntu 19.10 kernel 5.3.x
>> > >
>> > > After setup the target i can discover and connect the exported nvme
>> > > device on the host with:
>> > > modprobe nvme
>> > > modprobe nvme-tcp
>> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
>> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
>> > >
>> > > No errors so far, but when i try to format the device with:
>> > >
>> > > mkfs.ext4 /dev/nvme0n1
>> > >
>> > > The kernel panics with:
>> > > Writing inode tables:
>> > > [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> > > [ 692.653158] #PF: supervisor read access in kernel mode
>> > > [ 692.653922] #PF: error_code(0x0000) - not-present page
>> > > [ 692.653922] PGD 0 P4D 0
>> > > [ 692.653922] Oops: 0000 [#1] SMP PTI
>> > > [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
>> > > 5.4.6-050406-generic #201912211140
>> > > [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> > > BIOS 0.0.0 02/06/2015
>> > > [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
>> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > 89 75
>> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > knlGS:0000000000000000
>> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > [ 692.653922] Call Trace:
>> > > [ 692.653922] process_one_work+0x1ec/0x3a0
>> > > [ 692.653922] worker_thread+0x4d/0x400
>> > > [ 692.653922] kthread+0x104/0x140
>> > > [ 692.653922] ? process_one_work+0x3a0/0x3a0
>> > > [ 692.653922] ? kthread_park+0x90/0x90
>> > > [ 692.653922] ret_from_fork+0x35/0x40
>> > > [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
>> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
>> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
>> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
>> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
>> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
>> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
>> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
>> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
>> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
>> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
>> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
>> > > [ 692.653922] CR2: 0000000000000008
>> > > [ 692.653922] ---[ end trace d688c2c182feef87 ]---
>> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > 89 75
>> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > knlGS:0000000000000000
>> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > >
>> > >
>> > > Any help appreciated.
>> > >
>> > > Greetings
>> > >
>> > > --
>> > > Stefan Majer
>>
>>
>>
>> --
>> Stefan Majer
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2019-12-27 7:54 ` Stefan Majer
@ 2019-12-28 17:49 ` Stefan Majer
2019-12-28 17:53 ` Stefan Majer
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-28 17:49 UTC (permalink / raw)
To: sagi grimberg; +Cc: Keith Busch, linux-nvme
Hi,
took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
installed the debug symbols and ran decodestacktrace.sh from kernel
sources which gives me:
[ 29.266954] nvme nvme0: new ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
[ 29.267477] nvme nvme0: Removing ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery"
[ 29.285732] nvme nvme0: creating 1 I/O queues.
[ 29.286632] nvme nvme0: mapped 1/0 default/read queues.
[ 29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
192.168.22.1:4420
[ 29.293146] nvme0n1: detected capacity change from 0 to 1084227584
[ 39.196846] BUG: kernel NULL pointer dereference, address:
0000000000000008
[ 39.198524] #PF: supervisor read access in kernel mode
[ 39.199786] #PF: error_code(0x0000) - not-present page
[ 39.201198] PGD 0 P4D 0
[ 39.201849] Oops: 0000 [#1] SMP PTI
[ 39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
tainted 5.3.0-24-generic #26-Ubuntu
[ 39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
[ 39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[ 39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
[ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
75
All code
========
0: 8b 87 98 00 00 00 mov 0x98(%rdi),%eax
6: 83 f8 02 cmp $0x2,%eax
9: 0f 85 34 fd ff ff jne 0xfffffffffffffd43
f: 49 8b 47 28 mov 0x28(%r15),%rax
13: 4d 89 fe mov %r15,%r14
16: 48 89 45 a8 mov %rax,-0x58(%rbp)
1a: 49 8b 46 78 mov 0x78(%r14),%rax
1e: 49 8b 56 68 mov 0x68(%r14),%rdx
22: 45 8b 66 34 mov 0x34(%r14),%r12d
26: 45 2b 66 38 sub 0x38(%r14),%r12d
2a:* 8b 58 08 mov 0x8(%rax),%ebx <--
trapping instruction
2d: 8b 48 0c mov 0xc(%rax),%ecx
30: 4c 8b 28 mov (%rax),%r13
33: 48 29 d3 sub %rdx,%rbx
36: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
3a: 4c 39 e3 cmp %r12,%rbx
3d: 48 rex.W
3e: 89 .byte 0x89
3f: 75 .byte 0x75
Code starting with the faulting instruction
===========================================
0: 8b 58 08 mov 0x8(%rax),%ebx
3: 8b 48 0c mov 0xc(%rax),%ecx
6: 4c 8b 28 mov (%rax),%r13
9: 48 29 d3 sub %rdx,%rbx
c: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
10: 4c 39 e3 cmp %r12,%rbx
13: 48 rex.W
14: 89 .byte 0x89
15: 75 .byte 0x75
[ 39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
[ 39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
[ 39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
[ 39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
[ 39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
[ 39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
[ 39.228992] FS: 0000000000000000(0000) GS:ffff9dd8ff600000(0000)
knlGS:0000000000000000
[ 39.233660] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
[ 39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 39.246569] Call Trace:
[ 39.247272] process_one_work
(/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
/build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
/build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
[ 39.248361] worker_thread
(/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
/build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
[ 39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
[ 39.250243] ? process_one_work
(/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
[ 39.251485] ? kthread_park
(/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
[ 39.252474] ret_from_fork
(/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
[ 39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_
defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 crypto_simd cirrus nls_i
so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
ipmi_msghandler virtio_rng ip_tables
x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
i2c_piix4 pata_acpi floppy
[ 39.269809] CR2: 0000000000000008
greetings
Stefan
On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> Hi,
>
> no problem, i am also on vacation.
>
> the issue is not reproducible in a pure bare metal environment, target
> and host are physical machines.
> The environment where it happens both machines are kvm based.
>
> I first have to figure out howto gdb on the kernel crash, thats not my
> daily jobs, so please be patient.
>
> Greetings
> Stefan
>
> On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> >
> > Hey,
> >
> > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> >
> > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> >
> > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> i have to add that doing the same on bare metal does work without any problems.
> >> I suspect that this is probably caused by the fact that in the above
> >> example my target is a qemu-kvm machine with a emulated nvme device.
> >> Greetings
> >> Stefan
> >>
> >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> >> >
> >> > Adding Sagi.
> >> >
> >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> >> > > Hi,
> >> > >
> >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> >> > > with kernel 5.4.6 installed. Kernel was taken from
> >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> >> > > occurs with ubuntu 19.10 kernel 5.3.x
> >> > >
> >> > > After setup the target i can discover and connect the exported nvme
> >> > > device on the host with:
> >> > > modprobe nvme
> >> > > modprobe nvme-tcp
> >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> >> > >
> >> > > No errors so far, but when i try to format the device with:
> >> > >
> >> > > mkfs.ext4 /dev/nvme0n1
> >> > >
> >> > > The kernel panics with:
> >> > > Writing inode tables:
> >> > > [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> >> > > [ 692.653158] #PF: supervisor read access in kernel mode
> >> > > [ 692.653922] #PF: error_code(0x0000) - not-present page
> >> > > [ 692.653922] PGD 0 P4D 0
> >> > > [ 692.653922] Oops: 0000 [#1] SMP PTI
> >> > > [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> >> > > 5.4.6-050406-generic #201912211140
> >> > > [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> > > BIOS 0.0.0 02/06/2015
> >> > > [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> >> > > 89 75
> >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> >> > > knlGS:0000000000000000
> >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> > > [ 692.653922] Call Trace:
> >> > > [ 692.653922] process_one_work+0x1ec/0x3a0
> >> > > [ 692.653922] worker_thread+0x4d/0x400
> >> > > [ 692.653922] kthread+0x104/0x140
> >> > > [ 692.653922] ? process_one_work+0x3a0/0x3a0
> >> > > [ 692.653922] ? kthread_park+0x90/0x90
> >> > > [ 692.653922] ret_from_fork+0x35/0x40
> >> > > [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> >> > > [ 692.653922] CR2: 0000000000000008
> >> > > [ 692.653922] ---[ end trace d688c2c182feef87 ]---
> >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> >> > > 89 75
> >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> >> > > knlGS:0000000000000000
> >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> > >
> >> > >
> >> > > Any help appreciated.
> >> > >
> >> > > Greetings
> >> > >
> >> > > --
> >> > > Stefan Majer
> >>
> >>
> >>
> >> --
> >> Stefan Majer
>
>
>
> --
> Stefan Majer
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2019-12-28 17:49 ` Stefan Majer
@ 2019-12-28 17:53 ` Stefan Majer
2020-01-07 15:41 ` Stefan Majer
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-28 17:53 UTC (permalink / raw)
To: sagi grimberg; +Cc: Keith Busch, linux-nvme
I have to add:
./faddr2line /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
nvme_tcp_io_work+0x341/0x7f0
nvme_tcp_io_work+0x341/0x7f0:
nvme_tcp_req_cur_length at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
(inlined by) nvme_tcp_try_send_data at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
(inlined by) nvme_tcp_try_send at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
(inlined by) nvme_tcp_io_work at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048
On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> Hi,
>
> took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
> installed the debug symbols and ran decodestacktrace.sh from kernel
> sources which gives me:
>
> [ 29.266954] nvme nvme0: new ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
> [ 29.267477] nvme nvme0: Removing ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery"
> [ 29.285732] nvme nvme0: creating 1 I/O queues.
> [ 29.286632] nvme nvme0: mapped 1/0 default/read queues.
> [ 29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
> 192.168.22.1:4420
> [ 29.293146] nvme0n1: detected capacity change from 0 to 1084227584
> [ 39.196846] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [ 39.198524] #PF: supervisor read access in kernel mode
> [ 39.199786] #PF: error_code(0x0000) - not-present page
> [ 39.201198] PGD 0 P4D 0
> [ 39.201849] Oops: 0000 [#1] SMP PTI
> [ 39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
> tainted 5.3.0-24-generic #26-Ubuntu
> [ 39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 0.0.0 02/06/2015
> [ 39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> [ 39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
> [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
> 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
> 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
> 75
> All code
> ========
> 0: 8b 87 98 00 00 00 mov 0x98(%rdi),%eax
> 6: 83 f8 02 cmp $0x2,%eax
> 9: 0f 85 34 fd ff ff jne 0xfffffffffffffd43
> f: 49 8b 47 28 mov 0x28(%r15),%rax
> 13: 4d 89 fe mov %r15,%r14
> 16: 48 89 45 a8 mov %rax,-0x58(%rbp)
> 1a: 49 8b 46 78 mov 0x78(%r14),%rax
> 1e: 49 8b 56 68 mov 0x68(%r14),%rdx
> 22: 45 8b 66 34 mov 0x34(%r14),%r12d
> 26: 45 2b 66 38 sub 0x38(%r14),%r12d
> 2a:* 8b 58 08 mov 0x8(%rax),%ebx <--
> trapping instruction
> 2d: 8b 48 0c mov 0xc(%rax),%ecx
> 30: 4c 8b 28 mov (%rax),%r13
> 33: 48 29 d3 sub %rdx,%rbx
> 36: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
> 3a: 4c 39 e3 cmp %r12,%rbx
> 3d: 48 rex.W
> 3e: 89 .byte 0x89
> 3f: 75 .byte 0x75
>
> Code starting with the faulting instruction
> ===========================================
> 0: 8b 58 08 mov 0x8(%rax),%ebx
> 3: 8b 48 0c mov 0xc(%rax),%ecx
> 6: 4c 8b 28 mov (%rax),%r13
> 9: 48 29 d3 sub %rdx,%rbx
> c: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
> 10: 4c 39 e3 cmp %r12,%rbx
> 13: 48 rex.W
> 14: 89 .byte 0x89
> 15: 75 .byte 0x75
> [ 39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
> [ 39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
> [ 39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
> [ 39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
> [ 39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [ 39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
> [ 39.228992] FS: 0000000000000000(0000) GS:ffff9dd8ff600000(0000)
> knlGS:0000000000000000
> [ 39.233660] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
> [ 39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 39.246569] Call Trace:
> [ 39.247272] process_one_work
> (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
> /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
> /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
> e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
> [ 39.248361] worker_thread
> (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
> /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
> /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
> [ 39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
> [ 39.250243] ? process_one_work
> (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
> [ 39.251485] ? kthread_park
> (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
> [ 39.252474] ret_from_fork
> (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
> [ 39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
> xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
> xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_
> defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
> overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> aes_x86_64 crypto_simd cirrus nls_i
> so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
> fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
> qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
> ipmi_msghandler virtio_rng ip_tables
> x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
> i2c_piix4 pata_acpi floppy
> [ 39.269809] CR2: 0000000000000008
>
> greetings
> Stefan
>
> On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
> >
> > Hi,
> >
> > no problem, i am also on vacation.
> >
> > the issue is not reproducible in a pure bare metal environment, target
> > and host are physical machines.
> > The environment where it happens both machines are kvm based.
> >
> > I first have to figure out howto gdb on the kernel crash, thats not my
> > daily jobs, so please be patient.
> >
> > Greetings
> > Stefan
> >
> > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> > >
> > > Hey,
> > >
> > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> > >
> > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> > >
> > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> > >>
> > >> Hi,
> > >>
> > >> i have to add that doing the same on bare metal does work without any problems.
> > >> I suspect that this is probably caused by the fact that in the above
> > >> example my target is a qemu-kvm machine with a emulated nvme device.
> > >> Greetings
> > >> Stefan
> > >>
> > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> > >> >
> > >> > Adding Sagi.
> > >> >
> > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > >> > > Hi,
> > >> > >
> > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > >> > > with kernel 5.4.6 installed. Kernel was taken from
> > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > >> > > occurs with ubuntu 19.10 kernel 5.3.x
> > >> > >
> > >> > > After setup the target i can discover and connect the exported nvme
> > >> > > device on the host with:
> > >> > > modprobe nvme
> > >> > > modprobe nvme-tcp
> > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> > >> > >
> > >> > > No errors so far, but when i try to format the device with:
> > >> > >
> > >> > > mkfs.ext4 /dev/nvme0n1
> > >> > >
> > >> > > The kernel panics with:
> > >> > > Writing inode tables:
> > >> > > [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > >> > > [ 692.653158] #PF: supervisor read access in kernel mode
> > >> > > [ 692.653922] #PF: error_code(0x0000) - not-present page
> > >> > > [ 692.653922] PGD 0 P4D 0
> > >> > > [ 692.653922] Oops: 0000 [#1] SMP PTI
> > >> > > [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > >> > > 5.4.6-050406-generic #201912211140
> > >> > > [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > >> > > BIOS 0.0.0 02/06/2015
> > >> > > [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > >> > > 89 75
> > >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> > >> > > knlGS:0000000000000000
> > >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >> > > [ 692.653922] Call Trace:
> > >> > > [ 692.653922] process_one_work+0x1ec/0x3a0
> > >> > > [ 692.653922] worker_thread+0x4d/0x400
> > >> > > [ 692.653922] kthread+0x104/0x140
> > >> > > [ 692.653922] ? process_one_work+0x3a0/0x3a0
> > >> > > [ 692.653922] ? kthread_park+0x90/0x90
> > >> > > [ 692.653922] ret_from_fork+0x35/0x40
> > >> > > [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > >> > > [ 692.653922] CR2: 0000000000000008
> > >> > > [ 692.653922] ---[ end trace d688c2c182feef87 ]---
> > >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > >> > > 89 75
> > >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> > >> > > knlGS:0000000000000000
> > >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >> > >
> > >> > >
> > >> > > Any help appreciated.
> > >> > >
> > >> > > Greetings
> > >> > >
> > >> > > --
> > >> > > Stefan Majer
> > >>
> > >>
> > >>
> > >> --
> > >> Stefan Majer
> >
> >
> >
> > --
> > Stefan Majer
>
>
>
> --
> Stefan Majer
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2019-12-28 17:53 ` Stefan Majer
@ 2020-01-07 15:41 ` Stefan Majer
2020-01-07 16:48 ` Nadolski, Edmund
2020-01-15 20:03 ` Sagi Grimberg
0 siblings, 2 replies; 10+ messages in thread
From: Stefan Majer @ 2020-01-07 15:41 UTC (permalink / raw)
To: sagi grimberg; +Cc: Keith Busch, linux-nvme
Hi,
is there anything i can help with to further nail down the problem ?
please let me know.
Stefan
On Sat, Dec 28, 2019 at 6:53 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> I have to add:
>
> ./faddr2line /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
> nvme_tcp_io_work+0x341/0x7f0
> nvme_tcp_io_work+0x341/0x7f0:
> nvme_tcp_req_cur_length at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
> (inlined by) nvme_tcp_try_send_data at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
> (inlined by) nvme_tcp_try_send at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
> (inlined by) nvme_tcp_io_work at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048
>
> On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
> >
> > Hi,
> >
> > took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
> > installed the debug symbols and ran decodestacktrace.sh from kernel
> > sources which gives me:
> >
> > [ 29.266954] nvme nvme0: new ctrl: NQN
> > "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
> > [ 29.267477] nvme nvme0: Removing ctrl: NQN
> > "nqn.2014-08.org.nvmexpress.discovery"
> > [ 29.285732] nvme nvme0: creating 1 I/O queues.
> > [ 29.286632] nvme nvme0: mapped 1/0 default/read queues.
> > [ 29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
> > 192.168.22.1:4420
> > [ 29.293146] nvme0n1: detected capacity change from 0 to 1084227584
> > [ 39.196846] BUG: kernel NULL pointer dereference, address:
> > 0000000000000008
> > [ 39.198524] #PF: supervisor read access in kernel mode
> > [ 39.199786] #PF: error_code(0x0000) - not-present page
> > [ 39.201198] PGD 0 P4D 0
> > [ 39.201849] Oops: 0000 [#1] SMP PTI
> > [ 39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
> > tainted 5.3.0-24-generic #26-Ubuntu
> > [ 39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 0.0.0 02/06/2015
> > [ 39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > [ 39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
> > [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
> > 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
> > 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
> > 75
> > All code
> > ========
> > 0: 8b 87 98 00 00 00 mov 0x98(%rdi),%eax
> > 6: 83 f8 02 cmp $0x2,%eax
> > 9: 0f 85 34 fd ff ff jne 0xfffffffffffffd43
> > f: 49 8b 47 28 mov 0x28(%r15),%rax
> > 13: 4d 89 fe mov %r15,%r14
> > 16: 48 89 45 a8 mov %rax,-0x58(%rbp)
> > 1a: 49 8b 46 78 mov 0x78(%r14),%rax
> > 1e: 49 8b 56 68 mov 0x68(%r14),%rdx
> > 22: 45 8b 66 34 mov 0x34(%r14),%r12d
> > 26: 45 2b 66 38 sub 0x38(%r14),%r12d
> > 2a:* 8b 58 08 mov 0x8(%rax),%ebx <--
> > trapping instruction
> > 2d: 8b 48 0c mov 0xc(%rax),%ecx
> > 30: 4c 8b 28 mov (%rax),%r13
> > 33: 48 29 d3 sub %rdx,%rbx
> > 36: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
> > 3a: 4c 39 e3 cmp %r12,%rbx
> > 3d: 48 rex.W
> > 3e: 89 .byte 0x89
> > 3f: 75 .byte 0x75
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 8b 58 08 mov 0x8(%rax),%ebx
> > 3: 8b 48 0c mov 0xc(%rax),%ecx
> > 6: 4c 8b 28 mov (%rax),%r13
> > 9: 48 29 d3 sub %rdx,%rbx
> > c: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
> > 10: 4c 39 e3 cmp %r12,%rbx
> > 13: 48 rex.W
> > 14: 89 .byte 0x89
> > 15: 75 .byte 0x75
> > [ 39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
> > [ 39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
> > [ 39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
> > [ 39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
> > [ 39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > [ 39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
> > [ 39.228992] FS: 0000000000000000(0000) GS:ffff9dd8ff600000(0000)
> > knlGS:0000000000000000
> > [ 39.233660] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
> > [ 39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 39.246569] Call Trace:
> > [ 39.247272] process_one_work
> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
> > /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
> > /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
> > e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
> > [ 39.248361] worker_thread
> > (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
> > /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
> > /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
> > [ 39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
> > [ 39.250243] ? process_one_work
> > (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
> > [ 39.251485] ? kthread_park
> > (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
> > [ 39.252474] ret_from_fork
> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
> > [ 39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
> > xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
> > xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> > nf_defrag_ipv6 nf_
> > defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
> > overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> > aes_x86_64 crypto_simd cirrus nls_i
> > so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
> > fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
> > qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
> > ipmi_msghandler virtio_rng ip_tables
> > x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
> > i2c_piix4 pata_acpi floppy
> > [ 39.269809] CR2: 0000000000000008
> >
> > greetings
> > Stefan
> >
> > On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > no problem, i am also on vacation.
> > >
> > > the issue is not reproducible in a pure bare metal environment, target
> > > and host are physical machines.
> > > The environment where it happens both machines are kvm based.
> > >
> > > I first have to figure out howto gdb on the kernel crash, thats not my
> > > daily jobs, so please be patient.
> > >
> > > Greetings
> > > Stefan
> > >
> > > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> > > >
> > > > Hey,
> > > >
> > > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> > > >
> > > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> > > >
> > > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> i have to add that doing the same on bare metal does work without any problems.
> > > >> I suspect that this is probably caused by the fact that in the above
> > > >> example my target is a qemu-kvm machine with a emulated nvme device.
> > > >> Greetings
> > > >> Stefan
> > > >>
> > > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> > > >> >
> > > >> > Adding Sagi.
> > > >> >
> > > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > > >> > > Hi,
> > > >> > >
> > > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > > >> > > with kernel 5.4.6 installed. Kernel was taken from
> > > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > > >> > > occurs with ubuntu 19.10 kernel 5.3.x
> > > >> > >
> > > >> > > After setup the target i can discover and connect the exported nvme
> > > >> > > device on the host with:
> > > >> > > modprobe nvme
> > > >> > > modprobe nvme-tcp
> > > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> > > >> > >
> > > >> > > No errors so far, but when i try to format the device with:
> > > >> > >
> > > >> > > mkfs.ext4 /dev/nvme0n1
> > > >> > >
> > > >> > > The kernel panics with:
> > > >> > > Writing inode tables:
> > > >> > > [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > > >> > > [ 692.653158] #PF: supervisor read access in kernel mode
> > > >> > > [ 692.653922] #PF: error_code(0x0000) - not-present page
> > > >> > > [ 692.653922] PGD 0 P4D 0
> > > >> > > [ 692.653922] Oops: 0000 [#1] SMP PTI
> > > >> > > [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > > >> > > 5.4.6-050406-generic #201912211140
> > > >> > > [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > >> > > BIOS 0.0.0 02/06/2015
> > > >> > > [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > > >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > > >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > > >> > > 89 75
> > > >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > > >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > > >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > > >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > > >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > > >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > > >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> > > >> > > knlGS:0000000000000000
> > > >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > > >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > >> > > [ 692.653922] Call Trace:
> > > >> > > [ 692.653922] process_one_work+0x1ec/0x3a0
> > > >> > > [ 692.653922] worker_thread+0x4d/0x400
> > > >> > > [ 692.653922] kthread+0x104/0x140
> > > >> > > [ 692.653922] ? process_one_work+0x3a0/0x3a0
> > > >> > > [ 692.653922] ? kthread_park+0x90/0x90
> > > >> > > [ 692.653922] ret_from_fork+0x35/0x40
> > > >> > > [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > > >> > > [ 692.653922] CR2: 0000000000000008
> > > >> > > [ 692.653922] ---[ end trace d688c2c182feef87 ]---
> > > >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > > >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > > >> > > 89 75
> > > >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > > >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > > >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > > >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > > >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > > >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > > >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
> > > >> > > knlGS:0000000000000000
> > > >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > > >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > >> > >
> > > >> > >
> > > >> > > Any help appreciated.
> > > >> > >
> > > >> > > Greetings
> > > >> > >
> > > >> > > --
> > > >> > > Stefan Majer
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Stefan Majer
> > >
> > >
> > >
> > > --
> > > Stefan Majer
> >
> >
> >
> > --
> > Stefan Majer
>
>
>
> --
> Stefan Majer
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2020-01-07 15:41 ` Stefan Majer
@ 2020-01-07 16:48 ` Nadolski, Edmund
2020-01-15 20:03 ` Sagi Grimberg
1 sibling, 0 replies; 10+ messages in thread
From: Nadolski, Edmund @ 2020-01-07 16:48 UTC (permalink / raw)
To: Stefan Majer, sagi grimberg; +Cc: Keith Busch, linux-nvme
On 1/7/2020 8:41 AM, Stefan Majer wrote:
> Hi,
>
> is there anything i can help with to further nail down the problem ?
>
> please let me know.
> Stefan
Is there any kind of reset in progress when this hits?
Thanks,
Ed
> On Sat, Dec 28, 2019 at 6:53 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>>
>> I have to add:
>>
>> ./faddr2line /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
>> nvme_tcp_io_work+0x341/0x7f0
>> nvme_tcp_io_work+0x341/0x7f0:
>> nvme_tcp_req_cur_length at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
>> (inlined by) nvme_tcp_try_send_data at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
>> (inlined by) nvme_tcp_try_send at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
>> (inlined by) nvme_tcp_io_work at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048
>>
>> On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
>> > installed the debug symbols and ran decodestacktrace.sh from kernel
>> > sources which gives me:
>> >
>> > [ 29.266954] nvme nvme0: new ctrl: NQN
>> > "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
>> > [ 29.267477] nvme nvme0: Removing ctrl: NQN
>> > "nqn.2014-08.org.nvmexpress.discovery"
>> > [ 29.285732] nvme nvme0: creating 1 I/O queues.
>> > [ 29.286632] nvme nvme0: mapped 1/0 default/read queues.
>> > [ 29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
>> > 192.168.22.1:4420
>> > [ 29.293146] nvme0n1: detected capacity change from 0 to 1084227584
>> > [ 39.196846] BUG: kernel NULL pointer dereference, address:
>> > 0000000000000008
>> > [ 39.198524] #PF: supervisor read access in kernel mode
>> > [ 39.199786] #PF: error_code(0x0000) - not-present page
>> > [ 39.201198] PGD 0 P4D 0
>> > [ 39.201849] Oops: 0000 [#1] SMP PTI
>> > [ 39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
>> > tainted 5.3.0-24-generic #26-Ubuntu
>> > [ 39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> > BIOS 0.0.0 02/06/2015
>> > [ 39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
>> > [ 39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
>> > [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
>> > 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
>> > 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
>> > 75
>> > All code
>> > ========
>> > 0: 8b 87 98 00 00 00 mov 0x98(%rdi),%eax
>> > 6: 83 f8 02 cmp $0x2,%eax
>> > 9: 0f 85 34 fd ff ff jne 0xfffffffffffffd43
>> > f: 49 8b 47 28 mov 0x28(%r15),%rax
>> > 13: 4d 89 fe mov %r15,%r14
>> > 16: 48 89 45 a8 mov %rax,-0x58(%rbp)
>> > 1a: 49 8b 46 78 mov 0x78(%r14),%rax
>> > 1e: 49 8b 56 68 mov 0x68(%r14),%rdx
>> > 22: 45 8b 66 34 mov 0x34(%r14),%r12d
>> > 26: 45 2b 66 38 sub 0x38(%r14),%r12d
>> > 2a:* 8b 58 08 mov 0x8(%rax),%ebx <--
>> > trapping instruction
>> > 2d: 8b 48 0c mov 0xc(%rax),%ecx
>> > 30: 4c 8b 28 mov (%rax),%r13
>> > 33: 48 29 d3 sub %rdx,%rbx
>> > 36: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
>> > 3a: 4c 39 e3 cmp %r12,%rbx
>> > 3d: 48 rex.W
>> > 3e: 89 .byte 0x89
>> > 3f: 75 .byte 0x75
>> >
>> > Code starting with the faulting instruction
>> > ===========================================
>> > 0: 8b 58 08 mov 0x8(%rax),%ebx
>> > 3: 8b 48 0c mov 0xc(%rax),%ecx
>> > 6: 4c 8b 28 mov (%rax),%r13
>> > 9: 48 29 d3 sub %rdx,%rbx
>> > c: 48 8d 34 11 lea (%rcx,%rdx,1),%rsi
>> > 10: 4c 39 e3 cmp %r12,%rbx
>> > 13: 48 rex.W
>> > 14: 89 .byte 0x89
>> > 15: 75 .byte 0x75
>> > [ 39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
>> > [ 39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
>> > [ 39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
>> > [ 39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
>> > [ 39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > [ 39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
>> > [ 39.228992] FS: 0000000000000000(0000) GS:ffff9dd8ff600000(0000)
>> > knlGS:0000000000000000
>> > [ 39.233660] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [ 39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
>> > [ 39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > [ 39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > [ 39.246569] Call Trace:
>> > [ 39.247272] process_one_work
>> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
>> > /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
>> > /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
>> > e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
>> > [ 39.248361] worker_thread
>> > (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
>> > /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
>> > /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
>> > [ 39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
>> > [ 39.250243] ? process_one_work
>> > (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
>> > [ 39.251485] ? kthread_park
>> > (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
>> > [ 39.252474] ret_from_fork
>> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
>> > [ 39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
>> > xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
>> > xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
>> > nf_defrag_ipv6 nf_
>> > defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
>> > overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
>> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
>> > aes_x86_64 crypto_simd cirrus nls_i
>> > so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
>> > fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
>> > qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
>> > ipmi_msghandler virtio_rng ip_tables
>> > x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
>> > i2c_piix4 pata_acpi floppy
>> > [ 39.269809] CR2: 0000000000000008
>> >
>> > greetings
>> > Stefan
>> >
>> > On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
>> > >
>> > > Hi,
>> > >
>> > > no problem, i am also on vacation.
>> > >
>> > > the issue is not reproducible in a pure bare metal environment, target
>> > > and host are physical machines.
>> > > The environment where it happens both machines are kvm based.
>> > >
>> > > I first have to figure out howto gdb on the kernel crash, thats not my
>> > > daily jobs, so please be patient.
>> > >
>> > > Greetings
>> > > Stefan
>> > >
>> > > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
>> > > >
>> > > > Hey,
>> > > >
>> > > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
>> > > >
>> > > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
>> > > >
>> > > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
>> > > >>
>> > > >> Hi,
>> > > >>
>> > > >> i have to add that doing the same on bare metal does work without any problems.
>> > > >> I suspect that this is probably caused by the fact that in the above
>> > > >> example my target is a qemu-kvm machine with a emulated nvme device.
>> > > >> Greetings
>> > > >> Stefan
>> > > >>
>> > > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
>> > > >> >
>> > > >> > Adding Sagi.
>> > > >> >
>> > > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
>> > > >> > > Hi,
>> > > >> > >
>> > > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
>> > > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
>> > > >> > > with kernel 5.4.6 installed. Kernel was taken from
>> > > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
>> > > >> > > occurs with ubuntu 19.10 kernel 5.3.x
>> > > >> > >
>> > > >> > > After setup the target i can discover and connect the exported nvme
>> > > >> > > device on the host with:
>> > > >> > > modprobe nvme
>> > > >> > > modprobe nvme-tcp
>> > > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
>> > > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
>> > > >> > >
>> > > >> > > No errors so far, but when i try to format the device with:
>> > > >> > >
>> > > >> > > mkfs.ext4 /dev/nvme0n1
>> > > >> > >
>> > > >> > > The kernel panics with:
>> > > >> > > Writing inode tables:
>> > > >> > > [ 692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> > > >> > > [ 692.653158] #PF: supervisor read access in kernel mode
>> > > >> > > [ 692.653922] #PF: error_code(0x0000) - not-present page
>> > > >> > > [ 692.653922] PGD 0 P4D 0
>> > > >> > > [ 692.653922] Oops: 0000 [#1] SMP PTI
>> > > >> > > [ 692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
>> > > >> > > 5.4.6-050406-generic #201912211140
>> > > >> > > [ 692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> > > >> > > BIOS 0.0.0 02/06/2015
>> > > >> > > [ 692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
>> > > >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > >> > > 89 75
>> > > >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > >> > > knlGS:0000000000000000
>> > > >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > >> > > [ 692.653922] Call Trace:
>> > > >> > > [ 692.653922] process_one_work+0x1ec/0x3a0
>> > > >> > > [ 692.653922] worker_thread+0x4d/0x400
>> > > >> > > [ 692.653922] kthread+0x104/0x140
>> > > >> > > [ 692.653922] ? process_one_work+0x3a0/0x3a0
>> > > >> > > [ 692.653922] ? kthread_park+0x90/0x90
>> > > >> > > [ 692.653922] ret_from_fork+0x35/0x40
>> > > >> > > [ 692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
>> > > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
>> > > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
>> > > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
>> > > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
>> > > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
>> > > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
>> > > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
>> > > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
>> > > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
>> > > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
>> > > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
>> > > >> > > [ 692.653922] CR2: 0000000000000008
>> > > >> > > [ 692.653922] ---[ end trace d688c2c182feef87 ]---
>> > > >> > > [ 692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > >> > > [ 692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > >> > > 89 75
>> > > >> > > [ 692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > >> > > [ 692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > >> > > [ 692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > >> > > [ 692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > >> > > [ 692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > >> > > [ 692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > >> > > [ 692.653922] FS: 0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > >> > > knlGS:0000000000000000
>> > > >> > > [ 692.653922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > >> > > [ 692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > >> > > [ 692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > >> > > [ 692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > >> > >
>> > > >> > >
>> > > >> > > Any help appreciated.
>> > > >> > >
>> > > >> > > Greetings
>> > > >> > >
>> > > >> > > --
>> > > >> > > Stefan Majer
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> Stefan Majer
>> > >
>> > >
>> > >
>> > > --
>> > > Stefan Majer
>> >
>> >
>> >
>> > --
>> > Stefan Majer
>>
>>
>>
>> --
>> Stefan Majer
>
>
>
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2020-01-07 15:41 ` Stefan Majer
2020-01-07 16:48 ` Nadolski, Edmund
@ 2020-01-15 20:03 ` Sagi Grimberg
2020-01-16 6:28 ` Stefan Majer
1 sibling, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2020-01-15 20:03 UTC (permalink / raw)
To: Stefan Majer; +Cc: Keith Busch, linux-nvme
> Hi,
>
> is there anything i can help with to further nail down the problem ?
Hi Stephen,
I cannot reproduce this issue with my environment (both host and target
are VMs on my laptop, kernel 5.4.0 qemu version 3.1.0).
Would it be possible to try and use kernel 5.4 for the sake of the test?
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer dereference in nvme_tcp_io_work
2020-01-15 20:03 ` Sagi Grimberg
@ 2020-01-16 6:28 ` Stefan Majer
0 siblings, 0 replies; 10+ messages in thread
From: Stefan Majer @ 2020-01-16 6:28 UTC (permalink / raw)
To: Sagi Grimberg; +Cc: Keith Busch, linux-nvme
Hi Sagi,
Sure, no problem.
Wil report back. Thanks for looking into.
Greetings
Stefan
On Wed, Jan 15, 2020 at 9:03 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> > Hi,
> >
> > is there anything i can help with to further nail down the problem ?
>
> Hi Stephen,
>
> I cannot reproduce this issue with my environment (both host and target
> are VMs on my laptop, kernel 5.4.0 qemu version 3.1.0).
>
> Would it be possible to try and use kernel 5.4 for the sake of the test?
--
Stefan Majer
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-01-16 6:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25 10:06 null pointer dereference in nvme_tcp_io_work Stefan Majer
2019-12-26 17:47 ` Keith Busch
2019-12-27 7:18 ` Stefan Majer
[not found] ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
2019-12-27 7:54 ` Stefan Majer
2019-12-28 17:49 ` Stefan Majer
2019-12-28 17:53 ` Stefan Majer
2020-01-07 15:41 ` Stefan Majer
2020-01-07 16:48 ` Nadolski, Edmund
2020-01-15 20:03 ` Sagi Grimberg
2020-01-16 6:28 ` Stefan Majer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).