linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* null pointer dereference in nvme_tcp_io_work
@ 2019-12-25 10:06 Stefan Majer
  2019-12-26 17:47 ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-25 10:06 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch

Hi,

im trying to setup a nvme-over-tcp test environment with a qemu-kvm
based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
with kernel 5.4.6 installed. Kernel was taken from
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
occurs with ubuntu 19.10 kernel 5.3.x

After setup the target i can discover and connect the exported nvme
device on the host with:
modprobe nvme
modprobe nvme-tcp
nvme discover -t tcp -a 192.168.22.1 -s 4420
nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420

No errors so far, but when i try to format the device with:

mkfs.ext4 /dev/nvme0n1

The kernel panics with:
Writing inode tables:
[  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  692.653158] #PF: supervisor read access in kernel mode
[  692.653922] #PF: error_code(0x0000) - not-present page
[  692.653922] PGD 0 P4D 0
[  692.653922] Oops: 0000 [#1] SMP PTI
[  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
5.4.6-050406-generic #201912211140
[  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
[  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
[  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
89 75
[  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
[  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
[  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
[  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
[  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
[  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
[  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
knlGS:0000000000000000
[  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
[  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  692.653922] Call Trace:
[  692.653922]  process_one_work+0x1ec/0x3a0
[  692.653922]  worker_thread+0x4d/0x400
[  692.653922]  kthread+0x104/0x140
[  692.653922]  ? process_one_work+0x3a0/0x3a0
[  692.653922]  ? kthread_park+0x90/0x90
[  692.653922]  ret_from_fork+0x35/0x40
[  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
net_failover failover libahci i2c_piix4 pata_acpi floppy
[  692.653922] CR2: 0000000000000008
[  692.653922] ---[ end trace d688c2c182feef87 ]---
[  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
[  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
89 75
[  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
[  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
[  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
[  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
[  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
[  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
[  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
knlGS:0000000000000000
[  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
[  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


Any help appreciated.

Greetings

-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2019-12-25 10:06 null pointer dereference in nvme_tcp_io_work Stefan Majer
@ 2019-12-26 17:47 ` Keith Busch
  2019-12-27  7:18   ` Stefan Majer
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2019-12-26 17:47 UTC (permalink / raw)
  To: Stefan Majer, sagi; +Cc: linux-nvme

Adding Sagi.

On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> Hi,
> 
> im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> with kernel 5.4.6 installed. Kernel was taken from
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> occurs with ubuntu 19.10 kernel 5.3.x
> 
> After setup the target i can discover and connect the exported nvme
> device on the host with:
> modprobe nvme
> modprobe nvme-tcp
> nvme discover -t tcp -a 192.168.22.1 -s 4420
> nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> 
> No errors so far, but when i try to format the device with:
> 
> mkfs.ext4 /dev/nvme0n1
> 
> The kernel panics with:
> Writing inode tables:
> [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> [  692.653158] #PF: supervisor read access in kernel mode
> [  692.653922] #PF: error_code(0x0000) - not-present page
> [  692.653922] PGD 0 P4D 0
> [  692.653922] Oops: 0000 [#1] SMP PTI
> [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> 5.4.6-050406-generic #201912211140
> [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 0.0.0 02/06/2015
> [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> 89 75
> [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> knlGS:0000000000000000
> [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  692.653922] Call Trace:
> [  692.653922]  process_one_work+0x1ec/0x3a0
> [  692.653922]  worker_thread+0x4d/0x400
> [  692.653922]  kthread+0x104/0x140
> [  692.653922]  ? process_one_work+0x3a0/0x3a0
> [  692.653922]  ? kthread_park+0x90/0x90
> [  692.653922]  ret_from_fork+0x35/0x40
> [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> net_failover failover libahci i2c_piix4 pata_acpi floppy
> [  692.653922] CR2: 0000000000000008
> [  692.653922] ---[ end trace d688c2c182feef87 ]---
> [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> 89 75
> [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> knlGS:0000000000000000
> [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> Any help appreciated.
> 
> Greetings
> 
> -- 
> Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2019-12-26 17:47 ` Keith Busch
@ 2019-12-27  7:18   ` Stefan Majer
       [not found]     ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-27  7:18 UTC (permalink / raw)
  To: Keith Busch; +Cc: sagi, linux-nvme

Hi,

i have to add that doing the same on bare metal does work without any problems.
I suspect that this is probably caused by the fact that in the above
example my target is a qemu-kvm machine with a emulated nvme device.
Greetings
Stefan

On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
>
> Adding Sagi.
>
> On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > Hi,
> >
> > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > with kernel 5.4.6 installed. Kernel was taken from
> > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > occurs with ubuntu 19.10 kernel 5.3.x
> >
> > After setup the target i can discover and connect the exported nvme
> > device on the host with:
> > modprobe nvme
> > modprobe nvme-tcp
> > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> >
> > No errors so far, but when i try to format the device with:
> >
> > mkfs.ext4 /dev/nvme0n1
> >
> > The kernel panics with:
> > Writing inode tables:
> > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > [  692.653158] #PF: supervisor read access in kernel mode
> > [  692.653922] #PF: error_code(0x0000) - not-present page
> > [  692.653922] PGD 0 P4D 0
> > [  692.653922] Oops: 0000 [#1] SMP PTI
> > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > 5.4.6-050406-generic #201912211140
> > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 0.0.0 02/06/2015
> > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > 89 75
> > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > knlGS:0000000000000000
> > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  692.653922] Call Trace:
> > [  692.653922]  process_one_work+0x1ec/0x3a0
> > [  692.653922]  worker_thread+0x4d/0x400
> > [  692.653922]  kthread+0x104/0x140
> > [  692.653922]  ? process_one_work+0x3a0/0x3a0
> > [  692.653922]  ? kthread_park+0x90/0x90
> > [  692.653922]  ret_from_fork+0x35/0x40
> > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > [  692.653922] CR2: 0000000000000008
> > [  692.653922] ---[ end trace d688c2c182feef87 ]---
> > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > 89 75
> > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > knlGS:0000000000000000
> > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >
> >
> > Any help appreciated.
> >
> > Greetings
> >
> > --
> > Stefan Majer



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
       [not found]     ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
@ 2019-12-27  7:54       ` Stefan Majer
  2019-12-28 17:49         ` Stefan Majer
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-27  7:54 UTC (permalink / raw)
  To: sagi grimberg; +Cc: Keith Busch, linux-nvme

Hi,

no problem, i am also on vacation.

the issue is not reproducible in a pure bare metal environment, target
and host are physical machines.
The environment where it happens both machines are kvm based.

I first have to figure out howto gdb on the kernel crash, thats not my
daily jobs, so please be patient.

Greetings
Stefan

On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
>
> Hey,
>
> On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
>
> Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
>
> On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
>>
>> Hi,
>>
>> i have to add that doing the same on bare metal does work without any problems.
>> I suspect that this is probably caused by the fact that in the above
>> example my target is a qemu-kvm machine with a emulated nvme device.
>> Greetings
>> Stefan
>>
>> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
>> >
>> > Adding Sagi.
>> >
>> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
>> > > Hi,
>> > >
>> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
>> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
>> > > with kernel 5.4.6 installed. Kernel was taken from
>> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
>> > > occurs with ubuntu 19.10 kernel 5.3.x
>> > >
>> > > After setup the target i can discover and connect the exported nvme
>> > > device on the host with:
>> > > modprobe nvme
>> > > modprobe nvme-tcp
>> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
>> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
>> > >
>> > > No errors so far, but when i try to format the device with:
>> > >
>> > > mkfs.ext4 /dev/nvme0n1
>> > >
>> > > The kernel panics with:
>> > > Writing inode tables:
>> > > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> > > [  692.653158] #PF: supervisor read access in kernel mode
>> > > [  692.653922] #PF: error_code(0x0000) - not-present page
>> > > [  692.653922] PGD 0 P4D 0
>> > > [  692.653922] Oops: 0000 [#1] SMP PTI
>> > > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
>> > > 5.4.6-050406-generic #201912211140
>> > > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> > > BIOS 0.0.0 02/06/2015
>> > > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
>> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > 89 75
>> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > knlGS:0000000000000000
>> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > [  692.653922] Call Trace:
>> > > [  692.653922]  process_one_work+0x1ec/0x3a0
>> > > [  692.653922]  worker_thread+0x4d/0x400
>> > > [  692.653922]  kthread+0x104/0x140
>> > > [  692.653922]  ? process_one_work+0x3a0/0x3a0
>> > > [  692.653922]  ? kthread_park+0x90/0x90
>> > > [  692.653922]  ret_from_fork+0x35/0x40
>> > > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
>> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
>> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
>> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
>> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
>> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
>> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
>> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
>> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
>> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
>> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
>> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
>> > > [  692.653922] CR2: 0000000000000008
>> > > [  692.653922] ---[ end trace d688c2c182feef87 ]---
>> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > 89 75
>> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > knlGS:0000000000000000
>> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > >
>> > >
>> > > Any help appreciated.
>> > >
>> > > Greetings
>> > >
>> > > --
>> > > Stefan Majer
>>
>>
>>
>> --
>> Stefan Majer



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2019-12-27  7:54       ` Stefan Majer
@ 2019-12-28 17:49         ` Stefan Majer
  2019-12-28 17:53           ` Stefan Majer
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-28 17:49 UTC (permalink / raw)
  To: sagi grimberg; +Cc: Keith Busch, linux-nvme

Hi,

took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
installed the debug symbols and ran decodestacktrace.sh from kernel
sources which gives me:

[   29.266954] nvme nvme0: new ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
[   29.267477] nvme nvme0: Removing ctrl: NQN
"nqn.2014-08.org.nvmexpress.discovery"
[   29.285732] nvme nvme0: creating 1 I/O queues.
[   29.286632] nvme nvme0: mapped 1/0 default/read queues.
[   29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
192.168.22.1:4420
[   29.293146] nvme0n1: detected capacity change from 0 to 1084227584
[   39.196846] BUG: kernel NULL pointer dereference, address:
0000000000000008
[   39.198524] #PF: supervisor read access in kernel mode
[   39.199786] #PF: error_code(0x0000) - not-present page
[   39.201198] PGD 0 P4D 0
[   39.201849] Oops: 0000 [#1] SMP PTI
[   39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
tainted 5.3.0-24-generic #26-Ubuntu
[   39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
[   39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[   39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
[ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
75
All code
========
   0:   8b 87 98 00 00 00       mov    0x98(%rdi),%eax
   6:   83 f8 02                cmp    $0x2,%eax
   9:   0f 85 34 fd ff ff       jne    0xfffffffffffffd43
   f:   49 8b 47 28             mov    0x28(%r15),%rax
  13:   4d 89 fe                mov    %r15,%r14
  16:   48 89 45 a8             mov    %rax,-0x58(%rbp)
  1a:   49 8b 46 78             mov    0x78(%r14),%rax
  1e:   49 8b 56 68             mov    0x68(%r14),%rdx
  22:   45 8b 66 34             mov    0x34(%r14),%r12d
  26:   45 2b 66 38             sub    0x38(%r14),%r12d
  2a:*  8b 58 08                mov    0x8(%rax),%ebx           <--
trapping instruction
  2d:   8b 48 0c                mov    0xc(%rax),%ecx
  30:   4c 8b 28                mov    (%rax),%r13
  33:   48 29 d3                sub    %rdx,%rbx
  36:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
  3a:   4c 39 e3                cmp    %r12,%rbx
  3d:   48                      rex.W
  3e:   89                      .byte 0x89
  3f:   75                      .byte 0x75

Code starting with the faulting instruction
===========================================
   0:   8b 58 08                mov    0x8(%rax),%ebx
   3:   8b 48 0c                mov    0xc(%rax),%ecx
   6:   4c 8b 28                mov    (%rax),%r13
   9:   48 29 d3                sub    %rdx,%rbx
   c:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
  10:   4c 39 e3                cmp    %r12,%rbx
  13:   48                      rex.W
  14:   89                      .byte 0x89
  15:   75                      .byte 0x75
[   39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
[   39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
[   39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
[   39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
[   39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
[   39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
[   39.228992] FS:  0000000000000000(0000) GS:ffff9dd8ff600000(0000)
knlGS:0000000000000000
[   39.233660] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
[   39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   39.246569] Call Trace:
[   39.247272] process_one_work
(/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
/build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
/build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
[   39.248361] worker_thread
(/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
/build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
[   39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
[   39.250243] ? process_one_work
(/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
[   39.251485] ? kthread_park
(/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
[   39.252474] ret_from_fork
(/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
[   39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_
defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 crypto_simd cirrus nls_i
so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
ipmi_msghandler virtio_rng ip_tables
x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
i2c_piix4 pata_acpi floppy
[   39.269809] CR2: 0000000000000008

greetings
Stefan

On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> Hi,
>
> no problem, i am also on vacation.
>
> the issue is not reproducible in a pure bare metal environment, target
> and host are physical machines.
> The environment where it happens both machines are kvm based.
>
> I first have to figure out howto gdb on the kernel crash, thats not my
> daily jobs, so please be patient.
>
> Greetings
> Stefan
>
> On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> >
> > Hey,
> >
> > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> >
> > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> >
> > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> i have to add that doing the same on bare metal does work without any problems.
> >> I suspect that this is probably caused by the fact that in the above
> >> example my target is a qemu-kvm machine with a emulated nvme device.
> >> Greetings
> >> Stefan
> >>
> >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> >> >
> >> > Adding Sagi.
> >> >
> >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> >> > > Hi,
> >> > >
> >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> >> > > with kernel 5.4.6 installed. Kernel was taken from
> >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> >> > > occurs with ubuntu 19.10 kernel 5.3.x
> >> > >
> >> > > After setup the target i can discover and connect the exported nvme
> >> > > device on the host with:
> >> > > modprobe nvme
> >> > > modprobe nvme-tcp
> >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> >> > >
> >> > > No errors so far, but when i try to format the device with:
> >> > >
> >> > > mkfs.ext4 /dev/nvme0n1
> >> > >
> >> > > The kernel panics with:
> >> > > Writing inode tables:
> >> > > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> >> > > [  692.653158] #PF: supervisor read access in kernel mode
> >> > > [  692.653922] #PF: error_code(0x0000) - not-present page
> >> > > [  692.653922] PGD 0 P4D 0
> >> > > [  692.653922] Oops: 0000 [#1] SMP PTI
> >> > > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> >> > > 5.4.6-050406-generic #201912211140
> >> > > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >> > > BIOS 0.0.0 02/06/2015
> >> > > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> >> > > 89 75
> >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> >> > > knlGS:0000000000000000
> >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> > > [  692.653922] Call Trace:
> >> > > [  692.653922]  process_one_work+0x1ec/0x3a0
> >> > > [  692.653922]  worker_thread+0x4d/0x400
> >> > > [  692.653922]  kthread+0x104/0x140
> >> > > [  692.653922]  ? process_one_work+0x3a0/0x3a0
> >> > > [  692.653922]  ? kthread_park+0x90/0x90
> >> > > [  692.653922]  ret_from_fork+0x35/0x40
> >> > > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> >> > > [  692.653922] CR2: 0000000000000008
> >> > > [  692.653922] ---[ end trace d688c2c182feef87 ]---
> >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> >> > > 89 75
> >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> >> > > knlGS:0000000000000000
> >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> > >
> >> > >
> >> > > Any help appreciated.
> >> > >
> >> > > Greetings
> >> > >
> >> > > --
> >> > > Stefan Majer
> >>
> >>
> >>
> >> --
> >> Stefan Majer
>
>
>
> --
> Stefan Majer



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2019-12-28 17:49         ` Stefan Majer
@ 2019-12-28 17:53           ` Stefan Majer
  2020-01-07 15:41             ` Stefan Majer
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Majer @ 2019-12-28 17:53 UTC (permalink / raw)
  To: sagi grimberg; +Cc: Keith Busch, linux-nvme

I have to add:

./faddr2line  /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
nvme_tcp_io_work+0x341/0x7f0
nvme_tcp_io_work+0x341/0x7f0:
nvme_tcp_req_cur_length at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
(inlined by) nvme_tcp_try_send_data at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
(inlined by) nvme_tcp_try_send at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
(inlined by) nvme_tcp_io_work at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048

On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> Hi,
>
> took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
> installed the debug symbols and ran decodestacktrace.sh from kernel
> sources which gives me:
>
> [   29.266954] nvme nvme0: new ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
> [   29.267477] nvme nvme0: Removing ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery"
> [   29.285732] nvme nvme0: creating 1 I/O queues.
> [   29.286632] nvme nvme0: mapped 1/0 default/read queues.
> [   29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
> 192.168.22.1:4420
> [   29.293146] nvme0n1: detected capacity change from 0 to 1084227584
> [   39.196846] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [   39.198524] #PF: supervisor read access in kernel mode
> [   39.199786] #PF: error_code(0x0000) - not-present page
> [   39.201198] PGD 0 P4D 0
> [   39.201849] Oops: 0000 [#1] SMP PTI
> [   39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
> tainted 5.3.0-24-generic #26-Ubuntu
> [   39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 0.0.0 02/06/2015
> [   39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> [   39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
> [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
> 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
> 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
> 75
> All code
> ========
>    0:   8b 87 98 00 00 00       mov    0x98(%rdi),%eax
>    6:   83 f8 02                cmp    $0x2,%eax
>    9:   0f 85 34 fd ff ff       jne    0xfffffffffffffd43
>    f:   49 8b 47 28             mov    0x28(%r15),%rax
>   13:   4d 89 fe                mov    %r15,%r14
>   16:   48 89 45 a8             mov    %rax,-0x58(%rbp)
>   1a:   49 8b 46 78             mov    0x78(%r14),%rax
>   1e:   49 8b 56 68             mov    0x68(%r14),%rdx
>   22:   45 8b 66 34             mov    0x34(%r14),%r12d
>   26:   45 2b 66 38             sub    0x38(%r14),%r12d
>   2a:*  8b 58 08                mov    0x8(%rax),%ebx           <--
> trapping instruction
>   2d:   8b 48 0c                mov    0xc(%rax),%ecx
>   30:   4c 8b 28                mov    (%rax),%r13
>   33:   48 29 d3                sub    %rdx,%rbx
>   36:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
>   3a:   4c 39 e3                cmp    %r12,%rbx
>   3d:   48                      rex.W
>   3e:   89                      .byte 0x89
>   3f:   75                      .byte 0x75
>
> Code starting with the faulting instruction
> ===========================================
>    0:   8b 58 08                mov    0x8(%rax),%ebx
>    3:   8b 48 0c                mov    0xc(%rax),%ecx
>    6:   4c 8b 28                mov    (%rax),%r13
>    9:   48 29 d3                sub    %rdx,%rbx
>    c:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
>   10:   4c 39 e3                cmp    %r12,%rbx
>   13:   48                      rex.W
>   14:   89                      .byte 0x89
>   15:   75                      .byte 0x75
> [   39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
> [   39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
> [   39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
> [   39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
> [   39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [   39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
> [   39.228992] FS:  0000000000000000(0000) GS:ffff9dd8ff600000(0000)
> knlGS:0000000000000000
> [   39.233660] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
> [   39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   39.246569] Call Trace:
> [   39.247272] process_one_work
> (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
> /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
> /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
> e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
> [   39.248361] worker_thread
> (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
> /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
> /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
> [   39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
> [   39.250243] ? process_one_work
> (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
> [   39.251485] ? kthread_park
> (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
> [   39.252474] ret_from_fork
> (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
> [   39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
> xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
> xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_
> defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
> overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> aes_x86_64 crypto_simd cirrus nls_i
> so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
> fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
> qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
> ipmi_msghandler virtio_rng ip_tables
> x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
> i2c_piix4 pata_acpi floppy
> [   39.269809] CR2: 0000000000000008
>
> greetings
> Stefan
>
> On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
> >
> > Hi,
> >
> > no problem, i am also on vacation.
> >
> > the issue is not reproducible in a pure bare metal environment, target
> > and host are physical machines.
> > The environment where it happens both machines are kvm based.
> >
> > I first have to figure out howto gdb on the kernel crash, thats not my
> > daily jobs, so please be patient.
> >
> > Greetings
> > Stefan
> >
> > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> > >
> > > Hey,
> > >
> > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> > >
> > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> > >
> > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> > >>
> > >> Hi,
> > >>
> > >> i have to add that doing the same on bare metal does work without any problems.
> > >> I suspect that this is probably caused by the fact that in the above
> > >> example my target is a qemu-kvm machine with a emulated nvme device.
> > >> Greetings
> > >> Stefan
> > >>
> > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> > >> >
> > >> > Adding Sagi.
> > >> >
> > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > >> > > Hi,
> > >> > >
> > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > >> > > with kernel 5.4.6 installed. Kernel was taken from
> > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > >> > > occurs with ubuntu 19.10 kernel 5.3.x
> > >> > >
> > >> > > After setup the target i can discover and connect the exported nvme
> > >> > > device on the host with:
> > >> > > modprobe nvme
> > >> > > modprobe nvme-tcp
> > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> > >> > >
> > >> > > No errors so far, but when i try to format the device with:
> > >> > >
> > >> > > mkfs.ext4 /dev/nvme0n1
> > >> > >
> > >> > > The kernel panics with:
> > >> > > Writing inode tables:
> > >> > > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > >> > > [  692.653158] #PF: supervisor read access in kernel mode
> > >> > > [  692.653922] #PF: error_code(0x0000) - not-present page
> > >> > > [  692.653922] PGD 0 P4D 0
> > >> > > [  692.653922] Oops: 0000 [#1] SMP PTI
> > >> > > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > >> > > 5.4.6-050406-generic #201912211140
> > >> > > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > >> > > BIOS 0.0.0 02/06/2015
> > >> > > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > >> > > 89 75
> > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > >> > > knlGS:0000000000000000
> > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >> > > [  692.653922] Call Trace:
> > >> > > [  692.653922]  process_one_work+0x1ec/0x3a0
> > >> > > [  692.653922]  worker_thread+0x4d/0x400
> > >> > > [  692.653922]  kthread+0x104/0x140
> > >> > > [  692.653922]  ? process_one_work+0x3a0/0x3a0
> > >> > > [  692.653922]  ? kthread_park+0x90/0x90
> > >> > > [  692.653922]  ret_from_fork+0x35/0x40
> > >> > > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > >> > > [  692.653922] CR2: 0000000000000008
> > >> > > [  692.653922] ---[ end trace d688c2c182feef87 ]---
> > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > >> > > 89 75
> > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > >> > > knlGS:0000000000000000
> > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >> > >
> > >> > >
> > >> > > Any help appreciated.
> > >> > >
> > >> > > Greetings
> > >> > >
> > >> > > --
> > >> > > Stefan Majer
> > >>
> > >>
> > >>
> > >> --
> > >> Stefan Majer
> >
> >
> >
> > --
> > Stefan Majer
>
>
>
> --
> Stefan Majer



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2019-12-28 17:53           ` Stefan Majer
@ 2020-01-07 15:41             ` Stefan Majer
  2020-01-07 16:48               ` Nadolski, Edmund
  2020-01-15 20:03               ` Sagi Grimberg
  0 siblings, 2 replies; 10+ messages in thread
From: Stefan Majer @ 2020-01-07 15:41 UTC (permalink / raw)
  To: sagi grimberg; +Cc: Keith Busch, linux-nvme

Hi,

is there anything i can help with to further nail down the problem ?

please let me know.
Stefan

On Sat, Dec 28, 2019 at 6:53 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> I have to add:
>
> ./faddr2line  /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
> nvme_tcp_io_work+0x341/0x7f0
> nvme_tcp_io_work+0x341/0x7f0:
> nvme_tcp_req_cur_length at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
> (inlined by) nvme_tcp_try_send_data at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
> (inlined by) nvme_tcp_try_send at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
> (inlined by) nvme_tcp_io_work at
> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048
>
> On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
> >
> > Hi,
> >
> > took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
> > installed the debug symbols and ran decodestacktrace.sh from kernel
> > sources which gives me:
> >
> > [   29.266954] nvme nvme0: new ctrl: NQN
> > "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
> > [   29.267477] nvme nvme0: Removing ctrl: NQN
> > "nqn.2014-08.org.nvmexpress.discovery"
> > [   29.285732] nvme nvme0: creating 1 I/O queues.
> > [   29.286632] nvme nvme0: mapped 1/0 default/read queues.
> > [   29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
> > 192.168.22.1:4420
> > [   29.293146] nvme0n1: detected capacity change from 0 to 1084227584
> > [   39.196846] BUG: kernel NULL pointer dereference, address:
> > 0000000000000008
> > [   39.198524] #PF: supervisor read access in kernel mode
> > [   39.199786] #PF: error_code(0x0000) - not-present page
> > [   39.201198] PGD 0 P4D 0
> > [   39.201849] Oops: 0000 [#1] SMP PTI
> > [   39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
> > tainted 5.3.0-24-generic #26-Ubuntu
> > [   39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 0.0.0 02/06/2015
> > [   39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > [   39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
> > [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
> > 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
> > 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
> > 75
> > All code
> > ========
> >    0:   8b 87 98 00 00 00       mov    0x98(%rdi),%eax
> >    6:   83 f8 02                cmp    $0x2,%eax
> >    9:   0f 85 34 fd ff ff       jne    0xfffffffffffffd43
> >    f:   49 8b 47 28             mov    0x28(%r15),%rax
> >   13:   4d 89 fe                mov    %r15,%r14
> >   16:   48 89 45 a8             mov    %rax,-0x58(%rbp)
> >   1a:   49 8b 46 78             mov    0x78(%r14),%rax
> >   1e:   49 8b 56 68             mov    0x68(%r14),%rdx
> >   22:   45 8b 66 34             mov    0x34(%r14),%r12d
> >   26:   45 2b 66 38             sub    0x38(%r14),%r12d
> >   2a:*  8b 58 08                mov    0x8(%rax),%ebx           <--
> > trapping instruction
> >   2d:   8b 48 0c                mov    0xc(%rax),%ecx
> >   30:   4c 8b 28                mov    (%rax),%r13
> >   33:   48 29 d3                sub    %rdx,%rbx
> >   36:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
> >   3a:   4c 39 e3                cmp    %r12,%rbx
> >   3d:   48                      rex.W
> >   3e:   89                      .byte 0x89
> >   3f:   75                      .byte 0x75
> >
> > Code starting with the faulting instruction
> > ===========================================
> >    0:   8b 58 08                mov    0x8(%rax),%ebx
> >    3:   8b 48 0c                mov    0xc(%rax),%ecx
> >    6:   4c 8b 28                mov    (%rax),%r13
> >    9:   48 29 d3                sub    %rdx,%rbx
> >    c:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
> >   10:   4c 39 e3                cmp    %r12,%rbx
> >   13:   48                      rex.W
> >   14:   89                      .byte 0x89
> >   15:   75                      .byte 0x75
> > [   39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
> > [   39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
> > [   39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
> > [   39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
> > [   39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > [   39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
> > [   39.228992] FS:  0000000000000000(0000) GS:ffff9dd8ff600000(0000)
> > knlGS:0000000000000000
> > [   39.233660] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
> > [   39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   39.246569] Call Trace:
> > [   39.247272] process_one_work
> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
> > /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
> > /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
> > e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
> > [   39.248361] worker_thread
> > (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
> > /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
> > /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
> > [   39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
> > [   39.250243] ? process_one_work
> > (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
> > [   39.251485] ? kthread_park
> > (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
> > [   39.252474] ret_from_fork
> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
> > [   39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
> > xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
> > xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> > nf_defrag_ipv6 nf_
> > defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
> > overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> > aes_x86_64 crypto_simd cirrus nls_i
> > so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
> > fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
> > qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
> > ipmi_msghandler virtio_rng ip_tables
> > x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
> > i2c_piix4 pata_acpi floppy
> > [   39.269809] CR2: 0000000000000008
> >
> > greetings
> > Stefan
> >
> > On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > no problem, i am also on vacation.
> > >
> > > the issue is not reproducible in a pure bare metal environment, target
> > > and host are physical machines.
> > > The environment where it happens both machines are kvm based.
> > >
> > > I first have to figure out howto gdb on the kernel crash, thats not my
> > > daily jobs, so please be patient.
> > >
> > > Greetings
> > > Stefan
> > >
> > > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> > > >
> > > > Hey,
> > > >
> > > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> > > >
> > > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> > > >
> > > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> i have to add that doing the same on bare metal does work without any problems.
> > > >> I suspect that this is probably caused by the fact that in the above
> > > >> example my target is a qemu-kvm machine with a emulated nvme device.
> > > >> Greetings
> > > >> Stefan
> > > >>
> > > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> > > >> >
> > > >> > Adding Sagi.
> > > >> >
> > > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > > >> > > Hi,
> > > >> > >
> > > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > > >> > > with kernel 5.4.6 installed. Kernel was taken from
> > > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > > >> > > occurs with ubuntu 19.10 kernel 5.3.x
> > > >> > >
> > > >> > > After setup the target i can discover and connect the exported nvme
> > > >> > > device on the host with:
> > > >> > > modprobe nvme
> > > >> > > modprobe nvme-tcp
> > > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> > > >> > >
> > > >> > > No errors so far, but when i try to format the device with:
> > > >> > >
> > > >> > > mkfs.ext4 /dev/nvme0n1
> > > >> > >
> > > >> > > The kernel panics with:
> > > >> > > Writing inode tables:
> > > >> > > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > > >> > > [  692.653158] #PF: supervisor read access in kernel mode
> > > >> > > [  692.653922] #PF: error_code(0x0000) - not-present page
> > > >> > > [  692.653922] PGD 0 P4D 0
> > > >> > > [  692.653922] Oops: 0000 [#1] SMP PTI
> > > >> > > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > > >> > > 5.4.6-050406-generic #201912211140
> > > >> > > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > >> > > BIOS 0.0.0 02/06/2015
> > > >> > > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > > >> > > 89 75
> > > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > > >> > > knlGS:0000000000000000
> > > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > >> > > [  692.653922] Call Trace:
> > > >> > > [  692.653922]  process_one_work+0x1ec/0x3a0
> > > >> > > [  692.653922]  worker_thread+0x4d/0x400
> > > >> > > [  692.653922]  kthread+0x104/0x140
> > > >> > > [  692.653922]  ? process_one_work+0x3a0/0x3a0
> > > >> > > [  692.653922]  ? kthread_park+0x90/0x90
> > > >> > > [  692.653922]  ret_from_fork+0x35/0x40
> > > >> > > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > > >> > > [  692.653922] CR2: 0000000000000008
> > > >> > > [  692.653922] ---[ end trace d688c2c182feef87 ]---
> > > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > > >> > > 89 75
> > > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > > >> > > knlGS:0000000000000000
> > > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > >> > >
> > > >> > >
> > > >> > > Any help appreciated.
> > > >> > >
> > > >> > > Greetings
> > > >> > >
> > > >> > > --
> > > >> > > Stefan Majer
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Stefan Majer
> > >
> > >
> > >
> > > --
> > > Stefan Majer
> >
> >
> >
> > --
> > Stefan Majer
>
>
>
> --
> Stefan Majer



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2020-01-07 15:41             ` Stefan Majer
@ 2020-01-07 16:48               ` Nadolski, Edmund
  2020-01-15 20:03               ` Sagi Grimberg
  1 sibling, 0 replies; 10+ messages in thread
From: Nadolski, Edmund @ 2020-01-07 16:48 UTC (permalink / raw)
  To: Stefan Majer, sagi grimberg; +Cc: Keith Busch, linux-nvme

On 1/7/2020 8:41 AM, Stefan Majer wrote:
> Hi,
> 
> is there anything i can help with to further nail down the problem ?
> 
> please let me know.
> Stefan

Is there any kind of reset in progress when this hits?

Thanks,
Ed

> On Sat, Dec 28, 2019 at 6:53 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>>
>> I have to add:
>>
>> ./faddr2line  /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
>> nvme_tcp_io_work+0x341/0x7f0
>> nvme_tcp_io_work+0x341/0x7f0:
>> nvme_tcp_req_cur_length at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
>> (inlined by) nvme_tcp_try_send_data at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
>> (inlined by) nvme_tcp_try_send at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
>> (inlined by) nvme_tcp_io_work at
>> /build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048
>>
>> On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
>> > installed the debug symbols and ran decodestacktrace.sh from kernel
>> > sources which gives me:
>> >
>> > [   29.266954] nvme nvme0: new ctrl: NQN
>> > "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
>> > [   29.267477] nvme nvme0: Removing ctrl: NQN
>> > "nqn.2014-08.org.nvmexpress.discovery"
>> > [   29.285732] nvme nvme0: creating 1 I/O queues.
>> > [   29.286632] nvme nvme0: mapped 1/0 default/read queues.
>> > [   29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
>> > 192.168.22.1:4420
>> > [   29.293146] nvme0n1: detected capacity change from 0 to 1084227584
>> > [   39.196846] BUG: kernel NULL pointer dereference, address:
>> > 0000000000000008
>> > [   39.198524] #PF: supervisor read access in kernel mode
>> > [   39.199786] #PF: error_code(0x0000) - not-present page
>> > [   39.201198] PGD 0 P4D 0
>> > [   39.201849] Oops: 0000 [#1] SMP PTI
>> > [   39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
>> > tainted 5.3.0-24-generic #26-Ubuntu
>> > [   39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> > BIOS 0.0.0 02/06/2015
>> > [   39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
>> > [   39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
>> > [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
>> > 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
>> > 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
>> > 75
>> > All code
>> > ========
>> >    0:   8b 87 98 00 00 00       mov    0x98(%rdi),%eax
>> >    6:   83 f8 02                cmp    $0x2,%eax
>> >    9:   0f 85 34 fd ff ff       jne    0xfffffffffffffd43
>> >    f:   49 8b 47 28             mov    0x28(%r15),%rax
>> >   13:   4d 89 fe                mov    %r15,%r14
>> >   16:   48 89 45 a8             mov    %rax,-0x58(%rbp)
>> >   1a:   49 8b 46 78             mov    0x78(%r14),%rax
>> >   1e:   49 8b 56 68             mov    0x68(%r14),%rdx
>> >   22:   45 8b 66 34             mov    0x34(%r14),%r12d
>> >   26:   45 2b 66 38             sub    0x38(%r14),%r12d
>> >   2a:*  8b 58 08                mov    0x8(%rax),%ebx           <--
>> > trapping instruction
>> >   2d:   8b 48 0c                mov    0xc(%rax),%ecx
>> >   30:   4c 8b 28                mov    (%rax),%r13
>> >   33:   48 29 d3                sub    %rdx,%rbx
>> >   36:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
>> >   3a:   4c 39 e3                cmp    %r12,%rbx
>> >   3d:   48                      rex.W
>> >   3e:   89                      .byte 0x89
>> >   3f:   75                      .byte 0x75
>> >
>> > Code starting with the faulting instruction
>> > ===========================================
>> >    0:   8b 58 08                mov    0x8(%rax),%ebx
>> >    3:   8b 48 0c                mov    0xc(%rax),%ecx
>> >    6:   4c 8b 28                mov    (%rax),%r13
>> >    9:   48 29 d3                sub    %rdx,%rbx
>> >    c:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
>> >   10:   4c 39 e3                cmp    %r12,%rbx
>> >   13:   48                      rex.W
>> >   14:   89                      .byte 0x89
>> >   15:   75                      .byte 0x75
>> > [   39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
>> > [   39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
>> > [   39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
>> > [   39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
>> > [   39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > [   39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
>> > [   39.228992] FS:  0000000000000000(0000) GS:ffff9dd8ff600000(0000)
>> > knlGS:0000000000000000
>> > [   39.233660] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [   39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
>> > [   39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > [   39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > [   39.246569] Call Trace:
>> > [   39.247272] process_one_work
>> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
>> > /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
>> > /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
>> > e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
>> > [   39.248361] worker_thread
>> > (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
>> > /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
>> > /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
>> > [   39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
>> > [   39.250243] ? process_one_work
>> > (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
>> > [   39.251485] ? kthread_park
>> > (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
>> > [   39.252474] ret_from_fork
>> > (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
>> > [   39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
>> > xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
>> > xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
>> > nf_defrag_ipv6 nf_
>> > defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
>> > overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
>> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
>> > aes_x86_64 crypto_simd cirrus nls_i
>> > so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
>> > fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
>> > qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
>> > ipmi_msghandler virtio_rng ip_tables
>> > x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
>> > i2c_piix4 pata_acpi floppy
>> > [   39.269809] CR2: 0000000000000008
>> >
>> > greetings
>> > Stefan
>> >
>> > On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
>> > >
>> > > Hi,
>> > >
>> > > no problem, i am also on vacation.
>> > >
>> > > the issue is not reproducible in a pure bare metal environment, target
>> > > and host are physical machines.
>> > > The environment where it happens both machines are kvm based.
>> > >
>> > > I first have to figure out howto gdb on the kernel crash, thats not my
>> > > daily jobs, so please be patient.
>> > >
>> > > Greetings
>> > > Stefan
>> > >
>> > > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
>> > > >
>> > > > Hey,
>> > > >
>> > > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
>> > > >
>> > > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
>> > > >
>> > > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
>> > > >>
>> > > >> Hi,
>> > > >>
>> > > >> i have to add that doing the same on bare metal does work without any problems.
>> > > >> I suspect that this is probably caused by the fact that in the above
>> > > >> example my target is a qemu-kvm machine with a emulated nvme device.
>> > > >> Greetings
>> > > >> Stefan
>> > > >>
>> > > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
>> > > >> >
>> > > >> > Adding Sagi.
>> > > >> >
>> > > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
>> > > >> > > Hi,
>> > > >> > >
>> > > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
>> > > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
>> > > >> > > with kernel 5.4.6 installed. Kernel was taken from
>> > > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
>> > > >> > > occurs with ubuntu 19.10 kernel 5.3.x
>> > > >> > >
>> > > >> > > After setup the target i can discover and connect the exported nvme
>> > > >> > > device on the host with:
>> > > >> > > modprobe nvme
>> > > >> > > modprobe nvme-tcp
>> > > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
>> > > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
>> > > >> > >
>> > > >> > > No errors so far, but when i try to format the device with:
>> > > >> > >
>> > > >> > > mkfs.ext4 /dev/nvme0n1
>> > > >> > >
>> > > >> > > The kernel panics with:
>> > > >> > > Writing inode tables:
>> > > >> > > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> > > >> > > [  692.653158] #PF: supervisor read access in kernel mode
>> > > >> > > [  692.653922] #PF: error_code(0x0000) - not-present page
>> > > >> > > [  692.653922] PGD 0 P4D 0
>> > > >> > > [  692.653922] Oops: 0000 [#1] SMP PTI
>> > > >> > > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
>> > > >> > > 5.4.6-050406-generic #201912211140
>> > > >> > > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> > > >> > > BIOS 0.0.0 02/06/2015
>> > > >> > > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
>> > > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > >> > > 89 75
>> > > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > >> > > knlGS:0000000000000000
>> > > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > >> > > [  692.653922] Call Trace:
>> > > >> > > [  692.653922]  process_one_work+0x1ec/0x3a0
>> > > >> > > [  692.653922]  worker_thread+0x4d/0x400
>> > > >> > > [  692.653922]  kthread+0x104/0x140
>> > > >> > > [  692.653922]  ? process_one_work+0x3a0/0x3a0
>> > > >> > > [  692.653922]  ? kthread_park+0x90/0x90
>> > > >> > > [  692.653922]  ret_from_fork+0x35/0x40
>> > > >> > > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
>> > > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
>> > > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
>> > > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
>> > > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
>> > > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
>> > > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
>> > > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
>> > > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
>> > > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
>> > > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
>> > > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
>> > > >> > > [  692.653922] CR2: 0000000000000008
>> > > >> > > [  692.653922] ---[ end trace d688c2c182feef87 ]---
>> > > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
>> > > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
>> > > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
>> > > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
>> > > >> > > 89 75
>> > > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
>> > > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
>> > > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
>> > > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
>> > > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
>> > > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
>> > > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
>> > > >> > > knlGS:0000000000000000
>> > > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
>> > > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > > >> > >
>> > > >> > >
>> > > >> > > Any help appreciated.
>> > > >> > >
>> > > >> > > Greetings
>> > > >> > >
>> > > >> > > --
>> > > >> > > Stefan Majer
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> Stefan Majer
>> > >
>> > >
>> > >
>> > > --
>> > > Stefan Majer
>> >
>> >
>> >
>> > --
>> > Stefan Majer
>>
>>
>>
>> --
>> Stefan Majer
> 
> 
> 


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2020-01-07 15:41             ` Stefan Majer
  2020-01-07 16:48               ` Nadolski, Edmund
@ 2020-01-15 20:03               ` Sagi Grimberg
  2020-01-16  6:28                 ` Stefan Majer
  1 sibling, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2020-01-15 20:03 UTC (permalink / raw)
  To: Stefan Majer; +Cc: Keith Busch, linux-nvme


> Hi,
> 
> is there anything i can help with to further nail down the problem ?

Hi Stephen,

I cannot reproduce this issue with my environment (both host and target
are VMs on my laptop, kernel 5.4.0 qemu version 3.1.0).

Would it be possible to try and use kernel 5.4 for the sake of the test?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: null pointer dereference in nvme_tcp_io_work
  2020-01-15 20:03               ` Sagi Grimberg
@ 2020-01-16  6:28                 ` Stefan Majer
  0 siblings, 0 replies; 10+ messages in thread
From: Stefan Majer @ 2020-01-16  6:28 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: Keith Busch, linux-nvme

Hi Sagi,
Sure, no problem.
Wil report back. Thanks for looking into.
Greetings
Stefan

On Wed, Jan 15, 2020 at 9:03 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> > Hi,
> >
> > is there anything i can help with to further nail down the problem ?
>
> Hi Stephen,
>
> I cannot reproduce this issue with my environment (both host and target
> are VMs on my laptop, kernel 5.4.0 qemu version 3.1.0).
>
> Would it be possible to try and use kernel 5.4 for the sake of the test?



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-01-16  6:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25 10:06 null pointer dereference in nvme_tcp_io_work Stefan Majer
2019-12-26 17:47 ` Keith Busch
2019-12-27  7:18   ` Stefan Majer
     [not found]     ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
2019-12-27  7:54       ` Stefan Majer
2019-12-28 17:49         ` Stefan Majer
2019-12-28 17:53           ` Stefan Majer
2020-01-07 15:41             ` Stefan Majer
2020-01-07 16:48               ` Nadolski, Edmund
2020-01-15 20:03               ` Sagi Grimberg
2020-01-16  6:28                 ` Stefan Majer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).