linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Majer <stefan.majer@gmail.com>
To: sagi grimberg <sagi@grimberg.me>
Cc: Keith Busch <kbusch@kernel.org>,
	linux-nvme <linux-nvme@lists.infradead.org>
Subject: Re: null pointer dereference in nvme_tcp_io_work
Date: Sat, 28 Dec 2019 18:53:10 +0100	[thread overview]
Message-ID: <CADdPHGt+vLDp6hx0u3nabW7s6Ut11Jzbb4gx2NRD95zu3H9mvQ@mail.gmail.com> (raw)
In-Reply-To: <CADdPHGsT8JxqWN8KKnQgJvNFZXzq08pd5eR1RJeUN-cmhQYH_Q@mail.gmail.com>

I have to add:

./faddr2line  /var/lib/debug/lib/modules/5.3.0-24-generic/kernel/drivers/nvme/host/nvme-tcp.ko
nvme_tcp_io_work+0x341/0x7f0
nvme_tcp_io_work+0x341/0x7f0:
nvme_tcp_req_cur_length at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:189
(inlined by) nvme_tcp_try_send_data at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:854
(inlined by) nvme_tcp_try_send at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1011
(inlined by) nvme_tcp_io_work at
/build/linux-4AS01l/linux-5.3.0/drivers/nvme/host/tcp.c:1048

On Sat, Dec 28, 2019 at 6:49 PM Stefan Majer <stefan.majer@gmail.com> wrote:
>
> Hi,
>
> took a while, but now reproduced with ubuntu-19.10 kernel 5.3.x i
> installed the debug symbols and ran decodestacktrace.sh from kernel
> sources which gives me:
>
> [   29.266954] nvme nvme0: new ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.22.1:4420
> [   29.267477] nvme nvme0: Removing ctrl: NQN
> "nqn.2014-08.org.nvmexpress.discovery"
> [   29.285732] nvme nvme0: creating 1 I/O queues.
> [   29.286632] nvme nvme0: mapped 1/0 default/read queues.
> [   29.288565] nvme nvme0: new ctrl: NQN "nvmet-test", addr
> 192.168.22.1:4420
> [   29.293146] nvme0n1: detected capacity change from 0 to 1084227584
> [   39.196846] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [   39.198524] #PF: supervisor read access in kernel mode
> [   39.199786] #PF: error_code(0x0000) - not-present page
> [   39.201198] PGD 0 P4D 0
> [   39.201849] Oops: 0000 [#1] SMP PTI
> [   39.202679] CPU: 0 PID: 223 Comm: kworker/0:1H Kdump: loaded Not
> tainted 5.3.0-24-generic #26-Ubuntu
> [   39.204830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 0.0.0 02/06/2015
> [   39.207205] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> [   39.209005] RIP: 0010:nvme_tcp_io_work+0x341/0x7f0 nvme_tcp
> [ 39.210686] Code: 8b 87 98 00 00 00 83 f8 02 0f 85 34 fd ff ff 49 8b
> 47 28 4d 89 fe 48 89 45 a8 49 8b 46 78 49 8b 56 68 45 8b 66 34 45 2b
> 66 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48 89
> 75
> All code
> ========
>    0:   8b 87 98 00 00 00       mov    0x98(%rdi),%eax
>    6:   83 f8 02                cmp    $0x2,%eax
>    9:   0f 85 34 fd ff ff       jne    0xfffffffffffffd43
>    f:   49 8b 47 28             mov    0x28(%r15),%rax
>   13:   4d 89 fe                mov    %r15,%r14
>   16:   48 89 45 a8             mov    %rax,-0x58(%rbp)
>   1a:   49 8b 46 78             mov    0x78(%r14),%rax
>   1e:   49 8b 56 68             mov    0x68(%r14),%rdx
>   22:   45 8b 66 34             mov    0x34(%r14),%r12d
>   26:   45 2b 66 38             sub    0x38(%r14),%r12d
>   2a:*  8b 58 08                mov    0x8(%rax),%ebx           <--
> trapping instruction
>   2d:   8b 48 0c                mov    0xc(%rax),%ecx
>   30:   4c 8b 28                mov    (%rax),%r13
>   33:   48 29 d3                sub    %rdx,%rbx
>   36:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
>   3a:   4c 39 e3                cmp    %r12,%rbx
>   3d:   48                      rex.W
>   3e:   89                      .byte 0x89
>   3f:   75                      .byte 0x75
>
> Code starting with the faulting instruction
> ===========================================
>    0:   8b 58 08                mov    0x8(%rax),%ebx
>    3:   8b 48 0c                mov    0xc(%rax),%ecx
>    6:   4c 8b 28                mov    (%rax),%r13
>    9:   48 29 d3                sub    %rdx,%rbx
>    c:   48 8d 34 11             lea    (%rcx,%rdx,1),%rsi
>   10:   4c 39 e3                cmp    %r12,%rbx
>   13:   48                      rex.W
>   14:   89                      .byte 0x89
>   15:   75                      .byte 0x75
> [   39.216464] RSP: 0018:ffffb0f8c0453dd8 EFLAGS: 00010206
> [   39.218053] RAX: 0000000000000000 RBX: 00000000b4e42801 RCX: 0000000000000000
> [   39.219803] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9dd8e6e49478
> [   39.221766] RBP: ffffb0f8c0453e60 R08: 0000000000001000 R09: 0000000002800809
> [   39.223635] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> [   39.226010] R13: 0000000000000048 R14: ffff9dd8e6e49418 R15: ffff9dd8e6e49418
> [   39.228992] FS:  0000000000000000(0000) GS:ffff9dd8ff600000(0000)
> knlGS:0000000000000000
> [   39.233660] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   39.237863] CR2: 0000000000000008 CR3: 0000000067c6a005 CR4: 0000000000360ef0
> [   39.241807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   39.244496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   39.246569] Call Trace:
> [   39.247272] process_one_work
> (/build/linux-4AS01l/linux-5.3.0/arch/x86/include/asm/jump_label.h:25
> /build/linux-4AS01l/linux-5.3.0/include/linux/jump_label.h:200
> /build/linux-4AS01l/linux-5.3.0/include/trace/events/workqueu
> e.h:114 /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2274)
> [   39.248361] worker_thread
> (/build/linux-4AS01l/linux-5.3.0/include/linux/compiler.h:199
> /build/linux-4AS01l/linux-5.3.0/include/linux/list.h:268
> /build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2416)
> [   39.249364] kthread (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:255)
> [   39.250243] ? process_one_work
> (/build/linux-4AS01l/linux-5.3.0/kernel/workqueue.c:2358)
> [   39.251485] ? kthread_park
> (/build/linux-4AS01l/linux-5.3.0/kernel/kthread.c:215)
> [   39.252474] ret_from_fork
> (/build/linux-4AS01l/linux-5.3.0/arch/x86/entry/entry_64.S:358)
> [   39.253476] Modules linked in: nvme_tcp nvme_fabrics nvme nvme_core
> xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user
> xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_
> defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc aufs
> overlay intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> aes_x86_64 crypto_simd cirrus nls_i
> so8859_1 cryptd glue_helper drm_kms_helper drm input_leds joydev
> fb_sys_fops serio_raw syscopyarea sysfillrect sysimgblt mac_hid
> qemu_fw_cfg bonding sch_fq_codel ipmi_watchdog ipmi_devintf
> ipmi_msghandler virtio_rng ip_tables
> x_tables autofs4 psmouse virtio_net net_failover failover ahci libahci
> i2c_piix4 pata_acpi floppy
> [   39.269809] CR2: 0000000000000008
>
> greetings
> Stefan
>
> On Fri, Dec 27, 2019 at 8:54 AM Stefan Majer <stefan.majer@gmail.com> wrote:
> >
> > Hi,
> >
> > no problem, i am also on vacation.
> >
> > the issue is not reproducible in a pure bare metal environment, target
> > and host are physical machines.
> > The environment where it happens both machines are kvm based.
> >
> > I first have to figure out howto gdb on the kernel crash, thats not my
> > daily jobs, so please be patient.
> >
> > Greetings
> > Stefan
> >
> > On Fri, Dec 27, 2019 at 8:49 AM sagi grimberg <sagi@grimberg.me> wrote:
> > >
> > > Hey,
> > >
> > > On vacation so not able to take a look right now, but can you provide a line info from gdb on the RIP line?
> > >
> > > Also, did you say that the issue is not reproducible when the host is on bare metal but only on kvm? ( You said the target, but I'm asking about the host).
> > >
> > > On Thu, Dec 26, 2019, 23:18 Stefan Majer <stefan.majer@gmail.com> wrote:
> > >>
> > >> Hi,
> > >>
> > >> i have to add that doing the same on bare metal does work without any problems.
> > >> I suspect that this is probably caused by the fact that in the above
> > >> example my target is a qemu-kvm machine with a emulated nvme device.
> > >> Greetings
> > >> Stefan
> > >>
> > >> On Thu, Dec 26, 2019 at 6:47 PM Keith Busch <kbusch@kernel.org> wrote:
> > >> >
> > >> > Adding Sagi.
> > >> >
> > >> > On Wed, Dec 25, 2019 at 11:06:17AM +0100, Stefan Majer wrote:
> > >> > > Hi,
> > >> > >
> > >> > > im trying to setup a nvme-over-tcp test environment with a qemu-kvm
> > >> > > based nvmet-tcp target based on ubuntu-19.10 and a ubuntu-19.10 host
> > >> > > with kernel 5.4.6 installed. Kernel was taken from
> > >> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.4.6/ . Same Panic
> > >> > > occurs with ubuntu 19.10 kernel 5.3.x
> > >> > >
> > >> > > After setup the target i can discover and connect the exported nvme
> > >> > > device on the host with:
> > >> > > modprobe nvme
> > >> > > modprobe nvme-tcp
> > >> > > nvme discover -t tcp -a 192.168.22.1 -s 4420
> > >> > > nvme connect -t tcp -n nvmet-test -a 192.168.22.1 -s 4420
> > >> > >
> > >> > > No errors so far, but when i try to format the device with:
> > >> > >
> > >> > > mkfs.ext4 /dev/nvme0n1
> > >> > >
> > >> > > The kernel panics with:
> > >> > > Writing inode tables:
> > >> > > [  692.651243] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > >> > > [  692.653158] #PF: supervisor read access in kernel mode
> > >> > > [  692.653922] #PF: error_code(0x0000) - not-present page
> > >> > > [  692.653922] PGD 0 P4D 0
> > >> > > [  692.653922] Oops: 0000 [#1] SMP PTI
> > >> > > [  692.653922] CPU: 0 PID: 224 Comm: kworker/0:1H Not tainted
> > >> > > 5.4.6-050406-generic #201912211140
> > >> > > [  692.653922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > >> > > BIOS 0.0.0 02/06/2015
> > >> > > [  692.653922] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
> > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > >> > > 89 75
> > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > >> > > knlGS:0000000000000000
> > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >> > > [  692.653922] Call Trace:
> > >> > > [  692.653922]  process_one_work+0x1ec/0x3a0
> > >> > > [  692.653922]  worker_thread+0x4d/0x400
> > >> > > [  692.653922]  kthread+0x104/0x140
> > >> > > [  692.653922]  ? process_one_work+0x3a0/0x3a0
> > >> > > [  692.653922]  ? kthread_park+0x90/0x90
> > >> > > [  692.653922]  ret_from_fork+0x35/0x40
> > >> > > [  692.653922] Modules linked in: binfmt_misc nvme_tcp nvme_fabrics
> > >> > > nvme nvme_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
> > >> > > nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat
> > >> > > nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter
> > >> > > br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common
> > >> > > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> > >> > > ghash_clmulni_intel aesni_intel nls_iso8859_1 crypto_simd cryptd
> > >> > > cirrus glue_helper drm_kms_helper drm input_leds fb_sys_fops joydev
> > >> > > serio_raw syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg
> > >> > > bonding sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler
> > >> > > virtio_rng ip_tables x_tables autofs4 ahci psmouse virtio_net
> > >> > > net_failover failover libahci i2c_piix4 pata_acpi floppy
> > >> > > [  692.653922] CR2: 0000000000000008
> > >> > > [  692.653922] ---[ end trace d688c2c182feef87 ]---
> > >> > > [  692.653922] RIP: 0010:nvme_tcp_io_work+0x308/0x790 [nvme_tcp]
> > >> > > [  692.653922] Code: 8b 86 98 00 00 00 83 f8 02 0f 85 6d fd ff ff 49
> > >> > > 8b 46 28 4d 89 f7 48 89 45 a8 49 8b 47 78 49 8b 57 68 45 8b 67 34 45
> > >> > > 2b 67 38 <8b> 58 08 8b 48 0c 4c 8b 28 48 29 d3 48 8d 34 11 4c 39 e3 48
> > >> > > 89 75
> > >> > > [  692.653922] RSP: 0018:ffffa49a00447dd8 EFLAGS: 00010206
> > >> > > [  692.653922] RAX: 0000000000000000 RBX: 0000000077bd3601 RCX: 0000000000000000
> > >> > > [  692.653922] RDX: 0000000000000000 RSI: 0000000000000011 RDI: ffff9376781c0500
> > >> > > [  692.653922] RBP: ffffa49a00447e60 R08: 0000000000001000 R09: 0000000005000809
> > >> > > [  692.653922] R10: 0000000000000009 R11: 0000000000000000 R12: 0000000000001000
> > >> > > [  692.653922] R13: 0000000000000048 R14: ffff9376781c04a0 R15: ffff9376781c04a0
> > >> > > [  692.653922] FS:  0000000000000000(0000) GS:ffff93767f600000(0000)
> > >> > > knlGS:0000000000000000
> > >> > > [  692.653922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >> > > [  692.653922] CR2: 0000000000000008 CR3: 000000007b488003 CR4: 0000000000360ef0
> > >> > > [  692.653922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >> > > [  692.653922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >> > >
> > >> > >
> > >> > > Any help appreciated.
> > >> > >
> > >> > > Greetings
> > >> > >
> > >> > > --
> > >> > > Stefan Majer
> > >>
> > >>
> > >>
> > >> --
> > >> Stefan Majer
> >
> >
> >
> > --
> > Stefan Majer
>
>
>
> --
> Stefan Majer



-- 
Stefan Majer

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2019-12-28 17:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 10:06 null pointer dereference in nvme_tcp_io_work Stefan Majer
2019-12-26 17:47 ` Keith Busch
2019-12-27  7:18   ` Stefan Majer
     [not found]     ` <CAB5Wxwco3KD1e_nRGQ_mWAMa_2d-wP2-1Aao4ZXtDeVgFQQM_w@mail.gmail.com>
2019-12-27  7:54       ` Stefan Majer
2019-12-28 17:49         ` Stefan Majer
2019-12-28 17:53           ` Stefan Majer [this message]
2020-01-07 15:41             ` Stefan Majer
2020-01-07 16:48               ` Nadolski, Edmund
2020-01-15 20:03               ` Sagi Grimberg
2020-01-16  6:28                 ` Stefan Majer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADdPHGt+vLDp6hx0u3nabW7s6Ut11Jzbb4gx2NRD95zu3H9mvQ@mail.gmail.com \
    --to=stefan.majer@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).