All of lore.kernel.org
 help / color / mirror / Atom feed
* xen-blkfront crash on xl block-detach of not fully attached device
@ 2022-05-11 19:25 Marek Marczykowski-Górecki
  2022-05-12 12:47 ` Jason Andryuk
  0 siblings, 1 reply; 5+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-05-11 19:25 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 3879 bytes --]

Hi,

The reproducer is trivial:

[user@dom0 ~]$ sudo xl block-attach work backend=sys-usb vdev=xvdi target=/dev/sdz
[user@dom0 ~]$ xl block-list work
Vdev  BE  handle state evt-ch ring-ref BE-path                       
51712 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51712
51728 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51728
51744 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51744
51760 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51760
51840 3   241    3     -1     -1       /local/domain/3/backend/vbd/241/51840
                 ^ note state, the /dev/sdz doesn't exist in the backend

[user@dom0 ~]$ sudo xl block-detach work xvdi
[user@dom0 ~]$ xl block-list work
Vdev  BE  handle state evt-ch ring-ref BE-path                       
work is an invalid domain identifier

And its console has:

BUG: kernel NULL pointer dereference, address: 0000000000000050
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 80000000edebb067 P4D 80000000edebb067 PUD edec2067 PMD 0 
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 52 Comm: xenwatch Not tainted 5.16.18-2.43.fc32.qubes.x86_64 #1
RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
Call Trace:
 <TASK>
 blkback_changed+0x95/0x137 [xen_blkfront]
 ? read_reply+0x160/0x160
 xenwatch_thread+0xc0/0x1a0
 ? do_wait_intr_irq+0xa0/0xa0
 kthread+0x16b/0x190
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x22/0x30
 </TASK>
Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront pcspkr xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn ipmi_devintf ipmi_msghandler fuse bpf_preload ip_tables overlay xen_blkfront
CR2: 0000000000000050
---[ end trace 7bc9597fd06ae89d ]---
RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xen-blkfront crash on xl block-detach of not fully attached device
  2022-05-11 19:25 xen-blkfront crash on xl block-detach of not fully attached device Marek Marczykowski-Górecki
@ 2022-05-12 12:47 ` Jason Andryuk
  2022-05-12 13:59   ` Roger Pau Monné
  0 siblings, 1 reply; 5+ messages in thread
From: Jason Andryuk @ 2022-05-12 12:47 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: xen-devel

On Wed, May 11, 2022 at 3:25 PM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> Hi,
>
> The reproducer is trivial:
>
> [user@dom0 ~]$ sudo xl block-attach work backend=sys-usb vdev=xvdi target=/dev/sdz
> [user@dom0 ~]$ xl block-list work
> Vdev  BE  handle state evt-ch ring-ref BE-path
> 51712 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51712
> 51728 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51728
> 51744 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51744
> 51760 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51760
> 51840 3   241    3     -1     -1       /local/domain/3/backend/vbd/241/51840
>                  ^ note state, the /dev/sdz doesn't exist in the backend
>
> [user@dom0 ~]$ sudo xl block-detach work xvdi
> [user@dom0 ~]$ xl block-list work
> Vdev  BE  handle state evt-ch ring-ref BE-path
> work is an invalid domain identifier
>
> And its console has:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000050
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 80000000edebb067 P4D 80000000edebb067 PUD edec2067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 1 PID: 52 Comm: xenwatch Not tainted 5.16.18-2.43.fc32.qubes.x86_64 #1
> RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
> Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
> RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
> RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
> RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
> R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
> R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
> FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
> Call Trace:
>  <TASK>
>  blkback_changed+0x95/0x137 [xen_blkfront]
>  ? read_reply+0x160/0x160
>  xenwatch_thread+0xc0/0x1a0
>  ? do_wait_intr_irq+0xa0/0xa0
>  kthread+0x16b/0x190
>  ? set_kthread_struct+0x40/0x40
>  ret_from_fork+0x22/0x30
>  </TASK>
> Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront pcspkr xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn ipmi_devintf ipmi_msghandler fuse bpf_preload ip_tables overlay xen_blkfront
> CR2: 0000000000000050
> ---[ end trace 7bc9597fd06ae89d ]---
> RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
> Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
> RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
> RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
> RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
> R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
> R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
> FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
> Kernel panic - not syncing: Fatal exception
> Kernel Offset: disabled

This looks like it may be blkfront_closing() calling
blk_mq_stop_hw_queues() with info->rq == NULL.  info->rq is only
assigned in blkfront_connect(), which is called for state 4, but your
vbd never made it through there.  It seems like blkfront_closing()
should NULL check info->rq and info->gd before using them.

Regards,
Jason


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xen-blkfront crash on xl block-detach of not fully attached device
  2022-05-12 12:47 ` Jason Andryuk
@ 2022-05-12 13:59   ` Roger Pau Monné
  2022-05-12 19:18     ` Jason Andryuk
  0 siblings, 1 reply; 5+ messages in thread
From: Roger Pau Monné @ 2022-05-12 13:59 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: Marek Marczykowski-Górecki, xen-devel

On Thu, May 12, 2022 at 08:47:01AM -0400, Jason Andryuk wrote:
> On Wed, May 11, 2022 at 3:25 PM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
> >
> > Hi,
> >
> > The reproducer is trivial:
> >
> > [user@dom0 ~]$ sudo xl block-attach work backend=sys-usb vdev=xvdi target=/dev/sdz
> > [user@dom0 ~]$ xl block-list work
> > Vdev  BE  handle state evt-ch ring-ref BE-path
> > 51712 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51712
> > 51728 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51728
> > 51744 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51744
> > 51760 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51760
> > 51840 3   241    3     -1     -1       /local/domain/3/backend/vbd/241/51840
> >                  ^ note state, the /dev/sdz doesn't exist in the backend
> >
> > [user@dom0 ~]$ sudo xl block-detach work xvdi
> > [user@dom0 ~]$ xl block-list work
> > Vdev  BE  handle state evt-ch ring-ref BE-path
> > work is an invalid domain identifier
> >
> > And its console has:
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000050
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 80000000edebb067 P4D 80000000edebb067 PUD edec2067 PMD 0
> > Oops: 0000 [#1] PREEMPT SMP PTI
> > CPU: 1 PID: 52 Comm: xenwatch Not tainted 5.16.18-2.43.fc32.qubes.x86_64 #1
> > RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
> > Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
> > RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
> > RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
> > RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
> > R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
> > R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
> > FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
> > Call Trace:
> >  <TASK>
> >  blkback_changed+0x95/0x137 [xen_blkfront]
> >  ? read_reply+0x160/0x160
> >  xenwatch_thread+0xc0/0x1a0
> >  ? do_wait_intr_irq+0xa0/0xa0
> >  kthread+0x16b/0x190
> >  ? set_kthread_struct+0x40/0x40
> >  ret_from_fork+0x22/0x30
> >  </TASK>
> > Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront pcspkr xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn ipmi_devintf ipmi_msghandler fuse bpf_preload ip_tables overlay xen_blkfront
> > CR2: 0000000000000050
> > ---[ end trace 7bc9597fd06ae89d ]---
> > RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
> > Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
> > RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
> > RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
> > RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
> > R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
> > R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
> > FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
> > Kernel panic - not syncing: Fatal exception
> > Kernel Offset: disabled
> 
> This looks like it may be blkfront_closing() calling
> blk_mq_stop_hw_queues() with info->rq == NULL.  info->rq is only
> assigned in blkfront_connect(), which is called for state 4, but your
> vbd never made it through there.  It seems like blkfront_closing()
> should NULL check info->rq and info->gd before using them.

Care to send a patch? :)

Thanks, Roger.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xen-blkfront crash on xl block-detach of not fully attached device
  2022-05-12 13:59   ` Roger Pau Monné
@ 2022-05-12 19:18     ` Jason Andryuk
  2022-05-13  7:16       ` Roger Pau Monné
  0 siblings, 1 reply; 5+ messages in thread
From: Jason Andryuk @ 2022-05-12 19:18 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Marek Marczykowski-Górecki, xen-devel

On Thu, May 12, 2022 at 9:59 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> Care to send a patch? :)

I will, but because of $reasons, it won't be out until next week.

Regards,
Jason


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xen-blkfront crash on xl block-detach of not fully attached device
  2022-05-12 19:18     ` Jason Andryuk
@ 2022-05-13  7:16       ` Roger Pau Monné
  0 siblings, 0 replies; 5+ messages in thread
From: Roger Pau Monné @ 2022-05-13  7:16 UTC (permalink / raw)
  To: Jason Andryuk; +Cc: Marek Marczykowski-Górecki, xen-devel

On Thu, May 12, 2022 at 03:18:05PM -0400, Jason Andryuk wrote:
> On Thu, May 12, 2022 at 9:59 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > Care to send a patch? :)
> 
> I will, but because of $reasons, it won't be out until next week.

That's fine, I don't think we are on a rush :).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-05-13  7:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-11 19:25 xen-blkfront crash on xl block-detach of not fully attached device Marek Marczykowski-Górecki
2022-05-12 12:47 ` Jason Andryuk
2022-05-12 13:59   ` Roger Pau Monné
2022-05-12 19:18     ` Jason Andryuk
2022-05-13  7:16       ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.