All of lore.kernel.org
 help / color / mirror / Atom feed
* xen-blkfront: BUG_ON(info->nr_rings)
@ 2021-03-10 14:58 Jason Andryuk
  2021-03-11  9:01 ` Paul Durrant
  0 siblings, 1 reply; 4+ messages in thread
From: Jason Andryuk @ 2021-03-10 14:58 UTC (permalink / raw)
  To: xen-devel

Hi,

I was running a loop of `xl block-attach ; xl block-detach` and I
triggered a BUG in xen-blkfront, drivers/block/xen-blkfront.c:1917
This is BUG_ON(info->nr_rings) in negotiate_mq called by blkback_changed.

I'm using Linux 5.4.103 and blktap3 on Xen 4.12 (OpenXT), though I
don't think that matters.  The backtrace and some preceding logs (from
the reproducer) are below.

I just repro-ed with this:
path=<backend path/state>
xenstore-write $path 5 ; xenstore-write $path 4

info->nr_rings is still set because of the unexpected transition
XenbusStateClosing -> XenbusStateConnected:
dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.

I'm not totally sure how to handle this.  The XenbusStateConnected
event should be creating a new blkfront device, but instead it's seen
by the old one which hasn't been cleaned up yet.

After this BUG, the xenwatch thread is gone.  The VM is still running,
but watches aren't triggering anymore.

Regards,
Jason

dom7: [ 2866.494691] vbd vbd-51728: blkfront:blkback_changed to state 1.
tapback[27208]: backend.c:276 51728 physical_device_changed
tapback[27208]: backend.c:362 51728 found tapdisk[17223], for 254:9
tapdisk[17223]: VBD 9 got disk info: sectors=147456 sector size=512, info=0
xl: [18012] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18012] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
dom7: [ 2866.507702] vbd vbd-51728: blkfront:blkback_changed to state 2.
xl: [18022] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18026] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18024] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18022] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18018] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18026] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18024] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18018] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18038] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18034] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18038] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18034] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18030] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18030] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
tapback[27208]: frontend.c:216 51728 front-end supports persistent
grants but we don't
tapdisk[17223]: connecting VBD 9 domid=7, devid=51728, pool (null),
evt 12, poll duration 0, poll idle threshold 0
tapdisk[17223]: ring 0x74ce10 connected
dom7: [ 2866.536144] vbd vbd-51728: blkfront:blkback_changed to state 5.
xl: [18020] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18016] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
xl: [18020] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18016] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
xl: [18036] libxl_disk.c:303:device_disk_add:Domain 7:device already
exists in xenstore
dom7: [ 2866.544439] vbd vbd-51728: blkfront:blkback_changed to state 5.
xl: [18036] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
dom7: [ 2866.555778] vbd vbd-51728: blkfront:blkback_changed to state 5.
dom7: [ 2866.565810] vbd vbd-51728: blkfront:blkback_changed to state 5.
dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.
dom7: [ 2866.578655] ------------[ cut here ]------------
dom7: [ 2866.578662] kernel BUG at .../drivers/block/xen-blkfront.c:1917!
dom7: [ 2866.578681] invalid opcode: 0000 [#1] SMP PTI
dom7: [ 2866.578688] CPU: 0 PID: 76 Comm: xenwatch Tainted: G  O      5.4.103 #1
dom7: [ 2866.578699] RIP: 0010:talk_to_blkback+0x7b7/0xdb0
dom7: [ 2866.578706] Code: ff ff fa ff e9 5d fb ff ff 49 8b 56 08 48
8b b3 08 01 00 00 8b 7c 24 1c e8 96 bb ff ff 85 c0 0f 84 60 ff ff ff
e9 4b ff ff ff <0f> 0b 48 c7 c2 0c e7 c0 81 be f4 ff ff ff 4c 89 f7 e8
c3 ff fa ff
dom7: [ 2866.578727] RSP: 0018:ffffc900004e3d80 EFLAGS: 00010202
dom7: [ 2866.578734] RAX: 0000000000000001 RBX: ffff88801df68200 RCX:
0000000000000000
dom7: [ 2866.578743] RDX: 000000000000004a RSI: ffff88801d20ab80 RDI:
0000000000000000
dom7: [ 2866.578752] RBP: ffff88801e31a800 R08: 00000000000003c6 R09:
0000000000000800
dom7: [ 2866.578761] R10: ffffc900004d3db0 R11: 00000000000002da R12:
ffffffff81ea4410
dom7: [ 2866.578770] R13: dead000000000122 R14: ffff88801e31a800 R15:
ffff88801df68200
dom7: [ 2866.578779] FS:  0000000000000000(0000)
GS:ffff88801f200000(0000) knlGS:0000000000000000
dom7: [ 2866.578789] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
dom7: [ 2866.578797] CR2: 00007f34ade5a0d4 CR3: 000000001d0bc003 CR4:
00000000003606b0
dom7: [ 2866.578807] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
dom7: [ 2866.578815] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
dom7: [ 2866.578825] Call Trace:
dom7: [ 2866.578830]  blkback_changed+0x14a/0xc50
dom7: [ 2866.578836]  ? find_watch+0x40/0x40
dom7: [ 2866.578841]  ? xenbus_read_driver_state+0x34/0x60
dom7: [ 2866.578848]  ? find_watch+0x40/0x40
dom7: [ 2866.578853]  xenwatch_thread+0x97/0x160
dom7: [ 2866.578859]  ? wait_woken+0x80/0x80
dom7: [ 2866.578866]  kthread+0xf3/0x130
dom7: [ 2866.578871]  ? kthread_create_worker_on_cpu+0x70/0x70
dom7: [ 2866.578879]  ret_from_fork+0x35/0x40
dom7: [ 2866.578884] Modules linked in: xen_argo(O)
dom7: [ 2866.578890] ---[ end trace 06163b0483faf9c0 ]---
dom7: [ 2866.578898] RIP: 0010:talk_to_blkback+0x7b7/0xdb0
dom7: [ 2866.586251] Code: ff ff fa ff e9 5d fb ff ff 49 8b 56 08 48
8b b3 08 01 00 00 8b 7c 24 1c e8 96 bb ff ff 85 c0 0f 84 60 ff ff ff
e9 4b ff ff ff <0f> 0b 48 c7 c2 0c e7 c0 81 be f4 ff ff ff 4c 89 f7 e8
c3 ff fa ff
dom7: [ 2866.586276] RSP: 0018:ffffc900004e3d80 EFLAGS: 00010202
dom7: [ 2866.586288] RAX: 0000000000000001 RBX: ffff88801df68200 RCX:
0000000000000000
dom7: [ 2866.586301] RDX: 000000000000004a RSI: ffff88801d20ab80 RDI:
0000000000000000
dom7: [ 2866.586315] RBP: ffff88801e31a800 R08: 00000000000003c6 R09:
0000000000000800
dom7: [ 2866.586325] R10: ffffc900004d3db0 R11: 00000000000002da R12:
ffffffff81ea4410
dom7: [ 2866.586339] R13: dead000000000122 R14: ffff88801e31a800 R15:
ffff88801df68200
dom7: [ 2866.586354] FS:  0000000000000000(0000)
GS:ffff88801f200000(0000) knlGS:0000000000000000
dom7: [ 2866.586368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
dom7: [ 2866.586376] CR2: 00007f34ade5a0d4 CR3: 000000001d0bc003 CR4:
00000000003606b0
dom7: [ 2866.586390] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
dom7: [ 2866.586404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
tapdisk[17223]: disconnecting domid=7, devid=51728


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xen-blkfront: BUG_ON(info->nr_rings)
  2021-03-10 14:58 xen-blkfront: BUG_ON(info->nr_rings) Jason Andryuk
@ 2021-03-11  9:01 ` Paul Durrant
  2021-03-11 10:37   ` Roger Pau Monné
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Durrant @ 2021-03-11  9:01 UTC (permalink / raw)
  To: xen-devel

On 10/03/2021 14:58, Jason Andryuk wrote:
> Hi,
> 
> I was running a loop of `xl block-attach ; xl block-detach` and I
> triggered a BUG in xen-blkfront, drivers/block/xen-blkfront.c:1917
> This is BUG_ON(info->nr_rings) in negotiate_mq called by blkback_changed.
> 
> I'm using Linux 5.4.103 and blktap3 on Xen 4.12 (OpenXT), though I
> don't think that matters.  The backtrace and some preceding logs (from
> the reproducer) are below.
> 
> I just repro-ed with this:
> path=<backend path/state>
> xenstore-write $path 5 ; xenstore-write $path 4
> 
> info->nr_rings is still set because of the unexpected transition
> XenbusStateClosing -> XenbusStateConnected:
> dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
> dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.
> 
> I'm not totally sure how to handle this.  The XenbusStateConnected
> event should be creating a new blkfront device, but instead it's seen
> by the old one which hasn't been cleaned up yet.
> 

Sounds like blkfront needs to be fixed. Once it is in state 5 the only 
state it should go to should be 6. From there it can cycle back to 4.

   Paul


> After this BUG, the xenwatch thread is gone.  The VM is still running,
> but watches aren't triggering anymore.
> 
> Regards,
> Jason
> 
> dom7: [ 2866.494691] vbd vbd-51728: blkfront:blkback_changed to state 1.
> tapback[27208]: backend.c:276 51728 physical_device_changed
> tapback[27208]: backend.c:362 51728 found tapdisk[17223], for 254:9
> tapdisk[17223]: VBD 9 got disk info: sectors=147456 sector size=512, info=0
> xl: [18012] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18012] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> dom7: [ 2866.507702] vbd vbd-51728: blkfront:blkback_changed to state 2.
> xl: [18022] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18026] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18024] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18022] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18018] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18026] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18024] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18018] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18038] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18034] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18038] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18034] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18030] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18030] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> tapback[27208]: frontend.c:216 51728 front-end supports persistent
> grants but we don't
> tapdisk[17223]: connecting VBD 9 domid=7, devid=51728, pool (null),
> evt 12, poll duration 0, poll idle threshold 0
> tapdisk[17223]: ring 0x74ce10 connected
> dom7: [ 2866.536144] vbd vbd-51728: blkfront:blkback_changed to state 5.
> xl: [18020] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18016] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> xl: [18020] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18016] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> xl: [18036] libxl_disk.c:303:device_disk_add:Domain 7:device already
> exists in xenstore
> dom7: [ 2866.544439] vbd vbd-51728: blkfront:blkback_changed to state 5.
> xl: [18036] libxl_device.c:1468:device_addrm_aocomplete:unable to add device
> dom7: [ 2866.555778] vbd vbd-51728: blkfront:blkback_changed to state 5.
> dom7: [ 2866.565810] vbd vbd-51728: blkfront:blkback_changed to state 5.
> dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
> dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.
> dom7: [ 2866.578655] ------------[ cut here ]------------
> dom7: [ 2866.578662] kernel BUG at .../drivers/block/xen-blkfront.c:1917!
> dom7: [ 2866.578681] invalid opcode: 0000 [#1] SMP PTI
> dom7: [ 2866.578688] CPU: 0 PID: 76 Comm: xenwatch Tainted: G  O      5.4.103 #1
> dom7: [ 2866.578699] RIP: 0010:talk_to_blkback+0x7b7/0xdb0
> dom7: [ 2866.578706] Code: ff ff fa ff e9 5d fb ff ff 49 8b 56 08 48
> 8b b3 08 01 00 00 8b 7c 24 1c e8 96 bb ff ff 85 c0 0f 84 60 ff ff ff
> e9 4b ff ff ff <0f> 0b 48 c7 c2 0c e7 c0 81 be f4 ff ff ff 4c 89 f7 e8
> c3 ff fa ff
> dom7: [ 2866.578727] RSP: 0018:ffffc900004e3d80 EFLAGS: 00010202
> dom7: [ 2866.578734] RAX: 0000000000000001 RBX: ffff88801df68200 RCX:
> 0000000000000000
> dom7: [ 2866.578743] RDX: 000000000000004a RSI: ffff88801d20ab80 RDI:
> 0000000000000000
> dom7: [ 2866.578752] RBP: ffff88801e31a800 R08: 00000000000003c6 R09:
> 0000000000000800
> dom7: [ 2866.578761] R10: ffffc900004d3db0 R11: 00000000000002da R12:
> ffffffff81ea4410
> dom7: [ 2866.578770] R13: dead000000000122 R14: ffff88801e31a800 R15:
> ffff88801df68200
> dom7: [ 2866.578779] FS:  0000000000000000(0000)
> GS:ffff88801f200000(0000) knlGS:0000000000000000
> dom7: [ 2866.578789] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> dom7: [ 2866.578797] CR2: 00007f34ade5a0d4 CR3: 000000001d0bc003 CR4:
> 00000000003606b0
> dom7: [ 2866.578807] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> dom7: [ 2866.578815] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> dom7: [ 2866.578825] Call Trace:
> dom7: [ 2866.578830]  blkback_changed+0x14a/0xc50
> dom7: [ 2866.578836]  ? find_watch+0x40/0x40
> dom7: [ 2866.578841]  ? xenbus_read_driver_state+0x34/0x60
> dom7: [ 2866.578848]  ? find_watch+0x40/0x40
> dom7: [ 2866.578853]  xenwatch_thread+0x97/0x160
> dom7: [ 2866.578859]  ? wait_woken+0x80/0x80
> dom7: [ 2866.578866]  kthread+0xf3/0x130
> dom7: [ 2866.578871]  ? kthread_create_worker_on_cpu+0x70/0x70
> dom7: [ 2866.578879]  ret_from_fork+0x35/0x40
> dom7: [ 2866.578884] Modules linked in: xen_argo(O)
> dom7: [ 2866.578890] ---[ end trace 06163b0483faf9c0 ]---
> dom7: [ 2866.578898] RIP: 0010:talk_to_blkback+0x7b7/0xdb0
> dom7: [ 2866.586251] Code: ff ff fa ff e9 5d fb ff ff 49 8b 56 08 48
> 8b b3 08 01 00 00 8b 7c 24 1c e8 96 bb ff ff 85 c0 0f 84 60 ff ff ff
> e9 4b ff ff ff <0f> 0b 48 c7 c2 0c e7 c0 81 be f4 ff ff ff 4c 89 f7 e8
> c3 ff fa ff
> dom7: [ 2866.586276] RSP: 0018:ffffc900004e3d80 EFLAGS: 00010202
> dom7: [ 2866.586288] RAX: 0000000000000001 RBX: ffff88801df68200 RCX:
> 0000000000000000
> dom7: [ 2866.586301] RDX: 000000000000004a RSI: ffff88801d20ab80 RDI:
> 0000000000000000
> dom7: [ 2866.586315] RBP: ffff88801e31a800 R08: 00000000000003c6 R09:
> 0000000000000800
> dom7: [ 2866.586325] R10: ffffc900004d3db0 R11: 00000000000002da R12:
> ffffffff81ea4410
> dom7: [ 2866.586339] R13: dead000000000122 R14: ffff88801e31a800 R15:
> ffff88801df68200
> dom7: [ 2866.586354] FS:  0000000000000000(0000)
> GS:ffff88801f200000(0000) knlGS:0000000000000000
> dom7: [ 2866.586368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> dom7: [ 2866.586376] CR2: 00007f34ade5a0d4 CR3: 000000001d0bc003 CR4:
> 00000000003606b0
> dom7: [ 2866.586390] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> dom7: [ 2866.586404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> tapdisk[17223]: disconnecting domid=7, devid=51728
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xen-blkfront: BUG_ON(info->nr_rings)
  2021-03-11  9:01 ` Paul Durrant
@ 2021-03-11 10:37   ` Roger Pau Monné
  2021-03-16 12:56     ` Jason Andryuk
  0 siblings, 1 reply; 4+ messages in thread
From: Roger Pau Monné @ 2021-03-11 10:37 UTC (permalink / raw)
  To: paul; +Cc: xen-devel

On Thu, Mar 11, 2021 at 09:01:51AM +0000, Paul Durrant wrote:
> On 10/03/2021 14:58, Jason Andryuk wrote:
> > Hi,
> > 
> > I was running a loop of `xl block-attach ; xl block-detach` and I
> > triggered a BUG in xen-blkfront, drivers/block/xen-blkfront.c:1917
> > This is BUG_ON(info->nr_rings) in negotiate_mq called by blkback_changed.
> > 
> > I'm using Linux 5.4.103 and blktap3 on Xen 4.12 (OpenXT), though I
> > don't think that matters.  The backtrace and some preceding logs (from
> > the reproducer) are below.
> > 
> > I just repro-ed with this:
> > path=<backend path/state>
> > xenstore-write $path 5 ; xenstore-write $path 4
> > 
> > info->nr_rings is still set because of the unexpected transition
> > XenbusStateClosing -> XenbusStateConnected:
> > dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
> > dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.
> > 
> > I'm not totally sure how to handle this.  The XenbusStateConnected
> > event should be creating a new blkfront device, but instead it's seen
> > by the old one which hasn't been cleaned up yet.

IIRC xenbus state changes (like you perform above) never trigger the
creation or destruction of devices on the bus. See
xenbus_otherend_changed.

xl block-detach however should indeed remove the device. We should add
an option to `xl block-detach -w` to wait for the device to actually
be removed before returning (or exit with a timeout).

> > 
> 
> Sounds like blkfront needs to be fixed. Once it is in state 5 the only state
> it should go to should be 6. From there it can cycle back to 4.

Indeed, there's likely some logic to be improved in blkfront so it
doesn't get messed up so badly on state changes by blkback.

I'm happy to review patch for both blkfront and libxl/xl in order to
make this better :).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xen-blkfront: BUG_ON(info->nr_rings)
  2021-03-11 10:37   ` Roger Pau Monné
@ 2021-03-16 12:56     ` Jason Andryuk
  0 siblings, 0 replies; 4+ messages in thread
From: Jason Andryuk @ 2021-03-16 12:56 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Paul Durrant, xen-devel

On Thu, Mar 11, 2021 at 5:37 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Thu, Mar 11, 2021 at 09:01:51AM +0000, Paul Durrant wrote:
> > On 10/03/2021 14:58, Jason Andryuk wrote:
> > > Hi,
> > >
> > > I was running a loop of `xl block-attach ; xl block-detach` and I
> > > triggered a BUG in xen-blkfront, drivers/block/xen-blkfront.c:1917
> > > This is BUG_ON(info->nr_rings) in negotiate_mq called by blkback_changed.
> > >
> > > I'm using Linux 5.4.103 and blktap3 on Xen 4.12 (OpenXT), though I
> > > don't think that matters.  The backtrace and some preceding logs (from
> > > the reproducer) are below.
> > >
> > > I just repro-ed with this:
> > > path=<backend path/state>
> > > xenstore-write $path 5 ; xenstore-write $path 4
> > >
> > > info->nr_rings is still set because of the unexpected transition
> > > XenbusStateClosing -> XenbusStateConnected:
> > > dom7: [ 2866.574853] vbd vbd-51728: blkfront:blkback_changed to state 5.
> > > dom7: [ 2866.578385] vbd vbd-51728: blkfront:blkback_changed to state 4.
> > >
> > > I'm not totally sure how to handle this.  The XenbusStateConnected
> > > event should be creating a new blkfront device, but instead it's seen
> > > by the old one which hasn't been cleaned up yet.
>
> IIRC xenbus state changes (like you perform above) never trigger the
> creation or destruction of devices on the bus. See
> xenbus_otherend_changed.
>
> xl block-detach however should indeed remove the device. We should add
> an option to `xl block-detach -w` to wait for the device to actually
> be removed before returning (or exit with a timeout).

I didn't realize `xl block-detach` didn't wait.  There is some timeout
logic with detaching devices, but I have to investigate this more.

> > >
> >
> > Sounds like blkfront needs to be fixed. Once it is in state 5 the only state
> > it should go to should be 6. From there it can cycle back to 4.

Ok, thanks for the feedback.  So blocking 5->6 is straight forward.
6->4 triggered the same BUG, so I'm still investigating.

> Indeed, there's likely some logic to be improved in blkfront so it
> doesn't get messed up so badly on state changes by blkback.
>
> I'm happy to review patch for both blkfront and libxl/xl in order to
> make this better :).

Okay.

Regards,
Jason


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-16 12:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10 14:58 xen-blkfront: BUG_ON(info->nr_rings) Jason Andryuk
2021-03-11  9:01 ` Paul Durrant
2021-03-11 10:37   ` Roger Pau Monné
2021-03-16 12:56     ` Jason Andryuk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.