All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY
@ 2021-11-20 10:15 Lin Ma
  2021-12-21  9:48 ` Claudio Fontana
  0 siblings, 1 reply; 3+ messages in thread
From: Lin Ma @ 2021-11-20 10:15 UTC (permalink / raw)
  To: pbonzini, fam; +Cc: qemu-devel, Lin Ma

While using SCSI passthrough, Following scenario makes qemu doesn't
realized the capacity change of remote scsi target:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'rescan-scsi-bus.sh -s ...' in vm.

In above scenario I used to experienced errors while accessing the
additional disk space in vm. I think the reasonable operations should
be:
1. online resize the scsi target.
2. issue 'rescan-scsi-bus.sh -s ...' in host.
3. issue 'block_resize' via qmp to notify qemu.
4. issue 'rescan-scsi-bus.sh -s ...' in vm.

The errors disappear once I notify qemu by block_resize via qmp.

So this patch replaces the number of logical blocks of READ CAPACITY
response from scsi target by qemu's bs->total_sectors. If the user in
vm wants to access the additional disk space, The administrator of
host must notify qemu once resizeing the scsi target.

Bonus is that domblkinfo of libvirt can reflect the consistent capacity
information between host and vm in case of missing block_resize in qemu.
E.g:
...
    <disk type='block' device='lun'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sdc' index='1'/>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
...

Before:
1. online resize the scsi target.
2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity:       4.000 GiB
Allocation:     0.000 B
Physical:       8.000 GiB

5. guest:~ # lsblk /dev/sda
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda      8:0    0   8G  0 disk
└─sda1   8:1    0   2G  0 part

After:
1. online resize the scsi target.
2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
Capacity:       4.000 GiB
Allocation:     0.000 B
Physical:       8.000 GiB

5. guest:~ # lsblk /dev/sda
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda      8:0    0   4G  0 disk
└─sda1   8:1    0   2G  0 part

Signed-off-by: Lin Ma <lma@suse.com>
---
 hw/scsi/scsi-generic.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 0306ccc7b1..343b51c2c0 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -315,11 +315,17 @@ static void scsi_read_complete(void * opaque, int ret)
     if (r->req.cmd.buf[0] == READ_CAPACITY_10 &&
         (ldl_be_p(&r->buf[0]) != 0xffffffffU || s->max_lba == 0)) {
         s->blocksize = ldl_be_p(&r->buf[4]);
-        s->max_lba = ldl_be_p(&r->buf[0]) & 0xffffffffULL;
+        BlockBackend *blk = s->conf.blk;
+        BlockDriverState *bs = blk_bs(blk);
+        s->max_lba = bs->total_sectors - 1;
+        stl_be_p(&r->buf[0], s->max_lba);
     } else if (r->req.cmd.buf[0] == SERVICE_ACTION_IN_16 &&
                (r->req.cmd.buf[1] & 31) == SAI_READ_CAPACITY_16) {
         s->blocksize = ldl_be_p(&r->buf[8]);
-        s->max_lba = ldq_be_p(&r->buf[0]);
+        BlockBackend *blk = s->conf.blk;
+        BlockDriverState *bs = blk_bs(blk);
+        s->max_lba = bs->total_sectors - 1;
+        stq_be_p(&r->buf[0], s->max_lba);
     }
     blk_set_guest_block_size(s->conf.blk, s->blocksize);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY
  2021-11-20 10:15 [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY Lin Ma
@ 2021-12-21  9:48 ` Claudio Fontana
  2021-12-21 11:20   ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 3+ messages in thread
From: Claudio Fontana @ 2021-12-21  9:48 UTC (permalink / raw)
  To: pbonzini, fam; +Cc: Hannes Reinecke, qemu-devel, Lin Ma

Hi Paolo, Hannes,

any thoughts on the following issue?

Introduction:

When using SAN storage for providing block devices to guests, configured as SCSI-passthrough devices, increasing the space available in the VM is a use case.

To do it, it is currently necessary to:

1) expand storage on the actual SAN,
2) run a "virsh blockresize" or equivalent command to make QEMU aware of the new size, and finally
3) do a "rescan-scsi-bus.sh" or equivalent operation in the guest to make the running guest aware of the increased disk size.

The problem:

As of now, the administrator needs to make sure that step 3 won't be done before step 2 has been executed, or the resulting state will be inconsistent.
In practice this creates organizational issues to try to sync up host/storage admins and guest OS admin, and is therefore error prone (due to these human factors).

The proposal:

The patch I replied to here from Ma Lin tries to avoid the inconsistent state, by having "rescan-scsi-bus.sh" still report the old size in the guest until QEMU itself is aware of the new disk size.

The patch:

Before the patch, the SCSI READ_CAPACITY command in the guest os directly receives the unmodified response from the storage backend.

After the patch, QEMU intercepts the READ_CAPACITY response and replaces the maximum LBA with the information which is saved in QEMU.

This means: after resizing the storage on the SAN backend, the host administrator must explicitly notify about CAPACITY HAS CHANGED by issuing a block resize command through QMP or libvirt,
even for SCSI passthrough disks.

Any ideas on this patch or on possible alternatives?

Thanks,

Claudio


On 11/20/21 11:15 AM, Lin Ma wrote:
> While using SCSI passthrough, Following scenario makes qemu doesn't
> realized the capacity change of remote scsi target:
> 1. online resize the scsi target.
> 2. issue 'rescan-scsi-bus.sh -s ...' in host.
> 3. issue 'rescan-scsi-bus.sh -s ...' in vm.
> 
> In above scenario I used to experienced errors while accessing the
> additional disk space in vm. I think the reasonable operations should
> be:
> 1. online resize the scsi target.
> 2. issue 'rescan-scsi-bus.sh -s ...' in host.
> 3. issue 'block_resize' via qmp to notify qemu.
> 4. issue 'rescan-scsi-bus.sh -s ...' in vm.
> 
> The errors disappear once I notify qemu by block_resize via qmp.
> 
> So this patch replaces the number of logical blocks of READ CAPACITY
> response from scsi target by qemu's bs->total_sectors. If the user in
> vm wants to access the additional disk space, The administrator of
> host must notify qemu once resizeing the scsi target.
> 
> Bonus is that domblkinfo of libvirt can reflect the consistent capacity
> information between host and vm in case of missing block_resize in qemu.
> E.g:
> ...
>     <disk type='block' device='lun'>
>       <driver name='qemu' type='raw'/>
>       <source dev='/dev/sdc' index='1'/>
>       <backingStore/>
>       <target dev='sda' bus='scsi'/>
>       <alias name='scsi0-0-0-0'/>
>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>     </disk>
> ...
> 
> Before:
> 1. online resize the scsi target.
> 2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
> 3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
> 4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
> Capacity:       4.000 GiB
> Allocation:     0.000 B
> Physical:       8.000 GiB
> 
> 5. guest:~ # lsblk /dev/sda
> NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda      8:0    0   8G  0 disk
> └─sda1   8:1    0   2G  0 part
> 
> After:
> 1. online resize the scsi target.
> 2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
> 3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
> 4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
> Capacity:       4.000 GiB
> Allocation:     0.000 B
> Physical:       8.000 GiB
> 
> 5. guest:~ # lsblk /dev/sda
> NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda      8:0    0   4G  0 disk
> └─sda1   8:1    0   2G  0 part
> 
> Signed-off-by: Lin Ma <lma@suse.com>
> ---
>  hw/scsi/scsi-generic.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 0306ccc7b1..343b51c2c0 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -315,11 +315,17 @@ static void scsi_read_complete(void * opaque, int ret)
>      if (r->req.cmd.buf[0] == READ_CAPACITY_10 &&
>          (ldl_be_p(&r->buf[0]) != 0xffffffffU || s->max_lba == 0)) {
>          s->blocksize = ldl_be_p(&r->buf[4]);
> -        s->max_lba = ldl_be_p(&r->buf[0]) & 0xffffffffULL;
> +        BlockBackend *blk = s->conf.blk;
> +        BlockDriverState *bs = blk_bs(blk);
> +        s->max_lba = bs->total_sectors - 1;
> +        stl_be_p(&r->buf[0], s->max_lba);
>      } else if (r->req.cmd.buf[0] == SERVICE_ACTION_IN_16 &&
>                 (r->req.cmd.buf[1] & 31) == SAI_READ_CAPACITY_16) {
>          s->blocksize = ldl_be_p(&r->buf[8]);
> -        s->max_lba = ldq_be_p(&r->buf[0]);
> +        BlockBackend *blk = s->conf.blk;
> +        BlockDriverState *bs = blk_bs(blk);
> +        s->max_lba = bs->total_sectors - 1;
> +        stq_be_p(&r->buf[0], s->max_lba);
>      }
>      blk_set_guest_block_size(s->conf.blk, s->blocksize);
>  
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY
  2021-12-21  9:48 ` Claudio Fontana
@ 2021-12-21 11:20   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 3+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-12-21 11:20 UTC (permalink / raw)
  To: Claudio Fontana, pbonzini, fam
  Cc: Hannes Reinecke, John Snow, qemu-devel, Lin Ma

Hi Claudio,

On 12/21/21 10:48, Claudio Fontana wrote:
> Hi Paolo, Hannes,
> 
> any thoughts on the following issue?
> 
> Introduction:
> 
> When using SAN storage for providing block devices to guests, configured as SCSI-passthrough devices, increasing the space available in the VM is a use case.
> 
> To do it, it is currently necessary to:
> 
> 1) expand storage on the actual SAN,
> 2) run a "virsh blockresize" or equivalent command to make QEMU aware of the new size, and finally
> 3) do a "rescan-scsi-bus.sh" or equivalent operation in the guest to make the running guest aware of the increased disk size.
> 
> The problem:
> 
> As of now, the administrator needs to make sure that step 3 won't be done before step 2 has been executed, or the resulting state will be inconsistent.
> In practice this creates organizational issues to try to sync up host/storage admins and guest OS admin, and is therefore error prone (due to these human factors).
> 
> The proposal:
> 
> The patch I replied to here from Ma Lin tries to avoid the inconsistent state, by having "rescan-scsi-bus.sh" still report the old size in the guest until QEMU itself is aware of the new disk size.
> 
> The patch:
> 
> Before the patch, the SCSI READ_CAPACITY command in the guest os directly receives the unmodified response from the storage backend.
> 
> After the patch, QEMU intercepts the READ_CAPACITY response and replaces the maximum LBA with the information which is saved in QEMU.
> 
> This means: after resizing the storage on the SAN backend, the host administrator must explicitly notify about CAPACITY HAS CHANGED by issuing a block resize command through QMP or libvirt,
> even for SCSI passthrough disks.
> 
> Any ideas on this patch or on possible alternatives?

I am not an SCSI expert, but Lin's description and yours sound
coherent to me.

One minor nitpick below.

> On 11/20/21 11:15 AM, Lin Ma wrote:
>> While using SCSI passthrough, Following scenario makes qemu doesn't
>> realized the capacity change of remote scsi target:
>> 1. online resize the scsi target.
>> 2. issue 'rescan-scsi-bus.sh -s ...' in host.
>> 3. issue 'rescan-scsi-bus.sh -s ...' in vm.
>>
>> In above scenario I used to experienced errors while accessing the
>> additional disk space in vm. I think the reasonable operations should
>> be:
>> 1. online resize the scsi target.
>> 2. issue 'rescan-scsi-bus.sh -s ...' in host.
>> 3. issue 'block_resize' via qmp to notify qemu.
>> 4. issue 'rescan-scsi-bus.sh -s ...' in vm.
>>
>> The errors disappear once I notify qemu by block_resize via qmp.
>>
>> So this patch replaces the number of logical blocks of READ CAPACITY
>> response from scsi target by qemu's bs->total_sectors. If the user in
>> vm wants to access the additional disk space, The administrator of
>> host must notify qemu once resizeing the scsi target.
>>
>> Bonus is that domblkinfo of libvirt can reflect the consistent capacity
>> information between host and vm in case of missing block_resize in qemu.
>> E.g:
>> ...
>>     <disk type='block' device='lun'>
>>       <driver name='qemu' type='raw'/>
>>       <source dev='/dev/sdc' index='1'/>
>>       <backingStore/>
>>       <target dev='sda' bus='scsi'/>
>>       <alias name='scsi0-0-0-0'/>
>>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>>     </disk>
>> ...
>>
>> Before:
>> 1. online resize the scsi target.
>> 2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
>> 3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
>> 4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
>> Capacity:       4.000 GiB
>> Allocation:     0.000 B
>> Physical:       8.000 GiB
>>
>> 5. guest:~ # lsblk /dev/sda
>> NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>> sda      8:0    0   8G  0 disk
>> └─sda1   8:1    0   2G  0 part
>>
>> After:
>> 1. online resize the scsi target.
>> 2. host:~  # rescan-scsi-bus.sh -s /dev/sdc
>> 3. guest:~ # rescan-scsi-bus.sh -s /dev/sda
>> 4  host:~  # virsh domblkinfo --domain $DOMAIN --human --device sda
>> Capacity:       4.000 GiB
>> Allocation:     0.000 B
>> Physical:       8.000 GiB
>>
>> 5. guest:~ # lsblk /dev/sda
>> NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>> sda      8:0    0   4G  0 disk
>> └─sda1   8:1    0   2G  0 part
>>
>> Signed-off-by: Lin Ma <lma@suse.com>
>> ---
>>  hw/scsi/scsi-generic.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
>> index 0306ccc7b1..343b51c2c0 100644
>> --- a/hw/scsi/scsi-generic.c
>> +++ b/hw/scsi/scsi-generic.c
>> @@ -315,11 +315,17 @@ static void scsi_read_complete(void * opaque, int ret)
>>      if (r->req.cmd.buf[0] == READ_CAPACITY_10 &&
>>          (ldl_be_p(&r->buf[0]) != 0xffffffffU || s->max_lba == 0)) {
>>          s->blocksize = ldl_be_p(&r->buf[4]);
>> -        s->max_lba = ldl_be_p(&r->buf[0]) & 0xffffffffULL;
>> +        BlockBackend *blk = s->conf.blk;
>> +        BlockDriverState *bs = blk_bs(blk);
>> +        s->max_lba = bs->total_sectors - 1;

I'd add a refresh_max_lba() helper:

 static void refresh_max_lba(SCSIDevice *s)
 {
     BlockBackend *blk = s->conf.blk;
     BlockDriverState *bs = blk_bs(blk);
     uint64_t max_lba = bs->total_sectors - 1;

     if (max_lba != s->max_lba) {
         trace_scsi_generic_max_lba_refreshed(s->max_lba, max_lba);
         s->max_lba = max_lba;
     }
 }

Otherwise, to the best of my knowledge:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>> +        stl_be_p(&r->buf[0], s->max_lba);
>>      } else if (r->req.cmd.buf[0] == SERVICE_ACTION_IN_16 &&
>>                 (r->req.cmd.buf[1] & 31) == SAI_READ_CAPACITY_16) {
>>          s->blocksize = ldl_be_p(&r->buf[8]);
>> -        s->max_lba = ldq_be_p(&r->buf[0]);
>> +        BlockBackend *blk = s->conf.blk;
>> +        BlockDriverState *bs = blk_bs(blk);
>> +        s->max_lba = bs->total_sectors - 1;
>> +        stq_be_p(&r->buf[0], s->max_lba);
>>      }
>>      blk_set_guest_block_size(s->conf.blk, s->blocksize);
>>  
>>
> 
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-21 11:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-20 10:15 [PATCH] scsi-generic: replace logical block count of response of READ CAPACITY Lin Ma
2021-12-21  9:48 ` Claudio Fontana
2021-12-21 11:20   ` Philippe Mathieu-Daudé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.