qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* cluster_size got from backup_calculate_cluster_size()
@ 2020-05-21  9:56 Derek Su
  2020-05-21 18:19 ` John Snow
  0 siblings, 1 reply; 4+ messages in thread
From: Derek Su @ 2020-05-21  9:56 UTC (permalink / raw)
  To: qemu-devel

Hi,

The cluster_size got from backup_calculate_cluster_size(),
MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size), is 64K regardless
of the target image's cluster size.


For example:

If the cluster size of source and target qcow2 images are both 16K, the 
64K from backup_calculate_cluster_size() results in unwanted copies of 
clusters.

The behavior slows down the backup (block-copy) process when the
source image receives lots of rand writes.


Is the following calculation reasonable for the above issue?


```
static int64_t backup_calculate_cluster_size(BlockDriverState *target,
                                              Error **errp)
{
     ...

     ret = bdrv_get_info(target, &bdi);

     ...

     return (bdi.cluster_size == 0 ?
                 BACKUP_CLUSTER_SIZE_DEFAULT : cluster_size);
}

```

Thanks.
Regards,

Derek


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cluster_size got from backup_calculate_cluster_size()
  2020-05-21  9:56 cluster_size got from backup_calculate_cluster_size() Derek Su
@ 2020-05-21 18:19 ` John Snow
  2020-05-21 20:53   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 4+ messages in thread
From: John Snow @ 2020-05-21 18:19 UTC (permalink / raw)
  To: Derek Su, qemu-devel; +Cc: Vladimir Sementsov-Ogievskiy, Qemu-block



On 5/21/20 5:56 AM, Derek Su wrote:
> Hi,
> 
> The cluster_size got from backup_calculate_cluster_size(),
> MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size), is 64K regardless
> of the target image's cluster size.

Not regardless -- but it is using 64K as a minimum.

> 
> 
> For example:
> 
> If the cluster size of source and target qcow2 images are both 16K, the
> 64K from backup_calculate_cluster_size() results in unwanted copies of
> clusters.
> 
> The behavior slows down the backup (block-copy) process when the
> source image receives lots of rand writes.

Are we talking about incremental, full, or top?

> 
> 
> Is the following calculation reasonable for the above issue?
> 
> 
> ```
> static int64_t backup_calculate_cluster_size(BlockDriverState *target,
>                                              Error **errp)
> {
>     ...
> 
>     ret = bdrv_get_info(target, &bdi);
> 
>     ...
> 
>     return (bdi.cluster_size == 0 ?
>                 BACKUP_CLUSTER_SIZE_DEFAULT : cluster_size);
> }
> 
> ```
> 
> Thanks.
> Regards,
> 
> Derek
> 


Might be reasonable. When I made this the "default", it actually used to
just be "hardcoded." We could continue in this direction and go all the
way and turn it into a tune-able.

I didn't think to make it lower than 64K because I was thinking about
the linear, full backup case where cluster sizes that were too small
might slow down the loop with too much metadata.

(And the default qcow2 is 64K, so it seemed like a good choice at the time.)

We could create a default-cluster-size tunable, perhaps. It's at 64K
now, but perhaps you can override it down to 16 or lower. We could
possibly treat a value of 0 as "no minimum; use the smallest common size."

This is a separate issue entirely, but we may also wish to begin
offering a cluster-size override directly. In the case that we know this
value is unsafe, we reject the command. In the case that it's ambiguous
due to lack of info, we can defer to the user's judgment. This would
allow us to force the backup to run in cases where we presently abort
out of fear.

CCing Vladimir who has been working on the backup loop extensively.

--js



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cluster_size got from backup_calculate_cluster_size()
  2020-05-21 18:19 ` John Snow
@ 2020-05-21 20:53   ` Vladimir Sementsov-Ogievskiy
  2020-05-22  1:51     ` Derek Su
  0 siblings, 1 reply; 4+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-05-21 20:53 UTC (permalink / raw)
  To: John Snow, Derek Su, qemu-devel; +Cc: Qemu-block

21.05.2020 21:19, John Snow wrote:
> 
> 
> On 5/21/20 5:56 AM, Derek Su wrote:
>> Hi,
>>
>> The cluster_size got from backup_calculate_cluster_size(),
>> MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size), is 64K regardless
>> of the target image's cluster size.
> 
> Not regardless -- but it is using 64K as a minimum.
> 
>>
>>
>> For example:
>>
>> If the cluster size of source and target qcow2 images are both 16K, the
>> 64K from backup_calculate_cluster_size() results in unwanted copies of
>> clusters.
>>
>> The behavior slows down the backup (block-copy) process when the
>> source image receives lots of rand writes.
> 
> Are we talking about incremental, full, or top?
> 
>>
>>
>> Is the following calculation reasonable for the above issue?
>>
>>
>> ```
>> static int64_t backup_calculate_cluster_size(BlockDriverState *target,
>>                                               Error **errp)
>> {
>>      ...
>>
>>      ret = bdrv_get_info(target, &bdi);
>>
>>      ...
>>
>>      return (bdi.cluster_size == 0 ?
>>                  BACKUP_CLUSTER_SIZE_DEFAULT : cluster_size);
>> }
>>
>> ```
>>
>> Thanks.
>> Regards,
>>
>> Derek
>>
> 
> 
> Might be reasonable. When I made this the "default", it actually used to
> just be "hardcoded." We could continue in this direction and go all the
> way and turn it into a tune-able.
> 
> I didn't think to make it lower than 64K because I was thinking about
> the linear, full backup case where cluster sizes that were too small
> might slow down the loop with too much metadata.

Yes, currently backup-loop is copying cluster-by-cluster, so if we allow clusters less than 64k, it may become slower (at least for full-backup). Still, my work about backup-performance is close to its end, and I hope, in 5.1 will be merged. One of effects is that backup copying loop may copy more than a cluster at once, so this problem will gone.

Keeping this in mind, I think we can allow smaller clusters now, as anyway, small clusters is a rare special case.

> 
> (And the default qcow2 is 64K, so it seemed like a good choice at the time.)
> 
> We could create a default-cluster-size tunable, perhaps. It's at 64K
> now, but perhaps you can override it down to 16 or lower. We could
> possibly treat a value of 0 as "no minimum; use the smallest common size."
> 
> This is a separate issue entirely, but we may also wish to begin
> offering a cluster-size override directly. In the case that we know this
> value is unsafe, we reject the command. In the case that it's ambiguous
> due to lack of info, we can defer to the user's judgment. This would
> allow us to force the backup to run in cases where we presently abort
> out of fear.
> 
> CCing Vladimir who has been working on the backup loop extensively.
> 
> --js
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cluster_size got from backup_calculate_cluster_size()
  2020-05-21 20:53   ` Vladimir Sementsov-Ogievskiy
@ 2020-05-22  1:51     ` Derek Su
  0 siblings, 0 replies; 4+ messages in thread
From: Derek Su @ 2020-05-22  1:51 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, John Snow, qemu-devel; +Cc: Qemu-block

On 2020/5/22 上午4:53, Vladimir Sementsov-Ogievskiy wrote:
> 21.05.2020 21:19, John Snow wrote:
>>
>>
>> On 5/21/20 5:56 AM, Derek Su wrote:
>>> Hi,
>>>
>>> The cluster_size got from backup_calculate_cluster_size(),
>>> MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size), is 64K regardless
>>> of the target image's cluster size.
>>
>> Not regardless -- but it is using 64K as a minimum.

Right, 64K is a minimum. Thanks for the correctness.

>>
>>>
>>>
>>> For example:
>>>
>>> If the cluster size of source and target qcow2 images are both 16K, the
>>> 64K from backup_calculate_cluster_size() results in unwanted copies of
>>> clusters.
>>>
>>> The behavior slows down the backup (block-copy) process when the
>>> source image receives lots of rand writes.
>>
>> Are we talking about incremental, full, or top?

Our use case is MIRROR_SYNC_MODE_NONE (only copy new writes from source 
image to the target).

>>
>>>
>>>
>>> Is the following calculation reasonable for the above issue?
>>>
>>>
>>> ```
>>> static int64_t backup_calculate_cluster_size(BlockDriverState *target,
>>>                                               Error **errp)
>>> {
>>>      ...
>>>
>>>      ret = bdrv_get_info(target, &bdi);
>>>
>>>      ...
>>>
>>>      return (bdi.cluster_size == 0 ?
>>>                  BACKUP_CLUSTER_SIZE_DEFAULT : cluster_size);
>>> }
>>>
>>> ```
>>>
>>> Thanks.
>>> Regards,
>>>
>>> Derek
>>>
>>
>>
>> Might be reasonable. When I made this the "default", it actually used to
>> just be "hardcoded." We could continue in this direction and go all the
>> way and turn it into a tune-able.
>>
>> I didn't think to make it lower than 64K because I was thinking about
>> the linear, full backup case where cluster sizes that were too small
>> might slow down the loop with too much metadata.
> 
> Yes, currently backup-loop is copying cluster-by-cluster, so if we allow 
> clusters less than 64k, it may become slower (at least for full-backup). 
> Still, my work about backup-performance is close to its end, and I hope, 
> in 5.1 will be merged. One of effects is that backup copying loop may 
> copy more than a cluster at once, so this problem will gone.
> 
> Keeping this in mind, I think we can allow smaller clusters now, as 
> anyway, small clusters is a rare special case.
> 
>>
>> (And the default qcow2 is 64K, so it seemed like a good choice at the 
>> time.)
>>
>> We could create a default-cluster-size tunable, perhaps. It's at 64K
>> now, but perhaps you can override it down to 16 or lower. We could
>> possibly treat a value of 0 as "no minimum; use the smallest common 
>> size."
>>
>> This is a separate issue entirely, but we may also wish to begin
>> offering a cluster-size override directly. In the case that we know this
>> value is unsafe, we reject the command. In the case that it's ambiguous
>> due to lack of info, we can defer to the user's judgment. This would
>> allow us to force the backup to run in cases where we presently abort
>> out of fear.
>>
>> CCing Vladimir who has been working on the backup loop extensively.
>>
>> --js
>>
> 
> 

Yes. I agree with that the minimum 64K or larger is a good choice for 
the linear and full backup choice.

In the COLO application (https://wiki.qemu.org/Features/COLO), the 
primary vm (PVM)'s write requests are replicated to the secondary 
vm(SVM)'s secondary disk. The writes also trigger block copies copying 
the old data on the secondary disk (source) to the hidden disk (target). 
However, the checkpoint mechanism empties the hidden disk periodically, 
so it is a endless backup (block-copy) process.

If the PVM write requests are random 4K blocks, the block copy on SVM 
becomes inefficient in the above use case.

I conducted some performance tests.
If the qcow2 cluster size 16K, compared with the 64K backup granularity 
size, the 16K backup granularity size shows that the seq. write 
performance decreases by 10% and the rand. write performance increases 
by 40%.

I think a tunable default-cluster-size seems good for such applications.

Thanks.
Regards,
Derek


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-22  1:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21  9:56 cluster_size got from backup_calculate_cluster_size() Derek Su
2020-05-21 18:19 ` John Snow
2020-05-21 20:53   ` Vladimir Sementsov-Ogievskiy
2020-05-22  1:51     ` Derek Su

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).