On 06.11.19 14:34, Dietmar Maurer wrote: > >> On 6 November 2019 14:17 Max Reitz wrote: >> >> >> On 06.11.19 14:09, Dietmar Maurer wrote: >>>> Let me elaborate: Yes, a cluster size generally means that it is most >>>> “efficient” to access the storage at that size. But there’s a tradeoff. >>>> At some point, reading the data takes sufficiently long that reading a >>>> bit of metadata doesn’t matter anymore (usually, that is). >>> >>> Any network storage suffers from long network latencies, so it always >>> matters if you do more IOs than necessary. >> >> Yes, exactly, that’s why I’m saying it makes sense to me to increase the >> buffer size from the measly 64 kB that we currently have. I just don’t >> see the point of increasing it exactly to the source cluster size. >> >>>> There is a bit of a problem with making the backup copy size rather >>>> large, and that is the fact that backup’s copy-before-write causes guest >>>> writes to stall. So if the guest just writes a bit of data, a 4 MB >>>> buffer size may mean that in the background it will have to wait for 4 >>>> MB of data to be copied.[1] >>> >>> We use this for several years now in production, and it is not a problem. >>> (Ceph storage is mostly on 10G (or faster) network equipment). >> >> So you mean for cases where backup already chooses a 4 MB buffer size >> because the target has that cluster size? > > To make it clear. Backups from Ceph as source are slow. Yep, but if the target would be another ceph instance, the backup buffer size would be chosen to be 4 MB (AFAIU), so I was wondering whether you are referring to this effect, or to... > That is why we use a patched qemu version, which uses: > > cluster_size = Max_Block_Size(source, target) ...this. The main problem with the stall I mentioned is that I think one of the main use cases of backup is having a fast source and a slow (off-site) target. In such cases, I suppose it becomes annoying if some guest writes (which were fast before the backup started) take a long time because the backup needs to copy quite a bit of data to off-site storage. (And blindly taking the source cluster size would mean that such things could happen if you use local qcow2 files with 2 MB clusters.) So I’d prefer decoupling the backup buffer size and the bitmap granularity, and then set the buffer size to maybe the MAX of source and target cluster sizes. But I don’t know when I can get around to do that. And then probably also cap it at 4 MB or 8 MB, because that happens to be what you need, but I’d prefer for it not to use tons of memory. (The mirror job uses 1 MB per request, for up to 16 parallel requests; and the backup copy-before-write implementation currently (on master) copies 1 MB at a time (per concurrent request), and the whole memory usage of backup is limited at 128 MB.) (OTOH, the minimum should probably be 1 MB.) Max