All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Max Reitz <mreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Dietmar Maurer <dietmar@proxmox.com>,
	qemu-block@nongnu.org,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: backup_calculate_cluster_size does not consider source
Date: Wed, 6 Nov 2019 11:34:50 +0100	[thread overview]
Message-ID: <20191106103450.cafwk7m5xd5eulxo@olga.proxmox.com> (raw)
In-Reply-To: <ac30110f-6abe-a144-2aa5-b1cc140d7e8c@redhat.com>

On Wed, Nov 06, 2019 at 10:37:04AM +0100, Max Reitz wrote:
> On 06.11.19 09:32, Stefan Hajnoczi wrote:
> > On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote:
> >> Example: Backup from ceph disk (rbd_cache=false) to local disk:
> >>
> >> backup_calculate_cluster_size returns 64K (correct for my local .raw image)
> >>
> >> Then the backup job starts to read 64K blocks from ceph.
> >>
> >> But ceph always reads 4M block, so this is incredibly slow and produces
> >> way too much network traffic.
> >>
> >> Why does backup_calculate_cluster_size does not consider the block size from
> >> the source disk? 
> >>
> >> cluster_size = MAX(block_size_source, block_size_target)
> 
> So Ceph always transmits 4 MB over the network, no matter what is
> actually needed?  That sounds, well, interesting.

Or at least it generates that much I/O - in the end, it can slow down
the backup by up to a multi-digit factor...

> backup_calculate_cluster_size() doesn’t consider the source size because
> to my knowledge there is no other medium that behaves this way.  So I
> suppose the assumption was always that the block size of the source
> doesn’t matter, because a partial read is always possible (without
> having to read everything).

Unless you enable qemu-side caching this only works until the
block/cluster size of the source exceeds the one of the target.

> What would make sense to me is to increase the buffer size in general.
> I don’t think we need to copy clusters at a time, and
> 0e2402452f1f2042923a5 has indeed increased the copy size to 1 MB for
> backup writes that are triggered by guest writes.  We haven’t yet
> increased the copy size for background writes, though.  We can do that,
> of course.  (And probably should.)
> 
> The thing is, it just seems unnecessary to me to take the source cluster
> size into account in general.  It seems weird that a medium only allows
> 4 MB reads, because, well, guests aren’t going to take that into account.

But guests usually have a page cache, which is why in many setups qemu
(and thereby the backup process) often doesn't.



  parent reply	other threads:[~2019-11-06 10:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05 10:02 backup_calculate_cluster_size does not consider source Dietmar Maurer
2019-11-06  8:32 ` Stefan Hajnoczi
2019-11-06  9:37   ` Max Reitz
2019-11-06 10:18     ` Dietmar Maurer
2019-11-06 10:37       ` Max Reitz
2019-11-06 10:34     ` Wolfgang Bumiller [this message]
2019-11-06 10:42       ` Max Reitz
2019-11-06 11:18         ` Dietmar Maurer
2019-11-06 11:22           ` Max Reitz
2019-11-06 11:37             ` Max Reitz
2019-11-06 13:09               ` Dietmar Maurer
2019-11-06 13:17                 ` Max Reitz
2019-11-06 13:34                   ` Dietmar Maurer
2019-11-06 13:52                     ` Max Reitz
2019-11-06 14:39                       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191106103450.cafwk7m5xd5eulxo@olga.proxmox.com \
    --to=w.bumiller@proxmox.com \
    --cc=dietmar@proxmox.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.