All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "armbru@redhat.com" <armbru@redhat.com>,
	"fam@euphon.net" <fam@euphon.net>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	Denis Lunev <den@virtuozzo.com>
Subject: Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status
Date: Fri, 11 Jan 2019 11:12:04 -0600	[thread overview]
Message-ID: <3e7a8e8c-ae95-2845-f1a9-7c3b9f868fdb@redhat.com> (raw)
In-Reply-To: <589853f3-847e-155b-7cdf-c4061522c8bc@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]

On 1/11/19 10:22 AM, Vladimir Sementsov-Ogievskiy wrote:

>> Even a dumb most-recent use cache will speed this up: both the second
>> and third queries above can be avoided because we know that both 0x40000
>> and 0x30000 the second query at 0x40000 can be skipped (0x40000 is
>> between our most recent lseek at 0x20000 and hole at 0x10000)
> 
> Is it correct just use results from previous iterations? In mirror source
> is active and may change.

If you keep a cache, you have to keep the cache up-to-date. Any writes
to an area that is covered by the known-hole cache have to flush the
cache, so that the next block status no longer sees a known-hole and
ends up doing another lseek.  Or, if the cache has enough state to track
unknown/known-hole/known-data, then writes update the cache to be
known-data, and future block status can skip the lseek by using the
results of the cache.

> 
>>
>> Make the cache slightly larger, or use a bitmap with 2 bits per cluster
>> (tracking unknown, known-data, known-hole), with proper flushing of the
>> cache as we write to the image, or whatever, and we should automatically
>> get some performance improvements by using fewer lseek() anywhere that
>> we remember what previous lseek() already told us, with no knobs needed.
>>
> 
> So the cache should consider all writes and discards. And it is obviously
> more difficult to implement it, than just don't call this lseek. And I
> don't understand, why cache + lseek is better for the case when we don't
> need nor the lseek neither the cache. Is this all to not add an option?
> Also Kevin objects to caching lseek in parallel sub-thread.

Keven objected to caching anything if the image has multiple writers,
where an outside process could change the file allocation in between our
reads. But multiple writers is rare - in fact, our image locking for
qcow2 formats tries to prevent multiple writers.  Having multiple
threads within one process writing is fine, as long as they properly
coordinate writes to the lseek cache so that readers never see a stale
claim of a hole - although a stale claim of data is safe.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-01-11 17:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 13:20 [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status Vladimir Sementsov-Ogievskiy
2019-01-10 20:51 ` Eric Blake
2019-01-11  7:54   ` Vladimir Sementsov-Ogievskiy
2019-01-11 10:13     ` Vladimir Sementsov-Ogievskiy
2019-01-11 16:02     ` Eric Blake
2019-01-11 16:05       ` Eric Blake
2019-01-11 16:22       ` Vladimir Sementsov-Ogievskiy
2019-01-11 17:12         ` Eric Blake [this message]
2019-01-11 10:41 ` Kevin Wolf
2019-01-11 11:40   ` Vladimir Sementsov-Ogievskiy
2019-01-11 12:21     ` Kevin Wolf
2019-01-11 12:59       ` Vladimir Sementsov-Ogievskiy
2019-01-11 13:15         ` Kevin Wolf
2019-01-11 16:09           ` Vladimir Sementsov-Ogievskiy
2019-01-11 17:04             ` Eric Blake
2019-01-11 17:27               ` Vladimir Sementsov-Ogievskiy
2019-01-22 18:57     ` Kevin Wolf
2019-01-23 11:53       ` Vladimir Sementsov-Ogievskiy
2019-01-23 16:33         ` Kevin Wolf
2019-01-24 14:36           ` Vladimir Sementsov-Ogievskiy
2019-01-24 15:31             ` Kevin Wolf
2019-01-24 15:47               ` Vladimir Sementsov-Ogievskiy
2019-01-23 12:04       ` Vladimir Sementsov-Ogievskiy
2019-01-24 14:37         ` Vladimir Sementsov-Ogievskiy
2019-01-24 15:39           ` Kevin Wolf
2019-01-24 15:49             ` Eric Blake
2019-01-24 15:53             ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e7a8e8c-ae95-2845-f1a9-7c3b9f868fdb@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.