All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "armbru@redhat.com" <armbru@redhat.com>,
	"fam@euphon.net" <fam@euphon.net>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	Denis Lunev <den@virtuozzo.com>
Subject: Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status
Date: Fri, 11 Jan 2019 10:02:33 -0600	[thread overview]
Message-ID: <5854b5b8-4682-2188-0c00-55c8d413be5e@redhat.com> (raw)
In-Reply-To: <2ad46997-01aa-7a7a-ed53-4463a60ca564@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 2409 bytes --]

On 1/11/19 1:54 AM, Vladimir Sementsov-Ogievskiy wrote:

>>
>> How much performance can we buy back without any knobs at all, if we
>> just taught posix-file.c to cache lseek() results?  That is, when
>> visiting a file sequentially, if lseek(fd, 0, SEEK_HOLE) returns EOF on
>> our first block status query, then all subsequent block status queries
>> fall within what we know to be data, and we can skip the lseek() calls.
> 
> EOF is bad mark I think. We may have a small hole not far from EOF, which
> will lead to the same performance, but not EOF returned.

EOF was just an example for a file that has no holes. But even for a
file with holes, caching should help.  That is, if I have a raw file with:

1M data | 1M hole | 1M data | EOF

but a qcow2 file that was created in an out-of-order fashion, so that
all clusters are discontiguous, then our current code may do something
like the following sequence:

block_status 0 - maps to 64k of host file at offset 0x20000
 - lseek(0x20000) detects that we are in a data portion, and the next
hole begins at 0x100000
 - but because the next cluster is not at 0x30000, we throw away the
information for 0x30000 to 0x100000
block_status 64k - maps to 64k of host file at offset 0x40000
 - lseek(0x40000) detects that we are in a data portion, and the next
hole begins at 0x100000
 - but because the next cluster is not at 0x50000, we throw away the
information for 0x50000 to 0x100000
block status 128k - maps to 64k of host file at offset 0x30000
 - lseek(0x30000) detects that we are in a data portion, and the next
hole begins at 0x100000
...

Even a dumb most-recent use cache will speed this up: both the second
and third queries above can be avoided because we know that both 0x40000
and 0x30000 the second query at 0x40000 can be skipped (0x40000 is
between our most recent lseek at 0x20000 and hole at 0x10000)

Make the cache slightly larger, or use a bitmap with 2 bits per cluster
(tracking unknown, known-data, known-hole), with proper flushing of the
cache as we write to the image, or whatever, and we should automatically
get some performance improvements by using fewer lseek() anywhere that
we remember what previous lseek() already told us, with no knobs needed.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2019-01-11 16:02 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 13:20 [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status Vladimir Sementsov-Ogievskiy
2019-01-10 20:51 ` Eric Blake
2019-01-11  7:54   ` Vladimir Sementsov-Ogievskiy
2019-01-11 10:13     ` Vladimir Sementsov-Ogievskiy
2019-01-11 16:02     ` Eric Blake [this message]
2019-01-11 16:05       ` Eric Blake
2019-01-11 16:22       ` Vladimir Sementsov-Ogievskiy
2019-01-11 17:12         ` Eric Blake
2019-01-11 10:41 ` Kevin Wolf
2019-01-11 11:40   ` Vladimir Sementsov-Ogievskiy
2019-01-11 12:21     ` Kevin Wolf
2019-01-11 12:59       ` Vladimir Sementsov-Ogievskiy
2019-01-11 13:15         ` Kevin Wolf
2019-01-11 16:09           ` Vladimir Sementsov-Ogievskiy
2019-01-11 17:04             ` Eric Blake
2019-01-11 17:27               ` Vladimir Sementsov-Ogievskiy
2019-01-22 18:57     ` Kevin Wolf
2019-01-23 11:53       ` Vladimir Sementsov-Ogievskiy
2019-01-23 16:33         ` Kevin Wolf
2019-01-24 14:36           ` Vladimir Sementsov-Ogievskiy
2019-01-24 15:31             ` Kevin Wolf
2019-01-24 15:47               ` Vladimir Sementsov-Ogievskiy
2019-01-23 12:04       ` Vladimir Sementsov-Ogievskiy
2019-01-24 14:37         ` Vladimir Sementsov-Ogievskiy
2019-01-24 15:39           ` Kevin Wolf
2019-01-24 15:49             ` Eric Blake
2019-01-24 15:53             ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5854b5b8-4682-2188-0c00-55c8d413be5e@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.