All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "armbru@redhat.com" <armbru@redhat.com>,
	"fam@euphon.net" <fam@euphon.net>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	Denis Lunev <den@virtuozzo.com>
Subject: Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status
Date: Fri, 11 Jan 2019 10:13:17 +0000	[thread overview]
Message-ID: <b9bcc040-3831-d250-4b7f-ffe27984e243@virtuozzo.com> (raw)
In-Reply-To: <2ad46997-01aa-7a7a-ed53-4463a60ca564@virtuozzo.com>

11.01.2019 10:54, Vladimir Sementsov-Ogievskiy wrote:
> 10.01.2019 23:51, Eric Blake wrote:
>> On 1/10/19 7:20 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> drv_co_block_status digs bs->file for additional, more accurate search
>>> for hole inside region, reported as DATA by bs since 5daa74a6ebc.
>>
>> s/region, reported/regions reported/
>>
>>>
>>> This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
>>> knows, where are holes and where is data. But every block_status
>>
>> Not quite true. qcow2 knows where some holes are, but does not know if
>> there are even more holes hiding in the sections marked data (such as
>> because of how the disk was pre-allocated).
>>
>>> request calls lseek additionally. Assume a big disk, full of
>>> data, in any iterative copying block job (or img convert) we'll call
>>> lseek(HOLE) on every iteration, and each of these lseeks will have to
>>> iterate through all metadata up to the end of file. It's obviously
>>> ineffective behavior. And for many scenarios we don't need this lseek
>>> at all.
>>
>> How much performance can we buy back without any knobs at all, if we
>> just taught posix-file.c to cache lseek() results?  That is, when
>> visiting a file sequentially, if lseek(fd, 0, SEEK_HOLE) returns EOF on
>> our first block status query, then all subsequent block status queries
>> fall within what we know to be data, and we can skip the lseek() calls.
> 
> EOF is bad mark I think. We may have a small hole not far from EOF, which
> will lead to the same performance, but not EOF returned.
> 
> I think, proper implementation of lseek cache should work, but it is more
> difficult than just disable lseek. And in scenarios without preallocation
> we don't need lseek, so again better is disable it.
> 
> And I don't sure that qemu is a proper place for lseek optimizations.
> 
> And another way may be to select cases when fiemap is safe and use it.
> 
> Again, for scenarios when we don't need nor lseek, nor fiemap, neither
> cache for them the best option is not use them.
> 
>>
>>>
>>> So, let's "5daa74a6ebc" by default, leaving an option to return
>>
>> s/let's/let's undo/
>>
>>> previous behavior, which is needed for scenarios with preallocated
>>> images.
>>>
>>> Add iotest illustrating new option semantics.
>>
>> And none of the existing iotests fail due to changed output?  Does that
>> mean our testsuite does not have good coverage of pre-allocation modes
>> where the zero probe mattered?
> 
> To be honest, I didn't run all the tests, only several.
> 
> Do it now, and found that, yes, at least 102 broken. Will fix it with the following
> version. Strange, do patchew run tests on patches in list now?
> 
>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>
>>> +++ b/qapi/block-core.json
>>> @@ -3673,6 +3673,8 @@
>>>   #                 (default: off)
>>>   # @force-share:   force share all permission on added nodes.
>>>   #                 Requires read-only=true. (Since 2.10)
>>> +# @probe-zeroes:  Probe zeroes on protocol layer if format reports data
>>> +#                 (default: false) (since 4.0)
>>
>> Why do we need a new bool?  Can we instead...
>>
>>>   #
>>>   # Remaining options are determined by the block driver.
>>>   #
>>> @@ -3686,7 +3688,8 @@
>>>               '*read-only': 'bool',
>>>               '*auto-read-only': 'bool',
>>>               '*force-share': 'bool',
>>> -            '*detect-zeroes': 'BlockdevDetectZeroesOptions' },
>>> +            '*detect-zeroes': 'BlockdevDetectZeroesOptions',
>>> +            '*probe-zeroes': 'bool' },
>>
>> ...add another enum value to 'detect-zeroes', since after all, what we
>> are really doing is fine-tuning what level of zero probing we are
>> willing to do?
> 
> Should we? Or I just bind old behavior to detect-zeroes=on? And then, if needed,
> we'll add intermediate modes.
> 
>>
>>
>>> +++ b/qemu-options.hx
>>> @@ -615,6 +615,7 @@ DEF("blockdev", HAS_ARG, QEMU_OPTION_blockdev,
>>>       "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap]\n"
>>>       "          [,cache.direct=on|off][,cache.no-flush=on|off]\n"
>>>       "          [,read-only=on|off][,detect-zeroes=on|off|unmap]\n"
>>> +    "          [,probe-zeroes=on|off]\n"
>>>       "          [,driver specific parameters...]\n"
>>>       "                configure a block backend\n", QEMU_ARCH_ALL)
>>>   STEXI
>>> @@ -670,6 +671,8 @@ discard requests.
>>>   conversion of plain zero writes by the OS to driver specific optimized
>>>   zero write commands. You may even choose "unmap" if @var{discard} is set
>>>   to "unmap" to allow a zero write to be converted to an @code{unmap} operation.
>>> +@item probe-zeroes=@var{probe-zeroes}
>>> +Probe zeroes on protocol layer if format reports data. Default is "off".
>>
>> Again, fitting this into detect-zeroes seems better than inventing a new
>> knob.
>>
> 
> No objections. But description of detect-zeroes are more about writes, should
> we change them somehow?
> 
> # Describes the operation mode for the automatic conversion of plain
> # zero writes by the OS to driver specific optimized zero write commands.
> 
> ...
> 
> # @detect-zeroes: detect and optimize zero writes (Since 2.1)
> 
> 
> 

Hm, just note, that detect-zeroes was about writes and should be set on target, and
the new thing is about block-status and should be set on source, and this thing should
be described in spec as well.


-- 
Best regards,
Vladimir

  reply	other threads:[~2019-01-11 10:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 13:20 [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status Vladimir Sementsov-Ogievskiy
2019-01-10 20:51 ` Eric Blake
2019-01-11  7:54   ` Vladimir Sementsov-Ogievskiy
2019-01-11 10:13     ` Vladimir Sementsov-Ogievskiy [this message]
2019-01-11 16:02     ` Eric Blake
2019-01-11 16:05       ` Eric Blake
2019-01-11 16:22       ` Vladimir Sementsov-Ogievskiy
2019-01-11 17:12         ` Eric Blake
2019-01-11 10:41 ` Kevin Wolf
2019-01-11 11:40   ` Vladimir Sementsov-Ogievskiy
2019-01-11 12:21     ` Kevin Wolf
2019-01-11 12:59       ` Vladimir Sementsov-Ogievskiy
2019-01-11 13:15         ` Kevin Wolf
2019-01-11 16:09           ` Vladimir Sementsov-Ogievskiy
2019-01-11 17:04             ` Eric Blake
2019-01-11 17:27               ` Vladimir Sementsov-Ogievskiy
2019-01-22 18:57     ` Kevin Wolf
2019-01-23 11:53       ` Vladimir Sementsov-Ogievskiy
2019-01-23 16:33         ` Kevin Wolf
2019-01-24 14:36           ` Vladimir Sementsov-Ogievskiy
2019-01-24 15:31             ` Kevin Wolf
2019-01-24 15:47               ` Vladimir Sementsov-Ogievskiy
2019-01-23 12:04       ` Vladimir Sementsov-Ogievskiy
2019-01-24 14:37         ` Vladimir Sementsov-Ogievskiy
2019-01-24 15:39           ` Kevin Wolf
2019-01-24 15:49             ` Eric Blake
2019-01-24 15:53             ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b9bcc040-3831-d250-4b7f-ffe27984e243@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.