All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	ronniesahlberg@gmail.com, qemu-devel@nongnu.org,
	stefanha@redhat.com, mreitz@redhat.com
Subject: Re: [Qemu-devel] [PATCH 4/4] block: avoid creating oversized writes in multiwrite_merge
Date: Tue, 23 Sep 2014 11:52:23 +0200	[thread overview]
Message-ID: <542142D7.3040006@kamp.de> (raw)
In-Reply-To: <20140923094729.GD3871@noname.str.redhat.com>

On 23.09.2014 11:47, Kevin Wolf wrote:
> Am 23.09.2014 um 11:32 hat Peter Lieven geschrieben:
>> On 23.09.2014 10:59, Kevin Wolf wrote:
>>> Am 23.09.2014 um 08:15 hat Peter Lieven geschrieben:
>>>> On 22.09.2014 21:06, Paolo Bonzini wrote:
>>>>> Il 22/09/2014 11:43, Peter Lieven ha scritto:
>>>>>> This series aims not at touching default behaviour. The default for max_transfer_length
>>>>>> is 0 (no limit). max_transfer_length is a limit that MUST be satisfied otherwise the request
>>>>>> will fail. And Patch 2 aims at catching this fail earlier in the stack.
>>>>> Understood.  But the right fix is to avoid that backend limits transpire
>>>>> into guest ABI, not to catch the limits earlier.  So the right fix would
>>>>> be to implement request splitting.
>>>> Since you proposed to add traces for this would you leave those in?
>>>> And since iSCSI is the only user of this at the moment would you
>>>> go for implementing this check in the iSCSI block layer?
>>>>
>>>> As for the split logic would you think it is enough to build new qiov's
>>>> out of the too big iov without copying the contents? This would work
>>>> as long as a single iov inside the qiov is not bigger the max_transfer_length.
>>>> Otherwise I would need to allocate temporary buffers and copy around.
>>> You can split single iovs, too. There are functions that make this very
>>> easy, they copy a sub-qiov with a byte granularity offset and length
>>> (qemu_iovec_concat and friends). qcow2 uses the same to split requests
>>> at (fragmented) cluster boundaries.
>> Might it be as easy as this?
> This is completely untested, right? :-)

Yes :-)
I was just unsure if the general approach is right.

>
> But ignoring bugs, the principle looks right.
>
>> static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
>>      int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
>>      BdrvRequestFlags flags)
>> {
>>      if (nb_sectors < 0 || nb_sectors > (UINT_MAX >> BDRV_SECTOR_BITS)) {
>>          return -EINVAL;
>>      }
>>
>>      if (bs->bl.max_transfer_length &&
>>          nb_sectors > bs->bl.max_transfer_length) {
>>          int ret = 0;
>>          QEMUIOVector *qiov2 = NULL;
> Make it "QEMUIOVector qiov2;" on the stack.
>
>>          size_t soffset = 0;
>>
>>          trace_bdrv_co_do_readv_toobig(bs, sector_num, nb_sectors,
>> bs->bl.max_transfer_length);
>>
>>          qemu_iovec_init(qiov2, qiov->niov);
> And &qiov2 here, then this doesn't crash with a NULL dereference.
>
>>          while (nb_sectors > bs->bl.max_transfer_length && !ret) {
>>              qemu_iovec_reset(qiov2);
>>              qemu_iovec_concat(qiov2, qiov, soffset,
>>                                bs->bl.max_transfer_length << BDRV_SECTOR_BITS);
>>              ret = bdrv_co_do_preadv(bs, sector_num << BDRV_SECTOR_BITS,
>>                                      bs->bl.max_transfer_length << BDRV_SECTOR_BITS,
>>                                      qiov2, flags);
>>              soffset += bs->bl.max_transfer_length << BDRV_SECTOR_BITS;
>>              sector_num += bs->bl.max_transfer_length;
>>              nb_sectors -= bs->bl.max_transfer_length;
>>          }
>>          qemu_iovec_destroy(qiov2);
>>          if (ret) {
>>              return ret;
>>          }
> The error check needs to be immediately after the assignment of ret,
> otherwise the next loop iteration can overwrite an error with a success
> (and if it didn't, it would still do useless I/O because the request as
> a whole would fail anyway).

There is a && !ret in the loop condition. I wanted to avoid copying the destroy part.

BTW, is it !ret or ret < 0 ?

>
>>      }
>>
>>      return bdrv_co_do_preadv(bs, sector_num << BDRV_SECTOR_BITS,
>>                               nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
> qiov doesn't work here for the splitting case. You need the remaining
> part, not the whole original qiov.

Right ;-)

Peter

  reply	other threads:[~2014-09-23  9:52 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-05 16:51 [Qemu-devel] [PATCH 0/4] introduce max_transfer_length Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 1/4] BlockLimits: " Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 2/4] block: immediately cancel oversized read/write requests Peter Lieven
2014-09-08 13:44   ` Benoît Canet
2014-09-08 13:49     ` Paolo Bonzini
2014-09-08 13:56       ` Peter Lieven
2014-09-08 13:58         ` Paolo Bonzini
2014-09-08 14:35           ` Peter Lieven
2014-09-08 14:42             ` Paolo Bonzini
2014-09-08 14:54               ` Peter Lieven
2014-09-23  8:47                 ` Kevin Wolf
2014-09-23  8:55                   ` Peter Lieven
2014-09-23  9:09                     ` Kevin Wolf
2014-09-08 15:13               ` ronnie sahlberg
2014-09-08 15:15                 ` Paolo Bonzini
2014-09-08 15:18                   ` Peter Lieven
2014-09-08 15:27                     ` Paolo Bonzini
2014-09-08 16:18                       ` Peter Lieven
2014-09-08 16:29                         ` Paolo Bonzini
2014-09-12 11:43                           ` Peter Lieven
2014-09-18 14:12                             ` Paolo Bonzini
2014-09-18 14:16                               ` Peter Lieven
2014-09-18 14:17                                 ` Paolo Bonzini
2014-09-18 22:57                                   ` Peter Lieven
2014-09-08 15:16                 ` Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 3/4] block/iscsi: set max_transfer_length Peter Lieven
2014-09-05 16:51 ` [Qemu-devel] [PATCH 4/4] block: avoid creating oversized writes in multiwrite_merge Peter Lieven
2014-09-18 14:13   ` Paolo Bonzini
2014-09-18 22:56     ` Peter Lieven
2014-09-19 13:33       ` Paolo Bonzini
2014-09-22  9:43         ` Peter Lieven
2014-09-22 19:06           ` Paolo Bonzini
2014-09-23  6:15             ` Peter Lieven
2014-09-23  8:59               ` Kevin Wolf
2014-09-23  9:04                 ` Peter Lieven
2014-09-23  9:32                 ` Peter Lieven
2014-09-23  9:47                   ` Kevin Wolf
2014-09-23  9:52                     ` Peter Lieven [this message]
2014-09-23 10:05                       ` Kevin Wolf
2014-09-30  7:26                         ` Peter Lieven
2014-09-30  8:03                           ` Kevin Wolf
2014-09-05 17:05 ` [Qemu-devel] [PATCH 0/4] introduce max_transfer_length ronnie sahlberg
2014-09-05 19:52   ` Peter Lieven
2014-09-05 21:22     ` ronnie sahlberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=542142D7.3040006@kamp.de \
    --to=pl@kamp.de \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=ronniesahlberg@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.