qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: fam@euphon.net, kwolf@redhat.com, integration@gluster.org,
	berto@igalia.com, stefanha@redhat.com, pavel.dovgaluk@ispras.ru,
	sw@weilnetz.de, pl@kamp.de, qemu-devel@nongnu.org,
	mreitz@redhat.com, jsnow@redhat.com, ronniesahlberg@gmail.com,
	pbonzini@redhat.com, namei.unix@gmail.com, dillaman@redhat.com,
	ari@tuxera.com
Subject: Re: [PATCH v4 03/11] block: use int64_t instead of uint64_t in driver read handlers
Date: Tue, 11 May 2021 14:22:50 -0500	[thread overview]
Message-ID: <6d6b0719-4a43-7d2a-153f-cd19dfaa00e5@redhat.com> (raw)
In-Reply-To: <20210324205132.464899-4-vsementsov@virtuozzo.com>

On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote:
> We are going to convert .bdrv_co_preadv_part and .bdrv_co_pwritev_part
> to int64_t type for offset and bytes parameters (as it's already done
> for generic block/io.c layer).
> 
> In qcow2 .bdrv_co_preadv_part is used in some places, so let's add
> corresponding checks and assertions.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Unusual line, especially since...

> 
> block: use int64_t instead of uint64_t in driver read handlers
> 
> We are generally moving to int64_t for both offset and bytes parameters
> on all io paths.
> 
> Main motivation is realization of 64-bit write_zeroes operation for
> fast zeroing large disk chunks, up to the whole disk.
> 
> We chose signed type, to be consistent with off_t (which is signed) and
> with possibility for signed return type (where negative value means
> error).
> 
> So, convert driver read handlers parameters which are already 64bit to
> signed type.
> 
> While being here, convert also flags parameter to be BdrvRequestFlags.
> 
> Now let's consider all callers. Simple
> 
>   git grep '\->bdrv_\(aio\|co\)_preadv\(_part\)\?'
> 
> shows that's there three callers of driver function:
> 
>  bdrv_driver_preadv() in block/io.c, passes int64_t, checked by
>    bdrv_check_qiov_request() to be non-negative.
> 
>  qcow2_load_vmstate() does bdrv_check_qiov_request().
> 
>  do_perform_cow_read() has uint64_t argument. And a lot of things in
>  qcow2 driver are uint64_t, so converting it is big job. But we must
>  not work with requests that doesn't satisfy bdrv_check_qiov_request(),

s/doesn't/don't/

>  so let's just assert it here.
> 
> Still, the functions may be called directly, not only by drv->...
> Let's check:
> 
> git grep '\.bdrv_\(aio\|co\)_preadv\(_part\)\?\s*=' | \
> awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
> while read func; do git grep "$func(" | \
> grep -v "$func(BlockDriverState"; done
> 
> The only one such caller:
> 
>     QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1);
>     ...
>     ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0);
> 
> in tesTS/unit/test-bdrv-drain.c, and it's OK obviously.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

...it is repeated here. I'm fine dropping the first.

> ---
>  include/block/block_int.h        | 11 ++++++-----
>  block/backup-top.c               |  4 ++--

>  35 files changed, 120 insertions(+), 89 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index db7a909ea9..e6bcf74e46 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -234,8 +234,8 @@ struct BlockDriver {
>  
>      /* aio */
>      BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
> -        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
> -        BlockCompletionFunc *cb, void *opaque);
> +        int64_t offset, int64_t bytes, QEMUIOVector *qiov,
> +        BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque);
>      BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
>          uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
>          BlockCompletionFunc *cb, void *opaque);
> @@ -264,10 +264,11 @@ struct BlockDriver {
>       * The buffer in @qiov may point directly to guest memory.
>       */
>      int coroutine_fn (*bdrv_co_preadv)(BlockDriverState *bs,
> -        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags);
> +        int64_t offset, int64_t bytes, QEMUIOVector *qiov,
> +        BdrvRequestFlags flags);
>      int coroutine_fn (*bdrv_co_preadv_part)(BlockDriverState *bs,
> -        uint64_t offset, uint64_t bytes,
> -        QEMUIOVector *qiov, size_t qiov_offset, int flags);
> +        int64_t offset, int64_t bytes,
> +        QEMUIOVector *qiov, size_t qiov_offset, BdrvRequestFlags flags);

Lots of fallout, but I'm in favor of this signature change.


> +++ b/block/blkdebug.c
> @@ -614,8 +614,8 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>  }
>  
>  static int coroutine_fn
> -blkdebug_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
> -                   QEMUIOVector *qiov, int flags)
> +blkdebug_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
> +                   QEMUIOVector *qiov, BdrvRequestFlags flags)
>  {
>      int err;

Still calls rule_check() with uint64_t parameters, but since we've
asserted the caller was within range, no behavior change.

> +++ b/block/blkverify.c
> @@ -221,8 +221,8 @@ blkverify_co_prwv(BlockDriverState *bs, BlkverifyRequest *r, uint64_t offset,
>  }
>  
>  static int coroutine_fn
> -blkverify_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
> -                    QEMUIOVector *qiov, int flags)
> +blkverify_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
> +                    QEMUIOVector *qiov, BdrvRequestFlags flags)
>  {

Similarly for the call to blkverify_co_prwv(), and probably elsewhere in
the patch.

> +++ b/block/crypto.c
> @@ -397,8 +397,8 @@ static int block_crypto_reopen_prepare(BDRVReopenState *state,
>  #define BLOCK_CRYPTO_MAX_IO_SIZE (1024 * 1024)
>  
>  static coroutine_fn int
> -block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
> -                       QEMUIOVector *qiov, int flags)
> +block_crypto_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
> +                       QEMUIOVector *qiov, BdrvRequestFlags flags)
>  {
>      BlockCrypto *crypto = bs->opaque;
>      uint64_t cur_bytes; /* number of bytes in current iteration */

We could perhaps change the type of local variables like cur_bytes and
bytes_done; but it doesn't affect semantics.

> +++ b/block/curl.c
> @@ -896,7 +896,8 @@ out:
>  }
>  
>  static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
> -        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
> +        int64_t offset, int64_t bytes, QEMUIOVector *qiov,
> +        BdrvRequestFlags flags)
>  {
>      CURLAIOCB acb = {
>          .co = qemu_coroutine_self(),

Likewise changing the type of CURLAIOCB.offset and .bytes.  Cleanups
like that can be followups.

> +++ b/block/file-posix.c
> @@ -2033,9 +2033,9 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
>      return raw_thread_pool_submit(bs, handle_aiocb_rw, &acb);
>  }
>  
> -static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
> -                                      uint64_t bytes, QEMUIOVector *qiov,
> -                                      int flags)
> +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset,
> +                                      int64_t bytes, QEMUIOVector *qiov,
> +                                      BdrvRequestFlags flags)
>  {
>      return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ);

Similarly for fixing the signature of raw_co_prw() (after the
counterpart change to raw_co_pwritev)

> +++ b/block/nvme.c
> @@ -1221,8 +1221,9 @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>  }
>  
>  static coroutine_fn int nvme_co_preadv(BlockDriverState *bs,
> -                                       uint64_t offset, uint64_t bytes,
> -                                       QEMUIOVector *qiov, int flags)
> +                                       int64_t offset, int64_t bytes,
> +                                       QEMUIOVector *qiov,
> +                                       BdrvRequestFlags flags)
>  {
>      return nvme_co_prw(bs, offset, bytes, qiov, false, flags);
>  }

And for nvme_co_prw().

> +++ b/block/qcow2.c
> @@ -2281,9 +2281,10 @@ static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task)
>  }
>  
>  static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
> -                                             uint64_t offset, uint64_t bytes,
> +                                             int64_t offset, int64_t bytes,
>                                               QEMUIOVector *qiov,
> -                                             size_t qiov_offset, int flags)
> +                                             size_t qiov_offset,
> +                                             BdrvRequestFlags flags)
>  {
>      BDRVQcow2State *s = bs->opaque;
>      int ret = 0;

Unrelated to this patch; the loop sets cur_bytes = MIN(bytes, INT_MAX);
but it would be smarter to choose a cluster-aligned max instead of
INT_MAX.  It doesn't matter as long as the block layer has pre-clamped
requests to be < 32 bit to begin with, and then our later call to
qcow2_get_host_offset() further clamps it down to actual cluster
boundaries.  But it does look odd, compared to computing the original
MIN() to an aligned boundary in the first place, whether or not we ever
change the block layer to allow individual drivers to opt in to reading
more than 2G in one transaction.

qcow2_add_task() is another internal function worth improving in a followup.

> +++ b/block/quorum.c
> @@ -663,8 +663,8 @@ static int read_fifo_child(QuorumAIOCB *acb)
>      return ret;
>  }
>  
> -static int quorum_co_preadv(BlockDriverState *bs, uint64_t offset,
> -                            uint64_t bytes, QEMUIOVector *qiov, int flags)
> +static int quorum_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
> +                            QEMUIOVector *qiov, BdrvRequestFlags flags)
>  {
>      BDRVQuorumState *s = bs->opaque;
>      QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags);

and quorum_aio_get().

> diff --git a/block/raw-format.c b/block/raw-format.c
> index 7717578ed6..e3f459b2cb 100644
> --- a/block/raw-format.c
> +++ b/block/raw-format.c
> @@ -181,8 +181,8 @@ static void raw_reopen_abort(BDRVReopenState *state)
>  }
>  
>  /* Check and adjust the offset, against 'offset' and 'size' options. */
> -static inline int raw_adjust_offset(BlockDriverState *bs, uint64_t *offset,
> -                                    uint64_t bytes, bool is_write)
> +static inline int raw_adjust_offset(BlockDriverState *bs, int64_t *offset,
> +                                    int64_t bytes, bool is_write)
>  {
>      BDRVRawState *s = bs->opaque;
>  
> @@ -201,9 +201,9 @@ static inline int raw_adjust_offset(BlockDriverState *bs, uint64_t *offset,
>      return 0;
>  }
>  
> -static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
> -                                      uint64_t bytes, QEMUIOVector *qiov,
> -                                      int flags)
> +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset,
> +                                      int64_t bytes, QEMUIOVector *qiov,
> +                                      BdrvRequestFlags flags)
>  {
>      int ret;
>  
> @@ -259,7 +259,7 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
>          qiov = &local_qiov;
>      }
>  
> -    ret = raw_adjust_offset(bs, &offset, bytes, true);
> +    ret = raw_adjust_offset(bs, (int64_t *)&offset, bytes, true);

Ugly type-punning; thankfully it's transient until the counterpart patch
to driver write handlers.

>      if (ret) {
>          goto fail;
>      }
> @@ -294,7 +294,7 @@ static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
>  {
>      int ret;
>  
> -    ret = raw_adjust_offset(bs, (uint64_t *)&offset, bytes, true);
> +    ret = raw_adjust_offset(bs, (int64_t *)&offset, bytes, true);

And now you should lose this cast...

>      if (ret) {
>          return ret;
>      }
> @@ -306,7 +306,7 @@ static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs,
>  {
>      int ret;
>  
> -    ret = raw_adjust_offset(bs, (uint64_t *)&offset, bytes, true);
> +    ret = raw_adjust_offset(bs, &offset, bytes, true);

...like you did here.

>      if (ret) {
>          return ret;
>      }
> @@ -541,7 +541,7 @@ static int coroutine_fn raw_co_copy_range_from(BlockDriverState *bs,
>  {
>      int ret;
>  
> -    ret = raw_adjust_offset(bs, &src_offset, bytes, false);
> +    ret = raw_adjust_offset(bs, (int64_t *)&src_offset, bytes, false);
>      if (ret) {
>          return ret;
>      }
> @@ -560,7 +560,7 @@ static int coroutine_fn raw_co_copy_range_to(BlockDriverState *bs,
>  {
>      int ret;
>  
> -    ret = raw_adjust_offset(bs, &dst_offset, bytes, true);
> +    ret = raw_adjust_offset(bs, (int64_t *)&dst_offset, bytes, true);

Also transient casts.

Easy enough to fix my nits for qcow2 and the commit message.
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



  reply	other threads:[~2021-05-11 19:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-24 20:51 [PATCH v4 00/11] 64bit block-layer: part II Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 01/11] block/io: bring request check to bdrv_co_{read, write}v_vmstate Vladimir Sementsov-Ogievskiy via
2021-05-11 17:54   ` [PATCH v4 01/11] block/io: bring request check to bdrv_co_{read,write}v_vmstate Eric Blake
2021-03-24 20:51 ` [PATCH v4 02/11] qcow2: check request on vmstate save/load path Vladimir Sementsov-Ogievskiy
2021-05-11 17:57   ` Eric Blake
2021-03-24 20:51 ` [PATCH v4 03/11] block: use int64_t instead of uint64_t in driver read handlers Vladimir Sementsov-Ogievskiy
2021-05-11 19:22   ` Eric Blake [this message]
2021-05-24 12:57     ` Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 04/11] block: use int64_t instead of uint64_t in driver write handlers Vladimir Sementsov-Ogievskiy
2021-05-11 21:00   ` Eric Blake
2021-03-24 20:51 ` [PATCH v4 05/11] block: use int64_t instead of uint64_t in copy_range driver handlers Vladimir Sementsov-Ogievskiy
2021-05-11 21:14   ` Eric Blake
2021-03-24 20:51 ` [PATCH v4 06/11] block: make BlockLimits::max_pwrite_zeroes 64bit Vladimir Sementsov-Ogievskiy
2021-05-11 21:19   ` Eric Blake
2021-05-12  6:33     ` Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 07/11] block: use int64_t instead of int in driver write_zeroes handlers Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 08/11] block/io: allow 64bit write-zeroes requests Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 09/11] block: make BlockLimits::max_pdiscard 64bit Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 10/11] block: use int64_t instead of int in driver discard handlers Vladimir Sementsov-Ogievskiy
2021-03-24 20:51 ` [PATCH v4 11/11] block/io: allow 64bit discard requests Vladimir Sementsov-Ogievskiy
2021-03-24 21:13 ` [PATCH v4 00/11] 64bit block-layer: part II no-reply
2021-03-25  7:42   ` Vladimir Sementsov-Ogievskiy
2021-03-25  8:10     ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6d6b0719-4a43-7d2a-153f-cd19dfaa00e5@redhat.com \
    --to=eblake@redhat.com \
    --cc=ari@tuxera.com \
    --cc=berto@igalia.com \
    --cc=dillaman@redhat.com \
    --cc=fam@euphon.net \
    --cc=integration@gluster.org \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=namei.unix@gmail.com \
    --cc=pavel.dovgaluk@ispras.ru \
    --cc=pbonzini@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=ronniesahlberg@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=sw@weilnetz.de \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).