All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Richard W.M. Jones" <rjones@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: fam@euphon.net, kwolf@redhat.com, berto@igalia.com,
	qemu-block@nongnu.org, qemu-devel@nongnu.org, mreitz@redhat.com,
	stefanha@redhat.com, den@openvz.org
Subject: Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
Date: Fri, 8 Jan 2021 11:02:41 +0000	[thread overview]
Message-ID: <20210108110241.GF30079@redhat.com> (raw)
In-Reply-To: <d3413d86-845e-ec9f-83b7-75c0720c1656@virtuozzo.com>

On Fri, Jan 08, 2021 at 01:51:35PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 07.01.2021 12:58, Richard W.M. Jones wrote:
> >On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> >>Finally to be safe with calculations, to not calculate different
> >>maximums for different nodes (depending on cluster size and
> >>request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> >>as absolute maximum bytes length for Qemu. Actually, it's not much less
> >>than INT64_MAX.
> >
> >>+/*
> >>+ * We want allow aligning requests and disk length up to any 32bit alignment
> >>+ * and don't afraid of overflow.
> >>+ * To achieve it, and in the same time use some pretty number as maximum disk
> >>+ * size, let's define maximum "length" (a limit for any offset/bytes request and
> >>+ * for disk size) to be the greatest power of 2 less than INT64_MAX.
> >>+ */
> >>+#define BDRV_MAX_ALIGNMENT (1L << 30)
> >>+#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> >
> >This change broke nbdkit tests.
> >
> >We test that qemu can handle a qemu NBD export of size 2^63 - 512, the
> >largest size that (experimentally) we found qemu could safely handle.
> >eg:
> >
> >   https://github.com/libguestfs/nbdkit/blob/master/tests/test-memory-largest-for-qemu.sh
> >
> >Before this commit:
> >
> >   $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
> >   image: nbd://localhost:10809
> >   file format: raw
> >   virtual size: 8 EiB (9223372036854775296 bytes)
> >   disk size: unavailable
> >
> >After this commit:
> >
> >   $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
> >   qemu-img: Could not open 'nbd://localhost:10809': Could not refresh total sector count: File too large
> >
> >Can I confirm that this limit is now the new official one and we
> >should adjust nbdkit tests?  Or was this change unintentional given
> >that qemu seemed happy to handle 2^63 - 512 disks before?
> >
> >Note that nbdkit & libnbd support up to 2^63 - 1 bytes (we are not
> >limited to whole sectors).  Also the Linux kernel will let you create
> >a /dev/nbdX device of size 2^63 - 1.
> >
> 
> Hi Rich! The change is intentional.
>
> I think the benefit of having clean limit, allowing us to align up
> bytes to some alignment (which is a common operation) exceeds the
> loss of 1G at the end of 2**63 range. We get simpler and safer
> code. And anyway, new limit is not much worse than 2**63.

That's fine, as long as we settle on this.  I've updated the nbdkit tests:

https://github.com/libguestfs/nbdkit/commit/c3ec8c951e39a0f921252c162c236f23c588d2bd

> If at some
> point we have a problem with it being too restrictive, it's than
> very likely that 2**63 would be too small as well, which will
> require so much work that our a bit more restrictive limit is
> unlikely to increase the difficulty.

The next step is definitely working on 128 bit offsets!
https://rwmj.wordpress.com/2011/10/03/when-will-disk-sizes-go-beyond-64-bits/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/



  reply	other threads:[~2021-01-08 11:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 22:27 [PATCH 0/4] block: prepare for 64bit Vladimir Sementsov-Ogievskiy
2020-12-03 22:27 ` [PATCH 1/4] block/file-posix: fix workaround in raw_do_pwrite_zeroes() Vladimir Sementsov-Ogievskiy
2020-12-03 22:27 ` [PATCH 2/4] block/io: bdrv_refresh_limits(): use ERRP_GUARD Vladimir Sementsov-Ogievskiy
2020-12-04 15:08   ` Alberto Garcia
2020-12-03 22:27 ` [PATCH 3/4] block/io: bdrv_check_byte_request(): drop bdrv_is_inserted() Vladimir Sementsov-Ogievskiy
2020-12-04 15:16   ` Alberto Garcia
2020-12-03 22:27 ` [PATCH 4/4] block: introduce BDRV_MAX_LENGTH Vladimir Sementsov-Ogievskiy
2021-01-07  9:58   ` Richard W.M. Jones
2021-01-07 10:56     ` Richard W.M. Jones
2021-01-07 12:20       ` Richard W.M. Jones
2021-01-08 11:14         ` Vladimir Sementsov-Ogievskiy
2021-01-08 11:27           ` Daniel P. Berrangé
2021-02-04 14:05         ` Eric Blake
2021-01-08 10:51     ` Vladimir Sementsov-Ogievskiy
2021-01-08 11:02       ` Richard W.M. Jones [this message]
2020-12-08 17:13 ` [PATCH 0/4] block: prepare for 64bit Kevin Wolf
2020-12-08 17:32   ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210108110241.GF30079@redhat.com \
    --to=rjones@redhat.com \
    --cc=berto@igalia.com \
    --cc=den@openvz.org \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.