All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Wouter Verhelst <w@uter.be>
Cc: "nbd-general@lists.sourceforge.net"
	<nbd-general@lists.sourceforge.net>,
	"Denis V. Lunev" <den@openvz.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Alex Bligh <alex@alex.org.uk>
Subject: Re: [Qemu-devel] [Nbd] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command
Date: Fri, 4 Mar 2016 10:54:13 +0100	[thread overview]
Message-ID: <20160304095413.GC4366@noname.redhat.com> (raw)
In-Reply-To: <20160304084911.GA5955@grep.be>

Am 04.03.2016 um 09:49 hat Wouter Verhelst geschrieben:
> Hi folks,
> 
> (sorry about the lateness of this reply, was busy for the last few weeks)
> 
> On Thu, Feb 18, 2016 at 11:34:04AM +0300, Denis V. Lunev wrote:
> > On 02/18/2016 11:09 AM, Alex Bligh wrote:
> > > On 17 Feb 2016, at 18:10, Denis V. Lunev <den@openvz.org> wrote:
> > >
> > >> Currently available NBD_CMD_TRIM command can not be used as the
> > >> specification explicitely says that "a client MUST NOT make any
> > >> assumptions about the contents of the export affected by this
> > >> [NBD_CMD_TRIM] command, until overwriting it again with `NBD_CMD_WRITE`"
> > > Would a flag to NBD_CMD_TRIM that says "ensure the written
> > > data is zeroed" not be an easier solution than adding another
> > > very similar command?
> > >
> > > Or (cough) changing the spec?
> > >
> > from the point of the receiver the situation (from my POW) could
> > be different. Let us assume that we are writing to the plain
> > file.
> > 
> > There are 2 type of queries:
> > - pls make the target sparse, i.e. perform FALLOC_FL_PUNCH_HOLE
> >    and there is no problem that the operation could not be performed,
> >    this is a hint;
> 
> This is what NBD_CMD_TRIM does, currently.
> 
> The reason this is a hint, is that there is no guarantee that the
> underlying operating system or storage even supports
> FALLOC_FL_PUNCH_HOLE (or similar). We could have made NBD_CMD_TRIM fail
> with a "not possible on this export" kind of error in that case, but it
> was chosen not to do that (for reasons I don't remember; maybe we just
> didn't consider this enough).
> 
> This could be remedied if the client could somehow ask what the result
> of a TRIM command would be; i.e., if the server has support for
> FALLOC_FL_PUNCH_HOLE, it could set a flag which would let the client
> know that NBD_CMD_TRIM will zero out bytes. If the server doesn't set
> that flag and the client requires zeroes, it could then just issue a
> WRITE command, followed (maybe) by a TRIM for the same region (which
> would be less optimal, but have the same result with older servers)

NBD_CMD_TRIM covers the case "I don't need this data any more, you can
throw it away", and I think treating that purely as a hint is perfectly
fine.

> > - pls write the following amount of zeroes in either way (even calling
> >    write directly), i.e. ensure that the data is zeroed and the space on
> >    the file system is allocated for that.
> 
> IOW, you *don't* want to have a sparse file in that case? Or do I
> misunderstand things here?

I think what we're looking for is more like "zero out this area, feel
free to use whatever method is most efficient to achieve that".

So if the server knows that the backing store supports an efficient way
to write zeros (e.g. FALLOC_FL_ZERO_RANGE), it will use that. Otherwise,
if TRIM works and we know that the result is zeroed space instead of
undefined contents, the server is free to use it. And if even that
fails, it just falls back to an explicit write of a zeroed buffer.

If we want, we can give the client a little more control about whether
or not discarding in the process is allowed (or maybe even preferred).
qemu's interface for writing zeros has a BDRV_REQ_MAY_UNMAP flag, for
example.

Kevin

  reply	other threads:[~2016-03-04  9:54 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-17 18:10 [Qemu-devel] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command Denis V. Lunev
2016-02-17 20:58 ` Eric Blake
2016-02-18  4:46   ` Denis V. Lunev
2016-02-18  8:30     ` Denis V. Lunev
2016-02-18  9:18   ` Roman Kagan
2016-02-18 10:36     ` Denis V. Lunev
2016-02-18 16:35     ` Eric Blake
2016-02-18 17:23       ` [Qemu-devel] SUMMARY: " Denis V. Lunev
2016-02-18 17:55         ` Eric Blake
2016-02-18 19:29         ` [Qemu-devel] [Nbd] " Alex Bligh
2016-02-19  7:12         ` [Qemu-devel] " Denis V. Lunev
2016-02-19  8:56           ` Vladimir Sementsov-Ogievskiy
2016-02-19  9:11           ` Daniel P. Berrange
2016-02-18 12:14   ` [Qemu-devel] " Daniel P. Berrange
2016-02-18 14:05     ` Denis V. Lunev
2016-02-18  8:09 ` Alex Bligh
2016-02-18  8:34   ` Denis V. Lunev
2016-03-04  8:49     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-04  9:54       ` Kevin Wolf [this message]
2016-03-04 14:03         ` Paolo Bonzini
2016-03-06 10:28           ` Wouter Verhelst
2016-03-06 18:54             ` Denis V. Lunev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160304095413.GC4366@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=alex@alex.org.uk \
    --cc=den@openvz.org \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=qemu-devel@nongnu.org \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.