qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Klaus Jensen <its@irrelevant.dk>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>,
	qemu-block@nongnu.org, Klaus Jensen <k.jensen@samsung.com>,
	qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
	Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH RFC v5 12/12] hw/block/nvme: add persistence for zone info
Date: Mon, 30 Nov 2020 14:58:41 +0000	[thread overview]
Message-ID: <20201130145841.GA474479@stefanha-x1.localdomain> (raw)
In-Reply-To: <X8TspZJE+B551qno@apples.localdomain>

[-- Attachment #1: Type: text/plain, Size: 3712 bytes --]

On Mon, Nov 30, 2020 at 01:59:17PM +0100, Klaus Jensen wrote:
> On Nov 30 12:33, Stefan Hajnoczi wrote:
> > On Fri, Nov 27, 2020 at 12:46:01AM +0100, Klaus Jensen wrote:
> > > From: Klaus Jensen <k.jensen@samsung.com>
> > > 
> > > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > > ---
> > >  docs/specs/nvme.txt   |  15 +++
> > >  hw/block/nvme-ns.h    |  16 ++++
> > >  hw/block/nvme-ns.c    | 212 +++++++++++++++++++++++++++++++++++++++++-
> > >  hw/block/nvme.c       |  87 +++++++++++++++++
> > >  hw/block/trace-events |   2 +
> > >  5 files changed, 331 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt
> > > index 03bb4d9516b4..05d81c88ad4e 100644
> > > --- a/docs/specs/nvme.txt
> > > +++ b/docs/specs/nvme.txt
> > > @@ -20,6 +20,21 @@ The nvme device (-device nvme) emulates an NVM Express Controller.
> > >    `zns.mor`; Specifies the number of open resources available. This is a 0s
> > >       based value.
> > >  
> > > +  `zns.pstate`; This parameter specifies another blockdev to be used for
> > > +     storing zone state persistently.
> > > +
> > > +       -drive id=zns-pstate,file=zns-pstate.img,format=raw
> > > +       -device nvme-ns,zns.pstate=zns-pstate,...
> > > +
> > > +     To reset (or initialize) state, the blockdev image should be of zero size:
> > > +
> > > +       qemu-img create -f raw zns-pstate.img 0
> > > +
> > > +     The image will be intialized with a file format header and truncated to
> > > +     the required size. If the pstate given is of non-zero size, it will be
> > > +     assumed to already contain zone state information and the header will be
> > > +     checked.
> > 
> > In principle this makes sense but at first glance it looks like the code
> > is synchronous - it blocks the vCPU if zone metadata I/O is necessary.
> > That works for test/bring-up code but can't be used in production due to
> > the performance impact.
> > 
> > Is the expectation that the QEMU NVMe device emulation will only be used
> > for test/bring-up now and in the future?
> > 
> 
> Hi Stefan,
> 
> Thanks for taking a look at this.
> 
> I could see why someone would maybe use the core nvme emulation in
> production (but I'm not aware of anyone doing it), but the zoned
> emulation is *probably* not for production (and that is where the zone
> updates are needed). But someone could surprise me with a use case I
> guess.
> 
> And yes, I know this is synchronous in this version. I have not
> extensively evaluated the performance impact, but crucially the blocking
> only happens when the zone changes state (i.e. every write does NOT
> trigger a blk_pwrite just to persist the updated write pointer).
> 
> I know this can be done asynchronously (I have implemented it like so
> previously), but I wanted to get an opinion on the general stategry
> before adding that. The opposing strategy, is to use a some form of
> mmap/msync, but I, for one, pushed back against that because I'd like
> this to work on as many platforms as possible. Hence the RFC for this
> blockdev based approach.
> 
> But if you think a blockdev approach like this is a reasonable QEMU-esce
> way of doing it, then I'll proceed to do a v2 with asynchronous updates.

I don't have a strong opinion, especially since I haven't investigated
how NVMe Zoned Namespaces work (besides watching your interesting KVM
Forum presentation).

If you want to do the mmap approach, please use HostMemoryBackend for
that instead of open coding mmap(2) calls. That way it will support
several types of backing memory, including MAP_SHARED files, persistent
memory, etc.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      parent reply	other threads:[~2020-11-30 15:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 23:45 [PATCH v5 00/12] hw/block/nvme: zoned namespace command set Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 01/12] hw/block/nvme: Separate read and write handlers Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 02/12] hw/block/nvme: Merge nvme_write_zeroes() with nvme_write() Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 03/12] hw/block/nvme: add commands supported and effects log page Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 04/12] hw/block/nvme: Generate namespace UUIDs Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 05/12] hw/block/nvme: support namespace types Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 06/12] hw/block/nvme: add basic read/write for zoned namespaces Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 07/12] hw/block/nvme: add the zone management receive command Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 08/12] hw/block/nvme: add the zone management send command Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 09/12] hw/block/nvme: add the zone append command Klaus Jensen
2020-11-26 23:45 ` [PATCH v5 10/12] hw/block/nvme: track and enforce zone resources Klaus Jensen
2020-11-26 23:46 ` [PATCH v5 11/12] hw/block/nvme: allow open to close zone transitions by controller Klaus Jensen
2020-11-26 23:46 ` [PATCH RFC v5 12/12] hw/block/nvme: add persistence for zone info Klaus Jensen
2020-11-30 12:33   ` Stefan Hajnoczi
2020-11-30 12:59     ` Klaus Jensen
2020-11-30 13:18       ` Klaus Jensen
2020-11-30 14:58       ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201130145841.GA474479@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=fam@euphon.net \
    --cc=its@irrelevant.dk \
    --cc=k.jensen@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).