All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Shivaprasad G Bhat <sbhat@linux.ibm.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Greg Kurz <groug@kaod.org>,
	qemu-ppc@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Eric Blake <eblake@redhat.com>,
	qemu-arm@nongnu.org, richard.henderson@linaro.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	shameerali.kolothum.thodi@huawei.com, kwangwoo.lee@sk.com,
	Markus Armbruster <armbru@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	kvm-ppc@vger.kernel.org, shivaprasadbhat@gmail.com,
	bharata@linux.vnet.ibm.com
Subject: Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
Date: Tue, 4 May 2021 07:43:53 +0200	[thread overview]
Message-ID: <CAM9Jb+g8bKF0Z7za4sZpc2tZ01Sp4c4FEaV65He8w1+QOL3_yw@mail.gmail.com> (raw)
In-Reply-To: <d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com>

> > The proposal that "sync-dax=unsafe" for non-PPC architectures, is a
> > fundamental misrepresentation of how this is supposed to work. Rather
> > than make "sync-dax" a first class citizen of the device-description
> > interface I'm proposing that you make this a separate device-type.
> > This also solves the problem that "sync-dax" with an implicit
> > architecture backend assumption is not precise, but a new "non-nvdimm"
> > device type would make it explicit what the host is advertising to the
> > guest.
> >
>
> Currently, users can use a virtualized nvdimm support in Qemu to share
> host page cache to the guest via the below command
>
> -object memory-backend-file,id=memnvdimm1,mem-path=file_name_in_host_fs
> -device nvdimm,memdev=memnvdimm1
>
> Such usage can results in wrong application behavior because there is no
> hint to the application/guest OS that a cpu cache flush is not
> sufficient to ensure persistence.
>
> I understand that virio-pmem is suggested as an alternative for that.
> But why not fix virtualized nvdimm if platforms can express the details.
>
> ie, can ACPI indicate to the guest OS that the device need a flush
> mechanism to ensure persistence in the above case?
>
> What this patch series did was to express that property via a device
> tree node and guest driver enables a hypercall based flush mechanism to
> ensure persistence.

Would VIRTIO (entirely asynchronous, no trap at host side) based
mechanism is better
than hyper-call based? Registering memory can be done any way. We
implemented virtio-pmem
flush mechanisms with below considerations:

- Proper semantic for guest flush requests.
- Efficient mechanism for performance pov.

I am just asking myself if we have platform agnostic mechanism already
there, maybe
we can extend it to suit our needs? Maybe I am missing some points here.

> >> On PPC, the default is "sync-dax=writeback" - so the ND_REGION_ASYNC
> >>
> >> is set for the region and the guest makes hcalls to issue fsync on the host.
> >>
> >>
> >> Are you suggesting me to keep it "unsafe" as default for all architectures
> >>
> >> including PPC and a user can set it to "writeback" if desired.
> >
> > No, I am suggesting that "sync-dax" is insufficient to convey this
> > property. This behavior warrants its own device type, not an ambiguous
> > property of the memory-backend-file with implicit architecture
> > assumptions attached.
> >
>
> Why is it insufficient?  Is it because other architectures don't have an
> ability express this detail to guest OS? Isn't that an arch limitations?
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Shivaprasad G Bhat <sbhat@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Markus Armbruster <armbru@redhat.com>,
	bharata@linux.vnet.ibm.com,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	richard.henderson@linaro.org, Greg Kurz <groug@kaod.org>,
	kvm-ppc@vger.kernel.org, qemu-arm@nongnu.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	kwangwoo.lee@sk.com, David Gibson <david@gibson.dropbear.id.au>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	shameerali.kolothum.thodi@huawei.com, shivaprasadbhat@gmail.com,
	qemu-ppc@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
Date: Tue, 4 May 2021 07:43:53 +0200	[thread overview]
Message-ID: <CAM9Jb+g8bKF0Z7za4sZpc2tZ01Sp4c4FEaV65He8w1+QOL3_yw@mail.gmail.com> (raw)
In-Reply-To: <d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com>

> > The proposal that "sync-dax=unsafe" for non-PPC architectures, is a
> > fundamental misrepresentation of how this is supposed to work. Rather
> > than make "sync-dax" a first class citizen of the device-description
> > interface I'm proposing that you make this a separate device-type.
> > This also solves the problem that "sync-dax" with an implicit
> > architecture backend assumption is not precise, but a new "non-nvdimm"
> > device type would make it explicit what the host is advertising to the
> > guest.
> >
>
> Currently, users can use a virtualized nvdimm support in Qemu to share
> host page cache to the guest via the below command
>
> -object memory-backend-file,id=memnvdimm1,mem-path=file_name_in_host_fs
> -device nvdimm,memdev=memnvdimm1
>
> Such usage can results in wrong application behavior because there is no
> hint to the application/guest OS that a cpu cache flush is not
> sufficient to ensure persistence.
>
> I understand that virio-pmem is suggested as an alternative for that.
> But why not fix virtualized nvdimm if platforms can express the details.
>
> ie, can ACPI indicate to the guest OS that the device need a flush
> mechanism to ensure persistence in the above case?
>
> What this patch series did was to express that property via a device
> tree node and guest driver enables a hypercall based flush mechanism to
> ensure persistence.

Would VIRTIO (entirely asynchronous, no trap at host side) based
mechanism is better
than hyper-call based? Registering memory can be done any way. We
implemented virtio-pmem
flush mechanisms with below considerations:

- Proper semantic for guest flush requests.
- Efficient mechanism for performance pov.

I am just asking myself if we have platform agnostic mechanism already
there, maybe
we can extend it to suit our needs? Maybe I am missing some points here.

> >> On PPC, the default is "sync-dax=writeback" - so the ND_REGION_ASYNC
> >>
> >> is set for the region and the guest makes hcalls to issue fsync on the host.
> >>
> >>
> >> Are you suggesting me to keep it "unsafe" as default for all architectures
> >>
> >> including PPC and a user can set it to "writeback" if desired.
> >
> > No, I am suggesting that "sync-dax" is insufficient to convey this
> > property. This behavior warrants its own device type, not an ambiguous
> > property of the memory-backend-file with implicit architecture
> > assumptions attached.
> >
>
> Why is it insufficient?  Is it because other architectures don't have an
> ability express this detail to guest OS? Isn't that an arch limitations?


WARNING: multiple messages have this Message-ID (diff)
From: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Shivaprasad G Bhat <sbhat@linux.ibm.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Greg Kurz <groug@kaod.org>,
	qemu-ppc@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Eric Blake <eblake@redhat.com>,
	qemu-arm@nongnu.org, richard.henderson@linaro.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	shameerali.kolothum.thodi@huawei.com, kwangwoo.lee@sk.com,
	Markus Armbruster <armbru@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	kvm-ppc@vger.kernel.org, shivaprasadbhat@gmail.com,
	bharata@linux.vnet.ibm.com
Subject: Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
Date: Tue, 04 May 2021 05:43:53 +0000	[thread overview]
Message-ID: <CAM9Jb+g8bKF0Z7za4sZpc2tZ01Sp4c4FEaV65He8w1+QOL3_yw@mail.gmail.com> (raw)
In-Reply-To: <d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com>

> > The proposal that "sync-dax=unsafe" for non-PPC architectures, is a
> > fundamental misrepresentation of how this is supposed to work. Rather
> > than make "sync-dax" a first class citizen of the device-description
> > interface I'm proposing that you make this a separate device-type.
> > This also solves the problem that "sync-dax" with an implicit
> > architecture backend assumption is not precise, but a new "non-nvdimm"
> > device type would make it explicit what the host is advertising to the
> > guest.
> >
>
> Currently, users can use a virtualized nvdimm support in Qemu to share
> host page cache to the guest via the below command
>
> -object memory-backend-file,id=memnvdimm1,mem-path=file_name_in_host_fs
> -device nvdimm,memdev=memnvdimm1
>
> Such usage can results in wrong application behavior because there is no
> hint to the application/guest OS that a cpu cache flush is not
> sufficient to ensure persistence.
>
> I understand that virio-pmem is suggested as an alternative for that.
> But why not fix virtualized nvdimm if platforms can express the details.
>
> ie, can ACPI indicate to the guest OS that the device need a flush
> mechanism to ensure persistence in the above case?
>
> What this patch series did was to express that property via a device
> tree node and guest driver enables a hypercall based flush mechanism to
> ensure persistence.

Would VIRTIO (entirely asynchronous, no trap at host side) based
mechanism is better
than hyper-call based? Registering memory can be done any way. We
implemented virtio-pmem
flush mechanisms with below considerations:

- Proper semantic for guest flush requests.
- Efficient mechanism for performance pov.

I am just asking myself if we have platform agnostic mechanism already
there, maybe
we can extend it to suit our needs? Maybe I am missing some points here.

> >> On PPC, the default is "sync-dax=writeback" - so the ND_REGION_ASYNC
> >>
> >> is set for the region and the guest makes hcalls to issue fsync on the host.
> >>
> >>
> >> Are you suggesting me to keep it "unsafe" as default for all architectures
> >>
> >> including PPC and a user can set it to "writeback" if desired.
> >
> > No, I am suggesting that "sync-dax" is insufficient to convey this
> > property. This behavior warrants its own device type, not an ambiguous
> > property of the memory-backend-file with implicit architecture
> > assumptions attached.
> >
>
> Why is it insufficient?  Is it because other architectures don't have an
> ability express this detail to guest OS? Isn't that an arch limitations?

  reply	other threads:[~2021-05-04  5:44 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29  3:48 [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm Shivaprasad G Bhat
2021-04-29  3:48 ` Shivaprasad G Bhat
2021-04-29  3:48 ` Shivaprasad G Bhat
2021-04-29  3:48 ` [PATCH v4 1/3] spapr: nvdimm: Forward declare and move the definitions Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-05-03 18:23   ` Eric Blake
2021-05-03 18:23     ` Eric Blake
2021-05-03 18:23     ` Eric Blake
2021-05-04  1:21     ` David Gibson
2021-05-04  1:21       ` David Gibson
2021-05-04  1:21       ` David Gibson
2021-04-29  3:48 ` [PATCH v4 2/3] spapr: nvdimm: Implement H_SCM_FLUSH hcall Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-04-29  3:49 ` [PATCH v4 3/3] nvdimm: Enable sync-dax device property for nvdimm Shivaprasad G Bhat
2021-04-29  3:49   ` Shivaprasad G Bhat
2021-04-29  3:49   ` Shivaprasad G Bhat
2021-05-03 18:27   ` Eric Blake
2021-05-03 18:27     ` Eric Blake
2021-05-03 18:27     ` Eric Blake
2021-04-29 15:55 ` [PATCH v4 0/3] nvdimm: Enable sync-dax " Stefan Hajnoczi
2021-04-29 15:55   ` Stefan Hajnoczi
2021-04-29 15:55   ` Stefan Hajnoczi
2021-04-29 16:32   ` Aneesh Kumar K.V
2021-04-29 16:44     ` Aneesh Kumar K.V
2021-04-29 16:32     ` Aneesh Kumar K.V
2021-04-30  4:27     ` David Gibson
2021-04-30  4:27       ` David Gibson
2021-04-30  4:27       ` David Gibson
2021-04-30 15:08       ` Stefan Hajnoczi
2021-04-30 15:08         ` Stefan Hajnoczi
2021-04-30 15:08         ` Stefan Hajnoczi
2021-04-30 19:14 ` Dan Williams
2021-04-30 19:14   ` Dan Williams
2021-04-30 19:14   ` Dan Williams
2021-05-01 13:55   ` Aneesh Kumar K.V
2021-05-01 13:56     ` Aneesh Kumar K.V
2021-05-01 13:55     ` Aneesh Kumar K.V
2021-05-03 14:05   ` Shivaprasad G Bhat
2021-05-03 14:17     ` Shivaprasad G Bhat
2021-05-03 14:05     ` Shivaprasad G Bhat
2021-05-03 19:41     ` Dan Williams
2021-05-03 19:41       ` Dan Williams
2021-05-03 19:41       ` Dan Williams
2021-05-04  4:59       ` Aneesh Kumar K.V
2021-05-04  4:59         ` Aneesh Kumar K.V
2021-05-04  4:59         ` Aneesh Kumar K.V
2021-05-04  5:43         ` Pankaj Gupta [this message]
2021-05-04  5:43           ` Pankaj Gupta
2021-05-04  5:43           ` Pankaj Gupta
2021-05-04  9:02           ` Aneesh Kumar K.V
2021-05-04  9:14             ` Aneesh Kumar K.V
2021-05-04  9:02             ` Aneesh Kumar K.V
2021-05-05  0:12             ` Dan Williams
2021-05-05  0:12               ` Dan Williams
2021-05-05  0:12               ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM9Jb+g8bKF0Z7za4sZpc2tZ01Sp4c4FEaV65He8w1+QOL3_yw@mail.gmail.com \
    --to=pankaj.gupta.linux@gmail.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=armbru@redhat.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=groug@kaod.org \
    --cc=haozhong.zhang@intel.com \
    --cc=imammedo@redhat.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kwangwoo.lee@sk.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sbhat@linux.ibm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shivaprasadbhat@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.