All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Shivaprasad G Bhat <sbhat@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
	Greg Kurz <groug@kaod.org>,
	qemu-ppc@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
	marcel.apfelbaum@gmail.com, "Michael S. Tsirkin" <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Eric Blake <eblake@redhat.com>,
	qemu-arm@nongnu.org, richard.henderson@linaro.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	shameerali.kolothum.thodi@huawei.com, kwangwoo.lee@sk.com,
	Markus Armbruster <armbru@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	kvm-ppc@vger.kernel.org, shivaprasadbhat@gmail.com,
	bharata@linux.vnet.ibm.com
Subject: Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
Date: Tue, 4 May 2021 10:29:11 +0530	[thread overview]
Message-ID: <d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4hAOC89JOXr-ZCps=n8gEKD5W0jmGU1Enfo8ECVMf3veQ@mail.gmail.com>

On 5/4/21 1:11 AM, Dan Williams wrote:
> On Mon, May 3, 2021 at 7:06 AM Shivaprasad G Bhat <sbhat@linux.ibm.com> wrote:
>>


.....

> 
> The proposal that "sync-dax=unsafe" for non-PPC architectures, is a
> fundamental misrepresentation of how this is supposed to work. Rather
> than make "sync-dax" a first class citizen of the device-description
> interface I'm proposing that you make this a separate device-type.
> This also solves the problem that "sync-dax" with an implicit
> architecture backend assumption is not precise, but a new "non-nvdimm"
> device type would make it explicit what the host is advertising to the
> guest.
> 

Currently, users can use a virtualized nvdimm support in Qemu to share 
host page cache to the guest via the below command

-object memory-backend-file,id=memnvdimm1,mem-path=file_name_in_host_fs
-device nvdimm,memdev=memnvdimm1

Such usage can results in wrong application behavior because there is no 
hint to the application/guest OS that a cpu cache flush is not 
sufficient to ensure persistence.

I understand that virio-pmem is suggested as an alternative for that. 
But why not fix virtualized nvdimm if platforms can express the details.

ie, can ACPI indicate to the guest OS that the device need a flush 
mechanism to ensure persistence in the above case?

What this patch series did was to express that property via a device 
tree node and guest driver enables a hypercall based flush mechanism to 
ensure persistence.


....

>>
>> On PPC, the default is "sync-dax=writeback" - so the ND_REGION_ASYNC
>>
>> is set for the region and the guest makes hcalls to issue fsync on the host.
>>
>>
>> Are you suggesting me to keep it "unsafe" as default for all architectures
>>
>> including PPC and a user can set it to "writeback" if desired.
> 
> No, I am suggesting that "sync-dax" is insufficient to convey this
> property. This behavior warrants its own device type, not an ambiguous
> property of the memory-backend-file with implicit architecture
> assumptions attached.
> 

Why is it insufficient?  Is it because other architectures don't have an 
ability express this detail to guest OS? Isn't that an arch limitations?

-aneesh
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Shivaprasad G Bhat <sbhat@linux.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Markus Armbruster <armbru@redhat.com>,
	bharata@linux.vnet.ibm.com,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	richard.henderson@linaro.org, Greg Kurz <groug@kaod.org>,
	kvm-ppc@vger.kernel.org, qemu-arm@nongnu.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	kwangwoo.lee@sk.com, David Gibson <david@gibson.dropbear.id.au>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	shameerali.kolothum.thodi@huawei.com, shivaprasadbhat@gmail.com,
	qemu-ppc@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
Date: Tue, 4 May 2021 10:29:11 +0530	[thread overview]
Message-ID: <d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4hAOC89JOXr-ZCps=n8gEKD5W0jmGU1Enfo8ECVMf3veQ@mail.gmail.com>

On 5/4/21 1:11 AM, Dan Williams wrote:
> On Mon, May 3, 2021 at 7:06 AM Shivaprasad G Bhat <sbhat@linux.ibm.com> wrote:
>>


.....

> 
> The proposal that "sync-dax=unsafe" for non-PPC architectures, is a
> fundamental misrepresentation of how this is supposed to work. Rather
> than make "sync-dax" a first class citizen of the device-description
> interface I'm proposing that you make this a separate device-type.
> This also solves the problem that "sync-dax" with an implicit
> architecture backend assumption is not precise, but a new "non-nvdimm"
> device type would make it explicit what the host is advertising to the
> guest.
> 

Currently, users can use a virtualized nvdimm support in Qemu to share 
host page cache to the guest via the below command

-object memory-backend-file,id=memnvdimm1,mem-path=file_name_in_host_fs
-device nvdimm,memdev=memnvdimm1

Such usage can results in wrong application behavior because there is no 
hint to the application/guest OS that a cpu cache flush is not 
sufficient to ensure persistence.

I understand that virio-pmem is suggested as an alternative for that. 
But why not fix virtualized nvdimm if platforms can express the details.

ie, can ACPI indicate to the guest OS that the device need a flush 
mechanism to ensure persistence in the above case?

What this patch series did was to express that property via a device 
tree node and guest driver enables a hypercall based flush mechanism to 
ensure persistence.


....

>>
>> On PPC, the default is "sync-dax=writeback" - so the ND_REGION_ASYNC
>>
>> is set for the region and the guest makes hcalls to issue fsync on the host.
>>
>>
>> Are you suggesting me to keep it "unsafe" as default for all architectures
>>
>> including PPC and a user can set it to "writeback" if desired.
> 
> No, I am suggesting that "sync-dax" is insufficient to convey this
> property. This behavior warrants its own device type, not an ambiguous
> property of the memory-backend-file with implicit architecture
> assumptions attached.
> 

Why is it insufficient?  Is it because other architectures don't have an 
ability express this detail to guest OS? Isn't that an arch limitations?

-aneesh


WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Shivaprasad G Bhat <sbhat@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
	Greg Kurz <groug@kaod.org>,
	qemu-ppc@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
	marcel.apfelbaum@gmail.com, "Michael S. Tsirkin" <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Eric Blake <eblake@redhat.com>,
	qemu-arm@nongnu.org, richard.henderson@linaro.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Haozhong Zhang <haozhong.zhang@intel.com>,
	shameerali.kolothum.thodi@huawei.com, kwangwoo.lee@sk.com,
	Markus Armbruster <armbru@redhat.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	kvm-ppc@vger.kernel.org, shivaprasadbhat@gmail.com,
	bharata@linux.vnet.ibm.com
Subject: Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm
Date: Tue, 04 May 2021 04:59:29 +0000	[thread overview]
Message-ID: <d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4hAOC89JOXr-ZCps=n8gEKD5W0jmGU1Enfo8ECVMf3veQ@mail.gmail.com>

On 5/4/21 1:11 AM, Dan Williams wrote:
> On Mon, May 3, 2021 at 7:06 AM Shivaprasad G Bhat <sbhat@linux.ibm.com> wrote:
>>


.....

> 
> The proposal that "sync-dax=unsafe" for non-PPC architectures, is a
> fundamental misrepresentation of how this is supposed to work. Rather
> than make "sync-dax" a first class citizen of the device-description
> interface I'm proposing that you make this a separate device-type.
> This also solves the problem that "sync-dax" with an implicit
> architecture backend assumption is not precise, but a new "non-nvdimm"
> device type would make it explicit what the host is advertising to the
> guest.
> 

Currently, users can use a virtualized nvdimm support in Qemu to share 
host page cache to the guest via the below command

-object memory-backend-file,id=memnvdimm1,mem-path=file_name_in_host_fs
-device nvdimm,memdev=memnvdimm1

Such usage can results in wrong application behavior because there is no 
hint to the application/guest OS that a cpu cache flush is not 
sufficient to ensure persistence.

I understand that virio-pmem is suggested as an alternative for that. 
But why not fix virtualized nvdimm if platforms can express the details.

ie, can ACPI indicate to the guest OS that the device need a flush 
mechanism to ensure persistence in the above case?

What this patch series did was to express that property via a device 
tree node and guest driver enables a hypercall based flush mechanism to 
ensure persistence.


....

>>
>> On PPC, the default is "sync-dax=writeback" - so the ND_REGION_ASYNC
>>
>> is set for the region and the guest makes hcalls to issue fsync on the host.
>>
>>
>> Are you suggesting me to keep it "unsafe" as default for all architectures
>>
>> including PPC and a user can set it to "writeback" if desired.
> 
> No, I am suggesting that "sync-dax" is insufficient to convey this
> property. This behavior warrants its own device type, not an ambiguous
> property of the memory-backend-file with implicit architecture
> assumptions attached.
> 

Why is it insufficient?  Is it because other architectures don't have an 
ability express this detail to guest OS? Isn't that an arch limitations?

-aneesh

  reply	other threads:[~2021-05-04  4:59 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29  3:48 [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm Shivaprasad G Bhat
2021-04-29  3:48 ` Shivaprasad G Bhat
2021-04-29  3:48 ` Shivaprasad G Bhat
2021-04-29  3:48 ` [PATCH v4 1/3] spapr: nvdimm: Forward declare and move the definitions Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-05-03 18:23   ` Eric Blake
2021-05-03 18:23     ` Eric Blake
2021-05-03 18:23     ` Eric Blake
2021-05-04  1:21     ` David Gibson
2021-05-04  1:21       ` David Gibson
2021-05-04  1:21       ` David Gibson
2021-04-29  3:48 ` [PATCH v4 2/3] spapr: nvdimm: Implement H_SCM_FLUSH hcall Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-04-29  3:48   ` Shivaprasad G Bhat
2021-04-29  3:49 ` [PATCH v4 3/3] nvdimm: Enable sync-dax device property for nvdimm Shivaprasad G Bhat
2021-04-29  3:49   ` Shivaprasad G Bhat
2021-04-29  3:49   ` Shivaprasad G Bhat
2021-05-03 18:27   ` Eric Blake
2021-05-03 18:27     ` Eric Blake
2021-05-03 18:27     ` Eric Blake
2021-04-29 15:55 ` [PATCH v4 0/3] nvdimm: Enable sync-dax " Stefan Hajnoczi
2021-04-29 15:55   ` Stefan Hajnoczi
2021-04-29 15:55   ` Stefan Hajnoczi
2021-04-29 16:32   ` Aneesh Kumar K.V
2021-04-29 16:44     ` Aneesh Kumar K.V
2021-04-29 16:32     ` Aneesh Kumar K.V
2021-04-30  4:27     ` David Gibson
2021-04-30  4:27       ` David Gibson
2021-04-30  4:27       ` David Gibson
2021-04-30 15:08       ` Stefan Hajnoczi
2021-04-30 15:08         ` Stefan Hajnoczi
2021-04-30 15:08         ` Stefan Hajnoczi
2021-04-30 19:14 ` Dan Williams
2021-04-30 19:14   ` Dan Williams
2021-04-30 19:14   ` Dan Williams
2021-05-01 13:55   ` Aneesh Kumar K.V
2021-05-01 13:56     ` Aneesh Kumar K.V
2021-05-01 13:55     ` Aneesh Kumar K.V
2021-05-03 14:05   ` Shivaprasad G Bhat
2021-05-03 14:17     ` Shivaprasad G Bhat
2021-05-03 14:05     ` Shivaprasad G Bhat
2021-05-03 19:41     ` Dan Williams
2021-05-03 19:41       ` Dan Williams
2021-05-03 19:41       ` Dan Williams
2021-05-04  4:59       ` Aneesh Kumar K.V [this message]
2021-05-04  4:59         ` Aneesh Kumar K.V
2021-05-04  4:59         ` Aneesh Kumar K.V
2021-05-04  5:43         ` Pankaj Gupta
2021-05-04  5:43           ` Pankaj Gupta
2021-05-04  5:43           ` Pankaj Gupta
2021-05-04  9:02           ` Aneesh Kumar K.V
2021-05-04  9:14             ` Aneesh Kumar K.V
2021-05-04  9:02             ` Aneesh Kumar K.V
2021-05-05  0:12             ` Dan Williams
2021-05-05  0:12               ` Dan Williams
2021-05-05  0:12               ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d21fcac6-6a54-35fd-3088-6c56b85fbf25@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=armbru@redhat.com \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=groug@kaod.org \
    --cc=haozhong.zhang@intel.com \
    --cc=imammedo@redhat.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kwangwoo.lee@sk.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sbhat@linux.ibm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shivaprasadbhat@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.