* [PATCH v3 4/5] ext4: disable map_sync for virtio pmem
@ 2019-01-09 13:56 Pankaj Gupta
2019-01-09 13:56 ` [PATCH v3 5/5] xfs: " Pankaj Gupta
2019-01-09 14:42 ` [PATCH v3 4/5] ext4: " Jan Kara
0 siblings, 2 replies; 5+ messages in thread
From: Pankaj Gupta @ 2019-01-09 13:56 UTC (permalink / raw)
To: linux-kernel, kvm, qemu-devel, linux-nvdimm, linux-fsdevel,
virtualization, linux-acpi, linux-ext4, linux-xfs
Cc: jack, stefanha, dan.j.williams, riel, nilal, kwolf, pbonzini,
zwisler, vishal.l.verma, dave.jiang, david, jmoyer,
xiaoguangrong.eric, hch, mst, jasowang, lcapitulino, imammedo,
eblake, willy, tytso, adilger.kernel, darrick.wong, rjw, pagupta
Virtio pmem provides asynchronous host page cache flush
mechanism. We don't support 'MAP_SYNC' with virtio pmem
and ext4.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
---
fs/ext4/file.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 69d65d4..e54f48b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -360,8 +360,10 @@ static const struct vm_operations_struct ext4_file_vm_ops = {
static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
{
struct inode *inode = file->f_mapping->host;
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+ struct dax_device *dax_dev = sbi->s_daxdev;
- if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
+ if (unlikely(ext4_forced_shutdown(sbi)))
return -EIO;
/*
@@ -371,6 +373,13 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
if (!IS_DAX(file_inode(file)) && (vma->vm_flags & VM_SYNC))
return -EOPNOTSUPP;
+ /* We don't support synchronous mappings with guest direct access
+ * and virtio based host page cache flush mechanism.
+ */
+ if (IS_DAX(file_inode(file)) && virtio_pmem_host_cache_enabled(dax_dev)
+ && (vma->vm_flags & VM_SYNC))
+ return -EOPNOTSUPP;
+
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
--
2.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 5/5] xfs: disable map_sync for virtio pmem
2019-01-09 13:56 [PATCH v3 4/5] ext4: disable map_sync for virtio pmem Pankaj Gupta
@ 2019-01-09 13:56 ` Pankaj Gupta
2019-01-09 14:42 ` [PATCH v3 4/5] ext4: " Jan Kara
1 sibling, 0 replies; 5+ messages in thread
From: Pankaj Gupta @ 2019-01-09 13:56 UTC (permalink / raw)
To: linux-kernel, kvm, qemu-devel, linux-nvdimm, linux-fsdevel,
virtualization, linux-acpi, linux-ext4, linux-xfs
Cc: jack, stefanha, dan.j.williams, riel, nilal, kwolf, pbonzini,
zwisler, vishal.l.verma, dave.jiang, david, jmoyer,
xiaoguangrong.eric, hch, mst, jasowang, lcapitulino, imammedo,
eblake, willy, tytso, adilger.kernel, darrick.wong, rjw, pagupta
Virtio pmem provides asynchronous host page cache flush
mechanism. we don't support 'MAP_SYNC' with virtio pmem
and xfs.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
---
fs/xfs/xfs_file.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index e474250..eae4aa4 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1190,6 +1190,14 @@ xfs_file_mmap(
if (!IS_DAX(file_inode(filp)) && (vma->vm_flags & VM_SYNC))
return -EOPNOTSUPP;
+ /* We don't support synchronous mappings with guest direct access
+ * and virtio based host page cache mechanism.
+ */
+ if (IS_DAX(file_inode(filp)) && virtio_pmem_host_cache_enabled(
+ xfs_find_daxdev_for_inode(file_inode(filp))) &&
+ (vma->vm_flags & VM_SYNC))
+ return -EOPNOTSUPP;
+
file_accessed(filp);
vma->vm_ops = &xfs_file_vm_ops;
if (IS_DAX(file_inode(filp)))
--
2.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 4/5] ext4: disable map_sync for virtio pmem
2019-01-09 13:56 [PATCH v3 4/5] ext4: disable map_sync for virtio pmem Pankaj Gupta
2019-01-09 13:56 ` [PATCH v3 5/5] xfs: " Pankaj Gupta
@ 2019-01-09 14:42 ` Jan Kara
2019-01-09 14:54 ` Pankaj Gupta
1 sibling, 1 reply; 5+ messages in thread
From: Jan Kara @ 2019-01-09 14:42 UTC (permalink / raw)
To: Pankaj Gupta
Cc: linux-kernel, kvm, qemu-devel, linux-nvdimm, linux-fsdevel,
virtualization, linux-acpi, linux-ext4, linux-xfs, jack,
stefanha, dan.j.williams, riel, nilal, kwolf, pbonzini, zwisler,
vishal.l.verma, dave.jiang, david, jmoyer, xiaoguangrong.eric,
hch, mst, jasowang, lcapitulino, imammedo, eblake, willy, tytso,
adilger.kernel, darrick.wong, rjw
On Wed 09-01-19 19:26:05, Pankaj Gupta wrote:
> Virtio pmem provides asynchronous host page cache flush
> mechanism. We don't support 'MAP_SYNC' with virtio pmem
> and ext4.
>
> Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
...
> @@ -371,6 +373,13 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
> if (!IS_DAX(file_inode(file)) && (vma->vm_flags & VM_SYNC))
> return -EOPNOTSUPP;
>
> + /* We don't support synchronous mappings with guest direct access
> + * and virtio based host page cache flush mechanism.
> + */
> + if (IS_DAX(file_inode(file)) && virtio_pmem_host_cache_enabled(dax_dev)
> + && (vma->vm_flags & VM_SYNC))
> + return -EOPNOTSUPP;
> +
Shouldn't there rather be some generic way of doing this? Having
virtio_pmem_host_cache_enabled() check in filesystem code just looks like
filesystem sniffing into details is should not care about... Maybe just
naming this (or having a wrapper) dax_dev_map_sync_supported()?
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 4/5] ext4: disable map_sync for virtio pmem
2019-01-09 14:42 ` [PATCH v3 4/5] ext4: " Jan Kara
@ 2019-01-09 14:54 ` Pankaj Gupta
0 siblings, 0 replies; 5+ messages in thread
From: Pankaj Gupta @ 2019-01-09 14:54 UTC (permalink / raw)
To: Jan Kara
Cc: linux-kernel, kvm, qemu-devel, linux-nvdimm, linux-fsdevel,
virtualization, linux-acpi, linux-ext4, linux-xfs, stefanha,
dan j williams, riel, nilal, kwolf, pbonzini, zwisler,
vishal l verma, dave jiang, david, jmoyer, xiaoguangrong eric,
hch, mst, jasowang, lcapitulino, imammedo, eblake, willy, tytso,
adilger kernel, darrick wong, rjw
>
> On Wed 09-01-19 19:26:05, Pankaj Gupta wrote:
> > Virtio pmem provides asynchronous host page cache flush
> > mechanism. We don't support 'MAP_SYNC' with virtio pmem
> > and ext4.
> >
> > Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
> ...
> > @@ -371,6 +373,13 @@ static int ext4_file_mmap(struct file *file, struct
> > vm_area_struct *vma)
> > if (!IS_DAX(file_inode(file)) && (vma->vm_flags & VM_SYNC))
> > return -EOPNOTSUPP;
> >
> > + /* We don't support synchronous mappings with guest direct access
> > + * and virtio based host page cache flush mechanism.
> > + */
> > + if (IS_DAX(file_inode(file)) && virtio_pmem_host_cache_enabled(dax_dev)
> > + && (vma->vm_flags & VM_SYNC))
> > + return -EOPNOTSUPP;
> > +
>
> Shouldn't there rather be some generic way of doing this? Having
> virtio_pmem_host_cache_enabled() check in filesystem code just looks like
> filesystem sniffing into details is should not care about... Maybe just
> naming this (or having a wrapper) dax_dev_map_sync_supported()?
Thanks for the feedback.
Just wanted to avoid 'dax' in function name just to differentiate this with real dax.
But yes can add a wrapper: dax_dev_map_sync_supported()
Thanks,
Pankaj
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 0/5] kvm "virtio pmem" device
@ 2019-01-09 14:47 Pankaj Gupta
2019-01-09 14:47 ` [PATCH v3 4/5] ext4: disable map_sync for virtio pmem Pankaj Gupta
0 siblings, 1 reply; 5+ messages in thread
From: Pankaj Gupta @ 2019-01-09 14:47 UTC (permalink / raw)
To: linux-kernel, kvm, qemu-devel, linux-nvdimm, linux-fsdevel,
virtualization, linux-acpi, linux-ext4, linux-xfs
Cc: jack, stefanha, dan.j.williams, riel, nilal, kwolf, pbonzini,
zwisler, vishal.l.verma, dave.jiang, david, jmoyer,
xiaoguangrong.eric, hch, mst, jasowang, lcapitulino, imammedo,
eblake, willy, tytso, adilger.kernel, darrick.wong, rjw, pagupta
This patch series has implementation for "virtio pmem".
"virtio pmem" is fake persistent memory(nvdimm) in guest
which allows to bypass the guest page cache. This also
implements a VIRTIO based asynchronous flush mechanism.
Sharing guest kernel driver in this patchset with the
changes suggested in v2. Tested with Qemu side device
emulation for virtio-pmem [6].
Details of project idea for 'virtio pmem' flushing interface
is shared [3] & [4].
Implementation is divided into two parts:
New virtio pmem guest driver and qemu code changes for new
virtio pmem paravirtualized device.
1. Guest virtio-pmem kernel driver
---------------------------------
- Reads persistent memory range from paravirt device and
registers with 'nvdimm_bus'.
- 'nvdimm/pmem' driver uses this information to allocate
persistent memory region and setup filesystem operations
to the allocated memory.
- virtio pmem driver implements asynchronous flushing
interface to flush from guest to host.
2. Qemu virtio-pmem device
---------------------------------
- Creates virtio pmem device and exposes a memory range to
KVM guest.
- At host side this is file backed memory which acts as
persistent memory.
- Qemu side flush uses aio thread pool API's and virtio
for asynchronous guest multi request handling.
David Hildenbrand CCed also posted a modified version[6] of
qemu virtio-pmem code based on updated Qemu memory device API.
Virtio-pmem errors handling:
----------------------------------------
Checked behaviour of virtio-pmem for below types of errors
Need suggestions on expected behaviour for handling these errors?
- Hardware Errors: Uncorrectable recoverable Errors:
a] virtio-pmem:
- As per current logic if error page belongs to Qemu process,
host MCE handler isolates(hwpoison) that page and send SIGBUS.
Qemu SIGBUS handler injects exception to KVM guest.
- KVM guest then isolates the page and send SIGBUS to guest
userspace process which has mapped the page.
b] Existing implementation for ACPI pmem driver:
- Handles such errors with MCE notifier and creates a list
of bad blocks. Read/direct access DAX operation return EIO
if accessed memory page fall in bad block list.
- It also starts backgound scrubbing.
- Similar functionality can be reused in virtio-pmem with MCE
notifier but without scrubbing(no ACPI/ARS)? Need inputs to
confirm if this behaviour is ok or needs any change?
Changes from PATCH v2: [1]
- Disable MAP_SYNC for ext4 & XFS filesystems - [Dan]
- Use name 'virtio pmem' in place of 'fake dax'
Changes from PATCH v1: [2]
- 0-day build test for build dependency on libnvdimm
Changes suggested by - [Dan Williams]
- Split the driver into two parts virtio & pmem
- Move queuing of async block request to block layer
- Add "sync" parameter in nvdimm_flush function
- Use indirect call for nvdimm_flush
- Don’t move declarations to common global header e.g nd.h
- nvdimm_flush() return 0 or -EIO if it fails
- Teach nsio_rw_bytes() that the flush can fail
- Rename nvdimm_flush() to generic_nvdimm_flush()
- Use 'nd_region->provider_data' for long dereferencing
- Remove virtio_pmem_freeze/restore functions
- Remove BSD license text with SPDX license text
- Add might_sleep() in virtio_pmem_flush - [Luiz]
- Make spin_lock_irqsave() narrow
Changes from RFC v3
- Rebase to latest upstream - Luiz
- Call ndregion->flush in place of nvdimm_flush- Luiz
- kmalloc return check - Luiz
- virtqueue full handling - Stefan
- Don't map entire virtio_pmem_req to device - Stefan
- request leak, correct sizeof req- Stefan
- Move declaration to virtio_pmem.c
Changes from RFC v2:
- Add flush function in the nd_region in place of switching
on a flag - Dan & Stefan
- Add flush completion function with proper locking and wait
for host side flush completion - Stefan & Dan
- Keep userspace API in uapi header file - Stefan, MST
- Use LE fields & New device id - MST
- Indentation & spacing suggestions - MST & Eric
- Remove extra header files & add licensing - Stefan
Changes from RFC v1:
- Reuse existing 'pmem' code for registering persistent
memory and other operations instead of creating an entirely
new block driver.
- Use VIRTIO driver to register memory information with
nvdimm_bus and create region_type accordingly.
- Call VIRTIO flush from existing pmem driver.
Pankaj Gupta (5):
libnvdimm: nd_region flush callback support
virtio-pmem: Add virtio-pmem guest driver
libnvdimm: add nd_region buffered dax_dev flag
ext4: disable map_sync for virtio pmem
xfs: disable map_sync for virtio pmem
[2] https://lkml.org/lkml/2018/8/31/407
[3] https://www.spinics.net/lists/kvm/msg149761.html
[4] https://www.spinics.net/lists/kvm/msg153095.html
[5] https://lkml.org/lkml/2018/8/31/413
[6] https://marc.info/?l=qemu-devel&m=153555721901824&w=2
drivers/acpi/nfit/core.c | 4 -
drivers/dax/super.c | 17 +++++
drivers/nvdimm/claim.c | 6 +
drivers/nvdimm/nd.h | 1
drivers/nvdimm/pmem.c | 15 +++-
drivers/nvdimm/region_devs.c | 45 +++++++++++++-
drivers/nvdimm/virtio_pmem.c | 84 ++++++++++++++++++++++++++
drivers/virtio/Kconfig | 10 +++
drivers/virtio/Makefile | 1
drivers/virtio/pmem.c | 125 +++++++++++++++++++++++++++++++++++++++
fs/ext4/file.c | 11 +++
fs/xfs/xfs_file.c | 8 ++
include/linux/dax.h | 9 ++
include/linux/libnvdimm.h | 11 +++
include/linux/virtio_pmem.h | 60 ++++++++++++++++++
include/uapi/linux/virtio_ids.h | 1
include/uapi/linux/virtio_pmem.h | 10 +++
17 files changed, 406 insertions(+), 12 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 4/5] ext4: disable map_sync for virtio pmem
2019-01-09 14:47 [PATCH v3 0/5] kvm "virtio pmem" device Pankaj Gupta
@ 2019-01-09 14:47 ` Pankaj Gupta
0 siblings, 0 replies; 5+ messages in thread
From: Pankaj Gupta @ 2019-01-09 14:47 UTC (permalink / raw)
To: linux-kernel, kvm, qemu-devel, linux-nvdimm, linux-fsdevel,
virtualization, linux-acpi, linux-ext4, linux-xfs
Cc: jack, stefanha, dan.j.williams, riel, nilal, kwolf, pbonzini,
zwisler, vishal.l.verma, dave.jiang, david, jmoyer,
xiaoguangrong.eric, hch, mst, jasowang, lcapitulino, imammedo,
eblake, willy, tytso, adilger.kernel, darrick.wong, rjw, pagupta
Virtio pmem provides asynchronous host page cache flush
mechanism. We don't support 'MAP_SYNC' with virtio pmem
and ext4.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
---
fs/ext4/file.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 69d65d4..e54f48b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -360,8 +360,10 @@ static const struct vm_operations_struct ext4_file_vm_ops = {
static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
{
struct inode *inode = file->f_mapping->host;
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+ struct dax_device *dax_dev = sbi->s_daxdev;
- if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
+ if (unlikely(ext4_forced_shutdown(sbi)))
return -EIO;
/*
@@ -371,6 +373,13 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
if (!IS_DAX(file_inode(file)) && (vma->vm_flags & VM_SYNC))
return -EOPNOTSUPP;
+ /* We don't support synchronous mappings with guest direct access
+ * and virtio based host page cache flush mechanism.
+ */
+ if (IS_DAX(file_inode(file)) && virtio_pmem_host_cache_enabled(dax_dev)
+ && (vma->vm_flags & VM_SYNC))
+ return -EOPNOTSUPP;
+
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
--
2.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-01-09 14:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09 13:56 [PATCH v3 4/5] ext4: disable map_sync for virtio pmem Pankaj Gupta
2019-01-09 13:56 ` [PATCH v3 5/5] xfs: " Pankaj Gupta
2019-01-09 14:42 ` [PATCH v3 4/5] ext4: " Jan Kara
2019-01-09 14:54 ` Pankaj Gupta
2019-01-09 14:47 [PATCH v3 0/5] kvm "virtio pmem" device Pankaj Gupta
2019-01-09 14:47 ` [PATCH v3 4/5] ext4: disable map_sync for virtio pmem Pankaj Gupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).