* [PATCH 0/3] dax: require 'struct page' and other fixups
@ 2017-09-27 23:49 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm; +Cc: Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel, Christoph Hellwig
Prompted by a recent change to add more protection around setting up
'vm_flags' for a dax vma [1], rework the implementation to remove the
requirement to set VM_MIXEDMAP and VM_HUGEPAGE.
VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map. The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations. In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.
This patch series passes a run of the ndctl unit test suite and the
'mmap.sh' [2] test in particular. 'mmap.sh' tries to catch dependencies
on VM_MIXEDMAP and {pte,pmd}_devmap().
[1]: https://lkml.org/lkml/2017/9/25/638
[2]: https://github.com/pmem/ndctl/blob/master/test/mmap.sh
---
Dan Williams (3):
dax: disable filesystem dax on devices that do not map pages
dax: stop using VM_MIXEDMAP for dax
dax: stop using VM_HUGEPAGE for dax
drivers/dax/device.c | 1 -
drivers/dax/super.c | 7 +++++++
fs/ext2/file.c | 1 -
fs/ext4/file.c | 1 -
fs/xfs/xfs_file.c | 2 --
mm/huge_memory.c | 8 ++++----
mm/ksm.c | 3 +++
mm/madvise.c | 2 +-
mm/memory.c | 20 ++++++++++++++++++--
mm/migrate.c | 3 ++-
mm/mlock.c | 3 ++-
mm/mmap.c | 5 +++--
12 files changed, 40 insertions(+), 16 deletions(-)
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 0/3] dax: require 'struct page' and other fixups
@ 2017-09-27 23:49 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm
Cc: Jan Kara, linux-nvdimm, linux-mm, Jeff Moyer, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
Prompted by a recent change to add more protection around setting up
'vm_flags' for a dax vma [1], rework the implementation to remove the
requirement to set VM_MIXEDMAP and VM_HUGEPAGE.
VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map. The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations. In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.
This patch series passes a run of the ndctl unit test suite and the
'mmap.sh' [2] test in particular. 'mmap.sh' tries to catch dependencies
on VM_MIXEDMAP and {pte,pmd}_devmap().
[1]: https://lkml.org/lkml/2017/9/25/638
[2]: https://github.com/pmem/ndctl/blob/master/test/mmap.sh
---
Dan Williams (3):
dax: disable filesystem dax on devices that do not map pages
dax: stop using VM_MIXEDMAP for dax
dax: stop using VM_HUGEPAGE for dax
drivers/dax/device.c | 1 -
drivers/dax/super.c | 7 +++++++
fs/ext2/file.c | 1 -
fs/ext4/file.c | 1 -
fs/xfs/xfs_file.c | 2 --
mm/huge_memory.c | 8 ++++----
mm/ksm.c | 3 +++
mm/madvise.c | 2 +-
mm/memory.c | 20 ++++++++++++++++++--
mm/migrate.c | 3 ++-
mm/mlock.c | 3 ++-
mm/mmap.c | 5 +++--
12 files changed, 40 insertions(+), 16 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
2017-09-27 23:49 ` Dan Williams
@ 2017-09-27 23:49 ` Dan Williams
-1 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm; +Cc: Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel, Christoph Hellwig
If a dax buffer from a device that does not map pages is passed to
read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
gdb attempts to examine the contents of a dax buffer from a device that
does not map pages it triggers SIGBUS. If fork(2) is called on a process
with a dax mapping from a device that does not map pages it triggers
SIGBUS. 'struct page' is required otherwise several kernel code paths
break in surprising ways. Disable filesystem-dax on devices that do not
map pages.
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/super.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 30d8a5aedd23..d9ac57b3e49a 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -15,6 +15,7 @@
#include <linux/mount.h>
#include <linux/magic.h>
#include <linux/genhd.h>
+#include <linux/pfn_t.h>
#include <linux/cdev.h>
#include <linux/hash.h>
#include <linux/slab.h>
@@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
return len < 0 ? len : -EIO;
}
+ if (!pfn_t_has_page(pfn)) {
+ pr_err("VFS (%s): error: dax support not enabled\n",
+ sb->s_id);
+ return -EOPNOTSUPP;
+ }
+
return 0;
}
EXPORT_SYMBOL_GPL(__bdev_dax_supported);
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
@ 2017-09-27 23:49 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm
Cc: Jan Kara, linux-nvdimm, linux-mm, Jeff Moyer, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
If a dax buffer from a device that does not map pages is passed to
read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
gdb attempts to examine the contents of a dax buffer from a device that
does not map pages it triggers SIGBUS. If fork(2) is called on a process
with a dax mapping from a device that does not map pages it triggers
SIGBUS. 'struct page' is required otherwise several kernel code paths
break in surprising ways. Disable filesystem-dax on devices that do not
map pages.
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/super.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 30d8a5aedd23..d9ac57b3e49a 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -15,6 +15,7 @@
#include <linux/mount.h>
#include <linux/magic.h>
#include <linux/genhd.h>
+#include <linux/pfn_t.h>
#include <linux/cdev.h>
#include <linux/hash.h>
#include <linux/slab.h>
@@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
return len < 0 ? len : -EIO;
}
+ if (!pfn_t_has_page(pfn)) {
+ pr_err("VFS (%s): error: dax support not enabled\n",
+ sb->s_id);
+ return -EOPNOTSUPP;
+ }
+
return 0;
}
EXPORT_SYMBOL_GPL(__bdev_dax_supported);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
2017-09-27 23:49 ` Dan Williams
@ 2017-09-27 23:49 ` Dan Williams
-1 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm; +Cc: Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel, Christoph Hellwig
Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
This does require some small fixups for the pte insert routines that dax
utilizes.
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/device.c | 2 +-
fs/ext2/file.c | 1 -
fs/ext4/file.c | 2 +-
fs/xfs/xfs_file.c | 2 +-
mm/huge_memory.c | 8 ++++----
mm/ksm.c | 3 +++
mm/madvise.c | 2 +-
mm/memory.c | 20 ++++++++++++++++++--
mm/migrate.c | 3 ++-
mm/mlock.c | 3 ++-
mm/mmap.c | 5 +++--
11 files changed, 36 insertions(+), 15 deletions(-)
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index e9f3b3e4bbf4..ed79d006026e 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -450,7 +450,7 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
return rc;
vma->vm_ops = &dax_vm_ops;
- vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
+ vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index ff3a3636a5ca..70657e8550ed 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -125,7 +125,6 @@ static int ext2_file_mmap(struct file *file, struct vm_area_struct *vma)
file_accessed(file);
vma->vm_ops = &ext2_dax_vm_ops;
- vma->vm_flags |= VM_MIXEDMAP;
return 0;
}
#else
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index b1da660ac3bc..0cc9d205bd96 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -352,7 +352,7 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
- vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
+ vma->vm_flags |= VM_HUGEPAGE;
} else {
vma->vm_ops = &ext4_file_vm_ops;
}
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index ebdd0bd2b261..dece8fe937f5 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1131,7 +1131,7 @@ xfs_file_mmap(
file_accessed(filp);
vma->vm_ops = &xfs_file_vm_ops;
if (IS_DAX(file_inode(filp)))
- vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
+ vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 269b5df58543..c69d30e27fd9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -765,11 +765,11 @@ int vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
* but we need to be consistent with PTEs and architectures that
* can't support a 'special' bit.
*/
- BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)));
+ BUG_ON(!((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))
+ || pfn_t_devmap(pfn)));
BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) ==
(VM_PFNMAP|VM_MIXEDMAP));
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
- BUG_ON(!pfn_t_devmap(pfn));
if (addr < vma->vm_start || addr >= vma->vm_end)
return VM_FAULT_SIGBUS;
@@ -824,11 +824,11 @@ int vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
* but we need to be consistent with PTEs and architectures that
* can't support a 'special' bit.
*/
- BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)));
+ BUG_ON(!((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))
+ || pfn_t_devmap(pfn)));
BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) ==
(VM_PFNMAP|VM_MIXEDMAP));
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
- BUG_ON(!pfn_t_devmap(pfn));
if (addr < vma->vm_start || addr >= vma->vm_end)
return VM_FAULT_SIGBUS;
diff --git a/mm/ksm.c b/mm/ksm.c
index 15dd7415f7b3..787dfe4f3d44 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2358,6 +2358,9 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
VM_HUGETLB | VM_MIXEDMAP))
return 0; /* just ignore the advice */
+ if (vma_is_dax(vma))
+ return 0;
+
#ifdef VM_SAO
if (*vm_flags & VM_SAO)
return 0;
diff --git a/mm/madvise.c b/mm/madvise.c
index 21261ff0466f..40344d43e565 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -95,7 +95,7 @@ static long madvise_behavior(struct vm_area_struct *vma,
new_flags |= VM_DONTDUMP;
break;
case MADV_DODUMP:
- if (new_flags & VM_SPECIAL) {
+ if (vma_is_dax(vma) || (new_flags & VM_SPECIAL)) {
error = -EINVAL;
goto out;
}
diff --git a/mm/memory.c b/mm/memory.c
index ec4e15494901..771acaf54fe6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -830,6 +830,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
return vma->vm_ops->find_special_page(vma, addr);
if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
return NULL;
+ if (pte_devmap(pte))
+ return NULL;
if (is_zero_pfn(pfn))
return NULL;
@@ -917,6 +919,8 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
}
}
+ if (pmd_devmap(pmd))
+ return NULL;
if (is_zero_pfn(pfn))
return NULL;
if (unlikely(pfn > highest_memmap_pfn))
@@ -1227,7 +1231,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
* efficient than faulting.
*/
if (!(vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) &&
- !vma->anon_vma)
+ !vma->anon_vma && !vma_is_dax(vma))
return 0;
if (is_vm_hugetlb_page(vma))
@@ -1896,12 +1900,24 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vm_insert_pfn_prot);
+static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn)
+{
+ /* these checks mirror the abort conditions in vm_normal_page */
+ if (vma->vm_flags & VM_MIXEDMAP)
+ return true;
+ if (pfn_t_devmap(pfn))
+ return true;
+ if (is_zero_pfn(pfn_t_to_pfn(pfn)))
+ return true;
+ return false;
+}
+
static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
pfn_t pfn, bool mkwrite)
{
pgprot_t pgprot = vma->vm_page_prot;
- BUG_ON(!(vma->vm_flags & VM_MIXEDMAP));
+ BUG_ON(!vm_mixed_ok(vma, pfn));
if (addr < vma->vm_start || addr >= vma->vm_end)
return -EFAULT;
diff --git a/mm/migrate.c b/mm/migrate.c
index 6954c1435833..179a84a311f6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2927,7 +2927,8 @@ int migrate_vma(const struct migrate_vma_ops *ops,
/* Sanity check the arguments */
start &= PAGE_MASK;
end &= PAGE_MASK;
- if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL))
+ if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL)
+ || vma_is_dax(dma))
return -EINVAL;
if (start < vma->vm_start || start >= vma->vm_end)
return -EINVAL;
diff --git a/mm/mlock.c b/mm/mlock.c
index dfc6f1912176..4d009350893f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -520,7 +520,8 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev,
vm_flags_t old_flags = vma->vm_flags;
if (newflags == vma->vm_flags || (vma->vm_flags & VM_SPECIAL) ||
- is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm))
+ is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) ||
+ vma_is_dax(vma))
/* don't set VM_LOCKED or VM_LOCKONFAULT and don't count */
goto out;
diff --git a/mm/mmap.c b/mm/mmap.c
index 680506faceae..d682f60670ff 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1111,7 +1111,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
* We later require that vma->vm_flags == vm_flags,
* so this tests vma->vm_flags & VM_SPECIAL, too.
*/
- if (vm_flags & VM_SPECIAL)
+ if ((vm_flags & VM_SPECIAL))
return NULL;
if (prev)
@@ -1723,7 +1723,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vm_stat_account(mm, vm_flags, len >> PAGE_SHIFT);
if (vm_flags & VM_LOCKED) {
if (!((vm_flags & VM_SPECIAL) || is_vm_hugetlb_page(vma) ||
- vma == get_gate_vma(current->mm)))
+ vma == get_gate_vma(current->mm) ||
+ vma_is_dax(vma)))
mm->locked_vm += (len >> PAGE_SHIFT);
else
vma->vm_flags &= VM_LOCKED_CLEAR_MASK;
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
@ 2017-09-27 23:49 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm
Cc: Jan Kara, linux-nvdimm, linux-mm, Jeff Moyer, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
This does require some small fixups for the pte insert routines that dax
utilizes.
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/device.c | 2 +-
fs/ext2/file.c | 1 -
fs/ext4/file.c | 2 +-
fs/xfs/xfs_file.c | 2 +-
mm/huge_memory.c | 8 ++++----
mm/ksm.c | 3 +++
mm/madvise.c | 2 +-
mm/memory.c | 20 ++++++++++++++++++--
mm/migrate.c | 3 ++-
mm/mlock.c | 3 ++-
mm/mmap.c | 5 +++--
11 files changed, 36 insertions(+), 15 deletions(-)
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index e9f3b3e4bbf4..ed79d006026e 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -450,7 +450,7 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
return rc;
vma->vm_ops = &dax_vm_ops;
- vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
+ vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index ff3a3636a5ca..70657e8550ed 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -125,7 +125,6 @@ static int ext2_file_mmap(struct file *file, struct vm_area_struct *vma)
file_accessed(file);
vma->vm_ops = &ext2_dax_vm_ops;
- vma->vm_flags |= VM_MIXEDMAP;
return 0;
}
#else
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index b1da660ac3bc..0cc9d205bd96 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -352,7 +352,7 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
- vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
+ vma->vm_flags |= VM_HUGEPAGE;
} else {
vma->vm_ops = &ext4_file_vm_ops;
}
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index ebdd0bd2b261..dece8fe937f5 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1131,7 +1131,7 @@ xfs_file_mmap(
file_accessed(filp);
vma->vm_ops = &xfs_file_vm_ops;
if (IS_DAX(file_inode(filp)))
- vma->vm_flags |= VM_MIXEDMAP | VM_HUGEPAGE;
+ vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 269b5df58543..c69d30e27fd9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -765,11 +765,11 @@ int vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
* but we need to be consistent with PTEs and architectures that
* can't support a 'special' bit.
*/
- BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)));
+ BUG_ON(!((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))
+ || pfn_t_devmap(pfn)));
BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) ==
(VM_PFNMAP|VM_MIXEDMAP));
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
- BUG_ON(!pfn_t_devmap(pfn));
if (addr < vma->vm_start || addr >= vma->vm_end)
return VM_FAULT_SIGBUS;
@@ -824,11 +824,11 @@ int vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
* but we need to be consistent with PTEs and architectures that
* can't support a 'special' bit.
*/
- BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)));
+ BUG_ON(!((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))
+ || pfn_t_devmap(pfn)));
BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) ==
(VM_PFNMAP|VM_MIXEDMAP));
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
- BUG_ON(!pfn_t_devmap(pfn));
if (addr < vma->vm_start || addr >= vma->vm_end)
return VM_FAULT_SIGBUS;
diff --git a/mm/ksm.c b/mm/ksm.c
index 15dd7415f7b3..787dfe4f3d44 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2358,6 +2358,9 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
VM_HUGETLB | VM_MIXEDMAP))
return 0; /* just ignore the advice */
+ if (vma_is_dax(vma))
+ return 0;
+
#ifdef VM_SAO
if (*vm_flags & VM_SAO)
return 0;
diff --git a/mm/madvise.c b/mm/madvise.c
index 21261ff0466f..40344d43e565 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -95,7 +95,7 @@ static long madvise_behavior(struct vm_area_struct *vma,
new_flags |= VM_DONTDUMP;
break;
case MADV_DODUMP:
- if (new_flags & VM_SPECIAL) {
+ if (vma_is_dax(vma) || (new_flags & VM_SPECIAL)) {
error = -EINVAL;
goto out;
}
diff --git a/mm/memory.c b/mm/memory.c
index ec4e15494901..771acaf54fe6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -830,6 +830,8 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
return vma->vm_ops->find_special_page(vma, addr);
if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
return NULL;
+ if (pte_devmap(pte))
+ return NULL;
if (is_zero_pfn(pfn))
return NULL;
@@ -917,6 +919,8 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
}
}
+ if (pmd_devmap(pmd))
+ return NULL;
if (is_zero_pfn(pfn))
return NULL;
if (unlikely(pfn > highest_memmap_pfn))
@@ -1227,7 +1231,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
* efficient than faulting.
*/
if (!(vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) &&
- !vma->anon_vma)
+ !vma->anon_vma && !vma_is_dax(vma))
return 0;
if (is_vm_hugetlb_page(vma))
@@ -1896,12 +1900,24 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vm_insert_pfn_prot);
+static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn)
+{
+ /* these checks mirror the abort conditions in vm_normal_page */
+ if (vma->vm_flags & VM_MIXEDMAP)
+ return true;
+ if (pfn_t_devmap(pfn))
+ return true;
+ if (is_zero_pfn(pfn_t_to_pfn(pfn)))
+ return true;
+ return false;
+}
+
static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
pfn_t pfn, bool mkwrite)
{
pgprot_t pgprot = vma->vm_page_prot;
- BUG_ON(!(vma->vm_flags & VM_MIXEDMAP));
+ BUG_ON(!vm_mixed_ok(vma, pfn));
if (addr < vma->vm_start || addr >= vma->vm_end)
return -EFAULT;
diff --git a/mm/migrate.c b/mm/migrate.c
index 6954c1435833..179a84a311f6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2927,7 +2927,8 @@ int migrate_vma(const struct migrate_vma_ops *ops,
/* Sanity check the arguments */
start &= PAGE_MASK;
end &= PAGE_MASK;
- if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL))
+ if (!vma || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL)
+ || vma_is_dax(dma))
return -EINVAL;
if (start < vma->vm_start || start >= vma->vm_end)
return -EINVAL;
diff --git a/mm/mlock.c b/mm/mlock.c
index dfc6f1912176..4d009350893f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -520,7 +520,8 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev,
vm_flags_t old_flags = vma->vm_flags;
if (newflags == vma->vm_flags || (vma->vm_flags & VM_SPECIAL) ||
- is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm))
+ is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) ||
+ vma_is_dax(vma))
/* don't set VM_LOCKED or VM_LOCKONFAULT and don't count */
goto out;
diff --git a/mm/mmap.c b/mm/mmap.c
index 680506faceae..d682f60670ff 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1111,7 +1111,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
* We later require that vma->vm_flags == vm_flags,
* so this tests vma->vm_flags & VM_SPECIAL, too.
*/
- if (vm_flags & VM_SPECIAL)
+ if ((vm_flags & VM_SPECIAL))
return NULL;
if (prev)
@@ -1723,7 +1723,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vm_stat_account(mm, vm_flags, len >> PAGE_SHIFT);
if (vm_flags & VM_LOCKED) {
if (!((vm_flags & VM_SPECIAL) || is_vm_hugetlb_page(vma) ||
- vma == get_gate_vma(current->mm)))
+ vma == get_gate_vma(current->mm) ||
+ vma_is_dax(vma)))
mm->locked_vm += (len >> PAGE_SHIFT);
else
vma->vm_flags &= VM_LOCKED_CLEAR_MASK;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/3] dax: stop using VM_HUGEPAGE for dax
2017-09-27 23:49 ` Dan Williams
@ 2017-09-27 23:49 ` Dan Williams
-1 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm; +Cc: Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel, Christoph Hellwig
This flag is deprecated in favor of the vma_is_dax() check in
transparent_hugepage_enabled() added in commit baabda261424 "mm: always
enable thp for dax mappings"
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/device.c | 1 -
fs/ext4/file.c | 1 -
fs/xfs/xfs_file.c | 2 --
3 files changed, 4 deletions(-)
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index ed79d006026e..74a35eb5e6d3 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -450,7 +450,6 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
return rc;
vma->vm_ops = &dax_vm_ops;
- vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 0cc9d205bd96..a54e1b4c49f9 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -352,7 +352,6 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
- vma->vm_flags |= VM_HUGEPAGE;
} else {
vma->vm_ops = &ext4_file_vm_ops;
}
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index dece8fe937f5..c0e0fcbe1bd3 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1130,8 +1130,6 @@ xfs_file_mmap(
{
file_accessed(filp);
vma->vm_ops = &xfs_file_vm_ops;
- if (IS_DAX(file_inode(filp)))
- vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/3] dax: stop using VM_HUGEPAGE for dax
@ 2017-09-27 23:49 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-27 23:49 UTC (permalink / raw)
To: akpm
Cc: Jan Kara, linux-nvdimm, linux-mm, Jeff Moyer, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
This flag is deprecated in favor of the vma_is_dax() check in
transparent_hugepage_enabled() added in commit baabda261424 "mm: always
enable thp for dax mappings"
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/dax/device.c | 1 -
fs/ext4/file.c | 1 -
fs/xfs/xfs_file.c | 2 --
3 files changed, 4 deletions(-)
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index ed79d006026e..74a35eb5e6d3 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -450,7 +450,6 @@ static int dax_mmap(struct file *filp, struct vm_area_struct *vma)
return rc;
vma->vm_ops = &dax_vm_ops;
- vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 0cc9d205bd96..a54e1b4c49f9 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -352,7 +352,6 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
- vma->vm_flags |= VM_HUGEPAGE;
} else {
vma->vm_ops = &ext4_file_vm_ops;
}
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index dece8fe937f5..c0e0fcbe1bd3 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1130,8 +1130,6 @@ xfs_file_mmap(
{
file_accessed(filp);
vma->vm_ops = &xfs_file_vm_ops;
- if (IS_DAX(file_inode(filp)))
- vma->vm_flags |= VM_HUGEPAGE;
return 0;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
2017-09-27 23:49 ` Dan Williams
@ 2017-09-28 0:09 ` Dan Williams
-1 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-28 0:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, Linux MM, Jan Kara, Christoph Hellwig, linux-nvdimm
On Wed, Sep 27, 2017 at 4:49 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
> This does require some small fixups for the pte insert routines that dax
> utilizes.
>
This changelog can be improved with this from the cover letter:
VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map. The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations. In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.
...I'll fold this in for v2.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
@ 2017-09-28 0:09 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-28 0:09 UTC (permalink / raw)
To: Andrew Morton
Cc: Jan Kara, linux-nvdimm, Linux MM, linux-fsdevel, Christoph Hellwig
On Wed, Sep 27, 2017 at 4:49 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
> This does require some small fixups for the pte insert routines that dax
> utilizes.
>
This changelog can be improved with this from the cover letter:
VM_MIXEDMAP is used by dax to direct mm paths like vm_normal_page() that
the memory page it is dealing with is not typical memory from the linear
map. The get_user_pages_fast() path, since it does not resolve the vma,
is already using {pte,pmd}_devmap() as a stand-in for VM_MIXEDMAP, so we
use that as a VM_MIXEDMAP replacement in some locations. In the cases
where there is no pte to consult we fallback to using vma_is_dax() to
detect the VM_MIXEDMAP special case.
...I'll fold this in for v2.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
2017-09-27 23:49 ` Dan Williams
(?)
@ 2017-09-28 16:25 ` Jeff Moyer
-1 siblings, 0 replies; 19+ messages in thread
From: Jeff Moyer @ 2017-09-28 16:25 UTC (permalink / raw)
To: Dan Williams
Cc: Jan Kara, linux-nvdimm, Christoph Hellwig, linux-mm, linux-fsdevel, akpm
Dan Williams <dan.j.williams@intel.com> writes:
> If a dax buffer from a device that does not map pages is passed to
> read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
> gdb attempts to examine the contents of a dax buffer from a device that
> does not map pages it triggers SIGBUS. If fork(2) is called on a process
> with a dax mapping from a device that does not map pages it triggers
> SIGBUS. 'struct page' is required otherwise several kernel code paths
> break in surprising ways. Disable filesystem-dax on devices that do not
> map pages.
>
[...]
> @@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
> return len < 0 ? len : -EIO;
> }
>
> + if (!pfn_t_has_page(pfn)) {
> + pr_err("VFS (%s): error: dax support not enabled\n",
> + sb->s_id);
Is the pr_err really necessary? At least one caller already prints a
warning. It seems cleaner to me to let the caller determine whether
it's worth printing anything.
-Jeff
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
@ 2017-09-28 16:25 ` Jeff Moyer
0 siblings, 0 replies; 19+ messages in thread
From: Jeff Moyer @ 2017-09-28 16:25 UTC (permalink / raw)
To: Dan Williams
Cc: akpm, Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
Dan Williams <dan.j.williams@intel.com> writes:
> If a dax buffer from a device that does not map pages is passed to
> read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
> gdb attempts to examine the contents of a dax buffer from a device that
> does not map pages it triggers SIGBUS. If fork(2) is called on a process
> with a dax mapping from a device that does not map pages it triggers
> SIGBUS. 'struct page' is required otherwise several kernel code paths
> break in surprising ways. Disable filesystem-dax on devices that do not
> map pages.
>
[...]
> @@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
> return len < 0 ? len : -EIO;
> }
>
> + if (!pfn_t_has_page(pfn)) {
> + pr_err("VFS (%s): error: dax support not enabled\n",
> + sb->s_id);
Is the pr_err really necessary? At least one caller already prints a
warning. It seems cleaner to me to let the caller determine whether
it's worth printing anything.
-Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
@ 2017-09-28 16:25 ` Jeff Moyer
0 siblings, 0 replies; 19+ messages in thread
From: Jeff Moyer @ 2017-09-28 16:25 UTC (permalink / raw)
To: Dan Williams
Cc: akpm, Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
Dan Williams <dan.j.williams@intel.com> writes:
> If a dax buffer from a device that does not map pages is passed to
> read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
> gdb attempts to examine the contents of a dax buffer from a device that
> does not map pages it triggers SIGBUS. If fork(2) is called on a process
> with a dax mapping from a device that does not map pages it triggers
> SIGBUS. 'struct page' is required otherwise several kernel code paths
> break in surprising ways. Disable filesystem-dax on devices that do not
> map pages.
>
[...]
> @@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
> return len < 0 ? len : -EIO;
> }
>
> + if (!pfn_t_has_page(pfn)) {
> + pr_err("VFS (%s): error: dax support not enabled\n",
> + sb->s_id);
Is the pr_err really necessary? At least one caller already prints a
warning. It seems cleaner to me to let the caller determine whether
it's worth printing anything.
-Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
2017-09-28 16:25 ` Jeff Moyer
@ 2017-09-28 16:28 ` Dan Williams
-1 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-28 16:28 UTC (permalink / raw)
To: Jeff Moyer
Cc: Jan Kara, linux-nvdimm, Christoph Hellwig, Linux MM,
linux-fsdevel, Andrew Morton
On Thu, Sep 28, 2017 at 9:25 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Dan Williams <dan.j.williams@intel.com> writes:
>
>> If a dax buffer from a device that does not map pages is passed to
>> read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
>> gdb attempts to examine the contents of a dax buffer from a device that
>> does not map pages it triggers SIGBUS. If fork(2) is called on a process
>> with a dax mapping from a device that does not map pages it triggers
>> SIGBUS. 'struct page' is required otherwise several kernel code paths
>> break in surprising ways. Disable filesystem-dax on devices that do not
>> map pages.
>>
> [...]
>> @@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
>> return len < 0 ? len : -EIO;
>> }
>>
>> + if (!pfn_t_has_page(pfn)) {
>> + pr_err("VFS (%s): error: dax support not enabled\n",
>> + sb->s_id);
>
> Is the pr_err really necessary? At least one caller already prints a
> warning. It seems cleaner to me to let the caller determine whether
> it's worth printing anything.
Agreed, I'll drop it in v2.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages
@ 2017-09-28 16:28 ` Dan Williams
0 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-28 16:28 UTC (permalink / raw)
To: Jeff Moyer
Cc: Andrew Morton, Jan Kara, linux-nvdimm, Linux MM, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
On Thu, Sep 28, 2017 at 9:25 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Dan Williams <dan.j.williams@intel.com> writes:
>
>> If a dax buffer from a device that does not map pages is passed to
>> read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
>> gdb attempts to examine the contents of a dax buffer from a device that
>> does not map pages it triggers SIGBUS. If fork(2) is called on a process
>> with a dax mapping from a device that does not map pages it triggers
>> SIGBUS. 'struct page' is required otherwise several kernel code paths
>> break in surprising ways. Disable filesystem-dax on devices that do not
>> map pages.
>>
> [...]
>> @@ -123,6 +124,12 @@ int __bdev_dax_supported(struct super_block *sb, int blocksize)
>> return len < 0 ? len : -EIO;
>> }
>>
>> + if (!pfn_t_has_page(pfn)) {
>> + pr_err("VFS (%s): error: dax support not enabled\n",
>> + sb->s_id);
>
> Is the pr_err really necessary? At least one caller already prints a
> warning. It seems cleaner to me to let the caller determine whether
> it's worth printing anything.
Agreed, I'll drop it in v2.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
2017-09-27 23:49 ` Dan Williams
(?)
@ 2017-09-28 16:32 ` Jeff Moyer
-1 siblings, 0 replies; 19+ messages in thread
From: Jeff Moyer @ 2017-09-28 16:32 UTC (permalink / raw)
To: Dan Williams
Cc: Jan Kara, linux-nvdimm, Christoph Hellwig, linux-mm, linux-fsdevel, akpm
Dan Williams <dan.j.williams@intel.com> writes:
> Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
> This does require some small fixups for the pte insert routines that dax
> utilizes.
It used to be that userspace would look to see if it had a 'mm' entry in
/proc/pid/smaps to determine whether or not it got a direct mapping.
Later, that same userspace (nvml) just uniformly declared dax not
available from any Linux file system, since msync was required. And, I
guess DAX has always been marked experimental, so the interface can be
changed.
All this is to say I guess it's fine to change this.
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 680506faceae..d682f60670ff 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1111,7 +1111,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
> * We later require that vma->vm_flags == vm_flags,
> * so this tests vma->vm_flags & VM_SPECIAL, too.
> */
> - if (vm_flags & VM_SPECIAL)
> + if ((vm_flags & VM_SPECIAL))
> return NULL;
That looks superfluous.
-Jeff
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
@ 2017-09-28 16:32 ` Jeff Moyer
0 siblings, 0 replies; 19+ messages in thread
From: Jeff Moyer @ 2017-09-28 16:32 UTC (permalink / raw)
To: Dan Williams
Cc: akpm, Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
Dan Williams <dan.j.williams@intel.com> writes:
> Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
> This does require some small fixups for the pte insert routines that dax
> utilizes.
It used to be that userspace would look to see if it had a 'mm' entry in
/proc/pid/smaps to determine whether or not it got a direct mapping.
Later, that same userspace (nvml) just uniformly declared dax not
available from any Linux file system, since msync was required. And, I
guess DAX has always been marked experimental, so the interface can be
changed.
All this is to say I guess it's fine to change this.
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 680506faceae..d682f60670ff 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1111,7 +1111,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
> * We later require that vma->vm_flags == vm_flags,
> * so this tests vma->vm_flags & VM_SPECIAL, too.
> */
> - if (vm_flags & VM_SPECIAL)
> + if ((vm_flags & VM_SPECIAL))
> return NULL;
That looks superfluous.
-Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
@ 2017-09-28 16:32 ` Jeff Moyer
0 siblings, 0 replies; 19+ messages in thread
From: Jeff Moyer @ 2017-09-28 16:32 UTC (permalink / raw)
To: Dan Williams
Cc: akpm, Jan Kara, linux-nvdimm, linux-mm, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
Dan Williams <dan.j.williams@intel.com> writes:
> Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
> This does require some small fixups for the pte insert routines that dax
> utilizes.
It used to be that userspace would look to see if it had a 'mm' entry in
/proc/pid/smaps to determine whether or not it got a direct mapping.
Later, that same userspace (nvml) just uniformly declared dax not
available from any Linux file system, since msync was required. And, I
guess DAX has always been marked experimental, so the interface can be
changed.
All this is to say I guess it's fine to change this.
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 680506faceae..d682f60670ff 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1111,7 +1111,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
> * We later require that vma->vm_flags == vm_flags,
> * so this tests vma->vm_flags & VM_SPECIAL, too.
> */
> - if (vm_flags & VM_SPECIAL)
> + if ((vm_flags & VM_SPECIAL))
> return NULL;
That looks superfluous.
-Jeff
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax
2017-09-28 16:32 ` Jeff Moyer
(?)
(?)
@ 2017-09-28 16:41 ` Dan Williams
-1 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-09-28 16:41 UTC (permalink / raw)
To: Jeff Moyer
Cc: Andrew Morton, Jan Kara, linux-nvdimm, Linux MM, linux-fsdevel,
Ross Zwisler, Christoph Hellwig
On Thu, Sep 28, 2017 at 9:32 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Dan Williams <dan.j.williams@intel.com> writes:
>
>> Now that we always have pages for DAX we can stop setting VM_MIXEDMAP.
>> This does require some small fixups for the pte insert routines that dax
>> utilizes.
>
> It used to be that userspace would look to see if it had a 'mm' entry in
> /proc/pid/smaps to determine whether or not it got a direct mapping.
> Later, that same userspace (nvml) just uniformly declared dax not
> available from any Linux file system, since msync was required. And, I
> guess DAX has always been marked experimental, so the interface can be
> changed.
>
> All this is to say I guess it's fine to change this.
Yes, it was always broken / dangerous to look for 'mm' as a pseudo-dax flag.
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 680506faceae..d682f60670ff 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -1111,7 +1111,7 @@ struct vm_area_struct *vma_merge(struct mm_struct *mm,
>> * We later require that vma->vm_flags == vm_flags,
>> * so this tests vma->vm_flags & VM_SPECIAL, too.
>> */
>> - if (vm_flags & VM_SPECIAL)
>> + if ((vm_flags & VM_SPECIAL))
>> return NULL;
>
> That looks superfluous.
Whoops, yeah. That was a case where I converted it to add a
vma_is_dax() check and then decided we don't need that.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-09-28 16:41 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-27 23:49 [PATCH 0/3] dax: require 'struct page' and other fixups Dan Williams
2017-09-27 23:49 ` Dan Williams
2017-09-27 23:49 ` [PATCH 1/3] dax: disable filesystem dax on devices that do not map pages Dan Williams
2017-09-27 23:49 ` Dan Williams
2017-09-28 16:25 ` Jeff Moyer
2017-09-28 16:25 ` Jeff Moyer
2017-09-28 16:25 ` Jeff Moyer
2017-09-28 16:28 ` Dan Williams
2017-09-28 16:28 ` Dan Williams
2017-09-27 23:49 ` [PATCH 2/3] dax: stop using VM_MIXEDMAP for dax Dan Williams
2017-09-27 23:49 ` Dan Williams
2017-09-28 0:09 ` Dan Williams
2017-09-28 0:09 ` Dan Williams
2017-09-28 16:32 ` Jeff Moyer
2017-09-28 16:32 ` Jeff Moyer
2017-09-28 16:32 ` Jeff Moyer
2017-09-28 16:41 ` Dan Williams
2017-09-27 23:49 ` [PATCH 3/3] dax: stop using VM_HUGEPAGE " Dan Williams
2017-09-27 23:49 ` Dan Williams
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.