* [PATCH 0/7 v4] DAX cleanups and fixes @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 Hi! This is another part of the series that implements new DAX page fault locking. This part contains easy DAX cleanups and fixes that were already reviewed and are rather obvious so they could be merged right away. They mostly remove dead unused code. The patches didn't change since the last posting of the DAX page fault locking series except for some added Reviewed-by tags. Since DAX error handling patches depend on these I assume they will get merged through NVDIMM tree. These patches depend on ext4 fixes from the series "ext4: DAX fixes". The dependency is mostly functional (these patches remove zeroing from DAX code and ext4 was depending on this zeroing until the above series fixed that) so NVDIMM tree should be merged after ext4 tree to avoid breaking ext4 DAX support. Honza _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 0/7 v4] DAX cleanups and fixes @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Hi! This is another part of the series that implements new DAX page fault locking. This part contains easy DAX cleanups and fixes that were already reviewed and are rather obvious so they could be merged right away. They mostly remove dead unused code. The patches didn't change since the last posting of the DAX page fault locking series except for some added Reviewed-by tags. Since DAX error handling patches depend on these I assume they will get merged through NVDIMM tree. These patches depend on ext4 fixes from the series "ext4: DAX fixes". The dependency is mostly functional (these patches remove zeroing from DAX code and ext4 was depending on this zeroing until the above series fixed that) so NVDIMM tree should be merged after ext4 tree to avoid breaking ext4 DAX support. Honza ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 1/7] DAX: move RADIX_DAX_ definitions to dax.c 2016-05-11 9:58 ` Jan Kara (?) @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, NeilBrown, linux-fsdevel, Jan Kara, linux-ext4 From: NeilBrown <neilb@suse.com> These don't belong in radix-tree.c any more than PAGECACHE_TAG_* do. Let's try to maintain the idea that radix-tree simply implements an abstract data type. Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 9 +++++++++ include/linux/radix-tree.h | 9 --------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 75ba46d82a76..08799a510b4d 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -32,6 +32,15 @@ #include <linux/pfn_t.h> #include <linux/sizes.h> +#define RADIX_DAX_MASK 0xf +#define RADIX_DAX_SHIFT 4 +#define RADIX_DAX_PTE (0x4 | RADIX_TREE_EXCEPTIONAL_ENTRY) +#define RADIX_DAX_PMD (0x8 | RADIX_TREE_EXCEPTIONAL_ENTRY) +#define RADIX_DAX_TYPE(entry) ((unsigned long)entry & RADIX_DAX_MASK) +#define RADIX_DAX_SECTOR(entry) (((unsigned long)entry >> RADIX_DAX_SHIFT)) +#define RADIX_DAX_ENTRY(sector, pmd) ((void *)((unsigned long)sector << \ + RADIX_DAX_SHIFT | (pmd ? RADIX_DAX_PMD : RADIX_DAX_PTE))) + static long dax_map_atomic(struct block_device *bdev, struct blk_dax_ctl *dax) { struct request_queue *q = bdev->bd_queue; diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h index 51a97ac8bfbf..d08d6ec3bf53 100644 --- a/include/linux/radix-tree.h +++ b/include/linux/radix-tree.h @@ -52,15 +52,6 @@ #define RADIX_TREE_EXCEPTIONAL_ENTRY 2 #define RADIX_TREE_EXCEPTIONAL_SHIFT 2 -#define RADIX_DAX_MASK 0xf -#define RADIX_DAX_SHIFT 4 -#define RADIX_DAX_PTE (0x4 | RADIX_TREE_EXCEPTIONAL_ENTRY) -#define RADIX_DAX_PMD (0x8 | RADIX_TREE_EXCEPTIONAL_ENTRY) -#define RADIX_DAX_TYPE(entry) ((unsigned long)entry & RADIX_DAX_MASK) -#define RADIX_DAX_SECTOR(entry) (((unsigned long)entry >> RADIX_DAX_SHIFT)) -#define RADIX_DAX_ENTRY(sector, pmd) ((void *)((unsigned long)sector << \ - RADIX_DAX_SHIFT | (pmd ? RADIX_DAX_PMD : RADIX_DAX_PTE))) - static inline int radix_tree_is_indirect_ptr(void *ptr) { return (int)((unsigned long)ptr & RADIX_TREE_INDIRECT_PTR); -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 1/7] DAX: move RADIX_DAX_ definitions to dax.c @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, NeilBrown, Jan Kara From: NeilBrown <neilb@suse.com> These don't belong in radix-tree.c any more than PAGECACHE_TAG_* do. Let's try to maintain the idea that radix-tree simply implements an abstract data type. Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 9 +++++++++ include/linux/radix-tree.h | 9 --------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 75ba46d82a76..08799a510b4d 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -32,6 +32,15 @@ #include <linux/pfn_t.h> #include <linux/sizes.h> +#define RADIX_DAX_MASK 0xf +#define RADIX_DAX_SHIFT 4 +#define RADIX_DAX_PTE (0x4 | RADIX_TREE_EXCEPTIONAL_ENTRY) +#define RADIX_DAX_PMD (0x8 | RADIX_TREE_EXCEPTIONAL_ENTRY) +#define RADIX_DAX_TYPE(entry) ((unsigned long)entry & RADIX_DAX_MASK) +#define RADIX_DAX_SECTOR(entry) (((unsigned long)entry >> RADIX_DAX_SHIFT)) +#define RADIX_DAX_ENTRY(sector, pmd) ((void *)((unsigned long)sector << \ + RADIX_DAX_SHIFT | (pmd ? RADIX_DAX_PMD : RADIX_DAX_PTE))) + static long dax_map_atomic(struct block_device *bdev, struct blk_dax_ctl *dax) { struct request_queue *q = bdev->bd_queue; diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h index 51a97ac8bfbf..d08d6ec3bf53 100644 --- a/include/linux/radix-tree.h +++ b/include/linux/radix-tree.h @@ -52,15 +52,6 @@ #define RADIX_TREE_EXCEPTIONAL_ENTRY 2 #define RADIX_TREE_EXCEPTIONAL_SHIFT 2 -#define RADIX_DAX_MASK 0xf -#define RADIX_DAX_SHIFT 4 -#define RADIX_DAX_PTE (0x4 | RADIX_TREE_EXCEPTIONAL_ENTRY) -#define RADIX_DAX_PMD (0x8 | RADIX_TREE_EXCEPTIONAL_ENTRY) -#define RADIX_DAX_TYPE(entry) ((unsigned long)entry & RADIX_DAX_MASK) -#define RADIX_DAX_SECTOR(entry) (((unsigned long)entry >> RADIX_DAX_SHIFT)) -#define RADIX_DAX_ENTRY(sector, pmd) ((void *)((unsigned long)sector << \ - RADIX_DAX_SHIFT | (pmd ? RADIX_DAX_PMD : RADIX_DAX_PTE))) ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 1/7] DAX: move RADIX_DAX_ definitions to dax.c @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, NeilBrown, Jan Kara From: NeilBrown <neilb@suse.com> These don't belong in radix-tree.c any more than PAGECACHE_TAG_* do. Let's try to maintain the idea that radix-tree simply implements an abstract data type. Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 9 +++++++++ include/linux/radix-tree.h | 9 --------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 75ba46d82a76..08799a510b4d 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -32,6 +32,15 @@ #include <linux/pfn_t.h> #include <linux/sizes.h> +#define RADIX_DAX_MASK 0xf +#define RADIX_DAX_SHIFT 4 +#define RADIX_DAX_PTE (0x4 | RADIX_TREE_EXCEPTIONAL_ENTRY) +#define RADIX_DAX_PMD (0x8 | RADIX_TREE_EXCEPTIONAL_ENTRY) +#define RADIX_DAX_TYPE(entry) ((unsigned long)entry & RADIX_DAX_MASK) +#define RADIX_DAX_SECTOR(entry) (((unsigned long)entry >> RADIX_DAX_SHIFT)) +#define RADIX_DAX_ENTRY(sector, pmd) ((void *)((unsigned long)sector << \ + RADIX_DAX_SHIFT | (pmd ? RADIX_DAX_PMD : RADIX_DAX_PTE))) + static long dax_map_atomic(struct block_device *bdev, struct blk_dax_ctl *dax) { struct request_queue *q = bdev->bd_queue; diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h index 51a97ac8bfbf..d08d6ec3bf53 100644 --- a/include/linux/radix-tree.h +++ b/include/linux/radix-tree.h @@ -52,15 +52,6 @@ #define RADIX_TREE_EXCEPTIONAL_ENTRY 2 #define RADIX_TREE_EXCEPTIONAL_SHIFT 2 -#define RADIX_DAX_MASK 0xf -#define RADIX_DAX_SHIFT 4 -#define RADIX_DAX_PTE (0x4 | RADIX_TREE_EXCEPTIONAL_ENTRY) -#define RADIX_DAX_PMD (0x8 | RADIX_TREE_EXCEPTIONAL_ENTRY) -#define RADIX_DAX_TYPE(entry) ((unsigned long)entry & RADIX_DAX_MASK) -#define RADIX_DAX_SECTOR(entry) (((unsigned long)entry >> RADIX_DAX_SHIFT)) -#define RADIX_DAX_ENTRY(sector, pmd) ((void *)((unsigned long)sector << \ - RADIX_DAX_SHIFT | (pmd ? RADIX_DAX_PMD : RADIX_DAX_PTE))) - static inline int radix_tree_is_indirect_ptr(void *ptr) { return (int)((unsigned long)ptr & RADIX_TREE_INDIRECT_PTR); -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 2/7] dax: Remove complete_unwritten argument 2016-05-11 9:58 ` Jan Kara @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 Fault handlers currently take complete_unwritten argument to convert unwritten extents after PTEs are updated. However no filesystem uses this anymore as the code is racy. Remove the unused argument. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/block_dev.c | 4 ++-- fs/dax.c | 43 +++++++++---------------------------------- fs/ext2/file.c | 4 ++-- fs/ext4/file.c | 4 ++-- fs/xfs/xfs_file.c | 7 +++---- include/linux/dax.h | 17 +++++++---------- include/linux/fs.h | 1 - 7 files changed, 25 insertions(+), 55 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 20a2c02b77c4..b25bb230b28a 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1746,7 +1746,7 @@ static const struct address_space_operations def_blk_aops = { */ static int blkdev_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { - return __dax_fault(vma, vmf, blkdev_get_block, NULL); + return __dax_fault(vma, vmf, blkdev_get_block); } static int blkdev_dax_pfn_mkwrite(struct vm_area_struct *vma, @@ -1758,7 +1758,7 @@ static int blkdev_dax_pfn_mkwrite(struct vm_area_struct *vma, static int blkdev_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, unsigned int flags) { - return __dax_pmd_fault(vma, addr, pmd, flags, blkdev_get_block, NULL); + return __dax_pmd_fault(vma, addr, pmd, flags, blkdev_get_block); } static const struct vm_operations_struct blkdev_dax_vm_ops = { diff --git a/fs/dax.c b/fs/dax.c index 08799a510b4d..c5ccf745d279 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -612,19 +612,13 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, * @vma: The virtual memory area where the fault occurred * @vmf: The description of the fault * @get_block: The filesystem method used to translate file offsets to blocks - * @complete_unwritten: The filesystem method used to convert unwritten blocks - * to written so the data written to them is exposed. This is required for - * required by write faults for filesystems that will return unwritten - * extent mappings from @get_block, but it is optional for reads as - * dax_insert_mapping() will always zero unwritten blocks. If the fs does - * not support unwritten extents, the it should pass NULL. * * When a page fault occurs, filesystems may call this helper in their * fault handler for DAX files. __dax_fault() assumes the caller has done all * the necessary locking for the page fault to proceed successfully. */ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, - get_block_t get_block, dax_iodone_t complete_unwritten) + get_block_t get_block) { struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; @@ -727,23 +721,9 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, page = NULL; } - /* - * If we successfully insert the new mapping over an unwritten extent, - * we need to ensure we convert the unwritten extent. If there is an - * error inserting the mapping, the filesystem needs to leave it as - * unwritten to prevent exposure of the stale underlying data to - * userspace, but we still need to call the completion function so - * the private resources on the mapping buffer can be released. We - * indicate what the callback should do via the uptodate variable, same - * as for normal BH based IO completions. - */ + /* Filesystem should not return unwritten buffers to us! */ + WARN_ON_ONCE(buffer_unwritten(&bh)); error = dax_insert_mapping(inode, &bh, vma, vmf); - if (buffer_unwritten(&bh)) { - if (complete_unwritten) - complete_unwritten(&bh, !error); - else - WARN_ON_ONCE(!(vmf->flags & FAULT_FLAG_WRITE)); - } out: if (error == -ENOMEM) @@ -772,7 +752,7 @@ EXPORT_SYMBOL(__dax_fault); * fault handler for DAX files. */ int dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, - get_block_t get_block, dax_iodone_t complete_unwritten) + get_block_t get_block) { int result; struct super_block *sb = file_inode(vma->vm_file)->i_sb; @@ -781,7 +761,7 @@ int dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, sb_start_pagefault(sb); file_update_time(vma->vm_file); } - result = __dax_fault(vma, vmf, get_block, complete_unwritten); + result = __dax_fault(vma, vmf, get_block); if (vmf->flags & FAULT_FLAG_WRITE) sb_end_pagefault(sb); @@ -815,8 +795,7 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, #define dax_pmd_dbg(bh, address, reason) __dax_dbg(bh, address, reason, "dax_pmd") int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, - pmd_t *pmd, unsigned int flags, get_block_t get_block, - dax_iodone_t complete_unwritten) + pmd_t *pmd, unsigned int flags, get_block_t get_block) { struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; @@ -875,6 +854,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (get_block(inode, block, &bh, 1) != 0) return VM_FAULT_SIGBUS; alloc = true; + WARN_ON_ONCE(buffer_unwritten(&bh)); } bdev = bh.b_bdev; @@ -1020,9 +1000,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, out: i_mmap_unlock_read(mapping); - if (buffer_unwritten(&bh)) - complete_unwritten(&bh, !(result & VM_FAULT_ERROR)); - return result; fallback: @@ -1042,8 +1019,7 @@ EXPORT_SYMBOL_GPL(__dax_pmd_fault); * pmd_fault handler for DAX files. */ int dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, - pmd_t *pmd, unsigned int flags, get_block_t get_block, - dax_iodone_t complete_unwritten) + pmd_t *pmd, unsigned int flags, get_block_t get_block) { int result; struct super_block *sb = file_inode(vma->vm_file)->i_sb; @@ -1052,8 +1028,7 @@ int dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, sb_start_pagefault(sb); file_update_time(vma->vm_file); } - result = __dax_pmd_fault(vma, address, pmd, flags, get_block, - complete_unwritten); + result = __dax_pmd_fault(vma, address, pmd, flags, get_block); if (flags & FAULT_FLAG_WRITE) sb_end_pagefault(sb); diff --git a/fs/ext2/file.c b/fs/ext2/file.c index c1400b109805..868c02317b05 100644 --- a/fs/ext2/file.c +++ b/fs/ext2/file.c @@ -51,7 +51,7 @@ static int ext2_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) } down_read(&ei->dax_sem); - ret = __dax_fault(vma, vmf, ext2_get_block, NULL); + ret = __dax_fault(vma, vmf, ext2_get_block); up_read(&ei->dax_sem); if (vmf->flags & FAULT_FLAG_WRITE) @@ -72,7 +72,7 @@ static int ext2_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, } down_read(&ei->dax_sem); - ret = __dax_pmd_fault(vma, addr, pmd, flags, ext2_get_block, NULL); + ret = __dax_pmd_fault(vma, addr, pmd, flags, ext2_get_block); up_read(&ei->dax_sem); if (flags & FAULT_FLAG_WRITE) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index dfb33da04589..cdcab0b36faa 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -207,7 +207,7 @@ static int ext4_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) if (IS_ERR(handle)) result = VM_FAULT_SIGBUS; else - result = __dax_fault(vma, vmf, ext4_dax_get_block, NULL); + result = __dax_fault(vma, vmf, ext4_dax_get_block); if (write) { if (!IS_ERR(handle)) @@ -243,7 +243,7 @@ static int ext4_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, result = VM_FAULT_SIGBUS; else result = __dax_pmd_fault(vma, addr, pmd, flags, - ext4_dax_get_block, NULL); + ext4_dax_get_block); if (write) { if (!IS_ERR(handle)) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 569938a4a357..c2946f436a3a 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1558,7 +1558,7 @@ xfs_filemap_page_mkwrite( xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); if (IS_DAX(inode)) { - ret = __dax_mkwrite(vma, vmf, xfs_get_blocks_dax_fault, NULL); + ret = __dax_mkwrite(vma, vmf, xfs_get_blocks_dax_fault); } else { ret = block_page_mkwrite(vma, vmf, xfs_get_blocks); ret = block_page_mkwrite_return(ret); @@ -1592,7 +1592,7 @@ xfs_filemap_fault( * changes to xfs_get_blocks_direct() to map unwritten extent * ioend for conversion on read-only mappings. */ - ret = __dax_fault(vma, vmf, xfs_get_blocks_dax_fault, NULL); + ret = __dax_fault(vma, vmf, xfs_get_blocks_dax_fault); } else ret = filemap_fault(vma, vmf); xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); @@ -1629,8 +1629,7 @@ xfs_filemap_pmd_fault( } xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); - ret = __dax_pmd_fault(vma, addr, pmd, flags, xfs_get_blocks_dax_fault, - NULL); + ret = __dax_pmd_fault(vma, addr, pmd, flags, xfs_get_blocks_dax_fault); xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); if (flags & FAULT_FLAG_WRITE) diff --git a/include/linux/dax.h b/include/linux/dax.h index 636dd59ab505..7c45ac7ea1d1 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -10,10 +10,8 @@ ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size); int dax_zero_page_range(struct inode *, loff_t from, unsigned len, get_block_t); int dax_truncate_page(struct inode *, loff_t from, get_block_t); -int dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t, - dax_iodone_t); -int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t, - dax_iodone_t); +int dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t); +int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t); #ifdef CONFIG_FS_DAX struct page *read_dax_sector(struct block_device *bdev, sector_t n); @@ -27,21 +25,20 @@ static inline struct page *read_dax_sector(struct block_device *bdev, #ifdef CONFIG_TRANSPARENT_HUGEPAGE int dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *, - unsigned int flags, get_block_t, dax_iodone_t); + unsigned int flags, get_block_t); int __dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *, - unsigned int flags, get_block_t, dax_iodone_t); + unsigned int flags, get_block_t); #else static inline int dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, unsigned int flags, get_block_t gb, - dax_iodone_t di) + pmd_t *pmd, unsigned int flags, get_block_t gb) { return VM_FAULT_FALLBACK; } #define __dax_pmd_fault dax_pmd_fault #endif int dax_pfn_mkwrite(struct vm_area_struct *, struct vm_fault *); -#define dax_mkwrite(vma, vmf, gb, iod) dax_fault(vma, vmf, gb, iod) -#define __dax_mkwrite(vma, vmf, gb, iod) __dax_fault(vma, vmf, gb, iod) +#define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb) +#define __dax_mkwrite(vma, vmf, gb) __dax_fault(vma, vmf, gb) static inline bool vma_is_dax(struct vm_area_struct *vma) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 70e61b58baaf..9f2813090d1b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -74,7 +74,6 @@ typedef int (get_block_t)(struct inode *inode, sector_t iblock, struct buffer_head *bh_result, int create); typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, ssize_t bytes, void *private); -typedef void (dax_iodone_t)(struct buffer_head *bh_map, int uptodate); #define MAY_EXEC 0x00000001 #define MAY_WRITE 0x00000002 -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 2/7] dax: Remove complete_unwritten argument @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Fault handlers currently take complete_unwritten argument to convert unwritten extents after PTEs are updated. However no filesystem uses this anymore as the code is racy. Remove the unused argument. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/block_dev.c | 4 ++-- fs/dax.c | 43 +++++++++---------------------------------- fs/ext2/file.c | 4 ++-- fs/ext4/file.c | 4 ++-- fs/xfs/xfs_file.c | 7 +++---- include/linux/dax.h | 17 +++++++---------- include/linux/fs.h | 1 - 7 files changed, 25 insertions(+), 55 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 20a2c02b77c4..b25bb230b28a 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1746,7 +1746,7 @@ static const struct address_space_operations def_blk_aops = { */ static int blkdev_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { - return __dax_fault(vma, vmf, blkdev_get_block, NULL); + return __dax_fault(vma, vmf, blkdev_get_block); } static int blkdev_dax_pfn_mkwrite(struct vm_area_struct *vma, @@ -1758,7 +1758,7 @@ static int blkdev_dax_pfn_mkwrite(struct vm_area_struct *vma, static int blkdev_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, unsigned int flags) { - return __dax_pmd_fault(vma, addr, pmd, flags, blkdev_get_block, NULL); + return __dax_pmd_fault(vma, addr, pmd, flags, blkdev_get_block); } static const struct vm_operations_struct blkdev_dax_vm_ops = { diff --git a/fs/dax.c b/fs/dax.c index 08799a510b4d..c5ccf745d279 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -612,19 +612,13 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, * @vma: The virtual memory area where the fault occurred * @vmf: The description of the fault * @get_block: The filesystem method used to translate file offsets to blocks - * @complete_unwritten: The filesystem method used to convert unwritten blocks - * to written so the data written to them is exposed. This is required for - * required by write faults for filesystems that will return unwritten - * extent mappings from @get_block, but it is optional for reads as - * dax_insert_mapping() will always zero unwritten blocks. If the fs does - * not support unwritten extents, the it should pass NULL. * * When a page fault occurs, filesystems may call this helper in their * fault handler for DAX files. __dax_fault() assumes the caller has done all * the necessary locking for the page fault to proceed successfully. */ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, - get_block_t get_block, dax_iodone_t complete_unwritten) + get_block_t get_block) { struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; @@ -727,23 +721,9 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, page = NULL; } - /* - * If we successfully insert the new mapping over an unwritten extent, - * we need to ensure we convert the unwritten extent. If there is an - * error inserting the mapping, the filesystem needs to leave it as - * unwritten to prevent exposure of the stale underlying data to - * userspace, but we still need to call the completion function so - * the private resources on the mapping buffer can be released. We - * indicate what the callback should do via the uptodate variable, same - * as for normal BH based IO completions. - */ + /* Filesystem should not return unwritten buffers to us! */ + WARN_ON_ONCE(buffer_unwritten(&bh)); error = dax_insert_mapping(inode, &bh, vma, vmf); - if (buffer_unwritten(&bh)) { - if (complete_unwritten) - complete_unwritten(&bh, !error); - else - WARN_ON_ONCE(!(vmf->flags & FAULT_FLAG_WRITE)); - } out: if (error == -ENOMEM) @@ -772,7 +752,7 @@ EXPORT_SYMBOL(__dax_fault); * fault handler for DAX files. */ int dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, - get_block_t get_block, dax_iodone_t complete_unwritten) + get_block_t get_block) { int result; struct super_block *sb = file_inode(vma->vm_file)->i_sb; @@ -781,7 +761,7 @@ int dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, sb_start_pagefault(sb); file_update_time(vma->vm_file); } - result = __dax_fault(vma, vmf, get_block, complete_unwritten); + result = __dax_fault(vma, vmf, get_block); if (vmf->flags & FAULT_FLAG_WRITE) sb_end_pagefault(sb); @@ -815,8 +795,7 @@ static void __dax_dbg(struct buffer_head *bh, unsigned long address, #define dax_pmd_dbg(bh, address, reason) __dax_dbg(bh, address, reason, "dax_pmd") int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, - pmd_t *pmd, unsigned int flags, get_block_t get_block, - dax_iodone_t complete_unwritten) + pmd_t *pmd, unsigned int flags, get_block_t get_block) { struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; @@ -875,6 +854,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (get_block(inode, block, &bh, 1) != 0) return VM_FAULT_SIGBUS; alloc = true; + WARN_ON_ONCE(buffer_unwritten(&bh)); } bdev = bh.b_bdev; @@ -1020,9 +1000,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, out: i_mmap_unlock_read(mapping); - if (buffer_unwritten(&bh)) - complete_unwritten(&bh, !(result & VM_FAULT_ERROR)); - return result; fallback: @@ -1042,8 +1019,7 @@ EXPORT_SYMBOL_GPL(__dax_pmd_fault); * pmd_fault handler for DAX files. */ int dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, - pmd_t *pmd, unsigned int flags, get_block_t get_block, - dax_iodone_t complete_unwritten) + pmd_t *pmd, unsigned int flags, get_block_t get_block) { int result; struct super_block *sb = file_inode(vma->vm_file)->i_sb; @@ -1052,8 +1028,7 @@ int dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, sb_start_pagefault(sb); file_update_time(vma->vm_file); } - result = __dax_pmd_fault(vma, address, pmd, flags, get_block, - complete_unwritten); + result = __dax_pmd_fault(vma, address, pmd, flags, get_block); if (flags & FAULT_FLAG_WRITE) sb_end_pagefault(sb); diff --git a/fs/ext2/file.c b/fs/ext2/file.c index c1400b109805..868c02317b05 100644 --- a/fs/ext2/file.c +++ b/fs/ext2/file.c @@ -51,7 +51,7 @@ static int ext2_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) } down_read(&ei->dax_sem); - ret = __dax_fault(vma, vmf, ext2_get_block, NULL); + ret = __dax_fault(vma, vmf, ext2_get_block); up_read(&ei->dax_sem); if (vmf->flags & FAULT_FLAG_WRITE) @@ -72,7 +72,7 @@ static int ext2_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, } down_read(&ei->dax_sem); - ret = __dax_pmd_fault(vma, addr, pmd, flags, ext2_get_block, NULL); + ret = __dax_pmd_fault(vma, addr, pmd, flags, ext2_get_block); up_read(&ei->dax_sem); if (flags & FAULT_FLAG_WRITE) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index dfb33da04589..cdcab0b36faa 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -207,7 +207,7 @@ static int ext4_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) if (IS_ERR(handle)) result = VM_FAULT_SIGBUS; else - result = __dax_fault(vma, vmf, ext4_dax_get_block, NULL); + result = __dax_fault(vma, vmf, ext4_dax_get_block); if (write) { if (!IS_ERR(handle)) @@ -243,7 +243,7 @@ static int ext4_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, result = VM_FAULT_SIGBUS; else result = __dax_pmd_fault(vma, addr, pmd, flags, - ext4_dax_get_block, NULL); + ext4_dax_get_block); if (write) { if (!IS_ERR(handle)) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 569938a4a357..c2946f436a3a 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1558,7 +1558,7 @@ xfs_filemap_page_mkwrite( xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); if (IS_DAX(inode)) { - ret = __dax_mkwrite(vma, vmf, xfs_get_blocks_dax_fault, NULL); + ret = __dax_mkwrite(vma, vmf, xfs_get_blocks_dax_fault); } else { ret = block_page_mkwrite(vma, vmf, xfs_get_blocks); ret = block_page_mkwrite_return(ret); @@ -1592,7 +1592,7 @@ xfs_filemap_fault( * changes to xfs_get_blocks_direct() to map unwritten extent * ioend for conversion on read-only mappings. */ - ret = __dax_fault(vma, vmf, xfs_get_blocks_dax_fault, NULL); + ret = __dax_fault(vma, vmf, xfs_get_blocks_dax_fault); } else ret = filemap_fault(vma, vmf); xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); @@ -1629,8 +1629,7 @@ xfs_filemap_pmd_fault( } xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); - ret = __dax_pmd_fault(vma, addr, pmd, flags, xfs_get_blocks_dax_fault, - NULL); + ret = __dax_pmd_fault(vma, addr, pmd, flags, xfs_get_blocks_dax_fault); xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); if (flags & FAULT_FLAG_WRITE) diff --git a/include/linux/dax.h b/include/linux/dax.h index 636dd59ab505..7c45ac7ea1d1 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -10,10 +10,8 @@ ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size); int dax_zero_page_range(struct inode *, loff_t from, unsigned len, get_block_t); int dax_truncate_page(struct inode *, loff_t from, get_block_t); -int dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t, - dax_iodone_t); -int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t, - dax_iodone_t); +int dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t); +int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t); #ifdef CONFIG_FS_DAX struct page *read_dax_sector(struct block_device *bdev, sector_t n); @@ -27,21 +25,20 @@ static inline struct page *read_dax_sector(struct block_device *bdev, #ifdef CONFIG_TRANSPARENT_HUGEPAGE int dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *, - unsigned int flags, get_block_t, dax_iodone_t); + unsigned int flags, get_block_t); int __dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *, - unsigned int flags, get_block_t, dax_iodone_t); + unsigned int flags, get_block_t); #else static inline int dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, unsigned int flags, get_block_t gb, - dax_iodone_t di) + pmd_t *pmd, unsigned int flags, get_block_t gb) { return VM_FAULT_FALLBACK; } #define __dax_pmd_fault dax_pmd_fault #endif int dax_pfn_mkwrite(struct vm_area_struct *, struct vm_fault *); -#define dax_mkwrite(vma, vmf, gb, iod) dax_fault(vma, vmf, gb, iod) -#define __dax_mkwrite(vma, vmf, gb, iod) __dax_fault(vma, vmf, gb, iod) +#define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb) +#define __dax_mkwrite(vma, vmf, gb) __dax_fault(vma, vmf, gb) static inline bool vma_is_dax(struct vm_area_struct *vma) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 70e61b58baaf..9f2813090d1b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -74,7 +74,6 @@ typedef int (get_block_t)(struct inode *inode, sector_t iblock, struct buffer_head *bh_result, int create); typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, ssize_t bytes, void *private); -typedef void (dax_iodone_t)(struct buffer_head *bh_map, int uptodate); #define MAY_EXEC 0x00000001 #define MAY_WRITE 0x00000002 -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data 2016-05-11 9:58 ` Jan Kara @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 Currently ext2 zeroes any data blocks allocated for DAX inode however it still returns them as BH_New. Thus DAX code zeroes them again in dax_insert_mapping() which can possibly overwrite the data that has been already stored to those blocks by a racing dax_io(). Avoid marking pre-zeroed buffers as new. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext2/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 6bd58e6ff038..1f07b758b968 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode, mutex_unlock(&ei->truncate_mutex); goto cleanup; } - } + } else + set_buffer_new(bh_result); ext2_splice_branch(inode, iblock, partial, indirect_blks, count); mutex_unlock(&ei->truncate_mutex); - set_buffer_new(bh_result); got_it: map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key)); if (count > blocks_to_boundary) -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Currently ext2 zeroes any data blocks allocated for DAX inode however it still returns them as BH_New. Thus DAX code zeroes them again in dax_insert_mapping() which can possibly overwrite the data that has been already stored to those blocks by a racing dax_io(). Avoid marking pre-zeroed buffers as new. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext2/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 6bd58e6ff038..1f07b758b968 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode, mutex_unlock(&ei->truncate_mutex); goto cleanup; } - } + } else + set_buffer_new(bh_result); ext2_splice_branch(inode, iblock, partial, indirect_blks, count); mutex_unlock(&ei->truncate_mutex); - set_buffer_new(bh_result); got_it: map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key)); if (count > blocks_to_boundary) -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data 2016-05-11 9:58 ` Jan Kara @ 2016-05-12 18:45 ` Ross Zwisler -1 siblings, 0 replies; 30+ messages in thread From: Ross Zwisler @ 2016-05-12 18:45 UTC (permalink / raw) To: Jan Kara; +Cc: Ted Tso, linux-nvdimm, linux-fsdevel, linux-ext4 On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > Currently ext2 zeroes any data blocks allocated for DAX inode however it > still returns them as BH_New. Thus DAX code zeroes them again in > dax_insert_mapping() which can possibly overwrite the data that has been > already stored to those blocks by a racing dax_io(). Avoid marking > pre-zeroed buffers as new. > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/ext2/inode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > index 6bd58e6ff038..1f07b758b968 100644 > --- a/fs/ext2/inode.c > +++ b/fs/ext2/inode.c > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode, > mutex_unlock(&ei->truncate_mutex); > goto cleanup; > } > - } > + } else > + set_buffer_new(bh_result); > > ext2_splice_branch(inode, iblock, partial, indirect_blks, count); > mutex_unlock(&ei->truncate_mutex); > - set_buffer_new(bh_result); > got_it: > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key)); > if (count > blocks_to_boundary) > -- > 2.6.6 Interestingly this change is causing a bunch of xfstests regressions for me with ext2 + DAX. All of these tests pass without this one change. # ./check generic/075 generic/091 generic/112 generic/127 generic/231 generic/263 FSTYP -- ext2 PLATFORM -- Linux/x86_64 alara 4.6.0-rc5+ MKFS_OPTIONS -- /dev/pmem0p2 MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem0p2 /mnt/xfstests_scratch generic/075 2s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad) --- tests/generic/075.out 2015-10-02 10:19:36.798795839 -0600 +++ /root/xfstests/results//generic/075.out.bad 2016-05-12 12:40:06.558706982 -0600 @@ -4,15 +4,5 @@ ----------------------------------------------- fsx.0 : -d -N numops -S 0 ----------------------------------------------- - ------------------------------------------------ -fsx.1 : -d -N numops -S 0 -x ------------------------------------------------ ... (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff) generic/091 1s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/091.out.bad) --- tests/generic/091.out 2015-10-02 10:19:36.800795853 -0600 +++ /root/xfstests/results//generic/091.out.bad 2016-05-12 12:40:07.366712217 -0600 @@ -1,7 +1,110 @@ QA output created by 091 fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W ... (Run 'diff -u tests/generic/091.out /root/xfstests/results//generic/091.out.bad' to see the entire diff) generic/112 2s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/112.out.bad) --- tests/generic/112.out 2015-10-02 10:19:36.802795866 -0600 +++ /root/xfstests/results//generic/112.out.bad 2016-05-12 12:40:08.601720218 -0600 @@ -4,15 +4,5 @@ ----------------------------------------------- fsx.0 : -A -d -N numops -S 0 ----------------------------------------------- - ------------------------------------------------ -fsx.1 : -A -d -N numops -S 0 -x ------------------------------------------------ ... (Run 'diff -u tests/generic/112.out /root/xfstests/results//generic/112.out.bad' to see the entire diff) generic/127 14s ... - output mismatch (see /root/xfstests/results//generic/127.out.bad) --- tests/generic/127.out 2016-02-17 16:02:40.110656181 -0700 +++ /root/xfstests/results//generic/127.out.bad 2016-05-12 12:40:16.861773730 -0600 @@ -4,10 +4,905 @@ === FSX Light Mode, Memory Mapping === All 100000 operations completed A-OK! === FSX Standard Mode, No Memory Mapping === -All 100000 operations completed A-OK! +ltp/fsx -q -l 262144 -o 65536 -S 191110531 -N 100000 -R -W fsx_std_nommap +READ BAD DATA: offset = 0xd72d, size = 0x826b, fname = /mnt/xfstests_test/fsx_std_nommap +OFFSET GOOD BAD RANGE ... (Run 'diff -u tests/generic/127.out /root/xfstests/results//generic/127.out.bad' to see the entire diff) generic/231 22s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/231.out.bad) --- tests/generic/231.out 2016-02-17 16:02:40.120656239 -0700 +++ /root/xfstests/results//generic/231.out.bad 2016-05-12 12:40:19.266789311 -0600 @@ -1,16 +1,1802 @@ QA output created by 231 === FSX Standard Mode, Memory Mapping, 1 Tasks === -All 20000 operations completed A-OK! -Comparing user usage -Comparing group usage -=== FSX Standard Mode, Memory Mapping, 4 Tasks === -All 20000 operations completed A-OK! ... (Run 'diff -u tests/generic/231.out /root/xfstests/results//generic/231.out.bad' to see the entire diff) generic/263 1s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/263.out.bad) --- tests/generic/263.out 2015-10-02 10:19:36.807795900 -0600 +++ /root/xfstests/results//generic/263.out.bad 2016-05-12 12:40:19.801792777 -0600 @@ -1,3 +1,1429 @@ QA output created by 263 fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z +main: filesystem does not support fallocate mode 0, disabling! +main: filesystem does not support fallocate mode FALLOC_FL_KEEP_SIZE, disabling! +main: filesystem does not support fallocate mode FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, disabling! +main: filesystem does not support fallocate mode FALLOC_FL_ZERO_RANGE, disabling! ... (Run 'diff -u tests/generic/263.out /root/xfstests/results//generic/263.out.bad' to see the entire diff) Ran: generic/075 generic/091 generic/112 generic/127 generic/231 generic/263 Failures: generic/075 generic/091 generic/112 generic/127 generic/231 generic/263 Failed 6 of 6 tests _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-12 18:45 ` Ross Zwisler 0 siblings, 0 replies; 30+ messages in thread From: Ross Zwisler @ 2016-05-12 18:45 UTC (permalink / raw) To: Jan Kara Cc: linux-nvdimm, Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > Currently ext2 zeroes any data blocks allocated for DAX inode however it > still returns them as BH_New. Thus DAX code zeroes them again in > dax_insert_mapping() which can possibly overwrite the data that has been > already stored to those blocks by a racing dax_io(). Avoid marking > pre-zeroed buffers as new. > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/ext2/inode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > index 6bd58e6ff038..1f07b758b968 100644 > --- a/fs/ext2/inode.c > +++ b/fs/ext2/inode.c > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode, > mutex_unlock(&ei->truncate_mutex); > goto cleanup; > } > - } > + } else > + set_buffer_new(bh_result); > > ext2_splice_branch(inode, iblock, partial, indirect_blks, count); > mutex_unlock(&ei->truncate_mutex); > - set_buffer_new(bh_result); > got_it: > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key)); > if (count > blocks_to_boundary) > -- > 2.6.6 Interestingly this change is causing a bunch of xfstests regressions for me with ext2 + DAX. All of these tests pass without this one change. # ./check generic/075 generic/091 generic/112 generic/127 generic/231 generic/263 FSTYP -- ext2 PLATFORM -- Linux/x86_64 alara 4.6.0-rc5+ MKFS_OPTIONS -- /dev/pmem0p2 MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem0p2 /mnt/xfstests_scratch generic/075 2s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad) --- tests/generic/075.out 2015-10-02 10:19:36.798795839 -0600 +++ /root/xfstests/results//generic/075.out.bad 2016-05-12 12:40:06.558706982 -0600 @@ -4,15 +4,5 @@ ----------------------------------------------- fsx.0 : -d -N numops -S 0 ----------------------------------------------- - ------------------------------------------------ -fsx.1 : -d -N numops -S 0 -x ------------------------------------------------ ... (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff) generic/091 1s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/091.out.bad) --- tests/generic/091.out 2015-10-02 10:19:36.800795853 -0600 +++ /root/xfstests/results//generic/091.out.bad 2016-05-12 12:40:07.366712217 -0600 @@ -1,7 +1,110 @@ QA output created by 091 fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W ... (Run 'diff -u tests/generic/091.out /root/xfstests/results//generic/091.out.bad' to see the entire diff) generic/112 2s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/112.out.bad) --- tests/generic/112.out 2015-10-02 10:19:36.802795866 -0600 +++ /root/xfstests/results//generic/112.out.bad 2016-05-12 12:40:08.601720218 -0600 @@ -4,15 +4,5 @@ ----------------------------------------------- fsx.0 : -A -d -N numops -S 0 ----------------------------------------------- - ------------------------------------------------ -fsx.1 : -A -d -N numops -S 0 -x ------------------------------------------------ ... (Run 'diff -u tests/generic/112.out /root/xfstests/results//generic/112.out.bad' to see the entire diff) generic/127 14s ... - output mismatch (see /root/xfstests/results//generic/127.out.bad) --- tests/generic/127.out 2016-02-17 16:02:40.110656181 -0700 +++ /root/xfstests/results//generic/127.out.bad 2016-05-12 12:40:16.861773730 -0600 @@ -4,10 +4,905 @@ === FSX Light Mode, Memory Mapping === All 100000 operations completed A-OK! === FSX Standard Mode, No Memory Mapping === -All 100000 operations completed A-OK! +ltp/fsx -q -l 262144 -o 65536 -S 191110531 -N 100000 -R -W fsx_std_nommap +READ BAD DATA: offset = 0xd72d, size = 0x826b, fname = /mnt/xfstests_test/fsx_std_nommap +OFFSET GOOD BAD RANGE ... (Run 'diff -u tests/generic/127.out /root/xfstests/results//generic/127.out.bad' to see the entire diff) generic/231 22s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/231.out.bad) --- tests/generic/231.out 2016-02-17 16:02:40.120656239 -0700 +++ /root/xfstests/results//generic/231.out.bad 2016-05-12 12:40:19.266789311 -0600 @@ -1,16 +1,1802 @@ QA output created by 231 === FSX Standard Mode, Memory Mapping, 1 Tasks === -All 20000 operations completed A-OK! -Comparing user usage -Comparing group usage -=== FSX Standard Mode, Memory Mapping, 4 Tasks === -All 20000 operations completed A-OK! ... (Run 'diff -u tests/generic/231.out /root/xfstests/results//generic/231.out.bad' to see the entire diff) generic/263 1s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/263.out.bad) --- tests/generic/263.out 2015-10-02 10:19:36.807795900 -0600 +++ /root/xfstests/results//generic/263.out.bad 2016-05-12 12:40:19.801792777 -0600 @@ -1,3 +1,1429 @@ QA output created by 263 fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z +main: filesystem does not support fallocate mode 0, disabling! +main: filesystem does not support fallocate mode FALLOC_FL_KEEP_SIZE, disabling! +main: filesystem does not support fallocate mode FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, disabling! +main: filesystem does not support fallocate mode FALLOC_FL_ZERO_RANGE, disabling! ... (Run 'diff -u tests/generic/263.out /root/xfstests/results//generic/263.out.bad' to see the entire diff) Ran: generic/075 generic/091 generic/112 generic/127 generic/231 generic/263 Failures: generic/075 generic/091 generic/112 generic/127 generic/231 generic/263 Failed 6 of 6 tests ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data 2016-05-12 18:45 ` Ross Zwisler @ 2016-05-16 15:22 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-16 15:22 UTC (permalink / raw) To: Ross Zwisler; +Cc: Ted Tso, linux-nvdimm, linux-fsdevel, Jan Kara, linux-ext4 On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > Currently ext2 zeroes any data blocks allocated for DAX inode however it > > still returns them as BH_New. Thus DAX code zeroes them again in > > dax_insert_mapping() which can possibly overwrite the data that has been > > already stored to those blocks by a racing dax_io(). Avoid marking > > pre-zeroed buffers as new. > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/ext2/inode.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > index 6bd58e6ff038..1f07b758b968 100644 > > --- a/fs/ext2/inode.c > > +++ b/fs/ext2/inode.c > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode, > > mutex_unlock(&ei->truncate_mutex); > > goto cleanup; > > } > > - } > > + } else > > + set_buffer_new(bh_result); > > > > ext2_splice_branch(inode, iblock, partial, indirect_blks, count); > > mutex_unlock(&ei->truncate_mutex); > > - set_buffer_new(bh_result); > > got_it: > > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key)); > > if (count > blocks_to_boundary) > > -- > > 2.6.6 > > Interestingly this change is causing a bunch of xfstests regressions for me > with ext2 + DAX. All of these tests pass without this one change. Good catch. Attached patch fixes this issue for me. Preferably it should be merged before the above ext2 change. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-16 15:22 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-16 15:22 UTC (permalink / raw) To: Ross Zwisler Cc: Jan Kara, linux-nvdimm, Ted Tso, linux-ext4, Vishal Verma, Dan Williams, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 1600 bytes --] On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > Currently ext2 zeroes any data blocks allocated for DAX inode however it > > still returns them as BH_New. Thus DAX code zeroes them again in > > dax_insert_mapping() which can possibly overwrite the data that has been > > already stored to those blocks by a racing dax_io(). Avoid marking > > pre-zeroed buffers as new. > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/ext2/inode.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > index 6bd58e6ff038..1f07b758b968 100644 > > --- a/fs/ext2/inode.c > > +++ b/fs/ext2/inode.c > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode, > > mutex_unlock(&ei->truncate_mutex); > > goto cleanup; > > } > > - } > > + } else > > + set_buffer_new(bh_result); > > > > ext2_splice_branch(inode, iblock, partial, indirect_blks, count); > > mutex_unlock(&ei->truncate_mutex); > > - set_buffer_new(bh_result); > > got_it: > > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key)); > > if (count > blocks_to_boundary) > > -- > > 2.6.6 > > Interestingly this change is causing a bunch of xfstests regressions for me > with ext2 + DAX. All of these tests pass without this one change. Good catch. Attached patch fixes this issue for me. Preferably it should be merged before the above ext2 change. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR [-- Attachment #2: 0001-ext2-Fix-block-zeroing-in-ext2_get_blocks-for-DAX.patch --] [-- Type: text/x-patch, Size: 1209 bytes --] >From 287d6b6cb0b6f325696fff93ff0f29ee5fde5736 Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Mon, 16 May 2016 17:17:04 +0200 Subject: [PATCH] ext2: Fix block zeroing in ext2_get_blocks() for DAX When zeroing allocated blocks for DAX, we accidentally zeroed only the first allocated block instead of all of them. So far this problem is hidden by the fact that page faults always need only a single block and DAX write code zeroes blocks again. But the zeroing in DAX code is racy and needs to be removed so fix the zeroing in ext2 to zero all allocated blocks. Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext2/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 6bd58e6ff038..038d0ed5f565 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -740,7 +740,7 @@ static int ext2_get_blocks(struct inode *inode, err = dax_clear_sectors(inode->i_sb->s_bdev, le32_to_cpu(chain[depth-1].key) << (inode->i_blkbits - 9), - 1 << inode->i_blkbits); + count << inode->i_blkbits); if (err) { mutex_unlock(&ei->truncate_mutex); goto cleanup; -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data 2016-05-16 15:22 ` Jan Kara @ 2016-05-17 6:52 ` Verma, Vishal L -1 siblings, 0 replies; 30+ messages in thread From: Verma, Vishal L @ 2016-05-17 6:52 UTC (permalink / raw) To: ross.zwisler, jack; +Cc: linux-fsdevel, linux-ext4, tytso, linux-nvdimm On Mon, 2016-05-16 at 17:22 +0200, Jan Kara wrote: > On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > > > > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > > > > > Currently ext2 zeroes any data blocks allocated for DAX inode > > > however it > > > still returns them as BH_New. Thus DAX code zeroes them again in > > > dax_insert_mapping() which can possibly overwrite the data that > > > has been > > > already stored to those blocks by a racing dax_io(). Avoid > > > marking > > > pre-zeroed buffers as new. > > > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > --- > > > fs/ext2/inode.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > > index 6bd58e6ff038..1f07b758b968 100644 > > > --- a/fs/ext2/inode.c > > > +++ b/fs/ext2/inode.c > > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode > > > *inode, > > > mutex_unlock(&ei->truncate_mutex); > > > goto cleanup; > > > } > > > - } > > > + } else > > > + set_buffer_new(bh_result); > > > > > > ext2_splice_branch(inode, iblock, partial, > > > indirect_blks, count); > > > mutex_unlock(&ei->truncate_mutex); > > > - set_buffer_new(bh_result); > > > got_it: > > > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth- > > > 1].key)); > > > if (count > blocks_to_boundary) > > > -- > > > 2.6.6 > > Interestingly this change is causing a bunch of xfstests > > regressions for me > > with ext2 + DAX. All of these tests pass without this one change. > Good catch. Attached patch fixes this issue for me. Preferably it > should be > merged before the above ext2 change. > > Honza Hey Jan, In my patch 3 of the error handling series, I have: - err = dax_clear_sectors(inode->i_sb->s_bdev, - le32_to_cpu(chain[depth-1].key) << - (inode->i_blkbits - 9), - 1 << inode->i_blkbits); + err = sb_issue_zeroout(inode->i_sb, + le32_to_cpu(chain[depth-1].key), 1, GFP_NOFS); Does this mean I have to change to send the sb_issue_zeroout for 'count' blocks.. i.e. - err = dax_clear_sectors(inode->i_sb->s_bdev, - le32_to_cpu(chain[depth-1].key) << - (inode->i_blkbits - 9), - 1 << inode->i_blkbits); + err = sb_issue_zeroout(inode->i_sb, + le32_to_cpu(chain[depth-1].key), count, GFP_NOFS); If so, I'll update my series tomorrow to include in both of these changes. Thanks, -Vishal _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-17 6:52 ` Verma, Vishal L 0 siblings, 0 replies; 30+ messages in thread From: Verma, Vishal L @ 2016-05-17 6:52 UTC (permalink / raw) To: ross.zwisler, jack Cc: linux-ext4, Williams, Dan J, linux-nvdimm, tytso, linux-fsdevel On Mon, 2016-05-16 at 17:22 +0200, Jan Kara wrote: > On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > > > > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > > > > > Currently ext2 zeroes any data blocks allocated for DAX inode > > > however it > > > still returns them as BH_New. Thus DAX code zeroes them again in > > > dax_insert_mapping() which can possibly overwrite the data that > > > has been > > > already stored to those blocks by a racing dax_io(). Avoid > > > marking > > > pre-zeroed buffers as new. > > > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > --- > > > fs/ext2/inode.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > > index 6bd58e6ff038..1f07b758b968 100644 > > > --- a/fs/ext2/inode.c > > > +++ b/fs/ext2/inode.c > > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode > > > *inode, > > > mutex_unlock(&ei->truncate_mutex); > > > goto cleanup; > > > } > > > - } > > > + } else > > > + set_buffer_new(bh_result); > > > > > > ext2_splice_branch(inode, iblock, partial, > > > indirect_blks, count); > > > mutex_unlock(&ei->truncate_mutex); > > > - set_buffer_new(bh_result); > > > got_it: > > > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth- > > > 1].key)); > > > if (count > blocks_to_boundary) > > > -- > > > 2.6.6 > > Interestingly this change is causing a bunch of xfstests > > regressions for me > > with ext2 + DAX. All of these tests pass without this one change. > Good catch. Attached patch fixes this issue for me. Preferably it > should be > merged before the above ext2 change. > > Honza Hey Jan, In my patch 3 of the error handling series, I have: - err = dax_clear_sectors(inode->i_sb->s_bdev, - le32_to_cpu(chain[depth-1].key) << - (inode->i_blkbits - 9), - 1 << inode->i_blkbits); + err = sb_issue_zeroout(inode->i_sb, + le32_to_cpu(chain[depth-1].key), 1, GFP_NOFS); Does this mean I have to change to send the sb_issue_zeroout for 'count' blocks.. i.e. - err = dax_clear_sectors(inode->i_sb->s_bdev, - le32_to_cpu(chain[depth-1].key) << - (inode->i_blkbits - 9), - 1 << inode->i_blkbits); + err = sb_issue_zeroout(inode->i_sb, + le32_to_cpu(chain[depth-1].key), count, GFP_NOFS); If so, I'll update my series tomorrow to include in both of these changes. Thanks, -Vishal ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data 2016-05-17 6:52 ` Verma, Vishal L (?) @ 2016-05-17 7:19 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-17 7:19 UTC (permalink / raw) To: Verma, Vishal L; +Cc: tytso, linux-nvdimm, linux-fsdevel, jack, linux-ext4 On Tue 17-05-16 06:52:47, Verma, Vishal L wrote: > On Mon, 2016-05-16 at 17:22 +0200, Jan Kara wrote: > > On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > > > > > > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > > > > > > > Currently ext2 zeroes any data blocks allocated for DAX inode > > > > however it > > > > still returns them as BH_New. Thus DAX code zeroes them again in > > > > dax_insert_mapping() which can possibly overwrite the data that > > > > has been > > > > already stored to those blocks by a racing dax_io(). Avoid > > > > marking > > > > pre-zeroed buffers as new. > > > > > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > > --- > > > > fs/ext2/inode.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > > > index 6bd58e6ff038..1f07b758b968 100644 > > > > --- a/fs/ext2/inode.c > > > > +++ b/fs/ext2/inode.c > > > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode > > > > *inode, > > > > mutex_unlock(&ei->truncate_mutex); > > > > goto cleanup; > > > > } > > > > - } > > > > + } else > > > > + set_buffer_new(bh_result); > > > > > > > > ext2_splice_branch(inode, iblock, partial, > > > > indirect_blks, count); > > > > mutex_unlock(&ei->truncate_mutex); > > > > - set_buffer_new(bh_result); > > > > got_it: > > > > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth- > > > > 1].key)); > > > > if (count > blocks_to_boundary) > > > > -- > > > > 2.6.6 > > > Interestingly this change is causing a bunch of xfstests > > > regressions for me > > > with ext2 + DAX. All of these tests pass without this one change. > > Good catch. Attached patch fixes this issue for me. Preferably it > > should be > > merged before the above ext2 change. > > > > Honza > > Hey Jan, > > In my patch 3 of the error handling series, I have: > > - err = dax_clear_sectors(inode->i_sb->s_bdev, > - le32_to_cpu(chain[depth-1].key) << > - (inode->i_blkbits - 9), > - 1 << inode->i_blkbits); > + err = sb_issue_zeroout(inode->i_sb, > + le32_to_cpu(chain[depth-1].key), 1, GFP_NOFS); > > Does this mean I have to change to send the sb_issue_zeroout for > 'count' blocks.. i.e. Yes, I've noticed the conflict today as well. > - err = dax_clear_sectors(inode->i_sb->s_bdev, > - le32_to_cpu(chain[depth-1].key) << > - (inode->i_blkbits - 9), > - 1 << inode->i_blkbits); > + err = sb_issue_zeroout(inode->i_sb, > + le32_to_cpu(chain[depth-1].key), count, GFP_NOFS); > > If so, I'll update my series tomorrow to include in both of these changes. I'd prefer these two to stay separate commits (they are really independent). Since you already depend on other patches from the DAX cleanup series, just add this patch to the list of dependencies and base your change on that... Hmm? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-17 7:19 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-17 7:19 UTC (permalink / raw) To: Verma, Vishal L Cc: ross.zwisler, jack, linux-ext4, Williams, Dan J, linux-nvdimm, tytso, linux-fsdevel On Tue 17-05-16 06:52:47, Verma, Vishal L wrote: > On Mon, 2016-05-16 at 17:22 +0200, Jan Kara wrote: > > On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > > > > > > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > > > > > > > Currently ext2 zeroes any data blocks allocated for DAX inode > > > > however it > > > > still returns them as BH_New. Thus DAX code zeroes them again in > > > > dax_insert_mapping() which can possibly overwrite the data that > > > > has been > > > > already stored to those blocks by a racing dax_io(). Avoid > > > > marking > > > > pre-zeroed buffers as new. > > > > > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > > --- > > > > fs/ext2/inode.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > > > index 6bd58e6ff038..1f07b758b968 100644 > > > > --- a/fs/ext2/inode.c > > > > +++ b/fs/ext2/inode.c > > > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode > > > > *inode, > > > > mutex_unlock(&ei->truncate_mutex); > > > > goto cleanup; > > > > } > > > > - } > > > > + } else > > > > + set_buffer_new(bh_result); > > > > > > > > ext2_splice_branch(inode, iblock, partial, > > > > indirect_blks, count); > > > > mutex_unlock(&ei->truncate_mutex); > > > > - set_buffer_new(bh_result); > > > > got_it: > > > > map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth- > > > > 1].key)); > > > > if (count > blocks_to_boundary) > > > > -- > > > > 2.6.6 > > > Interestingly this change is causing a bunch of xfstests > > > regressions for me > > > with ext2 + DAX. All of these tests pass without this one change. > > Good catch. Attached patch fixes this issue for me. Preferably it > > should be > > merged before the above ext2 change. > > > > Honza > > Hey Jan, > > In my patch 3 of the error handling series, I have: > > - err = dax_clear_sectors(inode->i_sb->s_bdev, > - le32_to_cpu(chain[depth-1].key) << > - (inode->i_blkbits - 9), > - 1 << inode->i_blkbits); > + err = sb_issue_zeroout(inode->i_sb, > + le32_to_cpu(chain[depth-1].key), 1, GFP_NOFS); > > Does this mean I have to change to send the sb_issue_zeroout for > 'count' blocks.. i.e. Yes, I've noticed the conflict today as well. > - err = dax_clear_sectors(inode->i_sb->s_bdev, > - le32_to_cpu(chain[depth-1].key) << > - (inode->i_blkbits - 9), > - 1 << inode->i_blkbits); > + err = sb_issue_zeroout(inode->i_sb, > + le32_to_cpu(chain[depth-1].key), count, GFP_NOFS); > > If so, I'll update my series tomorrow to include in both of these changes. I'd prefer these two to stay separate commits (they are really independent). Since you already depend on other patches from the DAX cleanup series, just add this patch to the list of dependencies and base your change on that... Hmm? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-17 7:19 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-17 7:19 UTC (permalink / raw) To: Verma, Vishal L Cc: ross.zwisler, jack, linux-ext4, Williams, Dan J, linux-nvdimm, tytso, linux-fsdevel On Tue 17-05-16 06:52:47, Verma, Vishal L wrote: > On Mon, 2016-05-16 at 17:22 +0200, Jan Kara wrote: > > On Thu 12-05-16 12:45:22, Ross Zwisler wrote: > > > > > > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote: > > > > > > > > Currently ext2 zeroes any data blocks allocated for DAX inode > > > > however it > > > > still returns them as BH_New. Thus DAX code zeroes them again in > > > > dax_insert_mapping() which can possibly overwrite the data that > > > > has been > > > > already stored to those blocks by a racing dax_io(). Avoid > > > > marking > > > > pre-zeroed buffers as new. > > > > > > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > > --- > > > > �fs/ext2/inode.c | 4 ++-- > > > > �1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > > > index 6bd58e6ff038..1f07b758b968 100644 > > > > --- a/fs/ext2/inode.c > > > > +++ b/fs/ext2/inode.c > > > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode > > > > *inode, > > > > � mutex_unlock(&ei->truncate_mutex); > > > > � goto cleanup; > > > > � } > > > > - } > > > > + } else > > > > + set_buffer_new(bh_result); > > > > � > > > > � ext2_splice_branch(inode, iblock, partial, > > > > indirect_blks, count); > > > > � mutex_unlock(&ei->truncate_mutex); > > > > - set_buffer_new(bh_result); > > > > �got_it: > > > > � map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth- > > > > 1].key)); > > > > � if (count > blocks_to_boundary) > > > > --� > > > > 2.6.6 > > > Interestingly this change is causing a bunch of xfstests > > > regressions for me > > > with ext2 + DAX.��All of these tests pass without this one change. > > Good catch. Attached patch fixes this issue for me. Preferably it > > should be > > merged before the above ext2 change. > > > > Honza > > Hey Jan, > > In my patch 3 of the error handling series, I have: > > -���������������err = dax_clear_sectors(inode->i_sb->s_bdev, > -�������������������������������le32_to_cpu(chain[depth-1].key) << > -�������������������������������(inode->i_blkbits - 9), > -�������������������������������1 << inode->i_blkbits); > +���������������err = sb_issue_zeroout(inode->i_sb, > +�������������������������������le32_to_cpu(chain[depth-1].key), 1, GFP_NOFS); > > Does this mean I have to change to send the sb_issue_zeroout for > 'count' blocks.. i.e. Yes, I've noticed the conflict today as well. > -���������������err = dax_clear_sectors(inode->i_sb->s_bdev, > -�������������������������������le32_to_cpu(chain[depth-1].key) << > -�������������������������������(inode->i_blkbits - 9), > -�������������������������������1 << inode->i_blkbits); > +���������������err = sb_issue_zeroout(inode->i_sb, > +�������������������������������le32_to_cpu(chain[depth-1].key), count, GFP_NOFS); > > If so, I'll update my series tomorrow to include in both of these changes. I'd prefer these two to stay separate commits (they are really independent). Since you already depend on other patches from the DAX cleanup series, just add this patch to the list of dependencies and base your change on that... Hmm? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data 2016-05-17 7:19 ` Jan Kara @ 2016-05-17 17:56 ` Verma, Vishal L -1 siblings, 0 replies; 30+ messages in thread From: Verma, Vishal L @ 2016-05-17 17:56 UTC (permalink / raw) To: jack; +Cc: tytso, linux-nvdimm, linux-fsdevel, linux-ext4 On Tue, 2016-05-17 at 09:19 +0200, Jan Kara wrote: > On Tue 17-05-16 06:52:47, Verma, Vishal L wrote: [...] > > > > > > - err = dax_clear_sectors(inode->i_sb->s_bdev, > > - le32_to_cpu(chain[depth-1].key) << > > - (inode->i_blkbits - 9), > > - 1 << inode->i_blkbits); > > + err = sb_issue_zeroout(inode->i_sb, > > + le32_to_cpu(chain[depth-1].key), > > count, GFP_NOFS); > > > > If so, I'll update my series tomorrow to include in both of these > > changes. > I'd prefer these two to stay separate commits (they are really > independent). Since you already depend on other patches from the DAX > cleanup series, just add this patch to the list of dependencies and > base > your change on that... Hmm? Yes I agree. I'm preparing a branch that adds your fix, and modifying my sb_issue_zeroout to use the same fix. I've verified that those tests pass after both changes. Thanks! -Vishal _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data @ 2016-05-17 17:56 ` Verma, Vishal L 0 siblings, 0 replies; 30+ messages in thread From: Verma, Vishal L @ 2016-05-17 17:56 UTC (permalink / raw) To: jack Cc: linux-ext4, ross.zwisler, linux-nvdimm, Williams, Dan J, tytso, linux-fsdevel On Tue, 2016-05-17 at 09:19 +0200, Jan Kara wrote: > On Tue 17-05-16 06:52:47, Verma, Vishal L wrote: [...] > > > > > > - err = dax_clear_sectors(inode->i_sb->s_bdev, > > - le32_to_cpu(chain[depth-1].key) << > > - (inode->i_blkbits - 9), > > - 1 << inode->i_blkbits); > > + err = sb_issue_zeroout(inode->i_sb, > > + le32_to_cpu(chain[depth-1].key), > > count, GFP_NOFS); > > > > If so, I'll update my series tomorrow to include in both of these > > changes. > I'd prefer these two to stay separate commits (they are really > independent). Since you already depend on other patches from the DAX > cleanup series, just add this patch to the list of dependencies and > base > your change on that... Hmm? Yes I agree. I'm preparing a branch that adds your fix, and modifying my sb_issue_zeroout to use the same fix. I've verified that those tests pass after both changes. Thanks! -Vishal ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 4/7] dax: Remove dead zeroing code from fault handlers 2016-05-11 9:58 ` Jan Kara (?) @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 Now that all filesystems zero out blocks allocated for a fault handler, we can just remove the zeroing from the handler itself. Also add checks that no filesystem returns to us unwritten or new buffer. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index c5ccf745d279..ccb8bc399d78 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -587,11 +587,6 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, error = PTR_ERR(dax.addr); goto out; } - - if (buffer_unwritten(bh) || buffer_new(bh)) { - clear_pmem(dax.addr, PAGE_SIZE); - wmb_pmem(); - } dax_unmap_atomic(bdev, &dax); error = dax_radix_entry(mapping, vmf->pgoff, dax.sector, false, @@ -670,7 +665,7 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, if (error) goto unlock_page; - if (!buffer_mapped(&bh) && !buffer_unwritten(&bh) && !vmf->cow_page) { + if (!buffer_mapped(&bh) && !vmf->cow_page) { if (vmf->flags & FAULT_FLAG_WRITE) { error = get_block(inode, block, &bh, 1); count_vm_event(PGMAJFAULT); @@ -722,7 +717,7 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, } /* Filesystem should not return unwritten buffers to us! */ - WARN_ON_ONCE(buffer_unwritten(&bh)); + WARN_ON_ONCE(buffer_unwritten(&bh) || buffer_new(&bh)); error = dax_insert_mapping(inode, &bh, vma, vmf); out: @@ -854,7 +849,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (get_block(inode, block, &bh, 1) != 0) return VM_FAULT_SIGBUS; alloc = true; - WARN_ON_ONCE(buffer_unwritten(&bh)); + WARN_ON_ONCE(buffer_unwritten(&bh) || buffer_new(&bh)); } bdev = bh.b_bdev; @@ -953,14 +948,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, dax_pmd_dbg(&bh, address, "pfn not in memmap"); goto fallback; } - - if (buffer_unwritten(&bh) || buffer_new(&bh)) { - clear_pmem(dax.addr, PMD_SIZE); - wmb_pmem(); - count_vm_event(PGMAJFAULT); - mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT); - result |= VM_FAULT_MAJOR; - } dax_unmap_atomic(bdev, &dax); /* -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 4/7] dax: Remove dead zeroing code from fault handlers @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Now that all filesystems zero out blocks allocated for a fault handler, we can just remove the zeroing from the handler itself. Also add checks that no filesystem returns to us unwritten or new buffer. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index c5ccf745d279..ccb8bc399d78 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -587,11 +587,6 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, error = PTR_ERR(dax.addr); goto out; } - - if (buffer_unwritten(bh) || buffer_new(bh)) { - clear_pmem(dax.addr, PAGE_SIZE); - wmb_pmem(); - } dax_unmap_atomic(bdev, &dax); error = dax_radix_entry(mapping, vmf->pgoff, dax.sector, false, @@ -670,7 +665,7 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, if (error) goto unlock_page; - if (!buffer_mapped(&bh) && !buffer_unwritten(&bh) && !vmf->cow_page) { + if (!buffer_mapped(&bh) && !vmf->cow_page) { if (vmf->flags & FAULT_FLAG_WRITE) { error = get_block(inode, block, &bh, 1); count_vm_event(PGMAJFAULT); @@ -722,7 +717,7 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, } /* Filesystem should not return unwritten buffers to us! */ - WARN_ON_ONCE(buffer_unwritten(&bh)); + WARN_ON_ONCE(buffer_unwritten(&bh) || buffer_new(&bh)); error = dax_insert_mapping(inode, &bh, vma, vmf); out: @@ -854,7 +849,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (get_block(inode, block, &bh, 1) != 0) return VM_FAULT_SIGBUS; alloc = true; - WARN_ON_ONCE(buffer_unwritten(&bh)); + WARN_ON_ONCE(buffer_unwritten(&bh) || buffer_new(&bh)); } bdev = bh.b_bdev; @@ -953,14 +948,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, dax_pmd_dbg(&bh, address, "pfn not in memmap"); goto fallback; } ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 4/7] dax: Remove dead zeroing code from fault handlers @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Now that all filesystems zero out blocks allocated for a fault handler, we can just remove the zeroing from the handler itself. Also add checks that no filesystem returns to us unwritten or new buffer. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index c5ccf745d279..ccb8bc399d78 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -587,11 +587,6 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, error = PTR_ERR(dax.addr); goto out; } - - if (buffer_unwritten(bh) || buffer_new(bh)) { - clear_pmem(dax.addr, PAGE_SIZE); - wmb_pmem(); - } dax_unmap_atomic(bdev, &dax); error = dax_radix_entry(mapping, vmf->pgoff, dax.sector, false, @@ -670,7 +665,7 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, if (error) goto unlock_page; - if (!buffer_mapped(&bh) && !buffer_unwritten(&bh) && !vmf->cow_page) { + if (!buffer_mapped(&bh) && !vmf->cow_page) { if (vmf->flags & FAULT_FLAG_WRITE) { error = get_block(inode, block, &bh, 1); count_vm_event(PGMAJFAULT); @@ -722,7 +717,7 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, } /* Filesystem should not return unwritten buffers to us! */ - WARN_ON_ONCE(buffer_unwritten(&bh)); + WARN_ON_ONCE(buffer_unwritten(&bh) || buffer_new(&bh)); error = dax_insert_mapping(inode, &bh, vma, vmf); out: @@ -854,7 +849,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (get_block(inode, block, &bh, 1) != 0) return VM_FAULT_SIGBUS; alloc = true; - WARN_ON_ONCE(buffer_unwritten(&bh)); + WARN_ON_ONCE(buffer_unwritten(&bh) || buffer_new(&bh)); } bdev = bh.b_bdev; @@ -953,14 +948,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, dax_pmd_dbg(&bh, address, "pfn not in memmap"); goto fallback; } - - if (buffer_unwritten(&bh) || buffer_new(&bh)) { - clear_pmem(dax.addr, PMD_SIZE); - wmb_pmem(); - count_vm_event(PGMAJFAULT); - mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT); - result |= VM_FAULT_MAJOR; - } dax_unmap_atomic(bdev, &dax); /* -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 5/7] dax: Remove zeroing from dax_io() 2016-05-11 9:58 ` Jan Kara @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 All the filesystems are now zeroing blocks themselves for DAX IO to avoid races between dax_io() and dax_fault(). Remove the zeroing code from dax_io() and add warning to catch the case when somebody unexpectedly returns new or unwritten buffer. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 28 ++++++++++------------------ 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index ccb8bc399d78..7c0036dd1570 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -119,18 +119,6 @@ int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size) } EXPORT_SYMBOL_GPL(dax_clear_sectors); -/* the clear_pmem() calls are ordered by a wmb_pmem() in the caller */ -static void dax_new_buf(void __pmem *addr, unsigned size, unsigned first, - loff_t pos, loff_t end) -{ - loff_t final = end - pos + first; /* The final byte of the buffer */ - - if (first > 0) - clear_pmem(addr, first); - if (final < size) - clear_pmem(addr + final, size - final); -} - static bool buffer_written(struct buffer_head *bh) { return buffer_mapped(bh) && !buffer_unwritten(bh); @@ -169,6 +157,9 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, struct blk_dax_ctl dax = { .addr = (void __pmem *) ERR_PTR(-EIO), }; + unsigned blkbits = inode->i_blkbits; + sector_t file_blks = (i_size_read(inode) + (1 << blkbits) - 1) + >> blkbits; if (rw == READ) end = min(end, i_size_read(inode)); @@ -176,7 +167,6 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, while (pos < end) { size_t len; if (pos == max) { - unsigned blkbits = inode->i_blkbits; long page = pos >> PAGE_SHIFT; sector_t block = page << (PAGE_SHIFT - blkbits); unsigned first = pos - (block << blkbits); @@ -192,6 +182,13 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, bh->b_size = 1 << blkbits; bh_max = pos - first + bh->b_size; bdev = bh->b_bdev; + /* + * We allow uninitialized buffers for writes + * beyond EOF as those cannot race with faults + */ + WARN_ON_ONCE( + (buffer_new(bh) && block < file_blks) || + (rw == WRITE && buffer_unwritten(bh))); } else { unsigned done = bh->b_size - (bh_max - (pos - first)); @@ -211,11 +208,6 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, rc = map_len; break; } - if (buffer_unwritten(bh) || buffer_new(bh)) { - dax_new_buf(dax.addr, map_len, first, - pos, end); - need_wmb = true; - } dax.addr += first; size = map_len - first; } -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 5/7] dax: Remove zeroing from dax_io() @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara All the filesystems are now zeroing blocks themselves for DAX IO to avoid races between dax_io() and dax_fault(). Remove the zeroing code from dax_io() and add warning to catch the case when somebody unexpectedly returns new or unwritten buffer. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 28 ++++++++++------------------ 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index ccb8bc399d78..7c0036dd1570 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -119,18 +119,6 @@ int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size) } EXPORT_SYMBOL_GPL(dax_clear_sectors); -/* the clear_pmem() calls are ordered by a wmb_pmem() in the caller */ -static void dax_new_buf(void __pmem *addr, unsigned size, unsigned first, - loff_t pos, loff_t end) -{ - loff_t final = end - pos + first; /* The final byte of the buffer */ - - if (first > 0) - clear_pmem(addr, first); - if (final < size) - clear_pmem(addr + final, size - final); -} - static bool buffer_written(struct buffer_head *bh) { return buffer_mapped(bh) && !buffer_unwritten(bh); @@ -169,6 +157,9 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, struct blk_dax_ctl dax = { .addr = (void __pmem *) ERR_PTR(-EIO), }; + unsigned blkbits = inode->i_blkbits; + sector_t file_blks = (i_size_read(inode) + (1 << blkbits) - 1) + >> blkbits; if (rw == READ) end = min(end, i_size_read(inode)); @@ -176,7 +167,6 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, while (pos < end) { size_t len; if (pos == max) { - unsigned blkbits = inode->i_blkbits; long page = pos >> PAGE_SHIFT; sector_t block = page << (PAGE_SHIFT - blkbits); unsigned first = pos - (block << blkbits); @@ -192,6 +182,13 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, bh->b_size = 1 << blkbits; bh_max = pos - first + bh->b_size; bdev = bh->b_bdev; + /* + * We allow uninitialized buffers for writes + * beyond EOF as those cannot race with faults + */ + WARN_ON_ONCE( + (buffer_new(bh) && block < file_blks) || + (rw == WRITE && buffer_unwritten(bh))); } else { unsigned done = bh->b_size - (bh_max - (pos - first)); @@ -211,11 +208,6 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, rc = map_len; break; } - if (buffer_unwritten(bh) || buffer_new(bh)) { - dax_new_buf(dax.addr, map_len, first, - pos, end); - need_wmb = true; - } dax.addr += first; size = map_len - first; } -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 6/7] dax: Remove pointless writeback from dax_do_io() 2016-05-11 9:58 ` Jan Kara @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 dax_do_io() is calling filemap_write_and_wait() if DIO_LOCKING flags is set. Presumably this was copied over from direct IO code. However DAX inodes have no pagecache pages to write so the call is pointless. Remove it. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 7c0036dd1570..237581441bc1 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -268,15 +268,8 @@ ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, memset(&bh, 0, sizeof(bh)); bh.b_bdev = inode->i_sb->s_bdev; - if ((flags & DIO_LOCKING) && iov_iter_rw(iter) == READ) { - struct address_space *mapping = inode->i_mapping; + if ((flags & DIO_LOCKING) && iov_iter_rw(iter) == READ) inode_lock(inode); - retval = filemap_write_and_wait_range(mapping, pos, end - 1); - if (retval) { - inode_unlock(inode); - goto out; - } - } /* Protects against truncate */ if (!(flags & DIO_SKIP_DIO_COUNT)) @@ -297,7 +290,6 @@ ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, if (!(flags & DIO_SKIP_DIO_COUNT)) inode_dio_end(inode); - out: return retval; } EXPORT_SYMBOL_GPL(dax_do_io); -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 6/7] dax: Remove pointless writeback from dax_do_io() @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara dax_do_io() is calling filemap_write_and_wait() if DIO_LOCKING flags is set. Presumably this was copied over from direct IO code. However DAX inodes have no pagecache pages to write so the call is pointless. Remove it. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 7c0036dd1570..237581441bc1 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -268,15 +268,8 @@ ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, memset(&bh, 0, sizeof(bh)); bh.b_bdev = inode->i_sb->s_bdev; - if ((flags & DIO_LOCKING) && iov_iter_rw(iter) == READ) { - struct address_space *mapping = inode->i_mapping; + if ((flags & DIO_LOCKING) && iov_iter_rw(iter) == READ) inode_lock(inode); - retval = filemap_write_and_wait_range(mapping, pos, end - 1); - if (retval) { - inode_unlock(inode); - goto out; - } - } /* Protects against truncate */ if (!(flags & DIO_SKIP_DIO_COUNT)) @@ -297,7 +290,6 @@ ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, if (!(flags & DIO_SKIP_DIO_COUNT)) inode_dio_end(inode); - out: return retval; } EXPORT_SYMBOL_GPL(dax_do_io); -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 7/7] dax: Remove redundant inode size checks 2016-05-11 9:58 ` Jan Kara (?) @ 2016-05-11 9:58 ` Jan Kara -1 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm; +Cc: Ted Tso, linux-fsdevel, Jan Kara, linux-ext4 Callers of dax fault handlers must make sure these calls cannot race with truncate. Thus it is enough to check inode size when entering the function and we don't have to recheck it again later in the handler. Note that inode size itself can be decreased while the fault handler runs but filesystem locking prevents against any radix tree or block mapping information changes resulting from the truncate and that is what we really care about. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 60 +----------------------------------------------------------- 1 file changed, 1 insertion(+), 59 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 237581441bc1..9bc6624251b4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -305,20 +305,11 @@ EXPORT_SYMBOL_GPL(dax_do_io); static int dax_load_hole(struct address_space *mapping, struct page *page, struct vm_fault *vmf) { - unsigned long size; - struct inode *inode = mapping->host; if (!page) page = find_or_create_page(mapping, vmf->pgoff, GFP_KERNEL | __GFP_ZERO); if (!page) return VM_FAULT_OOM; - /* Recheck i_size under page lock to avoid truncate race */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (vmf->pgoff >= size) { - unlock_page(page); - put_page(page); - return VM_FAULT_SIGBUS; - } vmf->page = page; return VM_FAULT_LOCKED; @@ -549,24 +540,10 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, .sector = to_sector(bh, inode), .size = bh->b_size, }; - pgoff_t size; int error; i_mmap_lock_read(mapping); - /* - * Check truncate didn't happen while we were allocating a block. - * If it did, this block may or may not be still allocated to the - * file. We can't tell the filesystem to free it because we can't - * take i_mutex here. In the worst case, the file still has blocks - * allocated past the end of the file. - */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (unlikely(vmf->pgoff >= size)) { - error = -EIO; - goto out; - } - if (dax_map_atomic(bdev, &dax) < 0) { error = PTR_ERR(dax.addr); goto out; @@ -632,15 +609,6 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, put_page(page); goto repeat; } - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (unlikely(vmf->pgoff >= size)) { - /* - * We have a struct page covering a hole in the file - * from a read fault and we've raced with a truncate - */ - error = -EIO; - goto unlock_page; - } } error = get_block(inode, block, &bh, 0); @@ -673,17 +641,8 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, if (error) goto unlock_page; vmf->page = page; - if (!page) { + if (!page) i_mmap_lock_read(mapping); - /* Check we didn't race with truncate */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> - PAGE_SHIFT; - if (vmf->pgoff >= size) { - i_mmap_unlock_read(mapping); - error = -EIO; - goto out; - } - } return VM_FAULT_LOCKED; } @@ -861,23 +820,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, i_mmap_lock_read(mapping); - /* - * If a truncate happened while we were allocating blocks, we may - * leave blocks allocated to the file that are beyond EOF. We can't - * take i_mutex here, so just leave them hanging; they'll be freed - * when the file is deleted. - */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (pgoff >= size) { - result = VM_FAULT_SIGBUS; - goto out; - } - if ((pgoff | PG_PMD_COLOUR) >= size) { - dax_pmd_dbg(&bh, address, - "offset + huge page size > file size"); - goto fallback; - } - if (!write && !buffer_mapped(&bh) && buffer_uptodate(&bh)) { spinlock_t *ptl; pmd_t entry; -- 2.6.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 7/7] dax: Remove redundant inode size checks @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Callers of dax fault handlers must make sure these calls cannot race with truncate. Thus it is enough to check inode size when entering the function and we don't have to recheck it again later in the handler. Note that inode size itself can be decreased while the fault handler runs but filesystem locking prevents against any radix tree or block mapping information changes resulting from the truncate and that is what we really care about. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 60 +----------------------------------------------------------- 1 file changed, 1 insertion(+), 59 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 237581441bc1..9bc6624251b4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -305,20 +305,11 @@ EXPORT_SYMBOL_GPL(dax_do_io); static int dax_load_hole(struct address_space *mapping, struct page *page, struct vm_fault *vmf) { - unsigned long size; - struct inode *inode = mapping->host; if (!page) page = find_or_create_page(mapping, vmf->pgoff, GFP_KERNEL | __GFP_ZERO); if (!page) return VM_FAULT_OOM; - /* Recheck i_size under page lock to avoid truncate race */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (vmf->pgoff >= size) { - unlock_page(page); - put_page(page); - return VM_FAULT_SIGBUS; - } vmf->page = page; return VM_FAULT_LOCKED; @@ -549,24 +540,10 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, .sector = to_sector(bh, inode), .size = bh->b_size, }; - pgoff_t size; int error; i_mmap_lock_read(mapping); - /* - * Check truncate didn't happen while we were allocating a block. - * If it did, this block may or may not be still allocated to the - * file. We can't tell the filesystem to free it because we can't - * take i_mutex here. In the worst case, the file still has blocks - * allocated past the end of the file. - */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (unlikely(vmf->pgoff >= size)) { - error = -EIO; - goto out; - } - if (dax_map_atomic(bdev, &dax) < 0) { error = PTR_ERR(dax.addr); goto out; @@ -632,15 +609,6 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, put_page(page); goto repeat; } - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (unlikely(vmf->pgoff >= size)) { - /* - * We have a struct page covering a hole in the file - * from a read fault and we've raced with a truncate - */ - error = -EIO; - goto unlock_page; - } } error = get_block(inode, block, &bh, 0); @@ -673,17 +641,8 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, if (error) goto unlock_page; vmf->page = page; - if (!page) { + if (!page) i_mmap_lock_read(mapping); - /* Check we didn't race with truncate */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> - PAGE_SHIFT; - if (vmf->pgoff >= size) { - i_mmap_unlock_read(mapping); - error = -EIO; - goto out; - } - } return VM_FAULT_LOCKED; } @@ -861,23 +820,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, i_mmap_lock_read(mapping); - /* - * If a truncate happened while we were allocating blocks, we may - * leave blocks allocated to the file that are beyond EOF. We can't - * take i_mutex here, so just leave them hanging; they'll be freed - * when the file is deleted. - */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (pgoff >= size) { - result = VM_FAULT_SIGBUS; - goto out; - } - if ((pgoff | PG_PMD_COLOUR) >= size) { - dax_pmd_dbg(&bh, address, - "offset + huge page size > file size"); - goto fallback; - } ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 7/7] dax: Remove redundant inode size checks @ 2016-05-11 9:58 ` Jan Kara 0 siblings, 0 replies; 30+ messages in thread From: Jan Kara @ 2016-05-11 9:58 UTC (permalink / raw) To: linux-nvdimm Cc: Ted Tso, linux-ext4, Vishal Verma, Ross Zwisler, Dan Williams, linux-fsdevel, Jan Kara Callers of dax fault handlers must make sure these calls cannot race with truncate. Thus it is enough to check inode size when entering the function and we don't have to recheck it again later in the handler. Note that inode size itself can be decreased while the fault handler runs but filesystem locking prevents against any radix tree or block mapping information changes resulting from the truncate and that is what we really care about. Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 60 +----------------------------------------------------------- 1 file changed, 1 insertion(+), 59 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 237581441bc1..9bc6624251b4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -305,20 +305,11 @@ EXPORT_SYMBOL_GPL(dax_do_io); static int dax_load_hole(struct address_space *mapping, struct page *page, struct vm_fault *vmf) { - unsigned long size; - struct inode *inode = mapping->host; if (!page) page = find_or_create_page(mapping, vmf->pgoff, GFP_KERNEL | __GFP_ZERO); if (!page) return VM_FAULT_OOM; - /* Recheck i_size under page lock to avoid truncate race */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (vmf->pgoff >= size) { - unlock_page(page); - put_page(page); - return VM_FAULT_SIGBUS; - } vmf->page = page; return VM_FAULT_LOCKED; @@ -549,24 +540,10 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh, .sector = to_sector(bh, inode), .size = bh->b_size, }; - pgoff_t size; int error; i_mmap_lock_read(mapping); - /* - * Check truncate didn't happen while we were allocating a block. - * If it did, this block may or may not be still allocated to the - * file. We can't tell the filesystem to free it because we can't - * take i_mutex here. In the worst case, the file still has blocks - * allocated past the end of the file. - */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (unlikely(vmf->pgoff >= size)) { - error = -EIO; - goto out; - } - if (dax_map_atomic(bdev, &dax) < 0) { error = PTR_ERR(dax.addr); goto out; @@ -632,15 +609,6 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, put_page(page); goto repeat; } - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (unlikely(vmf->pgoff >= size)) { - /* - * We have a struct page covering a hole in the file - * from a read fault and we've raced with a truncate - */ - error = -EIO; - goto unlock_page; - } } error = get_block(inode, block, &bh, 0); @@ -673,17 +641,8 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf, if (error) goto unlock_page; vmf->page = page; - if (!page) { + if (!page) i_mmap_lock_read(mapping); - /* Check we didn't race with truncate */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> - PAGE_SHIFT; - if (vmf->pgoff >= size) { - i_mmap_unlock_read(mapping); - error = -EIO; - goto out; - } - } return VM_FAULT_LOCKED; } @@ -861,23 +820,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, i_mmap_lock_read(mapping); - /* - * If a truncate happened while we were allocating blocks, we may - * leave blocks allocated to the file that are beyond EOF. We can't - * take i_mutex here, so just leave them hanging; they'll be freed - * when the file is deleted. - */ - size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT; - if (pgoff >= size) { - result = VM_FAULT_SIGBUS; - goto out; - } - if ((pgoff | PG_PMD_COLOUR) >= size) { - dax_pmd_dbg(&bh, address, - "offset + huge page size > file size"); - goto fallback; - } - if (!write && !buffer_mapped(&bh) && buffer_uptodate(&bh)) { spinlock_t *ptl; pmd_t entry; -- 2.6.6 ^ permalink raw reply related [flat|nested] 30+ messages in thread
end of thread, other threads:[~2016-05-17 17:56 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-11 9:58 [PATCH 0/7 v4] DAX cleanups and fixes Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` [PATCH 1/7] DAX: move RADIX_DAX_ definitions to dax.c Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` [PATCH 2/7] dax: Remove complete_unwritten argument Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` [PATCH 3/7] ext2: Avoid DAX zeroing to corrupt data Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-12 18:45 ` Ross Zwisler 2016-05-12 18:45 ` Ross Zwisler 2016-05-16 15:22 ` Jan Kara 2016-05-16 15:22 ` Jan Kara 2016-05-17 6:52 ` Verma, Vishal L 2016-05-17 6:52 ` Verma, Vishal L 2016-05-17 7:19 ` Jan Kara 2016-05-17 7:19 ` Jan Kara 2016-05-17 7:19 ` Jan Kara 2016-05-17 17:56 ` Verma, Vishal L 2016-05-17 17:56 ` Verma, Vishal L 2016-05-11 9:58 ` [PATCH 4/7] dax: Remove dead zeroing code from fault handlers Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` [PATCH 5/7] dax: Remove zeroing from dax_io() Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` [PATCH 6/7] dax: Remove pointless writeback from dax_do_io() Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` [PATCH 7/7] dax: Remove redundant inode size checks Jan Kara 2016-05-11 9:58 ` Jan Kara 2016-05-11 9:58 ` Jan Kara
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.