* [PATCH 1/4] mm: Introduce and use page_cache_empty
2020-08-04 16:17 [PATCH 0/4] Remove nrexceptional tracking Matthew Wilcox (Oracle)
@ 2020-08-04 16:17 ` Matthew Wilcox (Oracle)
2020-08-06 23:24 ` Kirill A. Shutemov
2020-08-04 16:17 ` [PATCH 2/4] mm: Stop accounting shadow entries Matthew Wilcox (Oracle)
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-08-04 16:17 UTC (permalink / raw)
To: linux-mm, linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-nvdimm
Instead of checking the two counters (nrpages and nrexceptional), we
can just check whether i_pages is empty.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/block_dev.c | 2 +-
fs/dax.c | 2 +-
include/linux/pagemap.h | 5 +++++
mm/truncate.c | 18 +++---------------
4 files changed, 10 insertions(+), 17 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 0ae656e022fd..2a77bd2c6144 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -79,7 +79,7 @@ static void kill_bdev(struct block_device *bdev)
{
struct address_space *mapping = bdev->bd_inode->i_mapping;
- if (mapping->nrpages == 0 && mapping->nrexceptional == 0)
+ if (page_cache_empty(mapping))
return;
invalidate_bh_lrus();
diff --git a/fs/dax.c b/fs/dax.c
index 11b16729b86f..2f75ee2cd41f 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -949,7 +949,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,
if (WARN_ON_ONCE(inode->i_blkbits != PAGE_SHIFT))
return -EIO;
- if (!mapping->nrexceptional || wbc->sync_mode != WB_SYNC_ALL)
+ if (page_cache_empty(mapping) || wbc->sync_mode != WB_SYNC_ALL)
return 0;
trace_dax_writeback_range(inode, xas.xa_index, end_index);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 484a36185bb5..a474a92a2a72 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -18,6 +18,11 @@
struct pagevec;
+static inline bool page_cache_empty(struct address_space *mapping)
+{
+ return xa_empty(&mapping->i_pages);
+}
+
/*
* Bits in mapping->flags.
*/
diff --git a/mm/truncate.c b/mm/truncate.c
index dd9ebc1da356..7c4c8ac140be 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -300,7 +300,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
pgoff_t index;
int i;
- if (mapping->nrpages == 0 && mapping->nrexceptional == 0)
+ if (page_cache_empty(mapping))
goto out;
/* Offsets within partial pages */
@@ -488,9 +488,6 @@ EXPORT_SYMBOL(truncate_inode_pages);
*/
void truncate_inode_pages_final(struct address_space *mapping)
{
- unsigned long nrexceptional;
- unsigned long nrpages;
-
/*
* Page reclaim can not participate in regular inode lifetime
* management (can't call iput()) and thus can race with the
@@ -500,16 +497,7 @@ void truncate_inode_pages_final(struct address_space *mapping)
*/
mapping_set_exiting(mapping);
- /*
- * When reclaim installs eviction entries, it increases
- * nrexceptional first, then decreases nrpages. Make sure we see
- * this in the right order or we might miss an entry.
- */
- nrpages = mapping->nrpages;
- smp_rmb();
- nrexceptional = mapping->nrexceptional;
-
- if (nrpages || nrexceptional) {
+ if (!page_cache_empty(mapping)) {
/*
* As truncation uses a lockless tree lookup, cycle
* the tree lock to make sure any ongoing tree
@@ -692,7 +680,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
int ret2 = 0;
int did_range_unmap = 0;
- if (mapping->nrpages == 0 && mapping->nrexceptional == 0)
+ if (page_cache_empty(mapping))
goto out;
pagevec_init(&pvec);
--
2.27.0
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] mm: Introduce and use page_cache_empty
2020-08-04 16:17 ` [PATCH 1/4] mm: Introduce and use page_cache_empty Matthew Wilcox (Oracle)
@ 2020-08-06 23:24 ` Kirill A. Shutemov
2020-08-16 1:48 ` Matthew Wilcox
0 siblings, 1 reply; 11+ messages in thread
From: Kirill A. Shutemov @ 2020-08-06 23:24 UTC (permalink / raw)
To: Matthew Wilcox (Oracle); +Cc: linux-mm, linux-fsdevel, linux-nvdimm
On Tue, Aug 04, 2020 at 05:17:52PM +0100, Matthew Wilcox (Oracle) wrote:
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 484a36185bb5..a474a92a2a72 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -18,6 +18,11 @@
>
> struct pagevec;
>
> +static inline bool page_cache_empty(struct address_space *mapping)
> +{
> + return xa_empty(&mapping->i_pages);
What about something like
bool empty = xa_empty(&mapping->i_pages);
VM_BUG_ON(empty && mapping->nrpages);
return empty;
?
> +}
> +
> /*
> * Bits in mapping->flags.
> */
--
Kirill A. Shutemov
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] mm: Introduce and use page_cache_empty
2020-08-06 23:24 ` Kirill A. Shutemov
@ 2020-08-16 1:48 ` Matthew Wilcox
0 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2020-08-16 1:48 UTC (permalink / raw)
To: Kirill A. Shutemov; +Cc: linux-mm, linux-fsdevel, linux-nvdimm
On Fri, Aug 07, 2020 at 02:24:00AM +0300, Kirill A. Shutemov wrote:
> On Tue, Aug 04, 2020 at 05:17:52PM +0100, Matthew Wilcox (Oracle) wrote:
> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index 484a36185bb5..a474a92a2a72 100644
> > --- a/include/linux/pagemap.h
> > +++ b/include/linux/pagemap.h
> > @@ -18,6 +18,11 @@
> >
> > struct pagevec;
> >
> > +static inline bool page_cache_empty(struct address_space *mapping)
> > +{
> > + return xa_empty(&mapping->i_pages);
>
> What about something like
>
> bool empty = xa_empty(&mapping->i_pages);
> VM_BUG_ON(empty && mapping->nrpages);
> return empty;
I tried this and it's triggered by generic/418. The problem
is that it's called when the pagecache lock isn't held (by
invalidate_inode_pages2_range), so it's possible for xa_empty() to
return true, then a page be added to the page cache, and mapping->pages
be incremented to 1. That seems to be what's happened here:
(gdb) p/x *(struct address_space *)0xffff88804b21b360
$2 = {host = 0xffff88804b21b200, i_pages = {xa_lock = {{rlock = {raw_lock = {{
val = {counter = 0x0}, {locked = 0x0, pending = 0x0}, {
locked_pending = 0x0, tail = 0x0}}}}}}, xa_flags = 0x21,
* xa_head = 0xffffea0001e187c0}, gfp_mask = 0x100c4a, i_mmap_writable = {
counter = 0x0}, nr_thps = {counter = 0x0}, i_mmap = {rb_root = {
rb_node = 0x0}, rb_leftmost = 0x0}, i_mmap_rwsem = {count = {
counter = 0x0}, owner = {counter = 0x0}, osq = {tail = {counter = 0x0}},
wait_lock = {raw_lock = {{val = {counter = 0x0}, {locked = 0x0,
pending = 0x0}, {locked_pending = 0x0, tail = 0x0}}}},
wait_list = {next = 0xffff88804b21b3b0, prev = 0xffff88804b21b3b0}},
* nrpages = 0x1, writeback_index = 0x0, a_ops = 0xffffffff81c2ed60,
flags = 0x40, wb_err = 0x0, private_lock = {{rlock = {raw_lock = {{val = {
counter = 0x0}, {locked = 0x0, pending = 0x0}, {
locked_pending = 0x0, tail = 0x0}}}}}}, private_list = {
next = 0xffff88804b21b3e8, prev = 0xffff88804b21b3e8}, private_data = 0x0}
(marked the critical lines with *)
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/4] mm: Stop accounting shadow entries
2020-08-04 16:17 [PATCH 0/4] Remove nrexceptional tracking Matthew Wilcox (Oracle)
2020-08-04 16:17 ` [PATCH 1/4] mm: Introduce and use page_cache_empty Matthew Wilcox (Oracle)
@ 2020-08-04 16:17 ` Matthew Wilcox (Oracle)
2020-08-04 16:17 ` [PATCH 3/4] dax: Account DAX entries as nrpages Matthew Wilcox (Oracle)
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-08-04 16:17 UTC (permalink / raw)
To: linux-mm, linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-nvdimm
We no longer need to keep track of how many shadow entries are
present in a mapping. This saves a few writes to the inode and
memory barriers.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
mm/filemap.c | 12 ------------
mm/truncate.c | 1 -
mm/workingset.c | 1 -
3 files changed, 14 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 76383d558b7c..7c3f97bd6dcd 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -139,17 +139,6 @@ static void page_cache_delete(struct address_space *mapping,
page->mapping = NULL;
/* Leave page->index set: truncation lookup relies upon it */
-
- if (shadow) {
- mapping->nrexceptional += nr;
- /*
- * Make sure the nrexceptional update is committed before
- * the nrpages update so that final truncate racing
- * with reclaim does not see both counters 0 at the
- * same time and miss a shadow entry.
- */
- smp_wmb();
- }
mapping->nrpages -= nr;
}
@@ -860,7 +849,6 @@ static int __add_to_page_cache_locked(struct page *page,
goto unlock;
if (xa_is_value(old)) {
- mapping->nrexceptional--;
if (shadowp)
*shadowp = old;
}
diff --git a/mm/truncate.c b/mm/truncate.c
index 7c4c8ac140be..a59184793607 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -40,7 +40,6 @@ static inline void __clear_shadow_entry(struct address_space *mapping,
if (xas_load(&xas) != entry)
return;
xas_store(&xas, NULL);
- mapping->nrexceptional--;
}
static void clear_shadow_entry(struct address_space *mapping, pgoff_t index,
diff --git a/mm/workingset.c b/mm/workingset.c
index fdeabea54e77..0649bfb1ca33 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -547,7 +547,6 @@ static enum lru_status shadow_lru_isolate(struct list_head *item,
goto out_invalid;
if (WARN_ON_ONCE(node->count != node->nr_values))
goto out_invalid;
- mapping->nrexceptional -= node->nr_values;
xas.xa_node = xa_parent_locked(&mapping->i_pages, node);
xas.xa_offset = node->offset;
xas.xa_shift = node->shift + XA_CHUNK_SHIFT;
--
2.27.0
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] dax: Account DAX entries as nrpages
2020-08-04 16:17 [PATCH 0/4] Remove nrexceptional tracking Matthew Wilcox (Oracle)
2020-08-04 16:17 ` [PATCH 1/4] mm: Introduce and use page_cache_empty Matthew Wilcox (Oracle)
2020-08-04 16:17 ` [PATCH 2/4] mm: Stop accounting shadow entries Matthew Wilcox (Oracle)
@ 2020-08-04 16:17 ` Matthew Wilcox (Oracle)
2020-08-04 16:17 ` [PATCH 4/4] mm: Remove nrexceptional from inode Matthew Wilcox (Oracle)
2020-08-06 19:44 ` [PATCH 0/4] Remove nrexceptional tracking Verma, Vishal L
4 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-08-04 16:17 UTC (permalink / raw)
To: linux-mm, linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-nvdimm
Simplify mapping_needs_writeback() by accounting DAX entries as
pages instead of exceptional entries.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/dax.c | 6 +++---
mm/filemap.c | 3 ---
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 2f75ee2cd41f..71ddab04d91d 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -525,7 +525,7 @@ static void *grab_mapping_entry(struct xa_state *xas,
dax_disassociate_entry(entry, mapping, false);
xas_store(xas, NULL); /* undo the PMD join */
dax_wake_entry(xas, entry, true);
- mapping->nrexceptional--;
+ mapping->nrpages -= PG_PMD_NR;
entry = NULL;
xas_set(xas, index);
}
@@ -541,7 +541,7 @@ static void *grab_mapping_entry(struct xa_state *xas,
dax_lock_entry(xas, entry);
if (xas_error(xas))
goto out_unlock;
- mapping->nrexceptional++;
+ mapping->nrpages += 1UL << order;
}
out_unlock:
@@ -644,7 +644,7 @@ static int __dax_invalidate_entry(struct address_space *mapping,
goto out;
dax_disassociate_entry(entry, mapping, trunc);
xas_store(&xas, NULL);
- mapping->nrexceptional--;
+ mapping->nrpages -= dax_entry_order(entry);
ret = 1;
out:
put_unlocked_entry(&xas, entry);
diff --git a/mm/filemap.c b/mm/filemap.c
index 7c3f97bd6dcd..ef8ecfba7f9b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -615,9 +615,6 @@ EXPORT_SYMBOL(filemap_fdatawait_keep_errors);
/* Returns true if writeback might be needed or already in progress. */
static bool mapping_needs_writeback(struct address_space *mapping)
{
- if (dax_mapping(mapping))
- return mapping->nrexceptional;
-
return mapping->nrpages;
}
--
2.27.0
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] mm: Remove nrexceptional from inode
2020-08-04 16:17 [PATCH 0/4] Remove nrexceptional tracking Matthew Wilcox (Oracle)
` (2 preceding siblings ...)
2020-08-04 16:17 ` [PATCH 3/4] dax: Account DAX entries as nrpages Matthew Wilcox (Oracle)
@ 2020-08-04 16:17 ` Matthew Wilcox (Oracle)
2020-08-06 19:44 ` [PATCH 0/4] Remove nrexceptional tracking Verma, Vishal L
4 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox (Oracle) @ 2020-08-04 16:17 UTC (permalink / raw)
To: linux-mm, linux-fsdevel; +Cc: Matthew Wilcox (Oracle), linux-nvdimm
We no longer track anything in nrexceptional, so remove it, saving 8
bytes per inode.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
fs/inode.c | 2 +-
include/linux/fs.h | 2 --
2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index 72c4c347afb7..b5c4aff077b7 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -528,7 +528,7 @@ void clear_inode(struct inode *inode)
*/
xa_lock_irq(&inode->i_data.i_pages);
BUG_ON(inode->i_data.nrpages);
- BUG_ON(inode->i_data.nrexceptional);
+ BUG_ON(!page_cache_empty(&inode->i_data));
xa_unlock_irq(&inode->i_data.i_pages);
BUG_ON(!list_empty(&inode->i_data.private_list));
BUG_ON(!(inode->i_state & I_FREEING));
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f5abba86107d..4fd8923cba43 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -436,7 +436,6 @@ int pagecache_write_end(struct file *, struct address_space *mapping,
* @i_mmap: Tree of private and shared mappings.
* @i_mmap_rwsem: Protects @i_mmap and @i_mmap_writable.
* @nrpages: Number of page entries, protected by the i_pages lock.
- * @nrexceptional: Shadow or DAX entries, protected by the i_pages lock.
* @writeback_index: Writeback starts here.
* @a_ops: Methods.
* @flags: Error bits and flags (AS_*).
@@ -457,7 +456,6 @@ struct address_space {
struct rb_root_cached i_mmap;
struct rw_semaphore i_mmap_rwsem;
unsigned long nrpages;
- unsigned long nrexceptional;
pgoff_t writeback_index;
const struct address_space_operations *a_ops;
unsigned long flags;
--
2.27.0
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] Remove nrexceptional tracking
2020-08-04 16:17 [PATCH 0/4] Remove nrexceptional tracking Matthew Wilcox (Oracle)
` (3 preceding siblings ...)
2020-08-04 16:17 ` [PATCH 4/4] mm: Remove nrexceptional from inode Matthew Wilcox (Oracle)
@ 2020-08-06 19:44 ` Verma, Vishal L
2020-08-06 20:16 ` Verma, Vishal L
4 siblings, 1 reply; 11+ messages in thread
From: Verma, Vishal L @ 2020-08-06 19:44 UTC (permalink / raw)
To: linux-mm, willy, linux-fsdevel; +Cc: linux-nvdimm
On Tue, 2020-08-04 at 17:17 +0100, Matthew Wilcox (Oracle) wrote:
> We actually use nrexceptional for very little these days. It's a
> constant
> source of pain with the THP patches because we don't know how large a
> shadow entry is, so either we have to ask the xarray how many indices
> it covers, or store that information in the shadow entry (and reduce
> the amount of other information in the shadow entry proportionally).
> While tracking down the most recent case of "evict tells me I've got
> the accounting wrong again", I wondered if it might not be simpler to
> just remove it. So here's a patch set to do just that. I think each
> of these patches is an improvement in isolation, but the combination
> of
> all four is larger than the sum of its parts.
>
> I'm running xfstests on this patchset right now. If one of the DAX
> people could try it out, that'd be fantastic.
>
> Matthew Wilcox (Oracle) (4):
> mm: Introduce and use page_cache_empty
> mm: Stop accounting shadow entries
> dax: Account DAX entries as nrpages
> mm: Remove nrexceptional from inode
Hi Matthew,
I applied these on top of 5.8 and ran them through the nvdimm unit test
suite, and saw some test failures. The first failing test signature is:
+ umount test_dax_mnt
./dax-ext4.sh: line 62: 15749 Segmentation fault umount $MNT
FAIL dax-ext4.sh (exit status: 139)
The line is: https://github.com/pmem/ndctl/blob/master/test/dax.sh#L79
And the failing umount happens right after 'run_test', which calls this:
https://github.com/pmem/ndctl/blob/master/test/dax-pmd.c
>
> fs/block_dev.c | 2 +-
> fs/dax.c | 8 ++++----
> fs/inode.c | 2 +-
> include/linux/fs.h | 2 --
> include/linux/pagemap.h | 5 +++++
> mm/filemap.c | 15 ---------------
> mm/truncate.c | 19 +++----------------
> mm/workingset.c | 1 -
> 8 files changed, 14 insertions(+), 40 deletions(-)
>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] Remove nrexceptional tracking
2020-08-06 19:44 ` [PATCH 0/4] Remove nrexceptional tracking Verma, Vishal L
@ 2020-08-06 20:16 ` Verma, Vishal L
2020-10-08 19:33 ` Matthew Wilcox
0 siblings, 1 reply; 11+ messages in thread
From: Verma, Vishal L @ 2020-08-06 20:16 UTC (permalink / raw)
To: linux-mm, willy, linux-fsdevel; +Cc: linux-nvdimm
On Thu, 2020-08-06 at 19:44 +0000, Verma, Vishal L wrote:
> >
> > I'm running xfstests on this patchset right now. If one of the DAX
> > people could try it out, that'd be fantastic.
> >
> > Matthew Wilcox (Oracle) (4):
> > mm: Introduce and use page_cache_empty
> > mm: Stop accounting shadow entries
> > dax: Account DAX entries as nrpages
> > mm: Remove nrexceptional from inode
>
> Hi Matthew,
>
> I applied these on top of 5.8 and ran them through the nvdimm unit test
> suite, and saw some test failures. The first failing test signature is:
>
> + umount test_dax_mnt
> ./dax-ext4.sh: line 62: 15749 Segmentation fault umount $MNT
> FAIL dax-ext4.sh (exit status: 139)
>
> The line is: https://github.com/pmem/ndctl/blob/master/test/dax.sh#L79
> And the failing umount happens right after 'run_test', which calls this:
> https://github.com/pmem/ndctl/blob/master/test/dax-pmd.c
>
>
And here is the associated kernel splat:
[ 89.570142] EXT4-fs (pmem4): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
[ 89.573407] EXT4-fs (pmem4): mounted filesystem with ordered data mode. Opts: dax
[ 90.450644] Injecting memory failure for pfn 0x148900 at process virtual address 0x7f2ec9b00000
[ 90.452421] Memory failure: 0x148900: Sending SIGBUS to dax-pmd:16886 due to hardware memory corruption
[ 90.454067] Memory failure: 0x148900: recovery action for dax page: Recovered
[ 91.656624] ------------[ cut here ]------------
[ 91.657822] kernel BUG at fs/inode.c:530!
[ 91.658850] invalid opcode: 0000 [#1] SMP PTI
[ 91.659883] CPU: 4 PID: 16891 Comm: umount Tainted: G O 5.8.0-00004-g6161e2d6ac48 #38
[ 91.661861] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 91.664920] RIP: 0010:clear_inode+0x78/0x90
[ 91.665934] Code: 00 a8 20 74 2b a8 40 75 29 48 8b 93 f0 01 00 00 48 8d 83 f0 01 00 00 48 39 c2 75 18 48 c7 83 e0 00 00 00 60 00 00 00 5b 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 66 66 2e 0f 1f 84 00 00 00 00
[ 91.669979] RSP: 0018:ffffc9000212fd98 EFLAGS: 00010002
[ 91.671151] RAX: 0000000000000004 RBX: ffff88810e33d470 RCX: 8c6318c6318c6319
[ 91.672689] RDX: ffff888110b48040 RSI: 0000000000000004 RDI: 0000000000000046
[ 91.674208] RBP: ffff88810e33d6b8 R08: 0000000000000000 R09: 0000000000000001
[ 91.675760] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88810e33d6b0
[ 91.677332] R13: ffff88811014c828 R14: ffff88811014c9c0 R15: ffff88810e3398c0
[ 91.678940] FS: 00007f86a7e74c80(0000) GS:ffff888117c00000(0000) knlGS:0000000000000000
[ 91.680745] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 91.682001] CR2: 000055bcfb4c3008 CR3: 0000000110b7c000 CR4: 00000000000006e0
[ 91.683581] Call Trace:
[ 91.684268] ext4_clear_inode+0x16/0x80
[ 91.685298] ext4_evict_inode+0x5f/0x610
[ 91.686245] evict+0xcf/0x1f0
[ 91.687017] dispose_list+0x48/0x70
[ 91.687937] evict_inodes+0x152/0x190
[ 91.688918] generic_shutdown_super+0x37/0x100
[ 91.689880] kill_block_super+0x21/0x50
[ 91.690784] deactivate_locked_super+0x36/0xa0
[ 91.691790] cleanup_mnt+0x12d/0x190
[ 91.692658] task_work_run+0x5c/0xa0
[ 91.694739] __prepare_exit_to_usermode+0x1b6/0x1c0
[ 91.695806] do_syscall_64+0x5e/0xb0
[ 91.696656] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 91.697798] RIP: 0033:0x7f86a80c694b
[ 91.698650] Code: 15 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1d 15 0c 00 f7 d8 64 89 01 48
[ 91.702545] RSP: 002b:00007ffd49a8cb98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 91.704204] RAX: 0000000000000000 RBX: 00007f86a81ee204 RCX: 00007f86a80c694b
[ 91.705732] RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000055bcfb4bf2a0
[ 91.707207] RBP: 000055bcfb4bafa0 R08: 0000000000000000 R09: 00007f86a8188a40
[ 91.708809] R10: 000055bcfb4c1550 R11: 0000000000000246 R12: 0000000000000000
[ 91.710040] R13: 000055bcfb4bf2a0 R14: 000055bcfb4bb0b0 R15: 000055bcfb4bb1d0
[ 91.711193] Modules linked in: nd_e820(O) nd_blk(O) nfit(O) kmem nd_pmem(O) nd_btt(O) dax_pmem(O) dax_pmem_core(O) libnvdimm(O) device_dax(O) nfit_test_iomap(O) [last unloaded: nfit_test_iomap]
[ 91.714154] ---[ end trace 016cb116e8654993 ]---
[ 91.715011] RIP: 0010:clear_inode+0x78/0x90
[ 91.715803] Code: 00 a8 20 74 2b a8 40 75 29 48 8b 93 f0 01 00 00 48 8d 83 f0 01 00 00 48 39 c2 75 18 48 c7 83 e0 00 00 00 60 00 00 00 5b 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 66 66 2e 0f 1f 84 00 00 00 00
[ 91.718950] RSP: 0018:ffffc9000212fd98 EFLAGS: 00010002
[ 91.719976] RAX: 0000000000000004 RBX: ffff88810e33d470 RCX: 8c6318c6318c6319
[ 91.721187] RDX: ffff888110b48040 RSI: 0000000000000004 RDI: 0000000000000046
[ 91.722473] RBP: ffff88810e33d6b8 R08: 0000000000000000 R09: 0000000000000001
[ 91.723633] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88810e33d6b0
[ 91.724838] R13: ffff88811014c828 R14: ffff88811014c9c0 R15: ffff88810e3398c0
[ 91.726109] FS: 00007f86a7e74c80(0000) GS:ffff888117c00000(0000) knlGS:0000000000000000
[ 91.727575] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 91.728596] CR2: 000055bcfb4c3008 CR3: 0000000110b7c000 CR4: 00000000000006e0
[ 91.729794] note: umount[16891] exited with preempt_count 1
[ 91.730746] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49
[ 91.732304] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 16891, name: umount
[ 91.733764] INFO: lockdep is turned off.
[ 91.734476] irq event stamp: 6192
[ 91.735067] hardirqs last enabled at (6191): [<ffffffff81c7a974>] _raw_spin_unlock_irq+0x24/0x40
[ 91.736641] hardirqs last disabled at (6192): [<ffffffff81c7ab27>] _raw_spin_lock_irq+0x17/0x80
[ 91.738141] softirqs last enabled at (6120): [<ffffffff8134b3b8>] bdi_split_work_to_wbs+0x248/0x580
[ 91.739752] softirqs last disabled at (6116): [<ffffffff81347a5d>] wb_queue_work+0x4d/0x180
[ 91.741297] CPU: 4 PID: 16891 Comm: umount Tainted: G D O 5.8.0-00004-g6161e2d6ac48 #38
[ 91.742940] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 91.744887] Call Trace:
[ 91.745401] dump_stack+0x92/0xc8
[ 91.745987] ___might_sleep.cold+0xb6/0xc6
[ 91.746716] exit_signals+0x1c/0x2d0
[ 91.747355] do_exit+0xc0/0xba0
[ 91.747942] ? __prepare_exit_to_usermode+0x1b6/0x1c0
[ 91.748821] rewind_stack_do_exit+0x17/0x20
[ 91.749619] RIP: 0033:0x7f86a80c694b
[ 91.750330] Code: 15 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1d 15 0c 00 f7 d8 64 89 01 48
[ 91.753544] RSP: 002b:00007ffd49a8cb98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 91.754902] RAX: 0000000000000000 RBX: 00007f86a81ee204 RCX: 00007f86a80c694b
[ 91.756176] RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000055bcfb4bf2a0
[ 91.757746] RBP: 000055bcfb4bafa0 R08: 0000000000000000 R09: 00007f86a8188a40
[ 91.759034] R10: 000055bcfb4c1550 R11: 0000000000000246 R12: 0000000000000000
[ 91.760166] R13: 000055bcfb4bf2a0 R14: 000055bcfb4bb0b0 R15: 000055bcfb4bb1d0
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] Remove nrexceptional tracking
2020-08-06 20:16 ` Verma, Vishal L
@ 2020-10-08 19:33 ` Matthew Wilcox
2020-10-09 23:04 ` Verma, Vishal L
0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2020-10-08 19:33 UTC (permalink / raw)
To: Verma, Vishal L; +Cc: linux-mm, linux-fsdevel, linux-nvdimm
On Thu, Aug 06, 2020 at 08:16:02PM +0000, Verma, Vishal L wrote:
> On Thu, 2020-08-06 at 19:44 +0000, Verma, Vishal L wrote:
> > >
> > > I'm running xfstests on this patchset right now. If one of the DAX
> > > people could try it out, that'd be fantastic.
> > >
> > > Matthew Wilcox (Oracle) (4):
> > > mm: Introduce and use page_cache_empty
> > > mm: Stop accounting shadow entries
> > > dax: Account DAX entries as nrpages
> > > mm: Remove nrexceptional from inode
> >
> > Hi Matthew,
> >
> > I applied these on top of 5.8 and ran them through the nvdimm unit test
> > suite, and saw some test failures. The first failing test signature is:
> >
> > + umount test_dax_mnt
> > ./dax-ext4.sh: line 62: 15749 Segmentation fault umount $MNT
> > FAIL dax-ext4.sh (exit status: 139)
Thanks. Fixed:
+++ b/fs/dax.c
@@ -644,7 +644,7 @@ static int __dax_invalidate_entry(struct address_space *mapping,
goto out;
dax_disassociate_entry(entry, mapping, trunc);
xas_store(&xas, NULL);
- mapping->nrpages -= dax_entry_order(entry);
+ mapping->nrpages -= 1UL << dax_entry_order(entry);
ret = 1;
out:
put_unlocked_entry(&xas, entry);
Updated git tree at
https://git.infradead.org/users/willy/pagecache.git/
It survives an xfstests run on an fsdax namespace which supports 2MB pages.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] Remove nrexceptional tracking
2020-10-08 19:33 ` Matthew Wilcox
@ 2020-10-09 23:04 ` Verma, Vishal L
0 siblings, 0 replies; 11+ messages in thread
From: Verma, Vishal L @ 2020-10-09 23:04 UTC (permalink / raw)
To: willy; +Cc: linux-mm, linux-nvdimm, linux-fsdevel
On Thu, 2020-10-08 at 20:33 +0100, Matthew Wilcox wrote:
> On Thu, Aug 06, 2020 at 08:16:02PM +0000, Verma, Vishal L wrote:
> > On Thu, 2020-08-06 at 19:44 +0000, Verma, Vishal L wrote:
> > > > I'm running xfstests on this patchset right now. If one of the DAX
> > > > people could try it out, that'd be fantastic.
> > > >
> > > > Matthew Wilcox (Oracle) (4):
> > > > mm: Introduce and use page_cache_empty
> > > > mm: Stop accounting shadow entries
> > > > dax: Account DAX entries as nrpages
> > > > mm: Remove nrexceptional from inode
> > >
> > > Hi Matthew,
> > >
> > > I applied these on top of 5.8 and ran them through the nvdimm unit test
> > > suite, and saw some test failures. The first failing test signature is:
> > >
> > > + umount test_dax_mnt
> > > ./dax-ext4.sh: line 62: 15749 Segmentation fault umount $MNT
> > > FAIL dax-ext4.sh (exit status: 139)
>
> Thanks. Fixed:
>
> +++ b/fs/dax.c
> @@ -644,7 +644,7 @@ static int __dax_invalidate_entry(struct address_space *mapping,
> goto out;
> dax_disassociate_entry(entry, mapping, trunc);
> xas_store(&xas, NULL);
> - mapping->nrpages -= dax_entry_order(entry);
> + mapping->nrpages -= 1UL << dax_entry_order(entry);
> ret = 1;
> out:
> put_unlocked_entry(&xas, entry);
>
> Updated git tree at
> https://git.infradead.org/users/willy/pagecache.git/
I ran this tree through the unit tests, and everything passes.
(Well, while the tests passed, this tree as-is did have an RCU warning
splat. I rebased to v5.9-rc8 and that was fine).
Feel free to add:
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 11+ messages in thread