* Is shmem page accounting wrong on split?
@ 2020-08-28 14:25 Matthew Wilcox
2020-08-28 14:55 ` Matthew Wilcox
0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2020-08-28 14:25 UTC (permalink / raw)
To: linux-mm; +Cc: Hugh Dickins, Yang Shi
If I understand truncate of a shmem THP correctly ...
Let's suppose the file has a single 2MB page at index 0, and is being
truncated down to 7 bytes in size.
shmem_setattr()
i_size_write(7);
shmem_truncate_range(7, -1);
shmem_undo_range(7, -1)
start = 1;
page = &head[1];
shmem_punch_compound();
split_huge_page()
end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); # == 1
__split_huge_page(..., 1, ...);
__delete_from_page_cache(&head[1], ...);
truncate_inode_page(page);
delete_from_page_cache(page)
__delete_from_page_cache(&head[1])
I think the solution is to call truncate_inode_page() from within
shmem_punch_compound() if we don't call split_huge_page(). I came across
this while reusing all this infrastructure for the XFS THP patchset,
so I'm not in a great position to test this patch.
This solution actually makes my life harder because I have a different
function to call if the page doesn't need to be split. But it's probably
the right solution for upstream today.
diff --git a/mm/shmem.c b/mm/shmem.c
index b2abca3f7f33..a0bc42974c2d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -819,15 +819,18 @@ void shmem_unlock_mapping(struct address_space *mapping)
static bool shmem_punch_compound(struct page *page, pgoff_t start, pgoff_t end)
{
if (!PageTransCompound(page))
- return true;
+ goto nosplit;
/* Just proceed to delete a huge page wholly within the range punched */
if (PageHead(page) &&
page->index >= start && page->index + HPAGE_PMD_NR <= end)
- return true;
+ goto nosplit;
/* Try to split huge page, so we can truly punch the hole or truncate */
return split_huge_page(page) >= 0;
+nosplit:
+ truncate_inode_page(page->mapping, page);
+ return true;
}
/*
@@ -883,8 +886,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
if ((!unfalloc || !PageUptodate(page)) &&
page_mapping(page) == mapping) {
VM_BUG_ON_PAGE(PageWriteback(page), page);
- if (shmem_punch_compound(page, start, end))
- truncate_inode_page(mapping, page);
+ shmem_punch_compound(page, start, end);
}
unlock_page(page);
}
@@ -966,9 +968,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
break;
}
VM_BUG_ON_PAGE(PageWriteback(page), page);
- if (shmem_punch_compound(page, start, end))
- truncate_inode_page(mapping, page);
- else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
+ if (!shmem_punch_compound(page, start, end)) {
/* Wipe the page and don't get stuck */
clear_highpage(page);
flush_dcache_page(page);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Is shmem page accounting wrong on split?
2020-08-28 14:25 Is shmem page accounting wrong on split? Matthew Wilcox
@ 2020-08-28 14:55 ` Matthew Wilcox
2020-08-28 15:43 ` Yang Shi
0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2020-08-28 14:55 UTC (permalink / raw)
To: linux-mm; +Cc: Hugh Dickins, Yang Shi
On Fri, Aug 28, 2020 at 03:25:46PM +0100, Matthew Wilcox wrote:
> If I understand truncate of a shmem THP correctly ...
>
> Let's suppose the file has a single 2MB page at index 0, and is being
> truncated down to 7 bytes in size.
>
> shmem_setattr()
> i_size_write(7);
> shmem_truncate_range(7, -1);
> shmem_undo_range(7, -1)
> start = 1;
> page = &head[1];
> shmem_punch_compound();
> split_huge_page()
> end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); # == 1
> __split_huge_page(..., 1, ...);
> __delete_from_page_cache(&head[1], ...);
> truncate_inode_page(page);
> delete_from_page_cache(page)
> __delete_from_page_cache(&head[1])
>
> I think the solution is to call truncate_inode_page() from within
> shmem_punch_compound() if we don't call split_huge_page(). I came across
> this while reusing all this infrastructure for the XFS THP patchset,
> so I'm not in a great position to test this patch.
Oh, this works for truncate, but not hole-punch. __split_huge_page()
won't call __delete_from_page_cache() for pages below the end of the
file. So maybe this instead?
It's a bit cheesy ... maybe split_huge_page() could return 1 to indicate
that it actually disposed of the page passed in?
+++ b/mm/shmem.c
@@ -827,7 +827,7 @@ static bool shmem_punch_compound(struct page *page, pgoff_t start, pgoff_t end)
return true;
/* Try to split huge page, so we can truly punch the hole or truncate */
- return split_huge_page(page) >= 0;
+ return split_huge_page(page) >= 0 && end < -1;
}
/*
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is shmem page accounting wrong on split?
2020-08-28 14:55 ` Matthew Wilcox
@ 2020-08-28 15:43 ` Yang Shi
2020-08-28 17:08 ` Hugh Dickins
0 siblings, 1 reply; 6+ messages in thread
From: Yang Shi @ 2020-08-28 15:43 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Linux MM, Hugh Dickins, Yang Shi
On Fri, Aug 28, 2020 at 7:55 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, Aug 28, 2020 at 03:25:46PM +0100, Matthew Wilcox wrote:
> > If I understand truncate of a shmem THP correctly ...
> >
> > Let's suppose the file has a single 2MB page at index 0, and is being
> > truncated down to 7 bytes in size.
> >
> > shmem_setattr()
> > i_size_write(7);
> > shmem_truncate_range(7, -1);
> > shmem_undo_range(7, -1)
> > start = 1;
> > page = &head[1];
> > shmem_punch_compound();
> > split_huge_page()
> > end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); # == 1
> > __split_huge_page(..., 1, ...);
> > __delete_from_page_cache(&head[1], ...);
> > truncate_inode_page(page);
> > delete_from_page_cache(page)
> > __delete_from_page_cache(&head[1])
> >
> > I think the solution is to call truncate_inode_page() from within
> > shmem_punch_compound() if we don't call split_huge_page(). I came across
> > this while reusing all this infrastructure for the XFS THP patchset,
> > so I'm not in a great position to test this patch.
>
> Oh, this works for truncate, but not hole-punch. __split_huge_page()
> won't call __delete_from_page_cache() for pages below the end of the
> file. So maybe this instead?
>
> It's a bit cheesy ... maybe split_huge_page() could return 1 to indicate
> that it actually disposed of the page passed in?
I'm fine to have split_huge_page() return 1.
>
> +++ b/mm/shmem.c
> @@ -827,7 +827,7 @@ static bool shmem_punch_compound(struct page *page, pgoff_t start, pgoff_t end)
> return true;
>
> /* Try to split huge page, so we can truly punch the hole or truncate */
> - return split_huge_page(page) >= 0;
> + return split_huge_page(page) >= 0 && end < -1;
It would be more clear if we could have some comment about what "-1"
means. It took me a little while to understand the magic number, but
once I understood it it looks more straightforward to me.
> }
>
> /*
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is shmem page accounting wrong on split?
2020-08-28 15:43 ` Yang Shi
@ 2020-08-28 17:08 ` Hugh Dickins
2020-08-28 17:31 ` Matthew Wilcox
0 siblings, 1 reply; 6+ messages in thread
From: Hugh Dickins @ 2020-08-28 17:08 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Linux MM, Hugh Dickins, Yang Shi, Yang Shi
On Fri, 28 Aug 2020, Yang Shi wrote:
> On Fri, Aug 28, 2020 at 7:55 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Fri, Aug 28, 2020 at 03:25:46PM +0100, Matthew Wilcox wrote:
> > > If I understand truncate of a shmem THP correctly ...
> > >
> > > Let's suppose the file has a single 2MB page at index 0, and is being
> > > truncated down to 7 bytes in size.
> > >
> > > shmem_setattr()
> > > i_size_write(7);
> > > shmem_truncate_range(7, -1);
> > > shmem_undo_range(7, -1)
> > > start = 1;
> > > page = &head[1];
> > > shmem_punch_compound();
> > > split_huge_page()
> > > end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); # == 1
> > > __split_huge_page(..., 1, ...);
> > > __delete_from_page_cache(&head[1], ...);
> > > truncate_inode_page(page);
> > > delete_from_page_cache(page)
> > > __delete_from_page_cache(&head[1])
> > >
> > > I think the solution is to call truncate_inode_page() from within
> > > shmem_punch_compound() if we don't call split_huge_page(). I came across
> > > this while reusing all this infrastructure for the XFS THP patchset,
> > > so I'm not in a great position to test this patch.
It's a good observation of an oddity that I probably didn't think of,
but you haven't said which kind of shmem page accounting goes wrong here
(vm_enough_memory? df of filesystem? du of filesystem? memcg charge?
all of the above? observed in practice?), and what needs solving.
If that page has already been deleted from page cache when splitting,
truncate_inode_page() sees NULL page->mapping != mapping and returns
without doing anything. What's the problem?
Hugh
> >
> > Oh, this works for truncate, but not hole-punch. __split_huge_page()
> > won't call __delete_from_page_cache() for pages below the end of the
> > file. So maybe this instead?
> >
> > It's a bit cheesy ... maybe split_huge_page() could return 1 to indicate
> > that it actually disposed of the page passed in?
>
> I'm fine to have split_huge_page() return 1.
>
> >
> > +++ b/mm/shmem.c
> > @@ -827,7 +827,7 @@ static bool shmem_punch_compound(struct page *page, pgoff_t start, pgoff_t end)
> > return true;
> >
> > /* Try to split huge page, so we can truly punch the hole or truncate */
> > - return split_huge_page(page) >= 0;
> > + return split_huge_page(page) >= 0 && end < -1;
>
> It would be more clear if we could have some comment about what "-1"
> means. It took me a little while to understand the magic number, but
> once I understood it it looks more straightforward to me.
>
> > }
> >
> > /*
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is shmem page accounting wrong on split?
2020-08-28 17:08 ` Hugh Dickins
@ 2020-08-28 17:31 ` Matthew Wilcox
2020-08-28 18:01 ` Matthew Wilcox
0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2020-08-28 17:31 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Linux MM, Yang Shi, Yang Shi
On Fri, Aug 28, 2020 at 10:08:52AM -0700, Hugh Dickins wrote:
> On Fri, 28 Aug 2020, Yang Shi wrote:
> > On Fri, Aug 28, 2020 at 7:55 AM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Fri, Aug 28, 2020 at 03:25:46PM +0100, Matthew Wilcox wrote:
> > > > If I understand truncate of a shmem THP correctly ...
> > > >
> > > > Let's suppose the file has a single 2MB page at index 0, and is being
> > > > truncated down to 7 bytes in size.
> > > >
> > > > shmem_setattr()
> > > > i_size_write(7);
> > > > shmem_truncate_range(7, -1);
> > > > shmem_undo_range(7, -1)
> > > > start = 1;
> > > > page = &head[1];
> > > > shmem_punch_compound();
> > > > split_huge_page()
> > > > end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); # == 1
> > > > __split_huge_page(..., 1, ...);
> > > > __delete_from_page_cache(&head[1], ...);
> > > > truncate_inode_page(page);
> > > > delete_from_page_cache(page)
> > > > __delete_from_page_cache(&head[1])
> > > >
> > > > I think the solution is to call truncate_inode_page() from within
> > > > shmem_punch_compound() if we don't call split_huge_page(). I came across
> > > > this while reusing all this infrastructure for the XFS THP patchset,
> > > > so I'm not in a great position to test this patch.
>
> It's a good observation of an oddity that I probably didn't think of,
> but you haven't said which kind of shmem page accounting goes wrong here
> (vm_enough_memory? df of filesystem? du of filesystem? memcg charge?
> all of the above? observed in practice?), and what needs solving.
>
> If that page has already been deleted from page cache when splitting,
> truncate_inode_page() sees NULL page->mapping != mapping and returns
> without doing anything. What's the problem?
Ah! I missed the check in truncate_inode_page(). This should be
fine then.
The problem I've observed in practice is following the same pattern in
truncate_inode_pages_range(). The call to delete_from_page_cache_batch()
trips the assertion that the page hasn't already been deleted from the
page cache. I think the solution is obvious -- don't add the page to
locked_pvec if page->mapping is NULL.
if (thp_punch(page, lstart, lend))
pagevec_add(&locked_pvec, page);
else
unlock_page(page);
}
for (i = 0; i < pagevec_count(&locked_pvec); i++)
truncate_cleanup_page(mapping, locked_pvec.pages[i]);
delete_from_page_cache_batch(mapping, &locked_pvec);
for (i = 0; i < pagevec_count(&locked_pvec); i++)
unlock_page(locked_pvec.pages[i]);
truncate_exceptional_pvec_entries(mapping, &pvec, indices);
pagevec_release(&pvec);
(shmem_punch_compound() got renamed to thp_punch() and moved to truncate.c)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is shmem page accounting wrong on split?
2020-08-28 17:31 ` Matthew Wilcox
@ 2020-08-28 18:01 ` Matthew Wilcox
0 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2020-08-28 18:01 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Linux MM, Yang Shi, Yang Shi
On Fri, Aug 28, 2020 at 06:31:22PM +0100, Matthew Wilcox wrote:
> On Fri, Aug 28, 2020 at 10:08:52AM -0700, Hugh Dickins wrote:
> > It's a good observation of an oddity that I probably didn't think of,
> > but you haven't said which kind of shmem page accounting goes wrong here
> > (vm_enough_memory? df of filesystem? du of filesystem? memcg charge?
> > all of the above? observed in practice?), and what needs solving.
Oh, I forgot to say which
> The problem I've observed in practice is following the same pattern in
> truncate_inode_pages_range(). The call to delete_from_page_cache_batch()
> trips the assertion that the page hasn't already been deleted from the
> page cache. I think the solution is obvious -- don't add the page to
> locked_pvec if page->mapping is NULL.
Here's the change I'm currently testing. It's survived about eight
minutes of xfstests so far, which is far longer than it was surviving
before.
- /* Try to split huge page, so we can truly punch the hole or truncate */
- return split_huge_page(page) >= 0;
+ /*
+ * split_huge_page may remove the page itself; if so, we should
+ * not attempt to remove it again.
+ */
+ return split_huge_page(page) >= 0 && page->mapping;
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-08-28 18:01 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-28 14:25 Is shmem page accounting wrong on split? Matthew Wilcox
2020-08-28 14:55 ` Matthew Wilcox
2020-08-28 15:43 ` Yang Shi
2020-08-28 17:08 ` Hugh Dickins
2020-08-28 17:31 ` Matthew Wilcox
2020-08-28 18:01 ` Matthew Wilcox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).