All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Qian Cai <cai@lca.pw>
Cc: Huang Ying <ying.huang@intel.com>,
	linux-mm@kvack.org, "Kirill A. Shutemov" <kirill@shutemov.name>
Subject: Re: page cache: Store only head pages in i_pages
Date: Fri, 29 Mar 2019 20:04:32 -0700	[thread overview]
Message-ID: <20190330030431.GX10344@bombadil.infradead.org> (raw)
In-Reply-To: <1553894734.26196.30.camel@lca.pw>

On Fri, Mar 29, 2019 at 05:25:34PM -0400, Qian Cai wrote:
> On Fri, 2019-03-29 at 12:59 -0700, Matthew Wilcox wrote:
> > Oh ... is it a race?
> > 
> > so CPU A does:
> > 
> > page = find_get_page(swap_address_space(entry), offset)
> >         page = find_subpage(page, offset);
> > trylock_page(page);
> > 
> > while CPU B does:
> > 
> > xa_lock_irq(&address_space->i_pages);
> > __delete_from_swap_cache(page, entry);
> >         xas_store(&xas, NULL);
> >         ClearPageSwapCache(page);
> > xa_unlock_irq(&address_space->i_pages);
> > 
> > and if the ClearPageSwapCache happens between the xas_load() and the
> > find_subpage(), we're stuffed.  CPU A has a reference to the page, but
> > not a lock, and find_get_page is running under RCU.
> > 
> > I suppose we could fix this by taking the i_pages xa_lock around the
> > call to find_get_pages().  If indeed, that's what this problem is.
> > Want to try this patch?
> 
> Confirmed. Well spotted!

Excellent!  I'm not comfortable with the rule that you have to be holding
the i_pages lock in order to call find_get_page() on a swap address_space.
How does this look to the various smart people who know far more about the
MM than I do?

The idea is to ensure that if this race does happen, the page will be
handled the same way as a pagecache page.  If __delete_from_swap_cache()
can be called while the page is still part of a VMA, then this patch
will break page_to_pgoff().  But I don't think that can happen ... ?

(also, is this the right memory barrier to use to ensure that the old
value of page->index cannot be observed if the PageSwapCache bit is
obseved as having been cleared?)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 2e15cc335966..a715efcf0991 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -165,13 +165,16 @@ void __delete_from_swap_cache(struct page *page, swp_entry_t entry)
 	VM_BUG_ON_PAGE(!PageSwapCache(page), page);
 	VM_BUG_ON_PAGE(PageWriteback(page), page);
 
+	page->index = idx;
+	smp_mb__before_atomic();
+	ClearPageSwapCache(page);
+
 	for (i = 0; i < nr; i++) {
 		void *entry = xas_store(&xas, NULL);
 		VM_BUG_ON_PAGE(entry != page, entry);
 		set_page_private(page + i, 0);
 		xas_next(&xas);
 	}
-	ClearPageSwapCache(page);
 	address_space->nrpages -= nr;
 	__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr);
 	ADD_CACHE_INFO(del_total, nr);


  reply	other threads:[~2019-03-30  3:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1553285568.26196.24.camel@lca.pw>
2019-03-23  3:38 ` page cache: Store only head pages in i_pages Matthew Wilcox
2019-03-23 23:50   ` Qian Cai
2019-03-24  2:06     ` Matthew Wilcox
2019-03-24  2:52       ` Qian Cai
2019-03-24  3:04         ` Matthew Wilcox
2019-03-24 15:42           ` Qian Cai
2019-03-27 10:48           ` William Kucharski
2019-03-27 11:50             ` Matthew Wilcox
2019-03-29  1:43           ` Qian Cai
2019-03-29 19:59             ` Matthew Wilcox
2019-03-29 21:25               ` Qian Cai
2019-03-30  3:04                 ` Matthew Wilcox [this message]
2019-03-30 14:10                   ` Matthew Wilcox
2019-03-31  3:23                     ` Matthew Wilcox
2019-04-01  9:18                       ` Kirill A. Shutemov
2019-04-01  9:27                         ` Kirill A. Shutemov
2019-04-04 13:10                           ` Qian Cai
2019-04-04 13:45                             ` Kirill A. Shutemov
2019-04-04 21:28                               ` Qian Cai
2019-04-05 13:37                                 ` Kirill A. Shutemov
2019-04-05 13:51                                   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190330030431.GX10344@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=cai@lca.pw \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.