All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michel Lespinasse <walken@google.com>
To: Andrea Arcangeli <aarcange@redhat.com>, Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Minchan Kim <minchan.kim@gmail.com>,
	Johannes Weiner <jweiner@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Shaohua Li <shaohua.li@intel.com>
Subject: Re: [RFC PATCH 0/3] page count lock for simpler put_page
Date: Thu, 4 Aug 2011 23:39:19 -0700	[thread overview]
Message-ID: <CANN689HpuQ3bAW946c4OeoLLAUXHd6nzp+NVxkrFgZo7k3k0Kg@mail.gmail.com> (raw)
In-Reply-To: <1312492042-13184-1-git-send-email-walken@google.com>

On Thu, Aug 4, 2011 at 2:07 PM, Michel Lespinasse <walken@google.com> wrote:
> Patch 3 demonstrates my motivation for this patch series: in my pre-THP
> implementation of idle page tracking, I was able to use get_page_unless_zero
> in a way that __split_huge_page_refcount made unsafe. Building on top of
> patch 2, I can make the required operation safe again. If patch 2 was to
> be rejected, I would like to get suggestions about alternative approaches
> to implement the get_first_page_unless_zero() operation described here.

I should add that I am quite worried about the places that use
get_page_unless_zero (or the page_cache_*_speculative wrappers) today.
My worrisome scenario would be as follows:

- thread T finds a pointer to a page P (possibly from a radix tree in
find_get_page() )
- page P gets freed by another thread
- page P gets re-allocated as the tail of a THP page by another thread
- another thread gets a reference on page P
- thread T proceeds doing page_cache_get_speculative(P), intending to
then check that P is really the page it wanted
- another thread splits up P's compound page;
__split_huge_page_refcount subtracts T's refcount on P from head(P)'s
refcount
- thread T figures out that it didn't get the page it expected, calls
page_cache_release(P). But it's too late - the refcount for what used
to be head(P) has already been corrupted (incorrectly decremented).

Does anything prevent the above ?

I can see that the page_cache_get_speculative comment in
include/linux/pagemap.h maps out one way to prevent the issue. If
thread T continually held an rcu read lock from the time it finds the
pointer to P until the time it calls get_page_unless_zero on that
page, AND there was a synchronize_rcu() call somewhere between the
time a THP page gets allocated and the time __split_huge_page_refcount
might first get called on that page, then things would be safe.
However, that does not seem to be true today: I could not find a
synchronize_rcu() call before __split_huge_page_refcount(), AND there
are also places (such as deactivate_page() for example) that call
get_page_unless_zero without being within an rcu read locked section
(or holding the zone lru lock to provide exclusion against
__split_huge_page_refcount).

Is there another protection mechanism that I have not considered ?

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-08-05  6:39 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-04 21:07 [RFC PATCH 0/3] page count lock for simpler put_page Michel Lespinasse
2011-08-04 21:07 ` [RFC PATCH 1/3] mm: Replace naked page->_count accesses with accessor functions Michel Lespinasse
2011-08-04 21:07 ` [RFC PATCH 2/3] mm: page count lock Michel Lespinasse
2011-08-07 14:00   ` Minchan Kim
2011-08-04 21:07 ` [RFC PATCH 3/3] mm: get_first_page_unless_zero() Michel Lespinasse
2011-08-07 14:13   ` Minchan Kim
2011-08-05  6:39 ` Michel Lespinasse [this message]
2011-08-07 14:25   ` [RFC PATCH 0/3] page count lock for simpler put_page Minchan Kim
2011-08-09 11:04     ` Michel Lespinasse
2011-08-09 22:22       ` Minchan Kim
2011-08-12 22:35         ` Michel Lespinasse
2011-08-13  4:07           ` Minchan Kim
2011-08-12 15:36       ` Andrea Arcangeli
2011-08-12 16:08         ` SPAM: " Paul E. McKenney
2011-08-12 16:43           ` Andrea Arcangeli
2011-08-12 17:27             ` Paul E. McKenney
2011-08-12 23:45               ` Michel Lespinasse
2011-08-13  1:57                 ` Paul E. McKenney
2011-08-13 23:56                   ` Andrea Arcangeli
2011-08-13  4:18             ` Minchan Kim
2011-08-12 16:57           ` Johannes Weiner
2011-08-12 17:08             ` Andrea Arcangeli
2011-08-12 17:52               ` Johannes Weiner
2011-08-12 18:13                 ` Paul E. McKenney
2011-08-12 19:05                   ` Johannes Weiner
2011-08-12 22:14                     ` Paul E. McKenney
2011-08-12 22:22                 ` Andrea Arcangeli
2011-08-12 18:03               ` Paul E. McKenney
2011-08-12 17:41             ` Paul E. McKenney
2011-08-12 17:56               ` Johannes Weiner
2011-08-12 23:02           ` Michel Lespinasse
2011-08-12 22:50         ` Michel Lespinasse
2011-08-13  4:11         ` Minchan Kim
2011-08-12 16:58   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANN689HpuQ3bAW946c4OeoLLAUXHd6nzp+NVxkrFgZo7k3k0Kg@mail.gmail.com \
    --to=walken@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jweiner@redhat.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.