From: Konstantin Khlebnikov <khlebnikov@openvz.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/3] mm: adjust rss counters for migration entiries
Date: Tue, 24 Jan 2012 11:08:13 +0400 [thread overview]
Message-ID: <4F1E58DD.6030607@openvz.org> (raw)
In-Reply-To: <alpine.LSU.2.00.1201231719580.14979@eggly.anvils>
Hugh Dickins wrote:
> On Wed, 18 Jan 2012, Andrew Morton wrote:
>> On Wed, 11 Jan 2012 17:41:26 +0900
>> KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>> On Wed, 11 Jan 2012 12:23:11 +0400
>>> Konstantin Khlebnikov<khlebnikov@openvz.org> wrote:
>
> I only just got around to looking at these, sorry.
>
>>
>> Putting "fix" in the patch title text is a good way of handling this.
>>
>> I renamed [3/3] to "mm: fix rss count leakage during migration" and
>> shall queue it for 3.3. If people think we should also backport it
>> into -stable then please let me know.
>
> I don't think it needs backporting to stable: unless I'm forgetting
> something, the only thing that actually uses these rss counters is the
> OOM killer, and I don't think that will be greatly affected by the bug.
>
Yes, there are only counters. But, I can imagine local DoS attack via this race.
>>
>> I reordered the patches and worked the chagnelogs quite a bit. I now
>> have:
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: fix rss count leakage during migration
>> :
>> : Memory migration fills a pte with a migration entry and it doesn't update
>> : the rss counters. Then it replaces the migration entry with the new page
>> : (or the old one if migration failed). But between these two passes this
>> : pte can be unmaped, or a task can fork a child and it will get a copy of
>> : this migration entry. Nobody accounts for this in the rss counters.
>> :
>> : This patch properly adjust rss counters for migration entries in
>> : zap_pte_range() and copy_one_pte(). Thus we avoid extra atomic operations
>> : on the migration fast-path.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> That was a good find, Konstantin: thank you.
>
>>
>> and
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: add rss counters consistency check
>> :
>> : Warn about non-zero rss counters at final mmdrop.
>> :
>> : This check will prevent reoccurences of bugs such as that fixed in "mm:
>> : fix rss count leakage during migration".
>> :
>> : I didn't hide this check under CONFIG_VM_DEBUG because it rather small and
>> : rss counters cover whole page-table management, so this is a good
>> : invariant.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> I'd be happier with this one if you do hide the check under
> CONFIG_VM_DEBUG - or even under CONFIG_DEBUG_VM if you want it to
> be compiled in sometimes ;) I suppose NR_MM_COUNTERS is only 3,
> so it isn't a huge overhead; but I think you're overestimating the
> importance of these counters, and it would look better under DEBUG_VM.
Theoretically, some drivers can touch page tables,
for example if they do that outside of vma we can get some kind of strange memory leaks.
>
>>
>> and
>>
>> : From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Subject: mm: postpone migrated page mapping reset
>> :
>> : Postpone resetting page->mapping until the final remove_migration_ptes().
>> : Otherwise the expression PageAnon(migration_entry_to_page(entry)) does not
>> : work.
>> :
>> : Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> : Cc: Hugh Dickins<hughd@google.com>
>> : Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>
> Isn't this one actually an essential part of the fix? It should have
> been part of the same patch, but you split them apart, now Andrew has
> reordered them and pushed one part to 3.3, but this needs to go in too?
>
Oops. I missed that. Yes. race-fix does not work for anon-memory without that patch.
But this is non-fatal, there are no new bugs.
> Hugh
next prev parent reply other threads:[~2012-01-24 7:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-06 17:38 [PATCH 1/3] mm: add rss counters consistency check Konstantin Khlebnikov
2012-01-06 17:38 ` [PATCH 2/3] mm: postpone migrated page mapping reset Konstantin Khlebnikov
2012-01-06 17:38 ` [PATCH 3/3] mm: adjust rss counters for migration entiries Konstantin Khlebnikov
2012-01-06 17:45 ` Konstantin Khlebnikov
2012-01-11 5:41 ` KAMEZAWA Hiroyuki
2012-01-11 8:23 ` Konstantin Khlebnikov
2012-01-11 8:41 ` KAMEZAWA Hiroyuki
2012-01-18 23:21 ` Andrew Morton
2012-01-24 1:42 ` Hugh Dickins
2012-01-24 7:08 ` Konstantin Khlebnikov [this message]
2012-01-25 23:01 ` Hugh Dickins
2012-01-26 0:20 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F1E58DD.6030607@openvz.org \
--to=khlebnikov@openvz.org \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).