From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760659AbaGPIfC (ORCPT ); Wed, 16 Jul 2014 04:35:02 -0400 Received: from cantor2.suse.de ([195.135.220.15]:41996 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758622AbaGPIe7 (ORCPT ); Wed, 16 Jul 2014 04:34:59 -0400 Date: Wed, 16 Jul 2014 10:34:56 +0200 From: Michal Hocko To: Johannes Weiner Cc: Hugh Dickins , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [patch 2/3] mm: memcontrol: rewrite uncharge API fix - double migration Message-ID: <20140716083456.GC7121@dhcp22.suse.cz> References: <1404759133-29218-1-git-send-email-hannes@cmpxchg.org> <1404759133-29218-3-git-send-email-hannes@cmpxchg.org> <20140715144539.GR29639@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140715144539.GR29639@cmpxchg.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [Sorry I have missed this thread] On Tue 15-07-14 10:45:39, Johannes Weiner wrote: [...] > From 274b94ad83b38fe7dc1707a8eb4015b3ab1673c5 Mon Sep 17 00:00:00 2001 > From: Johannes Weiner > Date: Thu, 10 Jul 2014 01:02:11 +0000 > Subject: [patch] mm: memcontrol: rewrite uncharge API fix - double migration > > Hugh reports: > > VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM)) > mm/memcontrol.c:6680! > page had count 1 mapcount 0 mapping anon index 0x196 > flags locked uptodate reclaim swapbacked, pcflags 1, memcg not root > mem_cgroup_migrate < move_to_new_page < migrate_pages < compact_zone < > compact_zone_order < try_to_compact_pages < __alloc_pages_direct_compact < > __alloc_pages_nodemask < alloc_pages_vma < do_huge_pmd_anonymous_page < > handle_mm_fault < __do_page_fault > > mem_cgroup_migrate() assumes that a page is only migrated once and > then freed immediately after. > > However, putting the page back on the LRU list and dropping the > isolation refcount is not done atomically. This allows a PFN-based > migrator like compaction to isolate the page, see the expected > anonymous page refcount of 1, and migrate the page once more. > > Furthermore, once the charges are transferred to the new page, the old > page no longer has a pin on the memcg, which might get released before > the page itself now. pc->mem_cgroup is invalid at this point, but > PCG_USED suggests otherwise, provoking use-after-free. The same applies to to the new page because we are transferring only statistics. The old page with PCG_USED would uncharge the res_counter and so the new page is not backed by any and so memcg can go away. This sounds like a more probable scenario to me because old page should go away quite early after successful migration. > Properly uncharge the page after it's been migrated, including the > clearing of PCG_USED, so that a subsequent charge migration attempt > will be able to detect it and bail out. > > Signed-off-by: Johannes Weiner > Reported-by: Hugh Dickins > --- > mm/memcontrol.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 1e3b27f8dc2f..1439537fe7c9 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6655,7 +6655,6 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage, > > VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM), oldpage); > VM_BUG_ON_PAGE(do_swap_account && !(pc->flags & PCG_MEMSW), oldpage); > - pc->flags &= ~(PCG_MEM | PCG_MEMSW); > > if (PageTransHuge(oldpage)) { > nr_pages <<= compound_order(oldpage); > @@ -6663,6 +6662,13 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage, > VM_BUG_ON_PAGE(!PageTransHuge(newpage), newpage); > } > > + pc->flags = 0; > + > + local_irq_disable(); > + mem_cgroup_charge_statistics(pc->mem_cgroup, oldpage, -nr_pages); > + memcg_check_events(pc->mem_cgroup, oldpage); > + local_irq_enable(); > + > commit_charge(newpage, pc->mem_cgroup, nr_pages, lrucare); > } Looks good to me. I am just wondering whether we should really fiddle with stats and events when actually nothing changed during the transition. I would simply extract core of commit_charge into __commit_charge which would be called from here. The impact is minimal because events are rate limited and stats are per-cpu so it is not a big deal it just looks ugly to me. -- Michal Hocko SUSE Labs