linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Alex Shi <alex.shi@linux.alibaba.com>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	mgorman@techsingularity.net, tj@kernel.org, hughd@google.com,
	khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com,
	yang.shi@linux.alibaba.com,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Vladimir Davydov" <vdavydov.dev@gmail.com>,
	"Roman Gushchin" <guro@fb.com>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Chris Down" <chris@chrisdown.name>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vlastimil Babka" <vbabka@suse.cz>, "Qian Cai" <cai@lca.pw>,
	"Andrey Ryabinin" <aryabinin@virtuozzo.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"David Rientjes" <rientjes@google.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	swkhack <swkhack@gmail.com>,
	"Potyra, Stefan" <Stefan.Potyra@elektrobit.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Stephen Rothwell" <sfr@canb.auug.org.au>,
	"Colin Ian King" <colin.king@canonical.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Mauro Carvalho Chehab" <mchehab+samsung@kernel.org>,
	"Peng Fan" <peng.fan@nxp.com>,
	"Nikolay Borisov" <nborisov@suse.com>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Kirill Tkhai" <ktkhai@virtuozzo.com>,
	"Yafang Shao" <laoar.shao@gmail.com>
Subject: Re: [PATCH v3 3/7] mm/lru: replace pgdat lru_lock with lruvec lock
Date: Fri, 15 Nov 2019 20:38:06 -0800	[thread overview]
Message-ID: <20191116043806.GD20752@bombadil.infradead.org> (raw)
In-Reply-To: <1573874106-23802-4-git-send-email-alex.shi@linux.alibaba.com>

On Sat, Nov 16, 2019 at 11:15:02AM +0800, Alex Shi wrote:
> This is the main patch to replace per node lru_lock with per memcg
> lruvec lock. It also fold the irqsave flags into lruvec.

I have to say, I don't love the part where we fold the irqsave flags
into the lruvec.  I know it saves us an argument, but it opens up the
possibility of mismatched expectations.  eg we currently have:

static void __split_huge_page(struct page *page, struct list_head *list,
			struct lruvec *lruvec, pgoff_t end)
{
...
	spin_unlock_irqrestore(&lruvec->lru_lock, lruvec->irqflags);

so if we introduce a new caller, we have to be certain that this caller
is also using lock_page_lruvec_irqsave() and not lock_page_lruvec_irq().
I can't think of a way to make the compiler enforce that, and if we don't,
then we can get some odd crashes with interrupts being unexpectedly
enabled or disabled, depending on how ->irqflags was used last.

So it makes the code more subtle.  And that's not a good thing.

> +static inline struct lruvec *lock_page_lruvec_irq(struct page *page,
> +						struct pglist_data *pgdat)
> +{
> +	struct lruvec *lruvec = mem_cgroup_page_lruvec(page, pgdat);
> +
> +	spin_lock_irq(&lruvec->lru_lock);
> +
> +	return lruvec;
> +}

...

> +static struct lruvec *lock_page_lru(struct page *page, int *isolated)
>  {
>  	pg_data_t *pgdat = page_pgdat(page);
> +	struct lruvec *lruvec = lock_page_lruvec_irq(page, pgdat);
>  
> -	spin_lock_irq(&pgdat->lru_lock);
>  	if (PageLRU(page)) {
> -		struct lruvec *lruvec;
>  
> -		lruvec = mem_cgroup_page_lruvec(page, pgdat);
>  		ClearPageLRU(page);
>  		del_page_from_lru_list(page, lruvec, page_lru(page));
>  		*isolated = 1;
>  	} else
>  		*isolated = 0;
> +
> +	return lruvec;
>  }

But what if the page is !PageLRU?  What lruvec did we just lock?
According to the comments on mem_cgroup_page_lruvec(),

 * This function is only safe when following the LRU page isolation
 * and putback protocol: the LRU lock must be held, and the page must
 * either be PageLRU() or the caller must have isolated/allocated it.

and now it's being called in order to find out which LRU lock to take.
So this comment needs to be updated, if it's wrong, or this patch has
a race.



  reply	other threads:[~2019-11-16  4:38 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-16  3:14 [PATCH v3 0/8] per lruvec lru_lock for memcg Alex Shi
2019-11-16  3:15 ` [PATCH v3 1/7] mm/lru: add per lruvec lock " Alex Shi
2019-11-16  6:28   ` Shakeel Butt
2019-11-18  2:44     ` Alex Shi
2019-11-18 12:08       ` Matthew Wilcox
2019-11-18 12:37         ` Alex Shi
2019-11-19 10:05         ` Alex Shi
2019-11-16  3:15 ` [PATCH v3 2/7] mm/lruvec: add irqsave flags into lruvec struct Alex Shi
2019-11-16  6:31   ` Shakeel Butt
2019-11-18  2:52     ` Alex Shi
2019-11-22  6:46   ` Christoph Hellwig
2019-11-16  3:15 ` [PATCH v3 3/7] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2019-11-16  4:38   ` Matthew Wilcox [this message]
2019-11-18 11:55     ` Alex Shi
2019-11-18 12:14       ` Matthew Wilcox
2019-11-18 12:31         ` Alex Shi
2019-11-18 12:34           ` Matthew Wilcox
2019-11-19 10:14             ` Alex Shi
2019-11-16  7:03   ` Shakeel Butt
2019-11-18 12:23     ` Alex Shi
2019-11-18 12:31       ` Matthew Wilcox
2019-11-19 10:08         ` Alex Shi
2019-11-18 16:11   ` Johannes Weiner
2019-11-19 10:04     ` Alex Shi
2019-11-19  2:10   ` Daniel Jordan
2019-11-19 10:10     ` Alex Shi
2019-11-16  3:15 ` [PATCH v3 4/7] mm/lru: only change the lru_lock iff page's lruvec is different Alex Shi
2019-11-16  3:15 ` [PATCH v3 5/7] mm/pgdat: remove pgdat lru_lock Alex Shi
2019-11-16  3:15 ` [PATCH v3 6/7] mm/lru: likely enhancement Alex Shi
2019-11-16  3:15 ` [PATCH v3 7/7] mm/lru: revise the comments of lru_lock Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191116043806.GD20752@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=Stefan.Potyra@elektrobit.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=cai@lca.pw \
    --cc=cgroups@vger.kernel.org \
    --cc=chris@chrisdown.name \
    --cc=colin.king@canonical.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=nborisov@suse.com \
    --cc=peng.fan@nxp.com \
    --cc=rientjes@google.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=shakeelb@google.com \
    --cc=swkhack@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).