All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: "Odzioba, Lukasz" <lukasz.odzioba@intel.com>,
	Michal Hocko <mhocko@kernel.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Shutemov, Kirill" <kirill.shutemov@intel.com>,
	"Anaczkowski, Lukasz" <lukasz.anaczkowski@intel.com>,
	"Shutemov, Kirill" <kirill.shutemov@intel.com>
Subject: Re: mm: pages are not freed from lru_add_pvecs after process termination
Date: Fri, 6 May 2016 09:04:34 -0700	[thread overview]
Message-ID: <572CC092.5020702@intel.com> (raw)
In-Reply-To: <D6EDEBF1F91015459DB866AC4EE162CC023C402E@IRSMSX103.ger.corp.intel.com>

On 05/06/2016 08:10 AM, Odzioba, Lukasz wrote:
> On Thu 05-05-16 09:21:00, Michal Hocko wrote: 
>> Or maybe the async nature of flushing turns
>> out to be just impractical and unreliable and we will end up skipping
>> THP (or all compound pages) for pcp LRU add cache. Let's see...
> 
> What if we simply skip lru_add pvecs for compound pages?
> That way we still have compound pages on LRU's, but the problem goes
> away.  It is not quite what this naïve patch does, but it works nice for me.
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 03aacbc..c75d5e1 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -392,7 +392,9 @@ static void __lru_cache_add(struct page *page)
>         get_page(page);
>         if (!pagevec_space(pvec))
>                 __pagevec_lru_add(pvec);
>         pagevec_add(pvec, page);
> +       if (PageCompound(page))
> +               __pagevec_lru_add(pvec);
>         put_cpu_var(lru_add_pvec);
>  }

That's not _quite_ what I had in mind since that drains the entire pvec
every time a large page is encountered.  But I'm conflicted about what
the right behavior _is_.

We'd taking the LRU lock for 'page' anyway, so we might as well drain
the pvec.

Or, does the additional work to put the page on to a pvec and then
immediately drain it overwhelm that advantage?

Or does it just not matter?

Kirill, do you have a suggestion for how we should be checking for THP
pages in code like this?  PageCompound() will surely _work_ for anon-THP
and your file-THP, but is it the best way to check?

> Do we have any tests that I could use to measure performance impact
> of such changes before I start to tweak it up? Or maybe it doesn't make
> sense at all ?

You probably want to very carefully calculate the time to fault a page,
then separately to free a page.  If we can't manage to detect a delta on
a little microbenchmark like that then we'll probably never see one in
practice.

You'll want to measure the fault time for a 4k pages, 2M pages, and then
possibly a mix.

You'll want to do this in a highly parallel test to make sure any
additional LRU lock overhead shows up.

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave.hansen@intel.com>
To: "Odzioba, Lukasz" <lukasz.odzioba@intel.com>,
	Michal Hocko <mhocko@kernel.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Shutemov, Kirill" <kirill.shutemov@intel.com>,
	"Anaczkowski, Lukasz" <lukasz.anaczkowski@intel.com>"Shutemov,
	Kirill" <kirill.shutemov@intel.com>
Subject: Re: mm: pages are not freed from lru_add_pvecs after process termination
Date: Fri, 6 May 2016 09:04:34 -0700	[thread overview]
Message-ID: <572CC092.5020702@intel.com> (raw)
In-Reply-To: <D6EDEBF1F91015459DB866AC4EE162CC023C402E@IRSMSX103.ger.corp.intel.com>

On 05/06/2016 08:10 AM, Odzioba, Lukasz wrote:
> On Thu 05-05-16 09:21:00, Michal Hocko wrote: 
>> Or maybe the async nature of flushing turns
>> out to be just impractical and unreliable and we will end up skipping
>> THP (or all compound pages) for pcp LRU add cache. Let's see...
> 
> What if we simply skip lru_add pvecs for compound pages?
> That way we still have compound pages on LRU's, but the problem goes
> away.  It is not quite what this naive patch does, but it works nice for me.
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 03aacbc..c75d5e1 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -392,7 +392,9 @@ static void __lru_cache_add(struct page *page)
>         get_page(page);
>         if (!pagevec_space(pvec))
>                 __pagevec_lru_add(pvec);
>         pagevec_add(pvec, page);
> +       if (PageCompound(page))
> +               __pagevec_lru_add(pvec);
>         put_cpu_var(lru_add_pvec);
>  }

That's not _quite_ what I had in mind since that drains the entire pvec
every time a large page is encountered.  But I'm conflicted about what
the right behavior _is_.

We'd taking the LRU lock for 'page' anyway, so we might as well drain
the pvec.

Or, does the additional work to put the page on to a pvec and then
immediately drain it overwhelm that advantage?

Or does it just not matter?

Kirill, do you have a suggestion for how we should be checking for THP
pages in code like this?  PageCompound() will surely _work_ for anon-THP
and your file-THP, but is it the best way to check?

> Do we have any tests that I could use to measure performance impact
> of such changes before I start to tweak it up? Or maybe it doesn't make
> sense at all ?

You probably want to very carefully calculate the time to fault a page,
then separately to free a page.  If we can't manage to detect a delta on
a little microbenchmark like that then we'll probably never see one in
practice.

You'll want to measure the fault time for a 4k pages, 2M pages, and then
possibly a mix.

You'll want to do this in a highly parallel test to make sure any
additional LRU lock overhead shows up.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-05-06 16:04 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-27 17:01 mm: pages are not freed from lru_add_pvecs after process termination Odzioba, Lukasz
2016-04-27 17:01 ` Odzioba, Lukasz
2016-04-27 17:11 ` Dave Hansen
2016-04-27 17:11   ` Dave Hansen
2016-04-28 14:37   ` Michal Hocko
2016-04-28 14:37     ` Michal Hocko
2016-05-02 13:00     ` Michal Hocko
2016-05-02 13:00       ` Michal Hocko
2016-05-04 19:41       ` Odzioba, Lukasz
2016-05-04 19:41         ` Odzioba, Lukasz
2016-05-04 20:16         ` Dave Hansen
2016-05-04 20:16           ` Dave Hansen
2016-05-04 20:36         ` Michal Hocko
2016-05-04 20:36           ` Michal Hocko
2016-05-05  7:21           ` Michal Hocko
2016-05-05  7:21             ` Michal Hocko
2016-05-05 17:25             ` Odzioba, Lukasz
2016-05-05 17:25               ` Odzioba, Lukasz
2016-05-11  7:38               ` Michal Hocko
2016-05-11  7:38                 ` Michal Hocko
2016-05-06 15:10             ` Odzioba, Lukasz
2016-05-06 15:10               ` Odzioba, Lukasz
2016-05-06 16:04               ` Dave Hansen [this message]
2016-05-06 16:04                 ` Dave Hansen
2016-05-11  7:53                 ` Michal Hocko
2016-05-11  7:53                   ` Michal Hocko
2016-05-13 11:29                   ` Vlastimil Babka
2016-05-13 11:29                     ` Vlastimil Babka
2016-05-13 12:05                   ` Odzioba, Lukasz
2016-05-13 12:05                     ` Odzioba, Lukasz
2016-06-07  9:02                   ` Odzioba, Lukasz
2016-06-07  9:02                     ` Odzioba, Lukasz
2016-06-07 11:19                     ` Michal Hocko
2016-06-07 11:19                       ` Michal Hocko
2016-06-08  8:51                       ` Odzioba, Lukasz
2016-06-08  8:51                         ` Odzioba, Lukasz
2016-05-02 14:39   ` Vlastimil Babka
2016-05-02 14:39     ` Vlastimil Babka
2016-05-02 15:01     ` Kirill A. Shutemov
2016-05-02 15:01       ` Kirill A. Shutemov
2016-05-02 15:13       ` Vlastimil Babka
2016-05-02 15:13         ` Vlastimil Babka
2016-05-02 15:49       ` Dave Hansen
2016-05-02 15:49         ` Dave Hansen
2016-05-02 16:02         ` Kirill A. Shutemov
2016-05-02 16:02           ` Kirill A. Shutemov
2016-05-03  7:37           ` Michal Hocko
2016-05-03  7:37             ` Michal Hocko
2016-05-03 10:07             ` Kirill A. Shutemov
2016-05-03 10:07               ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=572CC092.5020702@intel.com \
    --to=dave.hansen@intel.com \
    --cc=kirill.shutemov@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lukasz.anaczkowski@intel.com \
    --cc=lukasz.odzioba@intel.com \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.