All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Zimmer <nzimmer@sgi.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Travis <travis@sgi.com>, Nathan Zimmer <nzimmer@sgi.com>,
	Peter Anvin <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Robin Holt <holt@sgi.com>,
	Rob Landley <rob@landley.net>,
	Daniel J Blueman <daniel@numascale-asia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Yinghai Lu <yinghai@kernel.org>, Mel Gorman <mgorman@suse.de>
Subject: Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator
Date: Wed, 14 Aug 2013 17:15:06 -0500	[thread overview]
Message-ID: <20130814221505.GA147490@asylum.americas.sgi.com> (raw)
In-Reply-To: <20130814110556.GH10849@gmail.com>

On Wed, Aug 14, 2013 at 01:05:56PM +0200, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > [...]
> > 
> > Ok, so I don't know all the issues, and in many ways I don't even really 
> > care. You could do it other ways, I don't think this is a big deal. The 
> > part I hate is the runtime hook into the core MM page allocation code, 
> > so I'm just throwing out any random thing that comes to my mind that 
> > could be used to avoid that part.
> 
> So, my hope was that it's possible to have a single, simple, zero-cost 
> runtime check [zero cost for already initialized pages], because it can be 
> merged into already existing page flag mask checks present here and 
> executed for every freshly allocated page:
> 
> static inline int check_new_page(struct page *page)
> {
>         if (unlikely(page_mapcount(page) |
>                 (page->mapping != NULL)  |
>                 (atomic_read(&page->_count) != 0)  |
>                 (page->flags & PAGE_FLAGS_CHECK_AT_PREP) |
>                 (mem_cgroup_bad_page_check(page)))) {
>                 bad_page(page);
>                 return 1;
>         }
>         return 0;
> }
> 
> We already run this for every new page allocated and the initialization 
> check could hide in PAGE_FLAGS_CHECK_AT_PREP in a zero-cost fashion.
> 
> I'd not do any of the ensure_page_is_initialized() or 
> __expand_page_initialization() complications in this patch-set - each page 
> head represents itself and gets iterated when check_new_page() is done.
> 
> During regular bootup we'd initialize like before, except we don't set up 
> the page heads but memset() them to zero. With each page head 32 bytes 
> this would mean 8 GB of page head memory to clear per 1 TB - with 16 TB 
> that's 128 GB to clear - that ought to be possible to do rather quickly, 
> perhaps with some smart SMP cross-call approach that makes sure that each 
> memset is done in a node-local fashion. [*]
> 
> Such an approach should IMO be far smaller and less invasive than the 
> patches presented so far: it should be below 100 lines or so.
> 
> I don't know why there's such a big difference between the theory I 
> outlined and the invasive patch-set implemented so far in practice, 
> perhaps I'm missing some complication. I was trying to probe that 
> difference, before giving up on the idea and punting back to the async 
> hotplug-ish approach which would obviously work well too.
> 

The reason, which I failed to mention, is once we pull off a page the lru in
either __rmqueue_fallback or __rmqueue_smallest the first thing we do with it
is expand() or sometimes move_freepages().  These then trip over some BUG_ON and
VM_BUG_ON.
Those BUG_ONs are what keep causing me to delve into the ensure/expand foolishness.

Nate


WARNING: multiple messages have this Message-ID (diff)
From: Nathan Zimmer <nzimmer@sgi.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Travis <travis@sgi.com>, Nathan Zimmer <nzimmer@sgi.com>,
	Peter Anvin <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Robin Holt <holt@sgi.com>,
	Rob Landley <rob@landley.net>,
	Daniel J Blueman <daniel@numascale-asia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Yinghai Lu <yinghai@kernel.org>, Mel Gorman <mgorman@suse.de>
Subject: Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator
Date: Wed, 14 Aug 2013 17:15:06 -0500	[thread overview]
Message-ID: <20130814221505.GA147490@asylum.americas.sgi.com> (raw)
In-Reply-To: <20130814110556.GH10849@gmail.com>

On Wed, Aug 14, 2013 at 01:05:56PM +0200, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > [...]
> > 
> > Ok, so I don't know all the issues, and in many ways I don't even really 
> > care. You could do it other ways, I don't think this is a big deal. The 
> > part I hate is the runtime hook into the core MM page allocation code, 
> > so I'm just throwing out any random thing that comes to my mind that 
> > could be used to avoid that part.
> 
> So, my hope was that it's possible to have a single, simple, zero-cost 
> runtime check [zero cost for already initialized pages], because it can be 
> merged into already existing page flag mask checks present here and 
> executed for every freshly allocated page:
> 
> static inline int check_new_page(struct page *page)
> {
>         if (unlikely(page_mapcount(page) |
>                 (page->mapping != NULL)  |
>                 (atomic_read(&page->_count) != 0)  |
>                 (page->flags & PAGE_FLAGS_CHECK_AT_PREP) |
>                 (mem_cgroup_bad_page_check(page)))) {
>                 bad_page(page);
>                 return 1;
>         }
>         return 0;
> }
> 
> We already run this for every new page allocated and the initialization 
> check could hide in PAGE_FLAGS_CHECK_AT_PREP in a zero-cost fashion.
> 
> I'd not do any of the ensure_page_is_initialized() or 
> __expand_page_initialization() complications in this patch-set - each page 
> head represents itself and gets iterated when check_new_page() is done.
> 
> During regular bootup we'd initialize like before, except we don't set up 
> the page heads but memset() them to zero. With each page head 32 bytes 
> this would mean 8 GB of page head memory to clear per 1 TB - with 16 TB 
> that's 128 GB to clear - that ought to be possible to do rather quickly, 
> perhaps with some smart SMP cross-call approach that makes sure that each 
> memset is done in a node-local fashion. [*]
> 
> Such an approach should IMO be far smaller and less invasive than the 
> patches presented so far: it should be below 100 lines or so.
> 
> I don't know why there's such a big difference between the theory I 
> outlined and the invasive patch-set implemented so far in practice, 
> perhaps I'm missing some complication. I was trying to probe that 
> difference, before giving up on the idea and punting back to the async 
> hotplug-ish approach which would obviously work well too.
> 

The reason, which I failed to mention, is once we pull off a page the lru in
either __rmqueue_fallback or __rmqueue_smallest the first thing we do with it
is expand() or sometimes move_freepages().  These then trip over some BUG_ON and
VM_BUG_ON.
Those BUG_ONs are what keep causing me to delve into the ensure/expand foolishness.

Nate

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-08-14 22:15 UTC|newest]

Thread overview: 153+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-12  2:03 [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Robin Holt
2013-07-12  2:03 ` Robin Holt
2013-07-12  2:03 ` [RFC 1/4] memblock: Introduce a for_each_reserved_mem_region iterator Robin Holt
2013-07-12  2:03   ` Robin Holt
2013-07-12  2:03 ` [RFC 2/4] Have __free_pages_memory() free in larger chunks Robin Holt
2013-07-12  2:03   ` Robin Holt
2013-07-12  7:45   ` Robin Holt
2013-07-12  7:45     ` Robin Holt
2013-07-13  3:08     ` Yinghai Lu
2013-07-13  3:08       ` Yinghai Lu
2013-07-16 13:02   ` Sam Ben
2013-07-16 13:02     ` Sam Ben
2013-07-23 15:32     ` Johannes Weiner
2013-07-23 15:32       ` Johannes Weiner
2013-07-12  2:03 ` [RFC 3/4] Seperate page initialization into a separate function Robin Holt
2013-07-12  2:03   ` Robin Holt
2013-07-13  3:06   ` Yinghai Lu
2013-07-13  3:06     ` Yinghai Lu
2013-07-15  3:19     ` Robin Holt
2013-07-15  3:19       ` Robin Holt
2013-07-12  2:03 ` [RFC 4/4] Sparse initialization of struct page array Robin Holt
2013-07-12  2:03   ` Robin Holt
2013-07-13  4:19   ` Yinghai Lu
2013-07-13  4:19     ` Yinghai Lu
2013-07-13  4:39     ` H. Peter Anvin
2013-07-13  4:39       ` H. Peter Anvin
2013-07-13  5:31       ` Yinghai Lu
2013-07-13  5:31         ` Yinghai Lu
2013-07-13  5:38         ` H. Peter Anvin
2013-07-13  5:38           ` H. Peter Anvin
2013-07-15 14:08         ` Nathan Zimmer
2013-07-15 14:08           ` Nathan Zimmer
2013-07-15 17:45     ` Nathan Zimmer
2013-07-15 17:45       ` Nathan Zimmer
2013-07-15 17:54       ` H. Peter Anvin
2013-07-15 17:54         ` H. Peter Anvin
2013-07-15 18:26         ` Robin Holt
2013-07-15 18:26           ` Robin Holt
2013-07-15 18:29           ` H. Peter Anvin
2013-07-15 18:29             ` H. Peter Anvin
2013-07-23  8:32             ` Ingo Molnar
2013-07-23  8:32               ` Ingo Molnar
2013-07-23 11:09               ` Robin Holt
2013-07-23 11:09                 ` Robin Holt
2013-07-23 11:15                 ` Robin Holt
2013-07-23 11:15                   ` Robin Holt
2013-07-23 11:41                   ` Robin Holt
2013-07-23 11:41                     ` Robin Holt
2013-07-23 11:50                     ` Robin Holt
2013-07-23 11:50                       ` Robin Holt
2013-07-16 10:26     ` Robin Holt
2013-07-16 10:26       ` Robin Holt
2013-07-25  2:25     ` Robin Holt
2013-07-25  2:25       ` Robin Holt
2013-07-25 12:50       ` Yinghai Lu
2013-07-25 12:50         ` Yinghai Lu
2013-07-25 13:42         ` Robin Holt
2013-07-25 13:42           ` Robin Holt
2013-07-25 13:52           ` Yinghai Lu
2013-07-25 13:52             ` Yinghai Lu
2013-07-15 21:30   ` Andrew Morton
2013-07-15 21:30     ` Andrew Morton
2013-07-16 10:38     ` Robin Holt
2013-07-16 10:38       ` Robin Holt
2013-07-12  8:27 ` [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-07-12  8:27   ` Ingo Molnar
2013-07-12  8:47   ` boot tracing Borislav Petkov
2013-07-12  8:47     ` Borislav Petkov
2013-07-12  8:53     ` Ingo Molnar
2013-07-12  8:53       ` Ingo Molnar
2013-07-15  1:38       ` Sam Ben
2013-07-15  1:38         ` Sam Ben
2013-07-23  8:18         ` Ingo Molnar
2013-07-23  8:18           ` Ingo Molnar
2013-07-12  9:19   ` [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Robert Richter
2013-07-12  9:19     ` Robert Richter
2013-07-15 15:16   ` Robin Holt
2013-07-15 15:16     ` Robin Holt
2013-07-16  8:55   ` Joonsoo Kim
2013-07-16  8:55     ` Joonsoo Kim
2013-07-16  9:08     ` Borislav Petkov
2013-07-16  9:08       ` Borislav Petkov
2013-07-23  8:20       ` Ingo Molnar
2013-07-23  8:20         ` Ingo Molnar
2013-07-15 15:00 ` Robin Holt
2013-07-15 15:00   ` Robin Holt
2013-07-17  5:17 ` Sam Ben
2013-07-17  5:17   ` Sam Ben
2013-07-17  9:30   ` Robin Holt
2013-07-17  9:30     ` Robin Holt
2013-07-19 23:51     ` Yinghai Lu
2013-07-22  6:13       ` Robin Holt
2013-07-22  6:13         ` Robin Holt
2013-08-02 17:44 ` [RFC v2 0/5] " Nathan Zimmer
2013-08-02 17:44   ` Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 1/5] memblock: Introduce a for_each_reserved_mem_region iterator Nathan Zimmer
2013-08-02 17:44     ` Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 2/5] Have __free_pages_memory() free in larger chunks Nathan Zimmer
2013-08-02 17:44     ` Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 3/5] Move page initialization into a separate function Nathan Zimmer
2013-08-02 17:44     ` Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 4/5] Only set page reserved in the memblock region Nathan Zimmer
2013-08-02 17:44     ` Nathan Zimmer
2013-08-03 20:04     ` Nathan Zimmer
2013-08-03 20:04       ` Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 5/5] Sparse initialization of struct page array Nathan Zimmer
2013-08-02 17:44     ` Nathan Zimmer
2013-08-05  9:58   ` [RFC v2 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-08-05  9:58     ` Ingo Molnar
2013-08-12 21:54   ` [RFC v3 " Nathan Zimmer
2013-08-12 21:54     ` Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 1/5] memblock: Introduce a for_each_reserved_mem_region iterator Nathan Zimmer
2013-08-12 21:54       ` Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 2/5] Have __free_pages_memory() free in larger chunks Nathan Zimmer
2013-08-12 21:54       ` Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 3/5] Move page initialization into a separate function Nathan Zimmer
2013-08-12 21:54       ` Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 4/5] Only set page reserved in the memblock region Nathan Zimmer
2013-08-12 21:54       ` Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 5/5] Sparse initialization of struct page array Nathan Zimmer
2013-08-12 21:54       ` Nathan Zimmer
2013-08-13 10:58     ` [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-08-13 10:58       ` Ingo Molnar
2013-08-13 17:09     ` Linus Torvalds
2013-08-13 17:09       ` Linus Torvalds
2013-08-13 17:23       ` H. Peter Anvin
2013-08-13 17:23         ` H. Peter Anvin
2013-08-13 17:33       ` Mike Travis
2013-08-13 17:33         ` Mike Travis
2013-08-13 17:51         ` Linus Torvalds
2013-08-13 17:51           ` Linus Torvalds
2013-08-13 18:04           ` Mike Travis
2013-08-13 18:04             ` Mike Travis
2013-08-13 19:06             ` Mike Travis
2013-08-13 19:06               ` Mike Travis
2013-08-13 20:24               ` Yinghai Lu
2013-08-13 20:24                 ` Yinghai Lu
2013-08-13 20:37                 ` Mike Travis
2013-08-13 20:37                   ` Mike Travis
2013-08-13 21:35             ` Nathan Zimmer
2013-08-13 21:35               ` Nathan Zimmer
2013-08-13 23:10           ` Nathan Zimmer
2013-08-13 23:10             ` Nathan Zimmer
2013-08-13 23:55             ` Linus Torvalds
2013-08-13 23:55               ` Linus Torvalds
2013-08-14 11:27               ` Ingo Molnar
2013-08-14 11:27                 ` Ingo Molnar
2013-08-14 11:05           ` Ingo Molnar
2013-08-14 11:05             ` Ingo Molnar
2013-08-14 22:15             ` Nathan Zimmer [this message]
2013-08-14 22:15               ` Nathan Zimmer
2013-08-16 16:36     ` Dave Hansen
2013-08-16 16:36       ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130814221505.GA147490@asylum.americas.sgi.com \
    --to=nzimmer@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel@numascale-asia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=holt@sgi.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=rob@landley.net \
    --cc=torvalds@linux-foundation.org \
    --cc=travis@sgi.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.