From: Matthew Wilcox <willy@infradead.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, Minchan Kim <minchan@kernel.org>,
Nitin Gupta <ngupta@vflare.org>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
Kent Overstreet <kent.overstreet@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: pageless memory & zsmalloc
Date: Thu, 7 Oct 2021 19:11:36 +0100 [thread overview]
Message-ID: <YV84WBH9eqW4xB3d@casper.infradead.org> (raw)
In-Reply-To: <3a78e51a-66f2-5d4b-70ee-c2bc3969d095@suse.cz>
On Thu, Oct 07, 2021 at 05:03:12PM +0200, Vlastimil Babka wrote:
> On 10/5/21 19:51, Matthew Wilcox wrote:
> > We're trying to tidy up the mess in struct page, and as part of removing
> > slab from struct page, zsmalloc came on my radar because it's using some
> > of slab's fields. The eventual endgame is to get struct page down to a
> > single word which points to the "memory descriptor" (ie the current
> > zspage).
> >
> > zsmalloc, like vmalloc, allocates order-0 pages. Unlike vmalloc,
> > zsmalloc allows compaction. Currently (from the file):
> >
> > * Usage of struct page fields:
> > * page->private: points to zspage
> > * page->freelist(index): links together all component pages of a zspage
> > * For the huge page, this is always 0, so we use this field
> > * to store handle.
> > * page->units: first object offset in a subpage of zspage
> > *
> > * Usage of struct page flags:
> > * PG_private: identifies the first component page
> > * PG_owner_priv_1: identifies the huge component page
> >
> > This isn't quite everything. For compaction, zsmalloc also uses
> > page->mapping (set in __SetPageMovable()), PG_lock (to sync with
> > compaction) and page->_refcount (compaction gets a refcount on the page).
> >
> > Since zsmalloc is so well-contained, I propose we completely stop
> > using struct page in it, as we intend to do for the rest of the users
> > of struct page. That is, the _only_ element of struct page we use is
> > compound_head and it points to struct zspage.
> >
> > That means every single page allocated by zsmalloc is PageTail(). Also it
>
> I would be worried there is code, i.e. some pfn scanner that will see a
> PageTail, lookup its compound_head() and order and use it to skip over the
> rest of tail pages. Which would fail spectacularly if compound_head()
> pointed somewhere else than to the same memmap array to a struct page.
Yes, that's definitely a concern. What does work is the pfn scanner
doing pfn |= (1 << page_order(page)) - 1; (because page_order(zspage)
is 0, so this is a noop). It's something that will need to be audited
before we do this.
next prev parent reply other threads:[~2021-10-07 18:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-05 17:51 pageless memory & zsmalloc Matthew Wilcox
2021-10-05 20:13 ` Kent Overstreet
2021-10-05 21:28 ` Matthew Wilcox
2021-10-05 23:00 ` Kent Overstreet
2021-10-06 3:21 ` Matthew Wilcox
2021-10-07 15:03 ` Vlastimil Babka
2021-10-07 18:11 ` Matthew Wilcox [this message]
2021-10-08 20:43 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YV84WBH9eqW4xB3d@casper.infradead.org \
--to=willy@infradead.org \
--cc=hannes@cmpxchg.org \
--cc=kent.overstreet@gmail.com \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=ngupta@vflare.org \
--cc=senozhatsky@chromium.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).