All of lore.kernel.org
 help / color / mirror / Atom feed
* Distinguishing VMalloc pages
@ 2018-06-11 12:11 Matthew Wilcox
  2018-06-11 17:25 ` Christopher Lameter
  2018-06-12  9:54 ` Igor Stoppa
  0 siblings, 2 replies; 6+ messages in thread
From: Matthew Wilcox @ 2018-06-11 12:11 UTC (permalink / raw)
  To: Andrey Ryabinin, Michal Hocko, linux-mm


I think we all like the idea of being able to look at a page [1] and
determine what it's used for.  We have two places that we already look:

PageSlab
page_type

It's not possible to use page_type for VMalloc pages because that field
is in use for mapcount.  We don't want to use another page flag bit.

I tried to use the page->mapping field in my earlier patch and that was
a problem because page_mapping() would return non-NULL, which broke
user-space unmapping of vmalloced pages through the zap_pte_range ->
set_page_dirty path.

I can see two alternatives to pursue here.  One is that we already have
special casing in page_mapping():

 	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;

So changing:
-#define MAPPING_VMalloc                (void *)0x440
+#define MAPPING_VMalloc                (void *)0x441

in my original patch would lead to page_mapping() returning NULL.
Are there other paths where having a special value in page->mapping is
going to cause a problem?  Indeed, is having the PAGE_MAPPING_ANON bit
set in these pages going to cause a problem?  I just don't know those
code paths well enough.

Another possibility is putting a special value in one of the other
fields of struct page.

1. page->private is not available; everybody uses that field for
everything already, and there's no way that any value could be special
enough to be unique.
2. page->index (on 32-bit systems) can already have all possible values.
3. page->lru.  The second word is already used for many random things,
but the first word is always either a pointer or compound_head (with
bit 0 set).  So we could use a set of values with bits 0 & 1 clear, and
below 4kB (ie 1023 values total) to distinguish pages.

Any preferences/recommendations/words of warning?

[1] It may be helpful to refer to the 'new64' tab for a visual depiction:
https://docs.google.com/spreadsheets/d/1tvCszs_7FXrjei9_mtFiKV6nW1FLnYyvPvW-qNZhdog/edit#gid=1941250461

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Distinguishing VMalloc pages
  2018-06-11 12:11 Distinguishing VMalloc pages Matthew Wilcox
@ 2018-06-11 17:25 ` Christopher Lameter
  2018-06-11 17:59   ` Matthew Wilcox
  2018-06-12  9:54 ` Igor Stoppa
  1 sibling, 1 reply; 6+ messages in thread
From: Christopher Lameter @ 2018-06-11 17:25 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrey Ryabinin, Michal Hocko, linux-mm

On Mon, 11 Jun 2018, Matthew Wilcox wrote:

>
> I think we all like the idea of being able to look at a page [1] and
> determine what it's used for.  We have two places that we already look:
>
> PageSlab
> page_type

Since we already have PageSlab: Is it possible to use that flag
differently so that it is maybe something like PageTyped(xx)? I think
there may be some bits available somewhere if PageSlab( is set and these
typed pages usually are not on the lru. So if its untyped the page is on
LRU otherwise the type can be identified somehow?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Distinguishing VMalloc pages
  2018-06-11 17:25 ` Christopher Lameter
@ 2018-06-11 17:59   ` Matthew Wilcox
  0 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2018-06-11 17:59 UTC (permalink / raw)
  To: Christopher Lameter; +Cc: Andrey Ryabinin, Michal Hocko, linux-mm

On Mon, Jun 11, 2018 at 05:25:21PM +0000, Christopher Lameter wrote:
> On Mon, 11 Jun 2018, Matthew Wilcox wrote:
> 
> >
> > I think we all like the idea of being able to look at a page [1] and
> > determine what it's used for.  We have two places that we already look:
> >
> > PageSlab
> > page_type
> 
> Since we already have PageSlab: Is it possible to use that flag
> differently so that it is maybe something like PageTyped(xx)? I think
> there may be some bits available somewhere if PageSlab( is set and these
> typed pages usually are not on the lru. So if its untyped the page is on
> LRU otherwise the type can be identified somehow?

Yes, I've been thinking about that option too; thanks for bringing it up!

We need to go through the PageFlags and see which combinations of them
are valid.  I started on that in that same spreasdsheet (purposes tab) ...

Type flags: SL RS HP
State: LO ER RF UP DI LR AC WA O1 A1 PR P2 WB
HD MD RC SB UV ML UC YG ID

Mapping - 0xxx
Slab - 1000
VMalloc - 1001
Reserved - 1010
HWPoison - 1011
Kernel - 1100
PageTable - 1101
PageBuddy - 1110
1111 unused for now

SL is the Slab bit.  RS is Reserved and HP is HWPoison.  I believe that
all three of those bits are mutually exclusive (but maybe I'm wrong).

At any rate, SwapBacked only makes sense on anonymous pages (right?) and
MappedToDisk certainly doesn't make sense on slab pages, so we can use
those two bits ... I think.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Distinguishing VMalloc pages
  2018-06-11 12:11 Distinguishing VMalloc pages Matthew Wilcox
  2018-06-11 17:25 ` Christopher Lameter
@ 2018-06-12  9:54 ` Igor Stoppa
  2018-06-12 11:36   ` Matthew Wilcox
  1 sibling, 1 reply; 6+ messages in thread
From: Igor Stoppa @ 2018-06-12  9:54 UTC (permalink / raw)
  To: Matthew Wilcox, Andrey Ryabinin, Michal Hocko, linux-mm

On 11/06/18 15:11, Matthew Wilcox wrote:
> 
> I think we all like the idea of being able to look at a page [1] and
> determine what it's used for.  We have two places that we already look:
> 
> PageSlab
> page_type
> 
> It's not possible to use page_type for VMalloc pages because that field
> is in use for mapcount.  We don't want to use another page flag bit.
> 
> I tried to use the page->mapping field in my earlier patch and that was
> a problem because page_mapping() would return non-NULL, which broke
> user-space unmapping of vmalloced pages through the zap_pte_range ->
> set_page_dirty path.

This seems pretty similar to what I am doing in a preparatory patch for
pmalloc (I'm still working on this, I just got swamped in day-job 
related stuff, but I am progressing toward an example with IMA).
So it looks like my patch won't work, after all?

Although, in your case, you noticed a problem with userspace, while I do
not care at all about that, so maybe there is some wriggling space there ...

> 
> I can see two alternatives to pursue here.  One is that we already have
> special casing in page_mapping():
> 
>   	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
>   		return NULL;
> 
> So changing:
> -#define MAPPING_VMalloc                (void *)0x440
> +#define MAPPING_VMalloc                (void *)0x441
> 
> in my original patch would lead to page_mapping() returning NULL.
> Are there other paths where having a special value in page->mapping is
> going to cause a problem?  Indeed, is having the PAGE_MAPPING_ANON bit
> set in these pages going to cause a problem?  I just don't know those
> code paths well enough.
> 
> Another possibility is putting a special value in one of the other
> fields of struct page.
> 
> 1. page->private is not available; everybody uses that field for
> everything already, and there's no way that any value could be special
> enough to be unique.
> 2. page->index (on 32-bit systems) can already have all possible values.
> 3. page->lru.  The second word is already used for many random things,
> but the first word is always either a pointer or compound_head (with
> bit 0 set).  So we could use a set of values with bits 0 & 1 clear, and
> below 4kB (ie 1023 values total) to distinguish pages.
> 
> Any preferences/recommendations/words of warning?


Why not having a reference (either direct or indirect) to the actual
vmap area, and then the flag there, instead?

I do not know the specific use case you have in mind - if any - but I
think that if one is already trying to figure out what sort of use the
vmalloc page is put to, then probably pretty soon there will be a need
for a reference to the area.

So what if the page could hold a reference the area, where there would
be more space available for specifying what it is used for?

--
igor

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Distinguishing VMalloc pages
  2018-06-12  9:54 ` Igor Stoppa
@ 2018-06-12 11:36   ` Matthew Wilcox
  2018-06-12 12:35     ` Igor Stoppa
  0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2018-06-12 11:36 UTC (permalink / raw)
  To: Igor Stoppa; +Cc: Andrey Ryabinin, Michal Hocko, linux-mm

On Tue, Jun 12, 2018 at 12:54:09PM +0300, Igor Stoppa wrote:
> On 11/06/18 15:11, Matthew Wilcox wrote:
> > I tried to use the page->mapping field in my earlier patch and that was
> > a problem because page_mapping() would return non-NULL, which broke
> > user-space unmapping of vmalloced pages through the zap_pte_range ->
> > set_page_dirty path.
> 
> This seems pretty similar to what I am doing in a preparatory patch for
> pmalloc (I'm still working on this, I just got swamped in day-job related
> stuff, but I am progressing toward an example with IMA).
> So it looks like my patch won't work, after all?
> 
> Although, in your case, you noticed a problem with userspace, while I do
> not care at all about that, so maybe there is some wriggling space there ...

Yes; if your pages can never be mapped to userspace, then there's no
problem.  Many other users of struct page use the page->mapping field
for other purposes.

> Why not having a reference (either direct or indirect) to the actual
> vmap area, and then the flag there, instead?

Because what we're trying to do is find out "Given a random struct page,
what is it used for".  It might be page cache, it might be slab, it
might be anything.  We can't go round randomly dereferencing pointers
and seeing what pot of gold is at the end of that rainbow.

> I do not know the specific use case you have in mind - if any - but I
> think that if one is already trying to figure out what sort of use the
> vmalloc page is put to, then probably pretty soon there will be a need
> for a reference to the area.
> 
> So what if the page could hold a reference the area, where there would
> be more space available for specifying what it is used for?

It might be useful to refer to the earlier patch which included that
information:

https://www.spinics.net/lists/linux-mm/msg152818.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Distinguishing VMalloc pages
  2018-06-12 11:36   ` Matthew Wilcox
@ 2018-06-12 12:35     ` Igor Stoppa
  0 siblings, 0 replies; 6+ messages in thread
From: Igor Stoppa @ 2018-06-12 12:35 UTC (permalink / raw)
  To: Matthew Wilcox, Igor Stoppa; +Cc: Andrey Ryabinin, Michal Hocko, linux-mm



On 12/06/18 14:36, Matthew Wilcox wrote:
> On Tue, Jun 12, 2018 at 12:54:09PM +0300, Igor Stoppa wrote:

[...]

>> Although, in your case, you noticed a problem with userspace, while I do
>> not care at all about that, so maybe there is some wriggling space there ...
> 
> Yes; if your pages can never be mapped to userspace, then there's no
> problem.  Many other users of struct page use the page->mapping field
> for other purposes.
> 
>> Why not having a reference (either direct or indirect) to the actual
>> vmap area, and then the flag there, instead?
> 
> Because what we're trying to do is find out "Given a random struct page,
> what is it used for".  It might be page cache, it might be slab, it
> might be anything.  We can't go round randomly dereferencing pointers
> and seeing what pot of gold is at the end of that rainbow.

Ah, I had understood that it was already given that it was a vmalloc page.

[...]

> It might be useful to refer to the earlier patch which included that
> information:
> 
> https://www.spinics.net/lists/linux-mm/msg152818.html

thank you,
igor

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-12 12:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-11 12:11 Distinguishing VMalloc pages Matthew Wilcox
2018-06-11 17:25 ` Christopher Lameter
2018-06-11 17:59   ` Matthew Wilcox
2018-06-12  9:54 ` Igor Stoppa
2018-06-12 11:36   ` Matthew Wilcox
2018-06-12 12:35     ` Igor Stoppa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.