All of lore.kernel.org
 help / color / mirror / Atom feed
From: Charan Teja Kalla <quic_charante@quicinc.com>
To: Michal Hocko <mhocko@suse.com>
Cc: <akpm@linux-foundation.org>, <david@redhat.com>,
	<pasha.tatashin@soleen.com>, <sieberf@amazon.com>,
	<shakeelb@google.com>, <sjpark@amazon.de>, <dhowells@redhat.com>,
	<willy@infradead.org>, <quic_pkondeti@quicinc.com>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH V3] mm: fix use-after free of page_ext after race with memory-offline
Date: Tue, 16 Aug 2022 15:04:01 +0530	[thread overview]
Message-ID: <286e47e7-3d63-133c-aa6c-05100b557d42@quicinc.com> (raw)
In-Reply-To: <Yvpg6odyDsXrjw5i@dhcp22.suse.cz>

Thanks Michal!!

On 8/15/2022 8:36 PM, Michal Hocko wrote:
> On Tue 09-08-22 20:16:43, Charan Teja Kalla wrote:
> [...]
>> diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
>> index fabb2e1..0e259da 100644
>> --- a/include/linux/page_ext.h
>> +++ b/include/linux/page_ext.h
> [...]
>> @@ -87,5 +83,14 @@ static inline void page_ext_init_flatmem_late(void)
>>  static inline void page_ext_init_flatmem(void)
>>  {
>>  }
>> +
>> +static inline struct page *page_ext_get(struct page *page)
> struct page_ext *
> 
oops!! It didn't get caught as this is in !CONFIG_PAGE_EXTENSION.
>> +{
>> +	return NULL;
>> +}
>> +
>> +static inline void page_ext_put(void)
>> +{
>> +}
>>  #endif /* CONFIG_PAGE_EXTENSION */
>>  #endif /* __LINUX_PAGE_EXT_H */
> [...]
>> diff --git a/mm/page_ext.c b/mm/page_ext.c
>> index 3dc715d..91d7bd2 100644
>> --- a/mm/page_ext.c
>> +++ b/mm/page_ext.c
>> @@ -9,6 +9,7 @@
>>  #include <linux/page_owner.h>
>>  #include <linux/page_idle.h>
>>  #include <linux/page_table_check.h>
>> +#include <linux/rcupdate.h>
>>  
>>  /*
>>   * struct page extension
>> @@ -59,6 +60,10 @@
>>   * can utilize this callback to initialize the state of it correctly.
>>   */
>>  
>> +#ifdef CONFIG_SPARSEMEM
>> +#define PAGE_EXT_INVALID       (0x1)
>> +#endif
>> +
>>  #if defined(CONFIG_PAGE_IDLE_FLAG) && !defined(CONFIG_64BIT)
>>  static bool need_page_idle(void)
>>  {
>> @@ -84,6 +89,7 @@ static struct page_ext_operations *page_ext_ops[] __initdata = {
>>  unsigned long page_ext_size = sizeof(struct page_ext);
>>  
>>  static unsigned long total_usage;
>> +static struct page_ext *lookup_page_ext(const struct page *page);
>>  
>>  static bool __init invoke_need_callbacks(void)
>>  {
>> @@ -125,6 +131,37 @@ static inline struct page_ext *get_entry(void *base, unsigned long index)
>>  	return base + page_ext_size * index;
>>  }
>>  
>> +/*
>> + * This function gives proper page_ext of a memory section
>> + * during race with the offline operation on a memory block
>> + * this section falls into. Not using this function to get
>> + * page_ext of a page, in code paths where extra refcount
>> + * is not taken on that page eg: pfn walking, can lead to
>> + * use-after-free access of page_ext.
> I do not think this is really useful comment, it goes into way too much
> detail about memory hotplug yet not enough to actually understand the
> interaction because there are no references to the actual
> synchronization scheme. I would go with something like:
> 
> /*
>  * Get a page_ext associated with the given page. Returns NULL if
>  * no such page_ext exists otherwise ensures that the page_ext will
>  * stay alive until page_ext_put is called.
>  * This implies a non-sleeping context.
>  */

Will update as per the Matthew input @
https://lore.kernel.org/all/YvplthTjM8Ez5DIq@casper.infradead.org/
>> + */
>> +struct page_ext *page_ext_get(struct page *page)
>> +{
>> +	struct page_ext *page_ext;
>> +
>> +	rcu_read_lock();
>> +	page_ext = lookup_page_ext(page);
>> +	if (!page_ext) {
>> +		rcu_read_unlock();
>> +		return NULL;
>> +	}
>> +
>> +	return page_ext;
>> +}
>> +
>> +/*
>> + * Must be called after work is done with the page_ext received
>> + * with page_ext_get().
>> + */
>> +
>> +void page_ext_put(void)
>> +{
>> +	rcu_read_unlock();
>> +}
> Thinking about this some more I am not sure this is a good interface. It
> doesn't have any reference to the actual object this is called for. This
> is nicely visible in __folio_copy_owner which just calles page_ext_put()
> twice because there are 2 page_exts and I can already see how somebody
> might get confused this is just an error and send a patch to drop one of
> them.
> 
> I do understand why you went this way because having a parameter which
> is not used will likely lead to the same situation. On the other hand it
> could be annotated to not raise warnings. One potential way to
> workaround that would be
> 
> void page_ext_put(struct page_ext *page_ext)
> {
> 	if (unlikely(!page_ext))
> 		return;
> 	
> 	rcu_read_unlock();
> }
> 
> which would help to make the api slightly more robust in case somebody
> does page_ext_put in a branch where page_ext_get returns NULL.
> 
Looks better. Will change this accordingly.

> No strong opinion on that though. WDYI?
> 
>>  #ifndef CONFIG_SPARSEMEM
>>  
>>  
> [...]
>> @@ -183,19 +184,26 @@ static inline void __set_page_owner_handle(struct page_ext *page_ext,
>>  noinline void __set_page_owner(struct page *page, unsigned short order,
>>  					gfp_t gfp_mask)
>>  {
>> -	struct page_ext *page_ext = lookup_page_ext(page);
>> +	struct page_ext *page_ext = page_ext_get(page);
>>  	depot_stack_handle_t handle;
>>  
>>  	if (unlikely(!page_ext))
>>  		return;
> Either add a comment like this
> 	/* save_stack can sleep in general so we have to page_ext_put */


Vlastimil suggested to go for save stack first since !page_ext is mostly
unlikely.  Snip from his comments:
Why not simply do the save_stack() first and then page_ext_get() just
once? It should be really rare that it's NULL, so I don't think we save
much by avoiding an unnecessary save_stack(), while the overhead of
doing two get/put instead of one will affect every call.

https://lore.kernel.org/all/f5fd4942-b03e-1d1c-213b-9cd5283ced91@suse.cz/
>> +	page_ext_put();
>>  
>>  	handle = save_stack(gfp_mask);
> or just drop the initial page_ext_get altogether. This function is
> called only when page_ext is supposed to be initialized and !page_ext
> case above should be very unlikely. Or is there any reason to keep this?
> 


>> +
>> +	/* Ensure page_ext is valid after page_ext_put() above */
>> +	page_ext = page_ext_get(page);
>> +	if (unlikely(!page_ext))
>> +		return;
>>  	__set_page_owner_handle(page_ext, handle, order, gfp_mask);
>> +	page_ext_put();
>>  }
>>  
> [...]
>> @@ -558,13 +590,17 @@ read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos)
>>  		 */
>>  		handle = READ_ONCE(page_owner->handle);
>>  		if (!handle)
>> -			continue;
>> +			goto loop;
>>  
>>  		/* Record the next PFN to read in the file offset */
>>  		*ppos = (pfn - min_low_pfn) + 1;
>>  
>> +		memcpy(&page_owner_tmp, page_owner, sizeof(struct page_owner));
>> +		page_ext_put();
> why not
> 		page_owner_tmp = *page_owner;

Done!!
> 
>>  		return print_page_owner(buf, count, pfn, page,
>> -				page_owner, handle);
>> +				&page_owner_tmp, handle);
>> +loop:
>> +		page_ext_put();
>>  	}
>>  
>>  	return 0;

  parent reply	other threads:[~2022-08-16 10:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-09 14:46 [PATCH V3] mm: fix use-after free of page_ext after race with memory-offline Charan Teja Kalla
2022-08-10  1:57 ` Andrew Morton
2022-08-10  7:23   ` Michal Hocko
2022-08-10  8:27     ` Charan Teja Kalla
2022-08-10 11:40 ` Vlastimil Babka
2022-08-10 14:31   ` Charan Teja Kalla
2022-08-15 15:06 ` Michal Hocko
2022-08-15 15:26   ` Matthew Wilcox
2022-08-15 15:31     ` Michal Hocko
2022-08-16  9:34   ` Charan Teja Kalla [this message]
2022-08-16 16:15     ` Michal Hocko
2022-08-18 14:01       ` Charan Teja Kalla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=286e47e7-3d63-133c-aa6c-05100b557d42@quicinc.com \
    --to=quic_charante@quicinc.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=quic_pkondeti@quicinc.com \
    --cc=shakeelb@google.com \
    --cc=sieberf@amazon.com \
    --cc=sjpark@amazon.de \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.