All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: David Hildenbrand <david@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Qian Cai <cai@lca.pw>, Heiko Carstens <heiko.carstens@de.ibm.com>,
	gor@linux.ibm.com, linux-s390@vger.kernel.org,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/page_alloc: fix a crash in free_pages_prepare()
Date: Mon, 30 Sep 2019 08:27:39 +0200	[thread overview]
Message-ID: <364ff782-b9d0-5228-0392-ecb61b786dce@de.ibm.com> (raw)
In-Reply-To: <5ee9164f-71c5-4082-a80d-8fbc5dc50750@redhat.com>



On 28.09.19 11:06, David Hildenbrand wrote:
> On 28.09.19 00:17, Alexander Duyck wrote:
>> On Fri, Sep 27, 2019 at 2:59 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>>>
>>> On Fri, 27 Sep 2019 17:28:06 -0400 Qian Cai <cai@lca.pw> wrote:
>>>
>>>>>
>>>>> So I think you've moved the arch_free_page() to be after the final
>>>>> thing which can access page contents, yes?  If so, we should have a
>>>>> comment in free_pages_prepare() to attmept to prevent this problem from
>>>>> reoccurring as the code evolves?
>>>>
>>>> Right, something like this above arch_free_page() there?
>>>>
>>>> /*
>>>>  * It needs to be just above kernel_map_pages(), as s390 could mark those
>>>>  * pages unused and then trigger a fault when accessing.
>>>>  */
>>>
>>> I did this.
>>>
>>> --- a/mm/page_alloc.c~mm-page_alloc-fix-a-crash-in-free_pages_prepare-fix
>>> +++ a/mm/page_alloc.c
>>> @@ -1179,7 +1179,13 @@ static __always_inline bool free_pages_p
>>>                 kernel_init_free_pages(page, 1 << order);
>>>
>>>         kernel_poison_pages(page, 1 << order, 0);
>>> +       /*
>>> +        * arch_free_page() can make the page's contents inaccessible.  s390
>>> +        * does this.  So nothing which can access the page's contents should
>>> +        * happen after this.
>>> +        */
>>>         arch_free_page(page, order);
>>> +
>>>         if (debug_pagealloc_enabled())
>>>                 kernel_map_pages(page, 1 << order, 0);
>>>
>>
>> So the question I would have is what is the state of the page after it
>> has been marked unused and then pulled back in? I'm assuming it will
>> be all 0s.
> 
> I think this comment relates to the s390x implementation, so I'll try to
> explain that. After arch_free_page() the page might have been zapped in
> the hypervisor, but that might happen deferred. The guest ends up
> triggering the ESSA instruction in arch_free_page(). That instruction
> sets some extended-page-table-related ("pgste") bits in the hypervisor
> tables for the guest ("gmap") and fills a buffer with these entries. The
> page is marked _PGSTE_GPS_USAGE_UNUSED.

Yes. And that also means that architecturally it can be 0 or it can contain
the old content depending on whether the host has paged that page out or not
and how many pages have been marked unused.
I am also sure that the implementation of z/VM and KVM do differ in that regard.
For example KVM does not make use of the logical zero state but z/VM does.

In essence you can consider this like a ballooner that takes away the page lazily.


For a writeup of the details see 
https://www.kernel.org/doc/ols/2006/ols2006v2-pages-321-336.pdf
(This also contains additional states that were never merged upstream)


  reply	other threads:[~2019-09-30  6:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-27 19:47 [PATCH] mm/page_alloc: fix a crash in free_pages_prepare() Qian Cai
2019-09-27 20:48 ` Andrew Morton
2019-09-27 21:15   ` Qian Cai
2019-09-27 21:15     ` Qian Cai
2019-09-30  7:44     ` Heiko Carstens
2019-09-27 21:02 ` Andrew Morton
2019-09-27 21:28   ` Qian Cai
2019-09-27 21:28     ` Qian Cai
2019-09-27 21:59     ` Andrew Morton
2019-09-27 22:17       ` Alexander Duyck
2019-09-27 22:17         ` Alexander Duyck
2019-09-28  9:06         ` David Hildenbrand
2019-09-30  6:27           ` Christian Borntraeger [this message]
2019-09-30  6:30       ` Christian Borntraeger
2019-09-30 11:00 ` Michal Hocko
2019-09-30 11:04 ` Kirill A. Shutemov
2019-09-30 11:22   ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=364ff782-b9d0-5228-0392-ecb61b786dce@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=cai@lca.pw \
    --cc=david@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.