linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Peter Xu <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Jerome Glisse <jglisse@redhat.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Alistair Popple <apopple@nvidia.com>
Subject: Re: [PATCH v8 03/23] mm: Check against orig_pte for finish_fault()
Date: Thu, 14 Apr 2022 09:51:01 +0200	[thread overview]
Message-ID: <6ccf5f5f-8dc5-16cc-f06c-78401b822a54@samsung.com> (raw)
In-Reply-To: <Ylb9rXJyPm8/ao8f@xz-m1.local>

Hi Peter,

On 13.04.2022 18:43, Peter Xu wrote:
> On Wed, Apr 13, 2022 at 04:03:28PM +0200, Marek Szyprowski wrote:
>> On 05.04.2022 03:48, Peter Xu wrote:
>>> We used to check against none pte in finish_fault(), with the assumption
>>> that the orig_pte is always none pte.
>>>
>>> This change prepares us to be able to call do_fault() on !none ptes.  For
>>> example, we should allow that to happen for pte marker so that we can restore
>>> information out of the pte markers.
>>>
>>> Let's change the "pte_none" check into detecting changes since we fetched
>>> orig_pte.  One trivial thing to take care of here is, when pmd==NULL for
>>> the pgtable we may not initialize orig_pte at all in handle_pte_fault().
>>>
>>> By default orig_pte will be all zeros however the problem is not all
>>> architectures are using all-zeros for a none pte.  pte_clear() will be the
>>> right thing to use here so that we'll always have a valid orig_pte value
>>> for the whole handle_pte_fault() call.
>>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> This patch landed in today's linux next-202204213 as commit fa6009949163
>> ("mm: check against orig_pte for finish_fault()"). Unfortunately it
>> causes serious system instability on some ARM 32bit machines. I've
>> observed it on all tested boards (various Samsung Exynos based,
>> Raspberry Pi 3b and 4b, even QEMU's virt 32bit machine) when kernel was
>> compiled from multi_v7_defconfig.
> Thanks for the report.
>
>> Here is a crash log from QEMU's ARM 32bit virt machine:
>>
>> 8<--- cut here ---
>> Unable to handle kernel paging request at virtual address e093263c
>> [e093263c] *pgd=42083811, *pte=00000000, *ppte=00000000
>> Internal error: Oops: 807 [#1] SMP ARM
>> Modules linked in:
>> CPU: 1 PID: 37 Comm: kworker/u4:0 Not tainted
>> 5.18.0-rc2-00176-gfa6009949163 #11684
>> Hardware name: Generic DT based system
>> PC is at cpu_ca15_set_pte_ext+0x4c/0x58
>> LR is at handle_mm_fault+0x46c/0xbb0
> I had a feeling that for some reason the pte_clear() isn't working right
> there when it's applying to a kernel stack variable for arm32.  I'm totally
> newbie to arm32, so what I'm reading is this:
>
> https://people.kernel.org/linusw/arm32-page-tables
>
> Especially:
>
> https://protect2.fireeye.com/v1/url?k=35bc90ac-6a27a9bd-35bd1be3-0cc47a31cdbc-b032cb1d178dc691&q=1&e=c82daefb-c86b-4ca1-8db1-cadbdc124ed2&u=https%3A%2F%2Fdflund.se%2F%7Etriad%2Fimages%2Fclassic-mmu-page-table.jpg
>
> It does match with what I read from arm32's proc-v7-2level.S of it, where
> from the comment above cpu_v7_set_pte_ext:
>
>    * - ptep  - pointer to level 2 translation table entry
>    *    (hardware version is stored at +2048 bytes)        <----------
>
> So it seems to me that arm32 needs to store some metadata at offset 0x800
> of any pte_t* pointer passed over to pte_clear(), then it must be a real
> pgtable or it'll write to random places in the kernel, am I correct?
>
> Does it mean that all pte_*() operations upon a kernel stack var will be
> wrong?  I thought it could happen easily in the rest of mm too but I didn't
> yet check much.  The fact shows that it's mostly possible the current code
> just work well with arm32 and no such violation occured yet.
>
> That does sound a bit tricky, IMHO.  But I don't have an immediate solution
> to make it less tricky.. though I have a thought of workaround, by simply
> not calling pte_clear() on the stack var.
>
> Would you try the attached patch to replace this problematic patch? So we
> need to revert commit fa6009949163 and apply the new one.  Please let me
> know whether it'll solve the problem, so far I only compile tested it, but
> I'll run some more test to make sure the uffd-wp scenarios will be working
> right with the new version.

I've reverted fa6009949163 and applied the attached patch on top of 
linux next-20220314. The ARM 32bit issues went away. :)

Feel free to add:

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


  reply	other threads:[~2022-04-14  7:51 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05  1:46 [PATCH v8 00/23] userfaultfd-wp: Support shmem and hugetlbfs Peter Xu
2022-04-05  1:46 ` [PATCH v8 01/23] mm: Introduce PTE_MARKER swap entry Peter Xu
2022-04-12  1:07   ` Alistair Popple
2022-04-12 19:45     ` Peter Xu
2022-04-13  0:30       ` Alistair Popple
2022-04-13 13:44         ` Peter Xu
2022-04-19  8:25   ` Alistair Popple
2022-04-19 19:44     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 02/23] mm: Teach core mm about pte markers Peter Xu
2022-04-12  1:22   ` Alistair Popple
2022-04-12 19:53     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 03/23] mm: Check against orig_pte for finish_fault() Peter Xu
2022-04-12  2:05   ` Alistair Popple
2022-04-12 19:54     ` Peter Xu
     [not found]   ` <CGME20220413140330eucas1p167da41e079712b829ef8237dc27b049c@eucas1p1.samsung.com>
2022-04-13 14:03     ` Marek Szyprowski
2022-04-13 16:43       ` Peter Xu
2022-04-14  7:51         ` Marek Szyprowski [this message]
2022-04-14 16:30           ` Peter Xu
2022-04-14 20:57             ` Andrew Morton
2022-04-14 21:08               ` Peter Xu
2022-04-15 14:21   ` Guenter Roeck
2022-04-15 14:41     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 04/23] mm/uffd: PTE_MARKER_UFFD_WP Peter Xu
2022-04-06  1:41   ` kernel test robot
2022-04-05  1:48 ` [PATCH v8 05/23] mm/shmem: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2022-04-05  1:48 ` [PATCH v8 06/23] mm/shmem: Handle uffd-wp special pte in page fault handler Peter Xu
2022-05-11 16:30   ` David Hildenbrand
2022-05-12 16:34     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 07/23] mm/shmem: Persist uffd-wp bit across zapping for file-backed Peter Xu
2022-04-05  1:48 ` [PATCH v8 08/23] mm/shmem: Allow uffd wr-protect none pte for file-backed mem Peter Xu
2022-04-05  1:48 ` [PATCH v8 09/23] mm/shmem: Allows file-back mem to be uffd wr-protected on thps Peter Xu
2022-04-05  1:48 ` [PATCH v8 10/23] mm/shmem: Handle uffd-wp during fork() Peter Xu
2022-04-06  6:16   ` kernel test robot
2022-04-06 12:18     ` Peter Xu
2022-04-05  1:48 ` [PATCH v8 11/23] mm/hugetlb: Introduce huge pte version of uffd-wp helpers Peter Xu
2022-04-05  1:49 ` [PATCH v8 12/23] mm/hugetlb: Hook page faults for uffd write protection Peter Xu
2022-04-05  1:49 ` [PATCH v8 13/23] mm/hugetlb: Take care of UFFDIO_COPY_MODE_WP Peter Xu
2022-04-05  1:49 ` [PATCH v8 14/23] mm/hugetlb: Handle UFFDIO_WRITEPROTECT Peter Xu
2022-04-05  1:49 ` [PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults Peter Xu
2022-04-06 13:37   ` kernel test robot
2022-04-06 15:02     ` Peter Xu
2022-04-05  1:49 ` [PATCH v8 16/23] mm/hugetlb: Allow uffd wr-protect none ptes Peter Xu
2022-04-05  1:49 ` [PATCH v8 17/23] mm/hugetlb: Only drop uffd-wp special pte if required Peter Xu
2022-04-05  1:49 ` [PATCH v8 18/23] mm/hugetlb: Handle uffd-wp during fork() Peter Xu
2022-04-05  1:49 ` [PATCH v8 19/23] mm/khugepaged: Don't recycle vma pgtable if uffd-wp registered Peter Xu
2022-04-05  1:49 ` [PATCH v8 20/23] mm/pagemap: Recognize uffd-wp bit for shmem/hugetlbfs Peter Xu
2022-04-05  1:49 ` [PATCH v8 21/23] mm/uffd: Enable write protection for shmem & hugetlbfs Peter Xu
2022-04-05  1:49 ` [PATCH v8 22/23] mm: Enable PTE markers by default Peter Xu
2022-04-19 15:13   ` Johannes Weiner
2022-04-19 19:59     ` Peter Xu
2022-04-19 20:14       ` Johannes Weiner
2022-04-19 20:28         ` Peter Xu
2022-04-19 21:24           ` Johannes Weiner
2022-04-19 22:01             ` Peter Xu
2022-04-20 13:46               ` Johannes Weiner
2022-04-20 14:25                 ` Peter Xu
2022-04-05  1:49 ` [PATCH v8 23/23] selftests/uffd: Enable uffd-wp for shmem/hugetlbfs Peter Xu
2022-04-05 22:16 ` [PATCH v8 00/23] userfaultfd-wp: Support shmem and hugetlbfs Andrew Morton
2022-04-05 22:42   ` Peter Xu
2022-04-05 22:49     ` Andrew Morton
2022-04-05 23:02       ` Peter Xu
2022-04-05 23:08         ` Andrew Morton
2022-05-10 19:05 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ccf5f5f-8dc5-16cc-f06c-78401b822a54@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).