linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Android Kernel Team <kernel-team@android.com>,
	Jan Kara <jack@suse.cz>, Minchan Kim <minchan@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Hugh Dickins <hughd@google.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Linux-MM <linux-mm@kvack.org>,
	Vinayak Menon <vinmenon@codeaurora.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting
Date: Mon, 28 Dec 2020 10:47:36 -0800	[thread overview]
Message-ID: <CAHk-=wg4bzJ9ugrOp7DBoMjNpHechm4QWb0-HC3A_pN564RU5A@mail.gmail.com> (raw)
In-Reply-To: <20201228125352.phnj2x2ci3kwfld5@box>

[-- Attachment #1: Type: text/plain, Size: 1983 bytes --]

On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> So far I only found one more pin leak and always-true check. I don't see
> how can it lead to crash or corruption. Keep looking.

Well, I noticed that the nommu.c version of filemap_map_pages() needs
fixing, but that's obviously not the case Hugh sees.

No,m I think the problem is the

        pte_unmap_unlock(vmf->pte, vmf->ptl);

at the end of filemap_map_pages().

Why?

Because we've been updating vmf->pte as we go along:

                vmf->pte += xas.xa_index - last_pgoff;

and I think that by the time we get to that "pte_unmap_unlock()",
vmf->pte potentially points to past the edge of the page directory.

I think that is the bug that Hugh sees - simply because we get
entirely confused about the page table locking. And it would match the
latest change, which was all about moving that unlock from the caller
to filemap_map_pages(), and now it's missing the pte fixup..

I personally think it's wrong to update vmf->pte at all. We should
just have a local 'ptep' pointer that we update as we walk along. But
that requires another change to the calling convention, namely to
"do_set_pte()".

Also, considering how complicated this patch is getting, I think it
might be good to try to split it up a bit.

In particular, I think the calling convention change for
"filemap_map_pages()" could be done first and independently. And then
as a second step, move the VM_FAULT_NOPAGE and "pte_unmap_lock()" from
the callers to filemap_map_pages().

And then only as the final step, do that nice re-organization with
first_map_page/next_map_page() and moving the locking from
alloc_set_pte() into filemap_map_pages()..

How does that sound?

Anyway, Hugh, if it's about overshooting the pte pointer, maybe this
absolutely horrendous and disgusting patch fixes it without the above
kinds of more extensive cleanups. Worth testing, perhaps, even if it's
too ugly for words?

               Linus

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 1360 bytes --]

 mm/filemap.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index 42d7a58e3a14..74384f9f1776 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3022,6 +3022,8 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, unsigned long address,
 	struct page *head, *page;
 	unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss);
 	vm_fault_t ret = 0;
+	pte_t *orig_pte;
+	unsigned long orig_address;
 
 	rcu_read_lock();
 	head = first_map_page(vmf, &xas, end_pgoff);
@@ -3037,6 +3039,14 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, unsigned long address,
 
 	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
 				       vmf->address, &vmf->ptl);
+
+	/*
+	 * Disgusting - we should not update vmf->pte and ->address,
+	 * but do_set_pte() needs it
+	 */
+	orig_pte = vmf->pte;
+	orig_address = vmf->address;
+
 	do {
 		page = find_subpage(head, xas.xa_index);
 		if (PageHWPoison(page))
@@ -3066,6 +3076,10 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, unsigned long address,
 		unlock_page(head);
 		put_page(head);
 	} while ((head = next_map_page(vmf, &xas, end_pgoff)) != NULL);
+
+	/* Hackety hack - reset the vmf state back */
+	vmf->pte = orig_pte;
+	vmf->address = orig_address;
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 	rcu_read_unlock();
 	WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss);

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-12-28 18:49 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-09 16:39 [PATCH 0/2] Create 'old' ptes for faultaround mappings on arm64 with hardware access flag Will Deacon
2020-12-09 16:39 ` [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting Will Deacon
2020-12-09 17:58   ` Linus Torvalds
2020-12-09 18:40     ` Will Deacon
2020-12-09 19:04       ` Linus Torvalds
2020-12-09 20:32         ` Matthew Wilcox
2020-12-09 21:04           ` Linus Torvalds
2020-12-10 15:08         ` Kirill A. Shutemov
2020-12-10 17:23           ` Linus Torvalds
2020-12-14 16:07             ` Kirill A. Shutemov
2020-12-14 17:54               ` Linus Torvalds
2020-12-14 18:56                 ` Matthew Wilcox
2020-12-16 17:07                 ` Kirill A. Shutemov
2020-12-16 18:41                   ` Linus Torvalds
2020-12-17 10:54                     ` Kirill A. Shutemov
2020-12-17 18:22                       ` Linus Torvalds
2020-12-18 11:04                         ` Kirill A. Shutemov
2020-12-18 18:56                           ` Linus Torvalds
2020-12-19 12:41                             ` Kirill A. Shutemov
2020-12-19 20:08                               ` Linus Torvalds
2020-12-19 20:34                               ` Linus Torvalds
2020-12-22 10:00                                 ` Kirill A. Shutemov
2020-12-24  4:04                                   ` Hugh Dickins
2020-12-25 11:31                                     ` Kirill A. Shutemov
2020-12-26 17:57                                       ` Linus Torvalds
2020-12-26 20:43                                         ` Kirill A. Shutemov
2020-12-26 21:03                                           ` Hugh Dickins
2020-12-26 21:16                                             ` Linus Torvalds
2020-12-26 22:40                                               ` Kirill A. Shutemov
2020-12-27  0:45                                                 ` Hugh Dickins
2020-12-27  2:38                                                   ` Hugh Dickins
2020-12-27 19:38                                                     ` Linus Torvalds
2020-12-27 20:32                                                       ` Damian Tometzki
2020-12-27 22:35                                                         ` Hugh Dickins
2020-12-27 23:12                                                           ` Linus Torvalds
2020-12-27 23:40                                                             ` Linus Torvalds
2020-12-27 23:55                                                               ` Kirill A. Shutemov
2020-12-27 23:48                                                       ` Kirill A. Shutemov
2020-12-28  1:54                                                         ` Linus Torvalds
2020-12-28  6:43                                                           ` Hugh Dickins
2020-12-28 12:53                                                             ` Kirill A. Shutemov
2020-12-28 18:47                                                               ` Linus Torvalds [this message]
2020-12-28 21:58                                                                 ` Linus Torvalds
2020-12-29 13:28                                                                   ` Kirill A. Shutemov
2020-12-29 15:19                                                                     ` Matthew Wilcox
2020-12-29 20:52                                                                     ` Linus Torvalds
2020-12-28 22:05                                                                 ` Kirill A. Shutemov
2020-12-28 22:12                                                                   ` Kirill A. Shutemov
2020-12-29  4:35                                                                     ` Hugh Dickins
2020-12-28 23:28                                                                   ` Linus Torvalds
2020-12-26 21:07                                           ` Linus Torvalds
2020-12-26 21:41                                           ` Matthew Wilcox
2020-12-09 16:39 ` [PATCH 2/2] arm64: mm: Implement arch_wants_old_faultaround_pte() Will Deacon
2020-12-09 18:35   ` Catalin Marinas
2020-12-09 18:46     ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wg4bzJ9ugrOp7DBoMjNpHechm4QWb0-HC3A_pN564RU5A@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kernel-team@android.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=vinmenon@codeaurora.org \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).