All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>,
	syzbot 
	<bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>,
	Jan Beulich <JBeulich@suse.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Laurent Dufour <ldufour@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	syzkaller-bugs@googlegroups.com,
	Thomas Gleixner <tglx@linutronix.de>,
	the arch/x86 maintainers <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	David Rientjes <rientjes@google.com>,
	linux-mm <linux-mm@kvack.org>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: KASAN: use-after-free Read in __do_page_fault
Date: Tue, 31 Oct 2017 20:13:36 +0100	[thread overview]
Message-ID: <20171031191336.GA2799@redhat.com> (raw)
In-Reply-To: <CA+55aFwWJyArZMEuo1-4+VaiP95e__cRHkVvrfiQ+NUVJ15DNQ@mail.gmail.com>

On Tue, Oct 31, 2017 at 08:37:47AM -0700, Linus Torvalds wrote:
> Yes. Accessing "vma" after calling "handle_mm_fault()" is a bug. An
> unfortunate issue with userfaultfd.
> 
> The suggested fix to simply look up pkey beforehand seems sane and simple.

Agreed.

> 
> But sadly, from a quick check, it looks like arch/um/ has the same
> bug, but even worse. It will do
> 
>  (a) handle_mm_fault() in a loop without re-calculating vma. Don't ask me why.
> 
>  (b) flush_tlb_page(vma, address); afterwards

Yes, that flush_tlb_page is unsafe. Luckily it's only using it for
vma->vm_mm so it doesn't sound major issue to fix it.

> 
> but much more importantly, I think __get_user_pages() is broken in two ways:
> 
>  - faultin_page() does:
> 
>         ret = handle_mm_fault(vma, address, fault_flags);
>         ...
>         if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE))
> 
>    (easily fixed the same way)
> 
>  - more annoyingly and harder to fix: the retry case in
> __get_user_pages(), and the VMA saving there.
> 
> Ho humm.
> 
> Andrea, looking at that get_user_pages() case, I really think it's
> userfaultfd that is broken.
> 
> Could we perhaps limit userfaultfd to _only_ do the VM_FAULT_RETRY,
> and simply fail for non-retry faults?

In the get_user_pages case we already limit it to do only
VM_FAULT_RETRY so no use after free should materialize whenever gup is
involved.

The problematic path for the return to userland (get_user_pages
returns to kernel) is this one:

	if (return_to_userland) {
		if (signal_pending(current) &&
		    !fatal_signal_pending(current)) {
			/*
			 * If we got a SIGSTOP or SIGCONT and this is
			 * a normal userland page fault, just let
			 * userland return so the signal will be
			 * handled and gdb debugging works.  The page
			 * fault code immediately after we return from
			 * this function is going to release the
			 * mmap_sem and it's not depending on it
			 * (unlike gup would if we were not to return
			 * VM_FAULT_RETRY).
			 *
			 * If a fatal signal is pending we still take
			 * the streamlined VM_FAULT_RETRY failure path
			 * and there's no need to retake the mmap_sem
			 * in such case.
			 */
			down_read(&mm->mmap_sem);
			ret = VM_FAULT_NOPAGE;
		}
	}

We could remove the above branch all together and then
handle_userfault() would always return VM_FAULT_RETRY whenever it
decides to release the mmap_sem. The above makes debugging with gdb
more user friendly and it potentially lowers the latency of signals as
signals can unblock handle_userfault. The downside is that the return
to userland cannot dereference the vma after calling handle_mm_fault.

Thanks,
Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Dmitry Vyukov <dvyukov@google.com>,
	syzbot
	<bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>,
	Jan Beulich <JBeulich@suse.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Laurent Dufour <ldufour@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	syzkaller-bugs@googlegroups.com,
	Thomas Gleixner <tglx@linutronix.de>,
	the arch/x86 maintainers <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	David Rientjes <rientjes@google.com>,
	linux-mm <linux-mm@kvack.org>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: KASAN: use-after-free Read in __do_page_fault
Date: Tue, 31 Oct 2017 20:13:36 +0100	[thread overview]
Message-ID: <20171031191336.GA2799@redhat.com> (raw)
In-Reply-To: <CA+55aFwWJyArZMEuo1-4+VaiP95e__cRHkVvrfiQ+NUVJ15DNQ@mail.gmail.com>

On Tue, Oct 31, 2017 at 08:37:47AM -0700, Linus Torvalds wrote:
> Yes. Accessing "vma" after calling "handle_mm_fault()" is a bug. An
> unfortunate issue with userfaultfd.
> 
> The suggested fix to simply look up pkey beforehand seems sane and simple.

Agreed.

> 
> But sadly, from a quick check, it looks like arch/um/ has the same
> bug, but even worse. It will do
> 
>  (a) handle_mm_fault() in a loop without re-calculating vma. Don't ask me why.
> 
>  (b) flush_tlb_page(vma, address); afterwards

Yes, that flush_tlb_page is unsafe. Luckily it's only using it for
vma->vm_mm so it doesn't sound major issue to fix it.

> 
> but much more importantly, I think __get_user_pages() is broken in two ways:
> 
>  - faultin_page() does:
> 
>         ret = handle_mm_fault(vma, address, fault_flags);
>         ...
>         if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE))
> 
>    (easily fixed the same way)
> 
>  - more annoyingly and harder to fix: the retry case in
> __get_user_pages(), and the VMA saving there.
> 
> Ho humm.
> 
> Andrea, looking at that get_user_pages() case, I really think it's
> userfaultfd that is broken.
> 
> Could we perhaps limit userfaultfd to _only_ do the VM_FAULT_RETRY,
> and simply fail for non-retry faults?

In the get_user_pages case we already limit it to do only
VM_FAULT_RETRY so no use after free should materialize whenever gup is
involved.

The problematic path for the return to userland (get_user_pages
returns to kernel) is this one:

	if (return_to_userland) {
		if (signal_pending(current) &&
		    !fatal_signal_pending(current)) {
			/*
			 * If we got a SIGSTOP or SIGCONT and this is
			 * a normal userland page fault, just let
			 * userland return so the signal will be
			 * handled and gdb debugging works.  The page
			 * fault code immediately after we return from
			 * this function is going to release the
			 * mmap_sem and it's not depending on it
			 * (unlike gup would if we were not to return
			 * VM_FAULT_RETRY).
			 *
			 * If a fatal signal is pending we still take
			 * the streamlined VM_FAULT_RETRY failure path
			 * and there's no need to retake the mmap_sem
			 * in such case.
			 */
			down_read(&mm->mmap_sem);
			ret = VM_FAULT_NOPAGE;
		}
	}

We could remove the above branch all together and then
handle_userfault() would always return VM_FAULT_RETRY whenever it
decides to release the mmap_sem. The above makes debugging with gdb
more user friendly and it potentially lowers the latency of signals as
signals can unblock handle_userfault. The downside is that the return
to userland cannot dereference the vma after calling handle_mm_fault.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-10-31 19:13 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-30 19:12 KASAN: use-after-free Read in __do_page_fault syzbot
2017-10-30 19:15 ` Dmitry Vyukov
2017-10-30 19:15   ` Dmitry Vyukov
2017-10-31 12:00   ` Vlastimil Babka
2017-10-31 12:00     ` Vlastimil Babka
2017-10-31 12:42     ` Dmitry Vyukov
2017-10-31 12:42       ` Dmitry Vyukov
2017-10-31 13:20       ` Vlastimil Babka
2017-10-31 13:20         ` Vlastimil Babka
2017-10-31 13:57         ` Vlastimil Babka
2017-10-31 13:57           ` Vlastimil Babka
2017-10-31 14:11           ` Kirill A. Shutemov
2017-10-31 14:11             ` Kirill A. Shutemov
2017-10-31 14:28             ` Vlastimil Babka
2017-10-31 14:28               ` Vlastimil Babka
2017-10-31 19:15               ` Andrea Arcangeli
2017-10-31 19:15                 ` Andrea Arcangeli
2017-11-01  7:42                 ` Vlastimil Babka
2017-11-01  7:42                   ` Vlastimil Babka
2017-11-01 10:17                   ` Andrea Arcangeli
2017-11-01 10:17                     ` Andrea Arcangeli
2017-11-01 12:14                     ` Vlastimil Babka
2017-11-01 12:14                       ` Vlastimil Babka
2017-10-31 15:37           ` Linus Torvalds
2017-10-31 15:37             ` Linus Torvalds
2017-10-31 19:13             ` Andrea Arcangeli [this message]
2017-10-31 19:13               ` Andrea Arcangeli
2017-11-01 15:26               ` Linus Torvalds
2017-11-01 15:26                 ` Linus Torvalds
2017-11-02 19:36                 ` Andrea Arcangeli
2017-11-02 19:36                   ` Andrea Arcangeli
2017-11-02 10:00           ` Laurent Dufour
2017-11-02 10:00             ` Laurent Dufour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171031191336.GA2799@redhat.com \
    --to=aarcange@redhat.com \
    --cc=JBeulich@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com \
    --cc=dvyukov@google.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jpoimboe@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=regressions@leemhuis.info \
    --cc=rientjes@google.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.