From: Andrea Arcangeli <aarcange@redhat.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz>, Dmitry Vyukov <dvyukov@google.com>, syzbot <bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>, Jan Beulich <JBeulich@suse.com>, "H. Peter Anvin" <hpa@zytor.com>, Josh Poimboeuf <jpoimboe@redhat.com>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Laurent Dufour <ldufour@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>, Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>, syzkaller-bugs@googlegroups.com, Thomas Gleixner <tglx@linutronix.de>, the arch/x86 maintainers <x86@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>, David Rientjes <rientjes@google.com>, linux-mm <linux-mm@kvack.org>, Thorsten Leemhuis <regressions@leemhuis.info> Subject: Re: KASAN: use-after-free Read in __do_page_fault Date: Tue, 31 Oct 2017 20:13:36 +0100 [thread overview] Message-ID: <20171031191336.GA2799@redhat.com> (raw) In-Reply-To: <CA+55aFwWJyArZMEuo1-4+VaiP95e__cRHkVvrfiQ+NUVJ15DNQ@mail.gmail.com> On Tue, Oct 31, 2017 at 08:37:47AM -0700, Linus Torvalds wrote: > Yes. Accessing "vma" after calling "handle_mm_fault()" is a bug. An > unfortunate issue with userfaultfd. > > The suggested fix to simply look up pkey beforehand seems sane and simple. Agreed. > > But sadly, from a quick check, it looks like arch/um/ has the same > bug, but even worse. It will do > > (a) handle_mm_fault() in a loop without re-calculating vma. Don't ask me why. > > (b) flush_tlb_page(vma, address); afterwards Yes, that flush_tlb_page is unsafe. Luckily it's only using it for vma->vm_mm so it doesn't sound major issue to fix it. > > but much more importantly, I think __get_user_pages() is broken in two ways: > > - faultin_page() does: > > ret = handle_mm_fault(vma, address, fault_flags); > ... > if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE)) > > (easily fixed the same way) > > - more annoyingly and harder to fix: the retry case in > __get_user_pages(), and the VMA saving there. > > Ho humm. > > Andrea, looking at that get_user_pages() case, I really think it's > userfaultfd that is broken. > > Could we perhaps limit userfaultfd to _only_ do the VM_FAULT_RETRY, > and simply fail for non-retry faults? In the get_user_pages case we already limit it to do only VM_FAULT_RETRY so no use after free should materialize whenever gup is involved. The problematic path for the return to userland (get_user_pages returns to kernel) is this one: if (return_to_userland) { if (signal_pending(current) && !fatal_signal_pending(current)) { /* * If we got a SIGSTOP or SIGCONT and this is * a normal userland page fault, just let * userland return so the signal will be * handled and gdb debugging works. The page * fault code immediately after we return from * this function is going to release the * mmap_sem and it's not depending on it * (unlike gup would if we were not to return * VM_FAULT_RETRY). * * If a fatal signal is pending we still take * the streamlined VM_FAULT_RETRY failure path * and there's no need to retake the mmap_sem * in such case. */ down_read(&mm->mmap_sem); ret = VM_FAULT_NOPAGE; } } We could remove the above branch all together and then handle_userfault() would always return VM_FAULT_RETRY whenever it decides to release the mmap_sem. The above makes debugging with gdb more user friendly and it potentially lowers the latency of signals as signals can unblock handle_userfault. The downside is that the return to userland cannot dereference the vma after calling handle_mm_fault. Thanks, Andrea
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz>, Dmitry Vyukov <dvyukov@google.com>, syzbot <bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>, Jan Beulich <JBeulich@suse.com>, "H. Peter Anvin" <hpa@zytor.com>, Josh Poimboeuf <jpoimboe@redhat.com>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Laurent Dufour <ldufour@linux.vnet.ibm.com>, LKML <linux-kernel@vger.kernel.org>, Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>, syzkaller-bugs@googlegroups.com, Thomas Gleixner <tglx@linutronix.de>, the arch/x86 maintainers <x86@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>, David Rientjes <rientjes@google.com>, linux-mm <linux-mm@kvack.org>, Thorsten Leemhuis <regressions@leemhuis.info> Subject: Re: KASAN: use-after-free Read in __do_page_fault Date: Tue, 31 Oct 2017 20:13:36 +0100 [thread overview] Message-ID: <20171031191336.GA2799@redhat.com> (raw) In-Reply-To: <CA+55aFwWJyArZMEuo1-4+VaiP95e__cRHkVvrfiQ+NUVJ15DNQ@mail.gmail.com> On Tue, Oct 31, 2017 at 08:37:47AM -0700, Linus Torvalds wrote: > Yes. Accessing "vma" after calling "handle_mm_fault()" is a bug. An > unfortunate issue with userfaultfd. > > The suggested fix to simply look up pkey beforehand seems sane and simple. Agreed. > > But sadly, from a quick check, it looks like arch/um/ has the same > bug, but even worse. It will do > > (a) handle_mm_fault() in a loop without re-calculating vma. Don't ask me why. > > (b) flush_tlb_page(vma, address); afterwards Yes, that flush_tlb_page is unsafe. Luckily it's only using it for vma->vm_mm so it doesn't sound major issue to fix it. > > but much more importantly, I think __get_user_pages() is broken in two ways: > > - faultin_page() does: > > ret = handle_mm_fault(vma, address, fault_flags); > ... > if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE)) > > (easily fixed the same way) > > - more annoyingly and harder to fix: the retry case in > __get_user_pages(), and the VMA saving there. > > Ho humm. > > Andrea, looking at that get_user_pages() case, I really think it's > userfaultfd that is broken. > > Could we perhaps limit userfaultfd to _only_ do the VM_FAULT_RETRY, > and simply fail for non-retry faults? In the get_user_pages case we already limit it to do only VM_FAULT_RETRY so no use after free should materialize whenever gup is involved. The problematic path for the return to userland (get_user_pages returns to kernel) is this one: if (return_to_userland) { if (signal_pending(current) && !fatal_signal_pending(current)) { /* * If we got a SIGSTOP or SIGCONT and this is * a normal userland page fault, just let * userland return so the signal will be * handled and gdb debugging works. The page * fault code immediately after we return from * this function is going to release the * mmap_sem and it's not depending on it * (unlike gup would if we were not to return * VM_FAULT_RETRY). * * If a fatal signal is pending we still take * the streamlined VM_FAULT_RETRY failure path * and there's no need to retake the mmap_sem * in such case. */ down_read(&mm->mmap_sem); ret = VM_FAULT_NOPAGE; } } We could remove the above branch all together and then handle_userfault() would always return VM_FAULT_RETRY whenever it decides to release the mmap_sem. The above makes debugging with gdb more user friendly and it potentially lowers the latency of signals as signals can unblock handle_userfault. The downside is that the return to userland cannot dereference the vma after calling handle_mm_fault. Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-31 19:13 UTC|newest] Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-10-30 19:12 KASAN: use-after-free Read in __do_page_fault syzbot 2017-10-30 19:15 ` Dmitry Vyukov 2017-10-30 19:15 ` Dmitry Vyukov 2017-10-31 12:00 ` Vlastimil Babka 2017-10-31 12:00 ` Vlastimil Babka 2017-10-31 12:42 ` Dmitry Vyukov 2017-10-31 12:42 ` Dmitry Vyukov 2017-10-31 13:20 ` Vlastimil Babka 2017-10-31 13:20 ` Vlastimil Babka 2017-10-31 13:57 ` Vlastimil Babka 2017-10-31 13:57 ` Vlastimil Babka 2017-10-31 14:11 ` Kirill A. Shutemov 2017-10-31 14:11 ` Kirill A. Shutemov 2017-10-31 14:28 ` Vlastimil Babka 2017-10-31 14:28 ` Vlastimil Babka 2017-10-31 19:15 ` Andrea Arcangeli 2017-10-31 19:15 ` Andrea Arcangeli 2017-11-01 7:42 ` Vlastimil Babka 2017-11-01 7:42 ` Vlastimil Babka 2017-11-01 10:17 ` Andrea Arcangeli 2017-11-01 10:17 ` Andrea Arcangeli 2017-11-01 12:14 ` Vlastimil Babka 2017-11-01 12:14 ` Vlastimil Babka 2017-10-31 15:37 ` Linus Torvalds 2017-10-31 15:37 ` Linus Torvalds 2017-10-31 19:13 ` Andrea Arcangeli [this message] 2017-10-31 19:13 ` Andrea Arcangeli 2017-11-01 15:26 ` Linus Torvalds 2017-11-01 15:26 ` Linus Torvalds 2017-11-02 19:36 ` Andrea Arcangeli 2017-11-02 19:36 ` Andrea Arcangeli 2017-11-02 10:00 ` Laurent Dufour 2017-11-02 10:00 ` Laurent Dufour
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20171031191336.GA2799@redhat.com \ --to=aarcange@redhat.com \ --cc=JBeulich@suse.com \ --cc=akpm@linux-foundation.org \ --cc=bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com \ --cc=dvyukov@google.com \ --cc=hpa@zytor.com \ --cc=hughd@google.com \ --cc=jpoimboe@redhat.com \ --cc=kirill.shutemov@linux.intel.com \ --cc=ldufour@linux.vnet.ibm.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=luto@kernel.org \ --cc=mhocko@suse.com \ --cc=mingo@redhat.com \ --cc=regressions@leemhuis.info \ --cc=rientjes@google.com \ --cc=syzkaller-bugs@googlegroups.com \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=vbabka@suse.cz \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.