All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-arch@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-ia64@vger.kernel.org, linux-hexagon@vger.kernel.org,
	linux-m68k@lists.linux-m68k.org, Michal Simek <monstr@monstr.eu>,
	Dinh Nguyen <dinguyen@kernel.org>,
	openrisc@lists.librecores.org, linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org, sparclinux@vger.kernel.org
Subject: Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
Date: Wed, 1 Feb 2023 14:48:22 -0500	[thread overview]
Message-ID: <Y9rCBqwbLlLf1fHe@x1n> (raw)
In-Reply-To: <CAHk-=wjNwwnBckTo8HLSdsd1ndoAR=5RBoZhdOyzhsnDAYWL9g@mail.gmail.com>

On Tue, Jan 31, 2023 at 04:00:22PM -0800, Linus Torvalds wrote:
> So most of the time it's probably not going to matter all that much
> which signal gets sent in practice.

I do also see a common pattern of the possibility to have a generic fault
handler like generic_page_fault().

It probably should start with taking the mmap_sem until providing some
retval that is much easier to digest further by the arch-dependent code, so
it can directly do something rather than parsing the bitmask in a
duplicated way (hence the new retval should hopefully not a bitmask anymore
but a "what to do").

Maybe it can be something like:

/**
 * enum page_fault_retval - Higher level fault retval, generalized from
 * vm_fault_reason above that is only used by hardware page fault handlers.
 * It generalizes the bitmask-versioned retval into something that the arch
 * dependent code should react upon.
 *
 * @PF_RET_COMPLETED:		The page fault is completed successfully
 * @PF_RET_BAD_AREA:		The page fault address falls in a bad area
 *				(e.g., vma not found, expand_stack() fails..)
 * @PF_RET_ACCESS_ERR:		The page fault has access errors
 *				(e.g., write fault on !VM_WRITE vmas)
 * @PF_RET_KERN_FIXUP:		The page fault requires kernel fixups
 *				(e.g., during copy_to_user() but fault failed?)
 * @PF_RET_HWPOISON:		The page fault encountered poisoned pages
 * @PF_RET_SIGNAL:		The page fault encountered poisoned pages
 * ...
 */
enum page_fault_retval {
	PF_RET_DONE = 0,
	PF_RET_BAD_AREA,
	PF_RET_ACCESS_ERR,
	PF_RET_KERN_FIXUP,
        PF_RET_HWPOISON,
        PF_RET_SIGNAL,
	...
};

As a start we may still want to return some more information (perhaps still
the vm_fault_t alongside?  Or another union that will provide different
information based on different PF_RET_*).  One major thing is I see how we
handle VM_FAULT_HWPOISON and also the fact that we encode something more
into the bitmask on page sizes (VM_FAULT_HINDEX_MASK).

So the generic helper could, hopefully, hide the complexity of:

  - Taking and releasing of mmap lock
  - find_vma(), and also relevant checks on access or stack handling
  - handle_mm_fault() itself (of course...)
  - detect signals
  - handle page fault retries (so, in the new layer of retval there should
    have nothing telling it to retry; it should always be the ultimate result)
  - parse different errors into "what the arch code should do", and
    generalize the common ones, e.g.
    - OOM, do pagefault_out_of_memory() for user-mode
    - VM_FAULT_SIGSEGV, which should be able to merge into PF_RET_BAD_AREA?
    - ...

It'll simplify things if we can unify some small details like whether the
-EFAULT above should contain a sigbus.

A trivial detail I found when I was looking at this is, x86_64 passes in
different signals to kernelmode_fixup_or_oops() - in do_user_addr_fault()
there're three call sites and each of them pass over a differerent signal.
IIUC that will only make a difference if there's a nested page fault during
the vsyscall emulation (but I may be wrong too because I'm new to this
code), and I have no idea when it'll happen and whether that needs to be
strictly followed.

Thanks,

-- 
Peter Xu


WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-arch@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-ia64@vger.kernel.org, linux-hexagon@vger.kernel.org,
	linux-m68k@lists.linux-m68k.org, Michal Simek <monstr@monstr.eu>,
	Dinh Nguyen <dinguyen@kernel.org>,
	openrisc@lists.librecores.org, linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org, sparclinux@vger.kernel.org
Subject: Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
Date: Wed, 1 Feb 2023 14:48:22 -0500	[thread overview]
Message-ID: <Y9rCBqwbLlLf1fHe@x1n> (raw)
In-Reply-To: <CAHk-=wjNwwnBckTo8HLSdsd1ndoAR=5RBoZhdOyzhsnDAYWL9g@mail.gmail.com>

On Tue, Jan 31, 2023 at 04:00:22PM -0800, Linus Torvalds wrote:
> So most of the time it's probably not going to matter all that much
> which signal gets sent in practice.

I do also see a common pattern of the possibility to have a generic fault
handler like generic_page_fault().

It probably should start with taking the mmap_sem until providing some
retval that is much easier to digest further by the arch-dependent code, so
it can directly do something rather than parsing the bitmask in a
duplicated way (hence the new retval should hopefully not a bitmask anymore
but a "what to do").

Maybe it can be something like:

/**
 * enum page_fault_retval - Higher level fault retval, generalized from
 * vm_fault_reason above that is only used by hardware page fault handlers.
 * It generalizes the bitmask-versioned retval into something that the arch
 * dependent code should react upon.
 *
 * @PF_RET_COMPLETED:		The page fault is completed successfully
 * @PF_RET_BAD_AREA:		The page fault address falls in a bad area
 *				(e.g., vma not found, expand_stack() fails..)
 * @PF_RET_ACCESS_ERR:		The page fault has access errors
 *				(e.g., write fault on !VM_WRITE vmas)
 * @PF_RET_KERN_FIXUP:		The page fault requires kernel fixups
 *				(e.g., during copy_to_user() but fault failed?)
 * @PF_RET_HWPOISON:		The page fault encountered poisoned pages
 * @PF_RET_SIGNAL:		The page fault encountered poisoned pages
 * ...
 */
enum page_fault_retval {
	PF_RET_DONE = 0,
	PF_RET_BAD_AREA,
	PF_RET_ACCESS_ERR,
	PF_RET_KERN_FIXUP,
        PF_RET_HWPOISON,
        PF_RET_SIGNAL,
	...
};

As a start we may still want to return some more information (perhaps still
the vm_fault_t alongside?  Or another union that will provide different
information based on different PF_RET_*).  One major thing is I see how we
handle VM_FAULT_HWPOISON and also the fact that we encode something more
into the bitmask on page sizes (VM_FAULT_HINDEX_MASK).

So the generic helper could, hopefully, hide the complexity of:

  - Taking and releasing of mmap lock
  - find_vma(), and also relevant checks on access or stack handling
  - handle_mm_fault() itself (of course...)
  - detect signals
  - handle page fault retries (so, in the new layer of retval there should
    have nothing telling it to retry; it should always be the ultimate result)
  - parse different errors into "what the arch code should do", and
    generalize the common ones, e.g.
    - OOM, do pagefault_out_of_memory() for user-mode
    - VM_FAULT_SIGSEGV, which should be able to merge into PF_RET_BAD_AREA?
    - ...

It'll simplify things if we can unify some small details like whether the
-EFAULT above should contain a sigbus.

A trivial detail I found when I was looking at this is, x86_64 passes in
different signals to kernelmode_fixup_or_oops() - in do_user_addr_fault()
there're three call sites and each of them pass over a differerent signal.
IIUC that will only make a difference if there's a nested page fault during
the vsyscall emulation (but I may be wrong too because I'm new to this
code), and I have no idea when it'll happen and whether that needs to be
strictly followed.

Thanks,

-- 
Peter Xu


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-arch@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-ia64@vger.kernel.org, linux-hexagon@vger.kernel.org,
	linux-m68k@lists.linux-m68k.org, Michal Simek <monstr@monstr.eu>,
	Dinh Nguyen <dinguyen@kernel.org>,
	openrisc@lists.librecores.org, linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org, sparclinux@vger.kernel.org
Subject: Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
Date: Wed, 01 Feb 2023 19:48:22 +0000	[thread overview]
Message-ID: <Y9rCBqwbLlLf1fHe@x1n> (raw)
In-Reply-To: <CAHk-=wjNwwnBckTo8HLSdsd1ndoAR=5RBoZhdOyzhsnDAYWL9g@mail.gmail.com>

On Tue, Jan 31, 2023 at 04:00:22PM -0800, Linus Torvalds wrote:
> So most of the time it's probably not going to matter all that much
> which signal gets sent in practice.

I do also see a common pattern of the possibility to have a generic fault
handler like generic_page_fault().

It probably should start with taking the mmap_sem until providing some
retval that is much easier to digest further by the arch-dependent code, so
it can directly do something rather than parsing the bitmask in a
duplicated way (hence the new retval should hopefully not a bitmask anymore
but a "what to do").

Maybe it can be something like:

/**
 * enum page_fault_retval - Higher level fault retval, generalized from
 * vm_fault_reason above that is only used by hardware page fault handlers.
 * It generalizes the bitmask-versioned retval into something that the arch
 * dependent code should react upon.
 *
 * @PF_RET_COMPLETED:		The page fault is completed successfully
 * @PF_RET_BAD_AREA:		The page fault address falls in a bad area
 *				(e.g., vma not found, expand_stack() fails..)
 * @PF_RET_ACCESS_ERR:		The page fault has access errors
 *				(e.g., write fault on !VM_WRITE vmas)
 * @PF_RET_KERN_FIXUP:		The page fault requires kernel fixups
 *				(e.g., during copy_to_user() but fault failed?)
 * @PF_RET_HWPOISON:		The page fault encountered poisoned pages
 * @PF_RET_SIGNAL:		The page fault encountered poisoned pages
 * ...
 */
enum page_fault_retval {
	PF_RET_DONE = 0,
	PF_RET_BAD_AREA,
	PF_RET_ACCESS_ERR,
	PF_RET_KERN_FIXUP,
        PF_RET_HWPOISON,
        PF_RET_SIGNAL,
	...
};

As a start we may still want to return some more information (perhaps still
the vm_fault_t alongside?  Or another union that will provide different
information based on different PF_RET_*).  One major thing is I see how we
handle VM_FAULT_HWPOISON and also the fact that we encode something more
into the bitmask on page sizes (VM_FAULT_HINDEX_MASK).

So the generic helper could, hopefully, hide the complexity of:

  - Taking and releasing of mmap lock
  - find_vma(), and also relevant checks on access or stack handling
  - handle_mm_fault() itself (of course...)
  - detect signals
  - handle page fault retries (so, in the new layer of retval there should
    have nothing telling it to retry; it should always be the ultimate result)
  - parse different errors into "what the arch code should do", and
    generalize the common ones, e.g.
    - OOM, do pagefault_out_of_memory() for user-mode
    - VM_FAULT_SIGSEGV, which should be able to merge into PF_RET_BAD_AREA?
    - ...

It'll simplify things if we can unify some small details like whether the
-EFAULT above should contain a sigbus.

A trivial detail I found when I was looking at this is, x86_64 passes in
different signals to kernelmode_fixup_or_oops() - in do_user_addr_fault()
there're three call sites and each of them pass over a differerent signal.
IIUC that will only make a difference if there's a nested page fault during
the vsyscall emulation (but I may be wrong too because I'm new to this
code), and I have no idea when it'll happen and whether that needs to be
strictly followed.

Thanks,

-- 
Peter Xu

  reply	other threads:[~2023-02-01 19:49 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
2023-01-31 20:02 ` Al Viro
2023-01-31 20:03 ` [PATCH 01/10] alpha: fix livelock in uaccess Al Viro
2023-01-31 20:03   ` Al Viro
2023-03-07  0:48   ` patchwork-bot+linux-riscv
2023-03-07  0:48     ` patchwork-bot+linux-riscv
2023-01-31 20:03 ` [PATCH 02/10] hexagon: " Al Viro
2023-01-31 20:03   ` Al Viro
2023-02-10  2:59   ` Brian Cain
2023-02-10  2:59     ` Brian Cain
2023-01-31 20:04 ` [PATCH 03/10] ia64: " Al Viro
2023-01-31 20:04   ` Al Viro
2023-01-31 20:04 ` [PATCH 04/10] m68k: " Al Viro
2023-01-31 20:04   ` Al Viro
2023-02-05  6:18   ` Finn Thain
2023-02-05  6:18     ` Finn Thain
2023-02-05  6:18     ` Finn Thain
2023-02-05 18:51     ` Linus Torvalds
2023-02-05 18:51       ` Linus Torvalds
2023-02-05 18:51       ` Linus Torvalds
2023-02-07  3:07       ` Finn Thain
2023-02-07  3:07         ` Finn Thain
2023-02-07  3:07         ` Finn Thain
2023-02-05 20:39     ` Al Viro
2023-02-05 20:39       ` Al Viro
2023-02-05 20:39       ` Al Viro
2023-02-05 20:41       ` Linus Torvalds
2023-02-05 20:41         ` Linus Torvalds
2023-02-05 20:41         ` Linus Torvalds
2023-02-06 12:08   ` Geert Uytterhoeven
2023-02-06 12:08     ` Geert Uytterhoeven
2023-02-06 12:08     ` Geert Uytterhoeven
2023-01-31 20:05 ` [PATCH 05/10] microblaze: " Al Viro
2023-01-31 20:05   ` Al Viro
2023-01-31 20:05 ` [PATCH 06/10] nios2: " Al Viro
2023-01-31 20:05   ` Al Viro
2023-01-31 20:06 ` [PATCH 07/10] openrisc: " Al Viro
2023-01-31 20:06   ` Al Viro
2023-01-31 20:06 ` [PATCH 08/10] parisc: " Al Viro
2023-01-31 20:06   ` Al Viro
2023-02-06 16:58   ` Helge Deller
2023-02-06 16:58     ` Helge Deller
2023-02-06 16:58     ` Helge Deller
2023-02-28 17:34     ` Al Viro
2023-02-28 17:34       ` Al Viro
2023-02-28 18:26       ` Helge Deller
2023-02-28 19:14         ` Al Viro
2023-02-28 19:32           ` Helge Deller
2023-02-28 20:00             ` Helge Deller
2023-02-28 20:22               ` Helge Deller
2023-02-28 22:57                 ` Al Viro
2023-03-01  4:00                   ` Helge Deller
2023-03-02 17:53                     ` Al Viro
2023-02-28 15:22   ` Guenter Roeck
2023-02-28 15:22     ` Guenter Roeck
2023-02-28 15:22     ` Guenter Roeck
2023-02-28 19:18     ` Michael Schmitz
2023-02-28 19:18       ` Michael Schmitz
2023-02-28 19:18       ` Michael Schmitz
2023-01-31 20:06 ` [PATCH 09/10] riscv: " Al Viro
2023-01-31 20:06   ` Al Viro
2023-02-06 20:06   ` Björn Töpel
2023-02-06 20:06     ` Björn Töpel
2023-02-06 20:06     ` Björn Töpel
2023-02-07 16:11   ` Geert Uytterhoeven
2023-02-07 16:11     ` Geert Uytterhoeven
2023-02-07 16:11     ` Geert Uytterhoeven
2023-01-31 20:07 ` [PATCH 10/10] sparc: " Al Viro
2023-01-31 20:07   ` Al Viro
2023-01-31 20:24 ` [RFC][PATCHSET] VM_FAULT_RETRY fixes Linus Torvalds
2023-01-31 20:24   ` Linus Torvalds
2023-01-31 20:24   ` Linus Torvalds
2023-01-31 21:10   ` Al Viro
2023-01-31 21:10     ` Al Viro
2023-01-31 21:19     ` Linus Torvalds
2023-01-31 21:19       ` Linus Torvalds
2023-01-31 21:19       ` Linus Torvalds
2023-01-31 21:49       ` Al Viro
2023-01-31 21:49         ` Al Viro
2023-02-01  0:00         ` Linus Torvalds
2023-02-01  0:00           ` Linus Torvalds
2023-02-01  0:00           ` Linus Torvalds
2023-02-01 19:48           ` Peter Xu [this message]
2023-02-01 19:48             ` Peter Xu
2023-02-01 19:48             ` Peter Xu
2023-02-01 22:18             ` Al Viro
2023-02-01 22:18               ` Al Viro
2023-02-01 22:18               ` Al Viro
2023-02-02  0:57               ` Al Viro
2023-02-02  0:57                 ` Al Viro
2023-02-02  0:57                 ` Al Viro
2023-02-02 22:56               ` Peter Xu
2023-02-02 22:56                 ` Peter Xu
2023-02-02 22:56                 ` Peter Xu
2023-02-04  0:26                 ` Al Viro
2023-02-04  0:26                   ` Al Viro
2023-02-04  0:26                   ` Al Viro
2023-02-05  5:10                   ` Al Viro
2023-02-05  5:10                     ` Al Viro
2023-02-05  5:10                     ` Al Viro
2023-02-04  0:47         ` [loongarch oddities] " Al Viro
2023-02-01  8:21       ` Helge Deller
2023-02-01  8:21         ` Helge Deller
2023-02-01  8:21         ` Helge Deller
2023-02-01 19:51         ` Linus Torvalds
2023-02-01 19:51           ` Linus Torvalds
2023-02-01 19:51           ` Linus Torvalds
2023-02-02  6:58       ` Al Viro
2023-02-02  6:58         ` Al Viro
2023-02-02  8:54         ` Michael Cree
2023-02-02  9:56           ` John Paul Adrian Glaubitz
2023-02-02 15:20           ` Al Viro
2023-02-02 20:20             ` Al Viro
2023-02-02 20:34         ` Linus Torvalds
2023-02-01 10:50 ` Mark Rutland
2023-02-01 10:50   ` Mark Rutland
2023-02-01 10:50   ` Mark Rutland
2023-02-06 12:08   ` Geert Uytterhoeven
2023-02-06 12:08     ` Geert Uytterhoeven
2023-02-06 12:08     ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9rCBqwbLlLf1fHe@x1n \
    --to=peterx@redhat.com \
    --cc=dinguyen@kernel.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=monstr@monstr.eu \
    --cc=openrisc@lists.librecores.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.