Linux-m68k Archive on lore.kernel.org
 help / color / Atom feed
* Possible duplicate page fault accounting on some archs after commit 4064b9827063
@ 2020-06-10 15:48 Gerald Schaefer
  2020-06-10 16:50 ` Peter Xu
  2020-06-10 16:53 ` Linus Torvalds
  0 siblings, 2 replies; 4+ messages in thread
From: Gerald Schaefer @ 2020-06-10 15:48 UTC (permalink / raw)
  To: Peter Xu, linux-arch, linux-kernel
  Cc: linux-alpha, linux-arm-kernel, linux-hexagon, linux-ia64,
	linux-m68k, Michal Simek, linux-mips, Nick Hu, Ley Foon Tan,
	openrisc, linux-parisc, linux-riscv, linux-sh, sparclinux,
	linux-um, Guan Xuetao, linux-xtensa, Heiko Carstens,
	Linus Torvalds, Andrea Arcangeli

Hi,

Some architectures have their page fault accounting code inside the fault
retry loop, and rely on only going through that code once. Before commit
4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times"), that was
ensured by testing for and clearing FAULT_FLAG_ALLOW_RETRY.

That commit had to remove the clearing of FAULT_FLAG_ALLOW_RETRY for all
architectures, and introduced a subtle change to page fault accounting
logic in the affected archs. It is now possible to go through the retry
loop multiple times, and the affected archs would then account multiple
page faults instead of just one.

This was found by coincidence in s390 code, and a quick check showed that
there are quite a lot of other architectures that seem to be affected in a
similar way. I'm preparing a fix for s390, by moving the accounting behind
the retry loop, similar to x86. It is not completely straight-forward, so
I leave the fix for other archs to the respective maintainers.

Added the lists for possibly affected archs on cc, but no guarantee for
completeness.

Regards,
Gerald

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Possible duplicate page fault accounting on some archs after commit 4064b9827063
  2020-06-10 15:48 Possible duplicate page fault accounting on some archs after commit 4064b9827063 Gerald Schaefer
@ 2020-06-10 16:50 ` Peter Xu
  2020-06-15 21:34   ` Peter Xu
  2020-06-10 16:53 ` Linus Torvalds
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Xu @ 2020-06-10 16:50 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: linux-arch, linux-kernel, linux-alpha, linux-arm-kernel,
	linux-hexagon, linux-ia64, linux-m68k, Michal Simek, linux-mips,
	Nick Hu, Ley Foon Tan, openrisc, linux-parisc, linux-riscv,
	linux-sh, sparclinux, linux-um, Guan Xuetao, linux-xtensa,
	Heiko Carstens, Linus Torvalds, Andrea Arcangeli

On Wed, Jun 10, 2020 at 05:48:11PM +0200, Gerald Schaefer wrote:
> Hi,

Hi, Gerald,

> 
> Some architectures have their page fault accounting code inside the fault
> retry loop, and rely on only going through that code once. Before commit
> 4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times"), that was
> ensured by testing for and clearing FAULT_FLAG_ALLOW_RETRY.
> 
> That commit had to remove the clearing of FAULT_FLAG_ALLOW_RETRY for all
> architectures, and introduced a subtle change to page fault accounting
> logic in the affected archs. It is now possible to go through the retry
> loop multiple times, and the affected archs would then account multiple
> page faults instead of just one.
> 
> This was found by coincidence in s390 code, and a quick check showed that
> there are quite a lot of other architectures that seem to be affected in a
> similar way. I'm preparing a fix for s390, by moving the accounting behind
> the retry loop, similar to x86. It is not completely straight-forward, so
> I leave the fix for other archs to the respective maintainers.

Sorry for not noticing this before.  The accounting part should definitely be
put at least into a check against fault_flag_allow_retry_first() to mimic what
was done before.  And I agree it would be even better to put it after the retry
logic, so if any of the page faults gets a major fault, it'll be accounted as a
major fault which makes more sense to me, just like what x86 is doing now with:

	major |= fault & VM_FAULT_MAJOR;

I'm not sure what's the preference of the arch maintainers, just let me know if
it's preferred to use a single series to address this issue for all affected
archs (or the archs besides s390), then I'll do.

Thanks!

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Possible duplicate page fault accounting on some archs after commit 4064b9827063
  2020-06-10 15:48 Possible duplicate page fault accounting on some archs after commit 4064b9827063 Gerald Schaefer
  2020-06-10 16:50 ` Peter Xu
@ 2020-06-10 16:53 ` Linus Torvalds
  1 sibling, 0 replies; 4+ messages in thread
From: Linus Torvalds @ 2020-06-10 16:53 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Peter Xu, linux-arch, Linux Kernel Mailing List, alpha,
	Linux ARM, linux-hexagon, linux-ia64, linux-m68k, Michal Simek,
	linux-mips, Nick Hu, Ley Foon Tan, openrisc, linux-parisc,
	linux-riscv, Linux-sh list, sparclinux, linux-um, Guan Xuetao,
	linux-xtensa, Heiko Carstens, Andrea Arcangeli

On Wed, Jun 10, 2020 at 8:48 AM Gerald Schaefer
<gerald.schaefer@de.ibm.com> wrote:
>
> This was found by coincidence in s390 code, and a quick check showed that
> there are quite a lot of other architectures that seem to be affected in a
> similar way. I'm preparing a fix for s390, by moving the accounting behind
> the retry loop, similar to x86. It is not completely straight-forward, so
> I leave the fix for other archs to the respective maintainers.

Hmm. I wonder if we could move the handling into  handle_mm_fault() itself.

It's _fairly_ trivial to do on the arch side, just as long as you
remember to make the VM_FAULT_MAJOR bit sticky like x86 does with that

        major |= fault & VM_FAULT_MAJOR;

right after handle_mm_fault(). But it certainly doesn't seem like it
would be hard to move into common code in handle_mm_fault() either, by
just not doing the accounting if it's about to return VM_FAULT_RETRY
or VM_FAULT_ERROR.

That said, we want that perf_sw_event() accounting too, so we'd have
to pass in a 'struct regs *' as well. And it's not clear which way
accounting should go for other callers of handle_mm_fault() (ie gup
etc).

So I guess just having architectures fix it up individually and make
sure they don't do it for retry conditions is the right thing to do..

             Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Possible duplicate page fault accounting on some archs after commit 4064b9827063
  2020-06-10 16:50 ` Peter Xu
@ 2020-06-15 21:34   ` Peter Xu
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Xu @ 2020-06-15 21:34 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: linux-arch, linux-kernel, linux-alpha, linux-arm-kernel,
	linux-hexagon, linux-ia64, linux-m68k, Michal Simek, linux-mips,
	Nick Hu, Ley Foon Tan, openrisc, linux-parisc, linux-riscv,
	linux-sh, sparclinux, linux-um, Guan Xuetao, linux-xtensa,
	Heiko Carstens, Linus Torvalds, Andrea Arcangeli

On Wed, Jun 10, 2020 at 12:50:23PM -0400, Peter Xu wrote:
> On Wed, Jun 10, 2020 at 05:48:11PM +0200, Gerald Schaefer wrote:
> > Hi,
> 
> Hi, Gerald,
> 
> > 
> > Some architectures have their page fault accounting code inside the fault
> > retry loop, and rely on only going through that code once. Before commit
> > 4064b9827063 ("mm: allow VM_FAULT_RETRY for multiple times"), that was
> > ensured by testing for and clearing FAULT_FLAG_ALLOW_RETRY.
> > 
> > That commit had to remove the clearing of FAULT_FLAG_ALLOW_RETRY for all
> > architectures, and introduced a subtle change to page fault accounting
> > logic in the affected archs. It is now possible to go through the retry
> > loop multiple times, and the affected archs would then account multiple
> > page faults instead of just one.
> > 
> > This was found by coincidence in s390 code, and a quick check showed that
> > there are quite a lot of other architectures that seem to be affected in a
> > similar way. I'm preparing a fix for s390, by moving the accounting behind
> > the retry loop, similar to x86. It is not completely straight-forward, so
> > I leave the fix for other archs to the respective maintainers.
> 
> Sorry for not noticing this before.  The accounting part should definitely be
> put at least into a check against fault_flag_allow_retry_first() to mimic what
> was done before.  And I agree it would be even better to put it after the retry
> logic, so if any of the page faults gets a major fault, it'll be accounted as a
> major fault which makes more sense to me, just like what x86 is doing now with:
> 
> 	major |= fault & VM_FAULT_MAJOR;
> 
> I'm not sure what's the preference of the arch maintainers, just let me know if
> it's preferred to use a single series to address this issue for all affected
> archs (or the archs besides s390), then I'll do.

To make sure this won't fall through the cracks... I'll give it a shot with a
single series to address this issue for all archs.  Although it might not be
easy to do accounting directly in handle_mm_fault(), it might be still a chance
to introduce a helper so the accounting can be done in general code.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-10 15:48 Possible duplicate page fault accounting on some archs after commit 4064b9827063 Gerald Schaefer
2020-06-10 16:50 ` Peter Xu
2020-06-15 21:34   ` Peter Xu
2020-06-10 16:53 ` Linus Torvalds

Linux-m68k Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-m68k/0 linux-m68k/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-m68k linux-m68k/ https://lore.kernel.org/linux-m68k \
		linux-m68k@vger.kernel.org linux-m68k@lists.linux-m68k.org
	public-inbox-index linux-m68k

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-m68k


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git