linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>,
	linux-kernel@vger.kernel.org,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	linux-s390@vger.kernel.org
Subject: Re: [PATCH 19/25] mm/s390: Use mm_fault_accounting()
Date: Wed, 17 Jun 2020 12:06:17 -0400	[thread overview]
Message-ID: <20200617160617.GD76766@xz-x1> (raw)
In-Reply-To: <edb88596-6f2c-2648-748d-591a0b1e0131@de.ibm.com>

Hi, Christian,

On Wed, Jun 17, 2020 at 08:19:29AM +0200, Christian Borntraeger wrote:
> 
> 
> On 16.06.20 18:35, Peter Xu wrote:
> > Hi, Alexander,
> > 
> > On Tue, Jun 16, 2020 at 05:59:33PM +0200, Alexander Gordeev wrote:
> >>> @@ -489,21 +489,7 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
> >>>  	if (unlikely(fault & VM_FAULT_ERROR))
> >>>  		goto out_up;
> >>>
> >>> -	/*
> >>> -	 * Major/minor page fault accounting is only done on the
> >>> -	 * initial attempt. If we go through a retry, it is extremely
> >>> -	 * likely that the page will be found in page cache at that point.
> >>> -	 */
> >>>  	if (flags & FAULT_FLAG_ALLOW_RETRY) {
> >>> -		if (fault & VM_FAULT_MAJOR) {
> >>> -			tsk->maj_flt++;
> >>> -			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
> >>> -				      regs, address);
> >>> -		} else {
> >>> -			tsk->min_flt++;
> >>> -			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
> >>> -				      regs, address);
> >>> -		}
> >>>  		if (fault & VM_FAULT_RETRY) {
> >>>  			if (IS_ENABLED(CONFIG_PGSTE) && gmap &&
> >>>  			    (flags & FAULT_FLAG_RETRY_NOWAIT)) {

[1]

> >>
> >> Seems like the call to mm_fault_accounting() will be missed if
> >> we entered here with FAULT_FLAG_RETRY_NOWAIT flag set, since it
> >> jumps to "out_up"...
> > 
> > This is true as a functional change.  However that also means that we've got a
> > VM_FAULT_RETRY, which hints that this fault has been requested to retry rather
> > than handled correctly (for instance, due to some try_lock failed during the
> > fault process).
> > 
> > To me, that case should not be counted as a page fault at all?  Or we might get
> > the same duplicated accounting when the page fault retried from a higher stack.
> > 
> > Thanks
> 
> This case below (the one with the gmap) is the KVM case for doing a so called
> pseudo page fault to our guests. (we notify our guests about major host page
> faults and let it reschedule to something else instead of halting the vcpu).
> This is being resolved with either gup or fixup_user_fault asynchronously by
> KVM code (this can also be sync when the guest does not match some conditions)
> We do not change the counters in that code as far as I can tell so we should
> continue to do it here.
> 
> (see arch/s390/kvm/kvm-s390.c
> static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
> {
> [...]
>         } else if (current->thread.gmap_pfault) {
>                 trace_kvm_s390_major_guest_pfault(vcpu);
>                 current->thread.gmap_pfault = 0;
>                 if (kvm_arch_setup_async_pf(vcpu))
>                         return 0;
>                 return kvm_arch_fault_in_page(vcpu, current->thread.gmap_addr, 1);
>         }

Please correct me if I'm wrong... but I still think what this patch does is the
right thing to do.

Note again that IMHO when reached [1] above it means the page fault is not
handled correctly so we need to fallback to KVM async page fault, then we
shouldn't increment the accountings until it's finally handled correctly. That
final accounting should be done in the async pf path in gup code where the page
fault is handled:

  kvm_arch_fault_in_page
    gmap_fault
      fixup_user_fault

Where in fixup_user_fault() we have:

	if (tsk) {
		if (major)
			tsk->maj_flt++;
		else
			tsk->min_flt++;
	}

Thanks,

-- 
Peter Xu


  reply	other threads:[~2020-06-17 16:06 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-15 22:15 [PATCH 00/25] mm: Page fault accounting cleanups Peter Xu
2020-06-15 22:15 ` [PATCH 01/25] mm/um: Fix extra accounting for page fault retries Peter Xu
2020-06-15 22:15 ` [PATCH 02/25] mm: Introduce mm_fault_accounting() Peter Xu
2020-06-15 22:32   ` Linus Torvalds
2020-06-15 23:19     ` Peter Xu
2020-06-16 19:00       ` Andrew Morton
2020-06-17 16:26         ` Peter Xu
2020-06-15 22:15 ` [PATCH 03/25] mm/alpha: Use mm_fault_accounting() Peter Xu
2020-06-15 22:15 ` [PATCH 04/25] mm/arc: " Peter Xu
2020-06-15 22:15 ` [PATCH 05/25] mm/arm: " Peter Xu
2020-06-15 22:15 ` [PATCH 06/25] mm/arm64: " Peter Xu
2020-06-16  7:43   ` Will Deacon
2020-06-16 15:59     ` Peter Xu
2020-06-15 22:15 ` [PATCH 07/25] mm/csky: " Peter Xu
2020-06-17  7:04   ` Guo Ren
2020-06-17 15:49     ` Peter Xu
2020-06-17 17:53       ` Linus Torvalds
2020-06-17 19:58         ` Peter Xu
2020-06-17 20:15           ` Linus Torvalds
2020-06-18 14:38             ` Peter Xu
2020-06-18 17:15               ` Linus Torvalds
2020-06-18 21:24                 ` Peter Xu
2020-06-18 22:28                   ` Peter Xu
2020-06-18 22:59                     ` Linus Torvalds
2020-06-15 22:15 ` [PATCH 08/25] mm/hexagon: " Peter Xu
2020-06-15 22:15 ` [PATCH 09/25] mm/ia64: " Peter Xu
2020-06-15 22:15 ` [PATCH 10/25] mm/m68k: " Peter Xu
2020-06-15 22:15 ` [PATCH 11/25] mm/microblaze: " Peter Xu
2020-06-15 22:15 ` [PATCH 12/25] mm/mips: " Peter Xu
2020-06-15 22:15 ` [PATCH 13/25] mm/nds32: " Peter Xu
2020-06-17  1:05   ` Greentime Hu
2020-06-15 22:15 ` [PATCH 14/25] mm/nios2: " Peter Xu
2020-06-15 22:15 ` [PATCH 15/25] mm/openrisc: " Peter Xu
2020-06-16 18:11   ` Stafford Horne
2020-06-15 22:15 ` [PATCH 16/25] mm/parisc: " Peter Xu
2020-06-15 22:15 ` [PATCH 17/25] mm/powerpc: " Peter Xu
2020-06-15 22:16 ` [PATCH 18/25] mm/riscv: " Peter Xu
2020-06-18 23:49   ` Palmer Dabbelt
2020-06-19  0:12     ` Peter Xu
2020-06-15 22:23 ` [PATCH 19/25] mm/s390: " Peter Xu
2020-06-16 15:59   ` Alexander Gordeev
2020-06-16 16:35     ` Peter Xu
2020-06-17  6:19       ` Christian Borntraeger
2020-06-17 16:06         ` Peter Xu [this message]
2020-06-17 16:14           ` Christian Borntraeger
2020-06-17 16:44             ` Peter Xu
2020-06-15 22:23 ` [PATCH 20/25] mm/sh: " Peter Xu
2020-07-20 21:25   ` Rich Felker
2020-07-20 22:05     ` Peter Xu
2020-06-15 22:23 ` [PATCH 21/25] mm/sparc32: " Peter Xu
2020-06-15 22:23 ` [PATCH 22/25] mm/sparc64: " Peter Xu
2020-06-15 22:23 ` [PATCH 23/25] mm/unicore32: " Peter Xu
2020-06-15 22:23 ` [PATCH 24/25] mm/x86: " Peter Xu
2020-06-15 22:23 ` [PATCH 25/25] mm/xtensa: " Peter Xu
2020-06-15 23:13   ` Max Filippov
2020-06-16 18:55 ` [PATCH 00/25] mm: Page fault accounting cleanups Linus Torvalds
2020-06-16 21:03   ` Peter Xu
2020-06-17  0:55   ` Michael Ellerman
2020-06-17  8:04     ` Will Deacon
2020-06-17 16:10       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200617160617.GD76766@xz-x1 \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@de.ibm.com \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: [PATCH 19/25] mm/s390: Use mm_fault_accounting()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox