linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: try __get_user_pages_fast even if not in atomic context
@ 2018-07-27 15:46 Paolo Bonzini
  2018-07-30  8:40 ` David Hildenbrand
  2018-08-06  7:51 ` Xiao Guangrong
  0 siblings, 2 replies; 5+ messages in thread
From: Paolo Bonzini @ 2018-07-27 15:46 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: rkrcmar, Vitaly Kuznetsov, Junaid Shahid, Xiao Guangrong

We are currently cutting hva_to_pfn_fast short if we do not want an
immediate exit, which is represented by !async && !atomic.  However,
this is unnecessary, and __get_user_pages_fast is *much* faster
because the regular get_user_pages takes pmd_lock/pte_lock.
In fact, when many CPUs take a nested vmexit at the same time
the contention on those locks is visible, and this patch removes
about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU
nested guest.

Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 virt/kvm/kvm_main.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 861bb20e8451..0f26ff7ddedb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1343,18 +1343,16 @@ static inline int check_user_page_hwpoison(unsigned long addr)
 }
 
 /*
- * The atomic path to get the writable pfn which will be stored in @pfn,
- * true indicates success, otherwise false is returned.
+ * The fast path to get the writable pfn which will be stored in @pfn,
+ * true indicates success, otherwise false is returned.  It's also the
+ * only part that runs if we can are in atomic context.
  */
-static bool hva_to_pfn_fast(unsigned long addr, bool atomic, bool *async,
-			    bool write_fault, bool *writable, kvm_pfn_t *pfn)
+static bool hva_to_pfn_fast(unsigned long addr, bool write_fault,
+			    bool *writable, kvm_pfn_t *pfn)
 {
 	struct page *page[1];
 	int npages;
 
-	if (!(async || atomic))
-		return false;
-
 	/*
 	 * Fast pin a writable pfn only if it is a write fault request
 	 * or the caller allows to map a writable pfn for a read fault
@@ -1498,7 +1496,7 @@ static kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 	/* we can do it either atomically or asynchronously, not both */
 	BUG_ON(atomic && async);
 
-	if (hva_to_pfn_fast(addr, atomic, async, write_fault, writable, &pfn))
+	if (hva_to_pfn_fast(addr, write_fault, writable, &pfn))
 		return pfn;
 
 	if (atomic)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: try __get_user_pages_fast even if not in atomic context
  2018-07-27 15:46 [PATCH] KVM: try __get_user_pages_fast even if not in atomic context Paolo Bonzini
@ 2018-07-30  8:40 ` David Hildenbrand
  2018-08-06  7:51 ` Xiao Guangrong
  1 sibling, 0 replies; 5+ messages in thread
From: David Hildenbrand @ 2018-07-30  8:40 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: rkrcmar, Vitaly Kuznetsov, Junaid Shahid, Xiao Guangrong

On 27.07.2018 17:46, Paolo Bonzini wrote:
> We are currently cutting hva_to_pfn_fast short if we do not want an
> immediate exit, which is represented by !async && !atomic.  However,
> this is unnecessary, and __get_user_pages_fast is *much* faster
> because the regular get_user_pages takes pmd_lock/pte_lock.
> In fact, when many CPUs take a nested vmexit at the same time
> the contention on those locks is visible, and this patch removes
> about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU
> nested guest.
> 
> Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  virt/kvm/kvm_main.c | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 861bb20e8451..0f26ff7ddedb 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1343,18 +1343,16 @@ static inline int check_user_page_hwpoison(unsigned long addr)
>  }
>  
>  /*
> - * The atomic path to get the writable pfn which will be stored in @pfn,
> - * true indicates success, otherwise false is returned.
> + * The fast path to get the writable pfn which will be stored in @pfn,
> + * true indicates success, otherwise false is returned.  It's also the
> + * only part that runs if we can are in atomic context.
>   */
> -static bool hva_to_pfn_fast(unsigned long addr, bool atomic, bool *async,
> -			    bool write_fault, bool *writable, kvm_pfn_t *pfn)
> +static bool hva_to_pfn_fast(unsigned long addr, bool write_fault,
> +			    bool *writable, kvm_pfn_t *pfn)
>  {
>  	struct page *page[1];
>  	int npages;
>  
> -	if (!(async || atomic))
> -		return false;
> -
>  	/*
>  	 * Fast pin a writable pfn only if it is a write fault request
>  	 * or the caller allows to map a writable pfn for a read fault
> @@ -1498,7 +1496,7 @@ static kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
>  	/* we can do it either atomically or asynchronously, not both */
>  	BUG_ON(atomic && async);
>  
> -	if (hva_to_pfn_fast(addr, atomic, async, write_fault, writable, &pfn))
> +	if (hva_to_pfn_fast(addr, write_fault, writable, &pfn))
>  		return pfn;
>  
>  	if (atomic)
> 

Indeed for the !atomic case this looks perfectly valid. And also !async
should be fine as we there is nothing to wait for if the page is already
in memory.

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: try __get_user_pages_fast even if not in atomic context
  2018-07-27 15:46 [PATCH] KVM: try __get_user_pages_fast even if not in atomic context Paolo Bonzini
  2018-07-30  8:40 ` David Hildenbrand
@ 2018-08-06  7:51 ` Xiao Guangrong
  2018-08-06 11:44   ` Paolo Bonzini
  1 sibling, 1 reply; 5+ messages in thread
From: Xiao Guangrong @ 2018-08-06  7:51 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: rkrcmar, Vitaly Kuznetsov, Junaid Shahid, Xiao Guangrong



On 07/27/2018 11:46 PM, Paolo Bonzini wrote:
> We are currently cutting hva_to_pfn_fast short if we do not want an
> immediate exit, which is represented by !async && !atomic.  However,
> this is unnecessary, and __get_user_pages_fast is *much* faster
> because the regular get_user_pages takes pmd_lock/pte_lock.
> In fact, when many CPUs take a nested vmexit at the same time
> the contention on those locks is visible, and this patch removes
> about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU
> nested guest.
> 

Nice improvement.

Then after that, we will unconditionally try hva_to_pfn_fast(), does
it hurt the case that the mappings in the host's page tables have not
been present yet?

Can we apply this tech to other places using gup or even squash it
into  get_user_pages()?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: try __get_user_pages_fast even if not in atomic context
  2018-08-06  7:51 ` Xiao Guangrong
@ 2018-08-06 11:44   ` Paolo Bonzini
  2018-08-06 17:39     ` Andrea Arcangeli
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2018-08-06 11:44 UTC (permalink / raw)
  To: Xiao Guangrong, linux-kernel, kvm
  Cc: rkrcmar, Vitaly Kuznetsov, Junaid Shahid, Xiao Guangrong,
	Andrea Arcangeli

On 06/08/2018 09:51, Xiao Guangrong wrote:
> 
> 
> On 07/27/2018 11:46 PM, Paolo Bonzini wrote:
>> We are currently cutting hva_to_pfn_fast short if we do not want an
>> immediate exit, which is represented by !async && !atomic.  However,
>> this is unnecessary, and __get_user_pages_fast is *much* faster
>> because the regular get_user_pages takes pmd_lock/pte_lock.
>> In fact, when many CPUs take a nested vmexit at the same time
>> the contention on those locks is visible, and this patch removes
>> about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU
>> nested guest.
>>
> 
> Nice improvement.
> 
> Then after that, we will unconditionally try hva_to_pfn_fast(), does
> it hurt the case that the mappings in the host's page tables have not
> been present yet?

I don't think so, because that's quite slow anyway.

> Can we apply this tech to other places using gup or even squash it
> into  get_user_pages()?

That may make sense.  Andrea, do you have an idea?

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: try __get_user_pages_fast even if not in atomic context
  2018-08-06 11:44   ` Paolo Bonzini
@ 2018-08-06 17:39     ` Andrea Arcangeli
  0 siblings, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2018-08-06 17:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Xiao Guangrong, linux-kernel, kvm, rkrcmar, Vitaly Kuznetsov,
	Junaid Shahid, Xiao Guangrong

Hello,

On Mon, Aug 06, 2018 at 01:44:49PM +0200, Paolo Bonzini wrote:
> On 06/08/2018 09:51, Xiao Guangrong wrote:
> > 
> > 
> > On 07/27/2018 11:46 PM, Paolo Bonzini wrote:
> >> We are currently cutting hva_to_pfn_fast short if we do not want an
> >> immediate exit, which is represented by !async && !atomic.  However,
> >> this is unnecessary, and __get_user_pages_fast is *much* faster
> >> because the regular get_user_pages takes pmd_lock/pte_lock.
> >> In fact, when many CPUs take a nested vmexit at the same time
> >> the contention on those locks is visible, and this patch removes
> >> about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU
> >> nested guest.
> >>
> > 
> > Nice improvement.
> > 
> > Then after that, we will unconditionally try hva_to_pfn_fast(), does
> > it hurt the case that the mappings in the host's page tables have not
> > been present yet?
> 
> I don't think so, because that's quite slow anyway.

There will be a minimal impact, but it's worth it.

The reason it's worth is that we shouldn't be calling
get_user_pages_unlocked in hva_to_pfn_slow if we could pass
FOLL_HWPOISON to get_user_pages_fast.

And get_user_pages_fast is really just __get_user_pages_fast +
get_user_pages_unlocked with just a difference (see below).

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

> 
> > Can we apply this tech to other places using gup or even squash it
> > into  get_user_pages()?
> 
> That may make sense.  Andrea, do you have an idea?

About further improvements looking at commit
5b65c4677a57a1d4414212f9995aa0e46a21ff80 it looks like it may be worth
adding a new gup variant __get_user_pages_fast_irq_enabled to make our
slow path "__get_user_pages_fast_irq_enabled +
get_user_pages_unlocked" really as fast as get_user_pages_fast (which
we can't call in the atomic case and can't take the foll flags, making
it take the foll flags would also make it somewhat slower by adding
branches).

If I understand correctly the commit header Before refers to when
get_user_pages_fast was calling __get_user_pages_fast, and After is
the optimized version without local_irq_save/restore but instead using
local_irq_disable/enable.

So we'd need to call a new __get_user_pages_fast_irq_enabled instead
of __get_user_pages_fast that would only safe to call when irq are
enabled and that's always the case for KVM also for the atomic case
(KVM's atomic case is atomic only because of the spinlock, not because
irqs are disabled). Such new method would then also be ok to be called
from interrupts as long as irq are enabled when it is being called.

Such change would also contribute to reduce the minimal impact to the
_slow case. x86 would be sure fine with the generic version and it's
trivial to implement, I haven't checked other arch details.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-06 17:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-27 15:46 [PATCH] KVM: try __get_user_pages_fast even if not in atomic context Paolo Bonzini
2018-07-30  8:40 ` David Hildenbrand
2018-08-06  7:51 ` Xiao Guangrong
2018-08-06 11:44   ` Paolo Bonzini
2018-08-06 17:39     ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).