kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
@ 2020-03-04  2:37 linmiaohe
  2020-03-04  7:30 ` Paolo Bonzini
  2020-03-04 15:32 ` Peter Xu
  0 siblings, 2 replies; 7+ messages in thread
From: linmiaohe @ 2020-03-04  2:37 UTC (permalink / raw)
  To: Peter Xu
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, linux-kernel, kvm

Hi:
Peter Xu <peterx@redhat.com> writes:
>insn_fetch() will always implicitly refill instruction buffer properly when the buffer is empty, so we don't need to explicitly fetch it even if insn_len==0 for x86_decode_insn().
>
>Signed-off-by: Peter Xu <peterx@redhat.com>
>---
> arch/x86/kvm/emulate.c | 5 -----
> 1 file changed, 5 deletions(-)
>
>diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index dd19fb3539e0..04f33c1ca926 100644
>--- a/arch/x86/kvm/emulate.c
>+++ b/arch/x86/kvm/emulate.c
>@@ -5175,11 +5175,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> 	ctxt->opcode_len = 1;
> 	if (insn_len > 0)
> 		memcpy(ctxt->fetch.data, insn, insn_len);
>-	else {
>-		rc = __do_insn_fetch_bytes(ctxt, 1);
>-		if (rc != X86EMUL_CONTINUE)
>-			goto done;
>-	}
> 
> 	switch (mode) {
> 	case X86EMUL_MODE_REAL:

Looks good, thanks. But it seems we should also take care of the comment in __do_insn_fetch_bytes(), as we do not
load instruction at the beginning of x86_decode_insn() now, which may be misleading:
		/*
         * One instruction can only straddle two pages,
         * and one has been loaded at the beginning of
         * x86_decode_insn.  So, if not enough bytes
         * still, we must have hit the 15-byte boundary.
         */
        if (unlikely(size < op_size))
                return emulate_gp(ctxt, 0);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
  2020-03-04  2:37 [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn() linmiaohe
@ 2020-03-04  7:30 ` Paolo Bonzini
  2020-03-04 15:18   ` Peter Xu
  2020-03-04 15:32 ` Peter Xu
  1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2020-03-04  7:30 UTC (permalink / raw)
  To: linmiaohe, Peter Xu
  Cc: Sean Christopherson, Vitaly Kuznetsov, linux-kernel, kvm

On 04/03/20 03:37, linmiaohe wrote:
> Hi:
> Peter Xu <peterx@redhat.com> writes:
>> insn_fetch() will always implicitly refill instruction buffer properly when the buffer is empty, so we don't need to explicitly fetch it even if insn_len==0 for x86_decode_insn().
>>
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>> arch/x86/kvm/emulate.c | 5 -----
>> 1 file changed, 5 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index dd19fb3539e0..04f33c1ca926 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -5175,11 +5175,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
>> 	ctxt->opcode_len = 1;
>> 	if (insn_len > 0)
>> 		memcpy(ctxt->fetch.data, insn, insn_len);
>> -	else {
>> -		rc = __do_insn_fetch_bytes(ctxt, 1);
>> -		if (rc != X86EMUL_CONTINUE)
>> -			goto done;
>> -	}
>>
>> 	switch (mode) {
>> 	case X86EMUL_MODE_REAL:

This is a a small (but measurable) optimization; going through
__do_insn_fetch_bytes instead of do_insn_fetch_bytes is a little bit
faster because it lets you mark the branch in do_insn_fetch_bytes as
unlikely, and in general it allows the branch predictor to do a better job.

Paolo

> Looks good, thanks. But it seems we should also take care of the comment in __do_insn_fetch_bytes(), as we do not
> load instruction at the beginning of x86_decode_insn() now, which may be misleading:
> 		/*
>          * One instruction can only straddle two pages,
>          * and one has been loaded at the beginning of
>          * x86_decode_insn.  So, if not enough bytes
>          * still, we must have hit the 15-byte boundary.
>          */
>         if (unlikely(size < op_size))
>                 return emulate_gp(ctxt, 0);
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
  2020-03-04  7:30 ` Paolo Bonzini
@ 2020-03-04 15:18   ` Peter Xu
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Xu @ 2020-03-04 15:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linmiaohe, Sean Christopherson, Vitaly Kuznetsov, linux-kernel, kvm

On Wed, Mar 04, 2020 at 08:30:49AM +0100, Paolo Bonzini wrote:
> On 04/03/20 03:37, linmiaohe wrote:
> > Hi:
> > Peter Xu <peterx@redhat.com> writes:
> >> insn_fetch() will always implicitly refill instruction buffer properly when the buffer is empty, so we don't need to explicitly fetch it even if insn_len==0 for x86_decode_insn().
> >>
> >> Signed-off-by: Peter Xu <peterx@redhat.com>
> >> ---
> >> arch/x86/kvm/emulate.c | 5 -----
> >> 1 file changed, 5 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index dd19fb3539e0..04f33c1ca926 100644
> >> --- a/arch/x86/kvm/emulate.c
> >> +++ b/arch/x86/kvm/emulate.c
> >> @@ -5175,11 +5175,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> >> 	ctxt->opcode_len = 1;
> >> 	if (insn_len > 0)
> >> 		memcpy(ctxt->fetch.data, insn, insn_len);
> >> -	else {
> >> -		rc = __do_insn_fetch_bytes(ctxt, 1);
> >> -		if (rc != X86EMUL_CONTINUE)
> >> -			goto done;
> >> -	}
> >>
> >> 	switch (mode) {
> >> 	case X86EMUL_MODE_REAL:
> 
> This is a a small (but measurable) optimization; going through
> __do_insn_fetch_bytes instead of do_insn_fetch_bytes is a little bit
> faster because it lets you mark the branch in do_insn_fetch_bytes as
> unlikely, and in general it allows the branch predictor to do a better job.

Ah I see, that makes sense.  Thanks!

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
  2020-03-04  2:37 [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn() linmiaohe
  2020-03-04  7:30 ` Paolo Bonzini
@ 2020-03-04 15:32 ` Peter Xu
  2020-03-04 18:19   ` Paolo Bonzini
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Xu @ 2020-03-04 15:32 UTC (permalink / raw)
  To: linmiaohe
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, linux-kernel, kvm

On Wed, Mar 04, 2020 at 02:37:06AM +0000, linmiaohe wrote:
> Hi:
> Peter Xu <peterx@redhat.com> writes:
> >insn_fetch() will always implicitly refill instruction buffer properly when the buffer is empty, so we don't need to explicitly fetch it even if insn_len==0 for x86_decode_insn().
> >
> >Signed-off-by: Peter Xu <peterx@redhat.com>
> >---
> > arch/x86/kvm/emulate.c | 5 -----
> > 1 file changed, 5 deletions(-)
> >
> >diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index dd19fb3539e0..04f33c1ca926 100644
> >--- a/arch/x86/kvm/emulate.c
> >+++ b/arch/x86/kvm/emulate.c
> >@@ -5175,11 +5175,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> > 	ctxt->opcode_len = 1;
> > 	if (insn_len > 0)
> > 		memcpy(ctxt->fetch.data, insn, insn_len);
> >-	else {
> >-		rc = __do_insn_fetch_bytes(ctxt, 1);
> >-		if (rc != X86EMUL_CONTINUE)
> >-			goto done;
> >-	}
> > 
> > 	switch (mode) {
> > 	case X86EMUL_MODE_REAL:
> 
> Looks good, thanks. But it seems we should also take care of the comment in __do_insn_fetch_bytes(), as we do not
> load instruction at the beginning of x86_decode_insn() now, which may be misleading:
> 		/*
>          * One instruction can only straddle two pages,
>          * and one has been loaded at the beginning of
>          * x86_decode_insn.  So, if not enough bytes
>          * still, we must have hit the 15-byte boundary.
>          */
>         if (unlikely(size < op_size))
>                 return emulate_gp(ctxt, 0);

Right, thanks for spotting that (even if the patch to be dropped :).

I guess not only the comment, but the check might even fail if we
apply the patch. Because when the fetch is the 1st attempt and
unluckily that acrosses one page boundary (because we'll only fetch
until either 15 bytes or the page boundary), so that single fetch
could be smaller than op_size provided.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
  2020-03-04 15:32 ` Peter Xu
@ 2020-03-04 18:19   ` Paolo Bonzini
  2020-03-04 20:30     ` Nadav Amit
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2020-03-04 18:19 UTC (permalink / raw)
  To: Peter Xu, linmiaohe
  Cc: Sean Christopherson, Vitaly Kuznetsov, linux-kernel, kvm

On 04/03/20 16:32, Peter Xu wrote:
>> Looks good, thanks. But it seems we should also take care of the comment in __do_insn_fetch_bytes(), as we do not
>> load instruction at the beginning of x86_decode_insn() now, which may be misleading:
>> 		/*
>>          * One instruction can only straddle two pages,
>>          * and one has been loaded at the beginning of
>>          * x86_decode_insn.  So, if not enough bytes
>>          * still, we must have hit the 15-byte boundary.
>>          */
>>         if (unlikely(size < op_size))
>>                 return emulate_gp(ctxt, 0);
> Right, thanks for spotting that (even if the patch to be dropped :).
> 
> I guess not only the comment, but the check might even fail if we
> apply the patch. Because when the fetch is the 1st attempt and
> unluckily that acrosses one page boundary (because we'll only fetch
> until either 15 bytes or the page boundary), so that single fetch
> could be smaller than op_size provided.

Right, priming the decode cache with one byte from the current page
cannot fail, and then we know that the next call must be at the
beginning of the next page.

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
  2020-03-04 18:19   ` Paolo Bonzini
@ 2020-03-04 20:30     ` Nadav Amit
  0 siblings, 0 replies; 7+ messages in thread
From: Nadav Amit @ 2020-03-04 20:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Xu, linmiaohe, Sean Christopherson, Vitaly Kuznetsov,
	linux-kernel, kvm

> On Mar 4, 2020, at 10:19 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 04/03/20 16:32, Peter Xu wrote:
>>> Looks good, thanks. But it seems we should also take care of the comment in __do_insn_fetch_bytes(), as we do not
>>> load instruction at the beginning of x86_decode_insn() now, which may be misleading:
>>> 		/*
>>>         * One instruction can only straddle two pages,
>>>         * and one has been loaded at the beginning of
>>>         * x86_decode_insn.  So, if not enough bytes
>>>         * still, we must have hit the 15-byte boundary.
>>>         */
>>>        if (unlikely(size < op_size))
>>>                return emulate_gp(ctxt, 0);
>> Right, thanks for spotting that (even if the patch to be dropped :).
>> 
>> I guess not only the comment, but the check might even fail if we
>> apply the patch. Because when the fetch is the 1st attempt and
>> unluckily that acrosses one page boundary (because we'll only fetch
>> until either 15 bytes or the page boundary), so that single fetch
>> could be smaller than op_size provided.
> 
> Right, priming the decode cache with one byte from the current page
> cannot fail, and then we know that the next call must be at the
> beginning of the next page.

IIRC I encountered (and fixed) a similar KVM bug in the past. It is a shame
I never wrote a unit test (and I don’t have time now), but it would be nice
to have one.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn()
@ 2020-03-04  2:16 Peter Xu
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Xu @ 2020-03-04  2:16 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, peterx, Vitaly Kuznetsov

insn_fetch() will always implicitly refill instruction buffer properly
when the buffer is empty, so we don't need to explicitly fetch it even
if insn_len==0 for x86_decode_insn().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 arch/x86/kvm/emulate.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index dd19fb3539e0..04f33c1ca926 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5175,11 +5175,6 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
 	ctxt->opcode_len = 1;
 	if (insn_len > 0)
 		memcpy(ctxt->fetch.data, insn, insn_len);
-	else {
-		rc = __do_insn_fetch_bytes(ctxt, 1);
-		if (rc != X86EMUL_CONTINUE)
-			goto done;
-	}
 
 	switch (mode) {
 	case X86EMUL_MODE_REAL:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-04 20:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04  2:37 [PATCH] KVM: X86: Avoid explictly fetch instruction in x86_decode_insn() linmiaohe
2020-03-04  7:30 ` Paolo Bonzini
2020-03-04 15:18   ` Peter Xu
2020-03-04 15:32 ` Peter Xu
2020-03-04 18:19   ` Paolo Bonzini
2020-03-04 20:30     ` Nadav Amit
  -- strict thread matches above, loose matches on Subject: below --
2020-03-04  2:16 Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).