linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] x86: svm: use kvm_fast_pio_in()
@ 2015-03-02 21:02 Joel Schopp
  2015-03-03 16:42 ` Radim Krčmář
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Joel Schopp @ 2015-03-02 21:02 UTC (permalink / raw)
  To: Gleb Natapov, Paolo Bonzini, kvm
  Cc: David Kaplan, David Kaplan, rkrcmar, Joerg Roedel, linux-kernel,
	Borislav Petkov

From: David Kaplan <David.Kaplan@amd.com>

We can make the in instruction go faster the same way the out instruction is
already.

Changes from v2[Joel]:
	* changed rax from u32 to unsigned long
	* changed a couple return 0 to BUG_ON()
	* changed 8 to sizeof(new_rax)
	* added trace hook
	* removed redundant clearing of count
Changes from v1[Joel]
	* Added kvm_fast_pio_in() implementation that was left out of v1

Signed-off-by: David Kaplan <David.Kaplan@amd.com>
[extracted from larger unlrelated patch, forward ported, addressed reviews, tested]
Signed-off-by: Joel Schopp <joel.schopp@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/svm.c              |    4 +++-
 arch/x86/kvm/x86.c              |   30 ++++++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a236e39..b976824 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -931,6 +931,7 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
 struct x86_emulate_ctxt;
 
 int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port);
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port);
 void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int kvm_emulate_halt(struct kvm_vcpu *vcpu);
 int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..f8c906b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1899,7 +1899,7 @@ static int io_interception(struct vcpu_svm *svm)
 	++svm->vcpu.stat.io_exits;
 	string = (io_info & SVM_IOIO_STR_MASK) != 0;
 	in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
-	if (string || in)
+	if (string)
 		return emulate_instruction(vcpu, 0) == EMULATE_DONE;
 
 	port = io_info >> 16;
@@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
 	svm->next_rip = svm->vmcb->control.exit_info_2;
 	skip_emulated_instruction(&svm->vcpu);
 
+	if (in)
+		return kvm_fast_pio_in(vcpu, size, port);
 	return kvm_fast_pio_out(vcpu, size, port);
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bd7a70b..d05efaf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5463,6 +5463,36 @@ int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port)
 }
 EXPORT_SYMBOL_GPL(kvm_fast_pio_out);
 
+static int complete_fast_pio(struct kvm_vcpu *vcpu)
+{
+	unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
+
+	BUG_ON(!vcpu->arch.pio.count);
+	BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
+
+	memcpy(&new_rax, vcpu, sizeof(new_rax));
+	trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
+		      vcpu->arch.pio.count, vcpu->arch.pio_data);
+	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
+	vcpu->arch.pio.count = 0;
+	return 1;
+}
+
+int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
+{
+	unsigned long val;
+	int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
+					   port, &val, 1);
+
+	if (ret)
+		kvm_register_write(vcpu, VCPU_REGS_RAX, val);
+	else
+		vcpu->arch.complete_userspace_io = complete_fast_pio;
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_fast_pio_in);
+
 static void tsc_bad(void *info)
 {
 	__this_cpu_write(cpu_tsc_khz, 0);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-02 21:02 [PATCH v3] x86: svm: use kvm_fast_pio_in() Joel Schopp
@ 2015-03-03 16:42 ` Radim Krčmář
  2015-03-03 19:48   ` Joel Schopp
  2015-03-03 16:44 ` Radim Krčmář
  2015-03-13  0:47 ` Marcelo Tosatti
  2 siblings, 1 reply; 9+ messages in thread
From: Radim Krčmář @ 2015-03-03 16:42 UTC (permalink / raw)
  To: Joel Schopp
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, Joerg Roedel,
	linux-kernel, Borislav Petkov

2015-03-02 15:02-0600, Joel Schopp:
> From: David Kaplan <David.Kaplan@amd.com>
> 
> We can make the in instruction go faster the same way the out instruction is
> already.

(How much faster do benchmarks run?)

> Changes from v2[Joel]:
> 	* changed rax from u32 to unsigned long
> 	* changed a couple return 0 to BUG_ON()
> 	* changed 8 to sizeof(new_rax)
> 	* added trace hook
> 	* removed redundant clearing of count
> Changes from v1[Joel]
> 	* Added kvm_fast_pio_in() implementation that was left out of v1
> 
> Signed-off-by: David Kaplan <David.Kaplan@amd.com>
> [extracted from larger unlrelated patch, forward ported, addressed reviews, tested]
> Signed-off-by: Joel Schopp <joel.schopp@amd.com>
> ---
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -5463,6 +5463,36 @@ int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port)
>  }
>  EXPORT_SYMBOL_GPL(kvm_fast_pio_out);
>  
> +static int complete_fast_pio(struct kvm_vcpu *vcpu)

(complete_fast_pio_in()?)

> +{
> +	unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);

Shouldn't we handle writes in EAX differently than in AX and AL, because
of implicit zero extension.

> +
> +	BUG_ON(!vcpu->arch.pio.count);
> +	BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));

(Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be
 sufficient.)

> +
> +	memcpy(&new_rax, vcpu, sizeof(new_rax));
> +	trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
> +		      vcpu->arch.pio.count, vcpu->arch.pio_data);
> +	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
> +	vcpu->arch.pio.count = 0;

I think it is better to call emulator_pio_in_emulated directly, like

   	emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
   			vcpu->arch.pio.port, &new_rax, 1);
   	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);

because we know that vcpu->arch.pio.count != 0.

Refactoring could avoid the weird vcpu->ctxt->vcpu conversion.
(A better name is always welcome.)

---
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 96a8333f3db0..d0e5b086f2e1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4663,22 +4663,23 @@ static int emulator_pio_in_out(struct kvm_vcpu *vcpu, int size,
 	return 0;
 }
 
+static void emulator_complete_pio_in(struct kvm_vcpu *vcpu, int size,
+		unsigned short port, void *val, unsigned int count)
+{
+	memcpy(val, vcpu->arch.pio_data, size * count);
+	trace_kvm_pio(KVM_PIO_IN, port, size, count, vcpu->arch.pio_data);
+	vcpu->arch.pio.count = 0;
+}
+
 static int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt,
 				    int size, unsigned short port, void *val,
 				    unsigned int count)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
-	int ret;
 
-	if (vcpu->arch.pio.count)
-		goto data_avail;
-
-	ret = emulator_pio_in_out(vcpu, size, port, val, count, true);
-	if (ret) {
-data_avail:
-		memcpy(val, vcpu->arch.pio_data, size * count);
-		trace_kvm_pio(KVM_PIO_IN, port, size, count, vcpu->arch.pio_data);
-		vcpu->arch.pio.count = 0;
+	if (vcpu->arch.pio.count ||
+	    emulator_pio_in_out(vcpu, size, port, val, count, true)) {
+		emulator_complete_pio_in(vcpu, size, port, val, count);
 		return 1;
 	}
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-02 21:02 [PATCH v3] x86: svm: use kvm_fast_pio_in() Joel Schopp
  2015-03-03 16:42 ` Radim Krčmář
@ 2015-03-03 16:44 ` Radim Krčmář
  2015-03-03 20:03   ` Joel Schopp
  2015-03-13  0:47 ` Marcelo Tosatti
  2 siblings, 1 reply; 9+ messages in thread
From: Radim Krčmář @ 2015-03-03 16:44 UTC (permalink / raw)
  To: Joel Schopp
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, Joerg Roedel,
	linux-kernel, Borislav Petkov

2015-03-02 15:02-0600, Joel Schopp:
> +int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
> +{
> +	unsigned long val;
> +	int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
> +					   port, &val, 1);
> +

Btw. does this return 1 in some scenario?

> +	if (ret)
> +		kvm_register_write(vcpu, VCPU_REGS_RAX, val);
> +	else
> +		vcpu->arch.complete_userspace_io = complete_fast_pio;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(kvm_fast_pio_in);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-03 16:42 ` Radim Krčmář
@ 2015-03-03 19:48   ` Joel Schopp
  2015-03-03 20:42     ` Radim Krčmář
  0 siblings, 1 reply; 9+ messages in thread
From: Joel Schopp @ 2015-03-03 19:48 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, Joerg Roedel,
	linux-kernel, Borislav Petkov

Thank you for your detailed review on several of my patches.

>>  
>> +static int complete_fast_pio(struct kvm_vcpu *vcpu)
> (complete_fast_pio_in()?)
If I do a v4 I'll adopt that name.
>> +{
>> +	unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> Shouldn't we handle writes in EAX differently than in AX and AL, because
> of implicit zero extension.
I don't think the implicit zero extension hurts us here, but maybe there
is something I'm missing that I need understand. Could you explain this
further?
>
>> +
>> +	BUG_ON(!vcpu->arch.pio.count);
>> +	BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
> (Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be
>  sufficient.)
I prefer the checks that are there now after your last review,
especially since surrounded by BUG_ON they only run on debug kernels.

>
>> +
>> +	memcpy(&new_rax, vcpu, sizeof(new_rax));
>> +	trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
>> +		      vcpu->arch.pio.count, vcpu->arch.pio_data);
>> +	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>> +	vcpu->arch.pio.count = 0;
> I think it is better to call emulator_pio_in_emulated directly, like
>
>    	emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
>    			vcpu->arch.pio.port, &new_rax, 1);
>    	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>
> because we know that vcpu->arch.pio.count != 0.
I think two extra lines of code in my patch vs your suggestion are worth
it to a) reduce execution path length b) increase readability c) avoid
breaking the abstraction by not checking the return code d) avoid any
future bugs introduced by changes the function that would return a value
other than 1. 
>
> Refactoring could avoid the weird vcpu->ctxt->vcpu conversion.
> (A better name is always welcome.)
The pointer chasing is making me dizzy.  I'm not sure why
emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it
immediately translate that to a vcpu and never use the x86_emulate_ctxt,
why not pass the vcpu in the first place?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-03 16:44 ` Radim Krčmář
@ 2015-03-03 20:03   ` Joel Schopp
  2015-03-03 20:44     ` Radim Krčmář
  0 siblings, 1 reply; 9+ messages in thread
From: Joel Schopp @ 2015-03-03 20:03 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, Joerg Roedel,
	linux-kernel, Borislav Petkov


On 03/03/2015 10:44 AM, Radim Krčmář wrote:
> 2015-03-02 15:02-0600, Joel Schopp:
>> +int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
>> +{
>> +	unsigned long val;
>> +	int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
>> +					   port, &val, 1);
>> +
> Btw. does this return 1 in some scenario?
If a function returns a value it is always a good idea to check it and
act appropriately.  That said...
emulator_pio_in_emulated will return 1 if emulator_pio_in_out returns 1
or if vcpu->arch.pio.count != 0
emulator_pio_in_out returns 1 if kernel_pio returns 0
kernel_pio returns 0 if kvm_io_bus_read returns 0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-03 19:48   ` Joel Schopp
@ 2015-03-03 20:42     ` Radim Krčmář
  2015-04-07 12:55       ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Radim Krčmář @ 2015-03-03 20:42 UTC (permalink / raw)
  To: Joel Schopp
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, Joerg Roedel,
	linux-kernel, Borislav Petkov

2015-03-03 13:48-0600, Joel Schopp:
> >> +	unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> > Shouldn't we handle writes in EAX differently than in AX and AL, because
> > of implicit zero extension.
> I don't think the implicit zero extension hurts us here, but maybe there
> is something I'm missing that I need understand. Could you explain this
> further?

According to APM vol.2, 2.5.3 Operands and Results, when using EAX,
we should zero upper 32 bits of RAX:

  Zero Extension of Results. In 64-bit mode, when performing 32-bit
  operations with a GPR destination, the processor zero-extends the 32-bit
  result into the full 64-bit destination. Both 8-bit and 16-bit
  operations on GPRs preserve all unwritten upper bits of the destination
  GPR. This is consistent with legacy 16-bit and 32-bit semantics for
  partial-width results.

Is IN not covered?

> >> +	BUG_ON(!vcpu->arch.pio.count);
> >> +	BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
> > (Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be
> >  sufficient.)
> I prefer the checks that are there now after your last review,
> especially since surrounded by BUG_ON they only run on debug kernels.

BUG_ON is checked on essentially all kernels that run KVM.
(All distribution-based configs should have it.)

If we wanted to validate the size, then this is strictly better:
  BUG_ON(vcpu->arch.pio.count != 1 || vcpu->arch.pio.size > sizeof(new_rax))

> >> +	memcpy(&new_rax, vcpu, sizeof(new_rax));
> >> +	trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
> >> +		      vcpu->arch.pio.count, vcpu->arch.pio_data);
> >> +	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
> >> +	vcpu->arch.pio.count = 0;
> > I think it is better to call emulator_pio_in_emulated directly, like
> >
> >    	emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
> >    			vcpu->arch.pio.port, &new_rax, 1);
> >    	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
> >
> > because we know that vcpu->arch.pio.count != 0.
> I think two extra lines of code in my patch vs your suggestion are worth
> it to a) reduce execution path length b) increase readability c) avoid
> breaking the abstraction by not checking the return code d) avoid any
> future bugs introduced by changes the function that would return a value
> other than 1. 

True, it is horrible, the attached patch should have addressed (c) and
(d), and it could be inlined to match (a).

Pasting the same code creates bug opportunities when we forget to modify
all places.  This class of problems can be harder to deal with, that (c)
and (d), because we can't simply print all callers.

> > Refactoring could avoid the weird vcpu->ctxt->vcpu conversion.
> > (A better name is always welcome.)
> The pointer chasing is making me dizzy.  I'm not sure why
> emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it
> immediately translate that to a vcpu and never use the x86_emulate_ctxt,
> why not pass the vcpu in the first place?

It is a part of x86_emulate_ops, where ctxt is more important ...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-03 20:03   ` Joel Schopp
@ 2015-03-03 20:44     ` Radim Krčmář
  0 siblings, 0 replies; 9+ messages in thread
From: Radim Krčmář @ 2015-03-03 20:44 UTC (permalink / raw)
  To: Joel Schopp
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, Joerg Roedel,
	linux-kernel, Borislav Petkov

2015-03-03 14:03-0600, Joel Schopp:
> On 03/03/2015 10:44 AM, Radim Krčmář wrote:
> > 2015-03-02 15:02-0600, Joel Schopp:
> >> +	int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
> >> +					   port, &val, 1);
> > Btw. does this return 1 in some scenario?
> If a function returns a value it is always a good idea to check it and
> act appropriately.  That said...
> emulator_pio_in_emulated will return 1 if emulator_pio_in_out returns 1

Ah, I have completely forgotten, thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-02 21:02 [PATCH v3] x86: svm: use kvm_fast_pio_in() Joel Schopp
  2015-03-03 16:42 ` Radim Krčmář
  2015-03-03 16:44 ` Radim Krčmář
@ 2015-03-13  0:47 ` Marcelo Tosatti
  2 siblings, 0 replies; 9+ messages in thread
From: Marcelo Tosatti @ 2015-03-13  0:47 UTC (permalink / raw)
  To: Joel Schopp
  Cc: Gleb Natapov, Paolo Bonzini, kvm, David Kaplan, rkrcmar,
	Joerg Roedel, linux-kernel, Borislav Petkov

On Mon, Mar 02, 2015 at 03:02:02PM -0600, Joel Schopp wrote:
> From: David Kaplan <David.Kaplan@amd.com>
> 
> We can make the in instruction go faster the same way the out instruction is
> already.
> 
> Changes from v2[Joel]:
> 	* changed rax from u32 to unsigned long
> 	* changed a couple return 0 to BUG_ON()
> 	* changed 8 to sizeof(new_rax)
> 	* added trace hook
> 	* removed redundant clearing of count
> Changes from v1[Joel]
> 	* Added kvm_fast_pio_in() implementation that was left out of v1
> 
> Signed-off-by: David Kaplan <David.Kaplan@amd.com>
> [extracted from larger unlrelated patch, forward ported, addressed reviews, tested]
> Signed-off-by: Joel Schopp <joel.schopp@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    1 +
>  arch/x86/kvm/svm.c              |    4 +++-
>  arch/x86/kvm/x86.c              |   30 ++++++++++++++++++++++++++++++
>  3 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index a236e39..b976824 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -931,6 +931,7 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
>  struct x86_emulate_ctxt;
>  
>  int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port);
> +int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port);
>  void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
>  int kvm_emulate_halt(struct kvm_vcpu *vcpu);
>  int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d319e0c..f8c906b 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1899,7 +1899,7 @@ static int io_interception(struct vcpu_svm *svm)
>  	++svm->vcpu.stat.io_exits;
>  	string = (io_info & SVM_IOIO_STR_MASK) != 0;
>  	in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
> -	if (string || in)
> +	if (string)
>  		return emulate_instruction(vcpu, 0) == EMULATE_DONE;
>  
>  	port = io_info >> 16;
> @@ -1907,6 +1907,8 @@ static int io_interception(struct vcpu_svm *svm)
>  	svm->next_rip = svm->vmcb->control.exit_info_2;
>  	skip_emulated_instruction(&svm->vcpu);
>  
> +	if (in)
> +		return kvm_fast_pio_in(vcpu, size, port);
>  	return kvm_fast_pio_out(vcpu, size, port);
>  }
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index bd7a70b..d05efaf 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5463,6 +5463,36 @@ int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port)
>  }
>  EXPORT_SYMBOL_GPL(kvm_fast_pio_out);
>  
> +static int complete_fast_pio(struct kvm_vcpu *vcpu)
> +{
> +	unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> +
> +	BUG_ON(!vcpu->arch.pio.count);
> +	BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
> +
> +	memcpy(&new_rax, vcpu, sizeof(new_rax));

Weird.

> +	trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
> +		      vcpu->arch.pio.count, vcpu->arch.pio_data);
> +	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
> +	vcpu->arch.pio.count = 0;
> +	return 1;
> +}
> +
> +int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port)
> +{
> +	unsigned long val;

Please zero initialize val.
Please check sanity of size.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()
  2015-03-03 20:42     ` Radim Krčmář
@ 2015-04-07 12:55       ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2015-04-07 12:55 UTC (permalink / raw)
  To: Radim Krčmář, Joel Schopp
  Cc: Gleb Natapov, kvm, David Kaplan, Joerg Roedel, linux-kernel,
	Borislav Petkov



On 03/03/2015 21:42, Radim Krčmář wrote:
> 2015-03-03 13:48-0600, Joel Schopp:
>>>> +	unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
>>> Shouldn't we handle writes in EAX differently than in AX and AL, because
>>> of implicit zero extension.
>> I don't think the implicit zero extension hurts us here, but maybe there
>> is something I'm missing that I need understand. Could you explain this
>> further?
> 
> According to APM vol.2, 2.5.3 Operands and Results, when using EAX,
> we should zero upper 32 bits of RAX:
> 
>   Zero Extension of Results. In 64-bit mode, when performing 32-bit
>   operations with a GPR destination, the processor zero-extends the 32-bit
>   result into the full 64-bit destination. Both 8-bit and 16-bit
>   operations on GPRs preserve all unwritten upper bits of the destination
>   GPR. This is consistent with legacy 16-bit and 32-bit semantics for
>   partial-width results.
> 
> Is IN not covered?

It is.  You need to zero the upper 32 bits.

>>>> +	BUG_ON(!vcpu->arch.pio.count);
>>>> +	BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
>>> (Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be
>>>  sufficient.)
>> I prefer the checks that are there now after your last review,
>> especially since surrounded by BUG_ON they only run on debug kernels.
> 
> BUG_ON is checked on essentially all kernels that run KVM.
> (All distribution-based configs should have it.)

Correct.

> If we wanted to validate the size, then this is strictly better:
>   BUG_ON(vcpu->arch.pio.count != 1 || vcpu->arch.pio.size > sizeof(new_rax))

That would be a very weird assertion considering that
vcpu->arch.pio.size will architecturally be at most 4.

The first arm of the || is sufficient.

>>>> +	memcpy(&new_rax, vcpu, sizeof(new_rax));
>>>> +	trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
>>>> +		      vcpu->arch.pio.count, vcpu->arch.pio_data);
>>>> +	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>>>> +	vcpu->arch.pio.count = 0;
>>> I think it is better to call emulator_pio_in_emulated directly, like
>>>
>>>    	emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
>>>    			vcpu->arch.pio.port, &new_rax, 1);
>>>    	kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>>>
>>> because we know that vcpu->arch.pio.count != 0.
> 
> Pasting the same code creates bug opportunities when we forget to modify
> all places.  This class of problems can be harder to deal with, that (c)
> and (d), because we can't simply print all callers.

I agree with this and prefer calling emulator_pio_in_emulated in
complete_fast_pio_in, indeed.

>>> Refactoring could avoid the weird vcpu->ctxt->vcpu conversion.
>>> (A better name is always welcome.)

No need for that.

>> The pointer chasing is making me dizzy.  I'm not sure why
>> emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it
>> immediately translate that to a vcpu and never use the x86_emulate_ctxt,
>> why not pass the vcpu in the first place?

Because the emulator is written to be usable outside the Linux kernel as
well.

Also, the fast path (used if kernel_pio returns 0) doesn't read
VCPU_REGS_RAX, thus using an uninitialized variable here:

>>> +	unsigned long val;
>>> +	int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
>>> +					   port, &val, 1);
>>> +
>>> +	if (ret)
>>> +		kvm_register_write(vcpu, VCPU_REGS_RAX, val);

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-07 12:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-02 21:02 [PATCH v3] x86: svm: use kvm_fast_pio_in() Joel Schopp
2015-03-03 16:42 ` Radim Krčmář
2015-03-03 19:48   ` Joel Schopp
2015-03-03 20:42     ` Radim Krčmář
2015-04-07 12:55       ` Paolo Bonzini
2015-03-03 16:44 ` Radim Krčmář
2015-03-03 20:03   ` Joel Schopp
2015-03-03 20:44     ` Radim Krčmář
2015-03-13  0:47 ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).