linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out
@ 2021-08-08 23:29 Lai Jiangshan
  2021-08-09 16:54 ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-08 23:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

From: Lai Jiangshan <laijs@linux.alibaba.com>

The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
fixed a bug.  It did a great job and reset dr6 unconditionally when the
vcpu being scheduled out since the linux kernel only activates breakpoints
in rescheduling (perf events).

But writing to debug registers is slow, and it can be shown in perf results
sometimes even neither the host nor the guest activate breakpoints.

It'd be better to reset it conditionally and this patch moves the code of
reseting dr6 to the path of VM-exit where we can check the related
conditions.  And the comment is also updated.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
And even in the future, the linux kernel might activate breakpoint
in interrupts (rather than in rescheduling only),  the host would
not be confused by the stale dr6 after this patch.  The possible future
author who would implement it wouldn't need to care about how the kvm
mananges debug registers since it sticks to the host's expectations.

Another solution is changing breakpoint.c and make the linux kernel
always reset dr6 in activating breakpoints.  So that dr6 is allowed
to be stale when host breakpoints is not enabled and we don't need
to reset dr6 in this case. But it would be no harm to reset it even in
both places and killing stale states is good in programing.

 arch/x86/kvm/x86.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4116567f3d44..39b5dad2dd19 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4310,12 +4310,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 
 	static_call(kvm_x86_vcpu_put)(vcpu);
 	vcpu->arch.last_host_tsc = rdtsc();
-	/*
-	 * If userspace has set any breakpoints or watchpoints, dr6 is restored
-	 * on every vmexit, but if not, we might have a stale dr6 from the
-	 * guest. do_debug expects dr6 to be cleared after it runs, do the same.
-	 */
-	set_debugreg(0, 6);
 }
 
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
@@ -9375,6 +9369,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	fastpath_t exit_fastpath;
 
 	bool req_immediate_exit = false;
+	bool reset_dr6 = false;
 
 	/* Forbid vmenter if vcpu dirty ring is soft-full */
 	if (unlikely(vcpu->kvm->dirty_ring_size &&
@@ -9601,6 +9596,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		set_debugreg(vcpu->arch.eff_db[3], 3);
 		set_debugreg(vcpu->arch.dr6, 6);
 		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
+		reset_dr6 = true;
 	} else if (unlikely(hw_breakpoint_active())) {
 		set_debugreg(0, 7);
 	}
@@ -9631,17 +9627,34 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		kvm_update_dr0123(vcpu);
 		kvm_update_dr7(vcpu);
 		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
+		reset_dr6 = true;
 	}
 
 	/*
 	 * If the guest has used debug registers, at least dr7
 	 * will be disabled while returning to the host.
+	 *
+	 * If we have active breakpoints in the host, restore the old state.
+	 *
 	 * If we don't have active breakpoints in the host, we don't
-	 * care about the messed up debug address registers. But if
-	 * we have some of them active, restore the old state.
+	 * care about the messed up debug address registers but dr6
+	 * which is expected to be cleared normally.  Otherwise we might
+	 * leak a stale dr6 from the guest and confuse the host since
+	 * neither the host reset dr6 on activating next breakpoints nor
+	 * the hardware clear it on dilivering #DB.  The Intel SDM says:
+	 *
+	 *   Certain debug exceptions may clear bits 0-3. The remaining
+	 *   contents of the DR6 register are never cleared by the
+	 *   processor. To avoid confusion in identifying debug
+	 *   exceptions, debug handlers should clear the register before
+	 *   returning to the interrupted task.
+	 *
+	 * Keep it simple and live up to expectations: clear DR6 immediately.
 	 */
 	if (hw_breakpoint_active())
 		hw_breakpoint_restore();
+	else if (unlikely(reset_dr6))
+		set_debugreg(DR6_RESERVED, 6);
 
 	vcpu->arch.last_vmentry_cpu = vcpu->cpu;
 	vcpu->arch.last_guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out
  2021-08-08 23:29 [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Lai Jiangshan
@ 2021-08-09 16:54 ` Sean Christopherson
  2021-08-09 17:43   ` [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD Lai Jiangshan
  2021-08-10  9:59   ` [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Paolo Bonzini
  0 siblings, 2 replies; 15+ messages in thread
From: Sean Christopherson @ 2021-08-09 16:54 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Lai Jiangshan, Paolo Bonzini, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, kvm

On Mon, Aug 09, 2021, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
> fixed a bug.  It did a great job and reset dr6 unconditionally when the
> vcpu being scheduled out since the linux kernel only activates breakpoints
> in rescheduling (perf events).
> 
> But writing to debug registers is slow, and it can be shown in perf results
> sometimes even neither the host nor the guest activate breakpoints.
> 
> It'd be better to reset it conditionally and this patch moves the code of
> reseting dr6 to the path of VM-exit where we can check the related
> conditions.  And the comment is also updated.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
> And even in the future, the linux kernel might activate breakpoint
> in interrupts (rather than in rescheduling only),  the host would
> not be confused by the stale dr6 after this patch.  The possible future
> author who would implement it wouldn't need to care about how the kvm
> mananges debug registers since it sticks to the host's expectations.

Eh, I don't think this is a valid argument.  KGBD already manipulates breakpoints
in NMI context, activating breakpoints from interrupt context would absolutely
require a full audit of the kernel.

> Another solution is changing breakpoint.c and make the linux kernel
> always reset dr6 in activating breakpoints.  So that dr6 is allowed
> to be stale when host breakpoints is not enabled and we don't need
> to reset dr6 in this case. But it would be no harm to reset it even in
> both places and killing stale states is good in programing.

Hmm, other than further penalizing guests that enable debug breakpoints.

What about adding a arch.switch_db_regs flag to note that DR6 is loaded with a
guest value and keeping the reset code in kvm_arch_vcpu_put()?

>  arch/x86/kvm/x86.c | 29 +++++++++++++++++++++--------
>  1 file changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4116567f3d44..39b5dad2dd19 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4310,12 +4310,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  
>  	static_call(kvm_x86_vcpu_put)(vcpu);
>  	vcpu->arch.last_host_tsc = rdtsc();
> -	/*
> -	 * If userspace has set any breakpoints or watchpoints, dr6 is restored
> -	 * on every vmexit, but if not, we might have a stale dr6 from the
> -	 * guest. do_debug expects dr6 to be cleared after it runs, do the same.
> -	 */
> -	set_debugreg(0, 6);
>  }
>  
>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> @@ -9375,6 +9369,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	fastpath_t exit_fastpath;
>  
>  	bool req_immediate_exit = false;
> +	bool reset_dr6 = false;
>  
>  	/* Forbid vmenter if vcpu dirty ring is soft-full */
>  	if (unlikely(vcpu->kvm->dirty_ring_size &&
> @@ -9601,6 +9596,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		set_debugreg(vcpu->arch.eff_db[3], 3);
>  		set_debugreg(vcpu->arch.dr6, 6);

Not directly related to this patch, but why does KVM_DEBUGREG_RELOAD exist?
Commit ae561edeb421 ("KVM: x86: DR0-DR3 are not clear on reset") added it to
ensure DR0-3 are fresh when they're modified through non-standard paths, but I
don't see any reason why the new values _must_ be loaded into hardware.  eff_db
needs to be updated, but I don't see why hardware DRs need to be updated unless
hardware breakpoints are active or DR exiting is disabled, and in those cases
updating hardware is handled by KVM_DEBUGREG_WONT_EXIT and KVM_DEBUGREG_BP_ENABLED.

Am I missing something?

>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> +		reset_dr6 = true;
>  	} else if (unlikely(hw_breakpoint_active())) {
>  		set_debugreg(0, 7);
>  	}
> @@ -9631,17 +9627,34 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		kvm_update_dr0123(vcpu);
>  		kvm_update_dr7(vcpu);
>  		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
> +		reset_dr6 = true;
>  	}
>  
>  	/*
>  	 * If the guest has used debug registers, at least dr7
>  	 * will be disabled while returning to the host.
> +	 *
> +	 * If we have active breakpoints in the host, restore the old state.
> +	 *
>  	 * If we don't have active breakpoints in the host, we don't
> -	 * care about the messed up debug address registers. But if
> -	 * we have some of them active, restore the old state.
> +	 * care about the messed up debug address registers but dr6
> +	 * which is expected to be cleared normally.  Otherwise we might
> +	 * leak a stale dr6 from the guest and confuse the host since
> +	 * neither the host reset dr6 on activating next breakpoints nor
> +	 * the hardware clear it on dilivering #DB.  The Intel SDM says:
> +	 *
> +	 *   Certain debug exceptions may clear bits 0-3. The remaining
> +	 *   contents of the DR6 register are never cleared by the
> +	 *   processor. To avoid confusion in identifying debug
> +	 *   exceptions, debug handlers should clear the register before
> +	 *   returning to the interrupted task.
> +	 *
> +	 * Keep it simple and live up to expectations: clear DR6 immediately.
>  	 */
>  	if (hw_breakpoint_active())
>  		hw_breakpoint_restore();
> +	else if (unlikely(reset_dr6))
> +		set_debugreg(DR6_RESERVED, 6);
>  
>  	vcpu->arch.last_vmentry_cpu = vcpu->cpu;
>  	vcpu->arch.last_guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
> -- 
> 2.19.1.6.gb485710b
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD
  2021-08-09 16:54 ` Sean Christopherson
@ 2021-08-09 17:43   ` Lai Jiangshan
  2021-08-09 17:43     ` [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT Lai Jiangshan
  2021-08-09 17:43     ` [PATCH V2 3/3] KVM: X86: Reset " Lai Jiangshan
  2021-08-10  9:59   ` [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Paolo Bonzini
  1 sibling, 2 replies; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-09 17:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

From: Lai Jiangshan <laijs@linux.alibaba.com>

Commit ae561edeb421 ("KVM: x86: DR0-DR3 are not clear on reset") added code to
ensure eff_db are updated when they're modified through non-standard paths.

But there is no reason to also update hardware DRs unless hardware breakpoints
are active or DR exiting is disabled, and in those cases updating hardware is
handled by KVM_DEBUGREG_WONT_EXIT and KVM_DEBUGREG_BP_ENABLED.

KVM_DEBUGREG_RELOAD just causes unnecesarry load of hardware DRs and is better
to be removed.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/kvm_host.h | 1 -
 arch/x86/kvm/x86.c              | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 974cbfb1eefe..9623855a5838 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -522,7 +522,6 @@ struct kvm_pmu_ops;
 enum {
 	KVM_DEBUGREG_BP_ENABLED = 1,
 	KVM_DEBUGREG_WONT_EXIT = 2,
-	KVM_DEBUGREG_RELOAD = 4,
 };
 
 struct kvm_mtrr_range {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4116567f3d44..ad47a09ce307 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1180,7 +1180,6 @@ static void kvm_update_dr0123(struct kvm_vcpu *vcpu)
 	if (!(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)) {
 		for (i = 0; i < KVM_NR_DB_REGS; i++)
 			vcpu->arch.eff_db[i] = vcpu->arch.db[i];
-		vcpu->arch.switch_db_regs |= KVM_DEBUGREG_RELOAD;
 	}
 }
 
@@ -9600,7 +9599,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		set_debugreg(vcpu->arch.eff_db[2], 2);
 		set_debugreg(vcpu->arch.eff_db[3], 3);
 		set_debugreg(vcpu->arch.dr6, 6);
-		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
 	} else if (unlikely(hw_breakpoint_active())) {
 		set_debugreg(0, 7);
 	}
@@ -9630,7 +9628,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		static_call(kvm_x86_sync_dirty_debug_regs)(vcpu);
 		kvm_update_dr0123(vcpu);
 		kvm_update_dr7(vcpu);
-		vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
 	}
 
 	/*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-09 17:43   ` [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD Lai Jiangshan
@ 2021-08-09 17:43     ` Lai Jiangshan
  2021-08-10 10:07       ` Paolo Bonzini
  2021-08-09 17:43     ` [PATCH V2 3/3] KVM: X86: Reset " Lai Jiangshan
  1 sibling, 1 reply; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-09 17:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

From: Lai Jiangshan <laijs@linux.alibaba.com>

Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
registers") allows the guest accessing to DRs without exiting when
KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
on entry to the guest---including DR6 that was not synced before the commit.

But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
and just leads to a more case which leaks stale DR6 to the host which has
to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().

We'd better to set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT,
so that we can fine-grain control the cases when we need to reset it
which is done in later patch.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kvm/x86.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ad47a09ce307..d2aa49722064 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9598,7 +9598,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		set_debugreg(vcpu->arch.eff_db[1], 1);
 		set_debugreg(vcpu->arch.eff_db[2], 2);
 		set_debugreg(vcpu->arch.eff_db[3], 3);
-		set_debugreg(vcpu->arch.dr6, 6);
+		/* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
+		if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
+			set_debugreg(vcpu->arch.dr6, 6);
 	} else if (unlikely(hw_breakpoint_active())) {
 		set_debugreg(0, 7);
 	}
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 3/3] KVM: X86: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-09 17:43   ` [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD Lai Jiangshan
  2021-08-09 17:43     ` [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT Lai Jiangshan
@ 2021-08-09 17:43     ` Lai Jiangshan
  2021-08-10  9:42       ` Paolo Bonzini
  2021-08-10 10:14       ` Paolo Bonzini
  1 sibling, 2 replies; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-09 17:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

From: Lai Jiangshan <laijs@linux.alibaba.com>

The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
fixed a bug by reseting DR6 unconditionally when the vcpu being scheduled out.

But writing to debug registers is slow, and it can be shown in perf results
sometimes even neither the host nor the guest activate breakpoints.

It'd be better to reset it conditionally and this patch moves the code of
reseting DR6 to the path of VM-exit and only reset it when
KVM_DEBUGREG_WONT_EXIT which is the only case that DR6 is guest value.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kvm/x86.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d2aa49722064..f40cdd7687d8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4309,12 +4309,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 
 	static_call(kvm_x86_vcpu_put)(vcpu);
 	vcpu->arch.last_host_tsc = rdtsc();
-	/*
-	 * If userspace has set any breakpoints or watchpoints, dr6 is restored
-	 * on every vmexit, but if not, we might have a stale dr6 from the
-	 * guest. do_debug expects dr6 to be cleared after it runs, do the same.
-	 */
-	set_debugreg(0, 6);
 }
 
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
@@ -9630,6 +9624,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		static_call(kvm_x86_sync_dirty_debug_regs)(vcpu);
 		kvm_update_dr0123(vcpu);
 		kvm_update_dr7(vcpu);
+		/* Reset Dr6 which is guest value. */
+		set_debugreg(DR6_RESERVED, 6);
 	}
 
 	/*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 3/3] KVM: X86: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-09 17:43     ` [PATCH V2 3/3] KVM: X86: Reset " Lai Jiangshan
@ 2021-08-10  9:42       ` Paolo Bonzini
  2021-08-10 10:14       ` Paolo Bonzini
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10  9:42 UTC (permalink / raw)
  To: Lai Jiangshan, linux-kernel
  Cc: Lai Jiangshan, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, kvm

On 09/08/21 19:43, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
> fixed a bug by reseting DR6 unconditionally when the vcpu being scheduled out.
> 
> But writing to debug registers is slow, and it can be shown in perf results
> sometimes even neither the host nor the guest activate breakpoints.
> 
> It'd be better to reset it conditionally and this patch moves the code of
> reseting DR6 to the path of VM-exit and only reset it when
> KVM_DEBUGREG_WONT_EXIT which is the only case that DR6 is guest value.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
>   arch/x86/kvm/x86.c | 8 ++------
>   1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d2aa49722064..f40cdd7687d8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4309,12 +4309,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>   
>   	static_call(kvm_x86_vcpu_put)(vcpu);
>   	vcpu->arch.last_host_tsc = rdtsc();
> -	/*
> -	 * If userspace has set any breakpoints or watchpoints, dr6 is restored
> -	 * on every vmexit, but if not, we might have a stale dr6 from the
> -	 * guest. do_debug expects dr6 to be cleared after it runs, do the same.
> -	 */
> -	set_debugreg(0, 6);
>   }
>   
>   static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> @@ -9630,6 +9624,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   		static_call(kvm_x86_sync_dirty_debug_regs)(vcpu);
>   		kvm_update_dr0123(vcpu);
>   		kvm_update_dr7(vcpu);
> +		/* Reset Dr6 which is guest value. */

Better keep the rationale from the original comment,

	/*
	 * do_debug expects dr6 to be cleared after it runs, so do
	 * not leave the guest value in the host DR6.
	 */

Paolo

> +		set_debugreg(DR6_RESERVED, 6);
>   	}
>   
>   	/*
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out
  2021-08-09 16:54 ` Sean Christopherson
  2021-08-09 17:43   ` [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD Lai Jiangshan
@ 2021-08-10  9:59   ` Paolo Bonzini
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10  9:59 UTC (permalink / raw)
  To: Sean Christopherson, Lai Jiangshan
  Cc: linux-kernel, Lai Jiangshan, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, kvm

On 09/08/21 18:54, Sean Christopherson wrote:
> Not directly related to this patch, but why does KVM_DEBUGREG_RELOAD exist?
> Commit ae561edeb421 ("KVM: x86: DR0-DR3 are not clear on reset") added it to
> ensure DR0-3 are fresh when they're modified through non-standard paths, but I
> don't see any reason why the new values_must_  be loaded into hardware.  eff_db
> needs to be updated, but I don't see why hardware DRs need to be updated unless
> hardware breakpoints are active or DR exiting is disabled, and in those cases
> updating hardware is handled by KVM_DEBUGREG_WONT_EXIT and KVM_DEBUGREG_BP_ENABLED.

The original implementation of KVM_DEBUGREG_WONT_EXIT (by yours truly) 
had a bug where it did not call kvm_update_dr7 and thus 
KVM_DEBUGREG_BP_ENABLED was not set correctly.  I agree that commit 
70e4da7a8ff6 ("KVM: x86: fix root cause for missed hardware 
breakpoints") should have gotten rid of KVM_DEBUGREG_RELOAD altogether.


Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-09 17:43     ` [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT Lai Jiangshan
@ 2021-08-10 10:07       ` Paolo Bonzini
  2021-08-10 10:30         ` Lai Jiangshan
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10 10:07 UTC (permalink / raw)
  To: Lai Jiangshan, linux-kernel
  Cc: Lai Jiangshan, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, kvm

On 09/08/21 19:43, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
> registers") allows the guest accessing to DRs without exiting when
> KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
> on entry to the guest---including DR6 that was not synced before the commit.
> 
> But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
> but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
> and just leads to a more case which leaks stale DR6 to the host which has
> to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
> 
> We'd better to set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT,
> so that we can fine-grain control the cases when we need to reset it
> which is done in later patch.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
>   arch/x86/kvm/x86.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ad47a09ce307..d2aa49722064 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9598,7 +9598,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   		set_debugreg(vcpu->arch.eff_db[1], 1);
>   		set_debugreg(vcpu->arch.eff_db[2], 2);
>   		set_debugreg(vcpu->arch.eff_db[3], 3);
> -		set_debugreg(vcpu->arch.dr6, 6);
> +		/* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
> +		if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
> +			set_debugreg(vcpu->arch.dr6, 6);
>   	} else if (unlikely(hw_breakpoint_active())) {
>   		set_debugreg(0, 7);
>   	}
> 

Even better, this should be moved to vmx.c's vcpu_enter_guest.  This
matches the handling in svm.c:

         /*
          * Run with all-zero DR6 unless needed, so that we can get the exact cause
          * of a #DB.
          */
         if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
                 svm_set_dr6(svm, vcpu->arch.dr6);
         else
                 svm_set_dr6(svm, DR6_ACTIVE_LOW);

That is,

     KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
     
     Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
     registers") allows the guest accessing to DRs without exiting when
     KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
     on entry to the guest---including DR6 that was not synced before the commit.
     
     But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
     but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
     and just leads to a more case which leaks stale DR6 to the host which has
     to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
     
     Even if KVM_DEBUGREG_WONT_EXIT, however, setting the host DR6 only matters
     on VMX because SVM always uses the DR6 value from the VMCB.  So move this
     line to vmx.c and make it conditional on KVM_DEBUGREG_WONT_EXIT.
     
     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ae8e62df16dd..21a3ef3012cf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
  		vmx->loaded_vmcs->host_state.cr4 = cr4;
  	}
  
+	/* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
+	if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
+		set_debugreg(vcpu->arch.dr6, 6);
+
  	/* When single-stepping over STI and MOV SS, we must clear the
  	 * corresponding interruptibility bits in the guest state. Otherwise
  	 * vmentry fails as it then expects bit 14 (BS) in pending debug
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a111899ab2b4..fbc536b21585 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
  		set_debugreg(vcpu->arch.eff_db[1], 1);
  		set_debugreg(vcpu->arch.eff_db[2], 2);
  		set_debugreg(vcpu->arch.eff_db[3], 3);
-		set_debugreg(vcpu->arch.dr6, 6);
  	} else if (unlikely(hw_breakpoint_active())) {
  		set_debugreg(0, 7);
  	}

Paolo


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 3/3] KVM: X86: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-09 17:43     ` [PATCH V2 3/3] KVM: X86: Reset " Lai Jiangshan
  2021-08-10  9:42       ` Paolo Bonzini
@ 2021-08-10 10:14       ` Paolo Bonzini
  2021-08-10 10:34         ` Lai Jiangshan
  1 sibling, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10 10:14 UTC (permalink / raw)
  To: Lai Jiangshan, linux-kernel
  Cc: Lai Jiangshan, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, kvm

On 09/08/21 19:43, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
> fixed a bug by reseting DR6 unconditionally when the vcpu being scheduled out.
> 
> But writing to debug registers is slow, and it can be shown in perf results
> sometimes even neither the host nor the guest activate breakpoints.
> 
> It'd be better to reset it conditionally and this patch moves the code of
> reseting DR6 to the path of VM-exit and only reset it when
> KVM_DEBUGREG_WONT_EXIT which is the only case that DR6 is guest value.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
>   arch/x86/kvm/x86.c | 8 ++------
>   1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d2aa49722064..f40cdd7687d8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4309,12 +4309,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>   
>   	static_call(kvm_x86_vcpu_put)(vcpu);
>   	vcpu->arch.last_host_tsc = rdtsc();
> -	/*
> -	 * If userspace has set any breakpoints or watchpoints, dr6 is restored
> -	 * on every vmexit, but if not, we might have a stale dr6 from the
> -	 * guest. do_debug expects dr6 to be cleared after it runs, do the same.
> -	 */
> -	set_debugreg(0, 6);
>   }
>   
>   static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> @@ -9630,6 +9624,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   		static_call(kvm_x86_sync_dirty_debug_regs)(vcpu);
>   		kvm_update_dr0123(vcpu);
>   		kvm_update_dr7(vcpu);
> +		/* Reset Dr6 which is guest value. */
> +		set_debugreg(DR6_RESERVED, 6);
>   	}
>   
>   	/*
> 

... and this should also be done exclusively for VMX, in vmx_sync_dirty_debug_regs:

     KVM: VMX: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
     
     The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
     fixed a bug by resetting DR6 unconditionally when the vcpu being scheduled out.
     
     But writing to debug registers is slow, and it can be visible in perf results
     sometimes, even if neither the host nor the guest activate breakpoints.
     
     Since KVM_DEBUGREG_WONT_EXIT on Intel processors is the only case
     where DR6 gets the guest value, and it never happens at all on SVM,
     the register can be cleared in vmx.c right after reading it.
     
     Reported-by: Lai Jiangshan <laijs@linux.alibaba.com>
     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 21a3ef3012cf..3a91302d05c0 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5110,6 +5110,12 @@ static void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
  
  	vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_WONT_EXIT;
  	exec_controls_setbit(to_vmx(vcpu), CPU_BASED_MOV_DR_EXITING);
+
+	/*
+	 * do_debug expects dr6 to be cleared after it runs, avoid that it sees
+	 * a stale dr6 from the guest.
+	 */
+	set_debugreg(DR6_RESERVED, 6);
  }
  
  static void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fbc536b21585..04c393551fb0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4313,12 +4313,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
  
  	static_call(kvm_x86_vcpu_put)(vcpu);
  	vcpu->arch.last_host_tsc = rdtsc();
-	/*
-	 * If userspace has set any breakpoints or watchpoints, dr6 is restored
-	 * on every vmexit, but if not, we might have a stale dr6 from the
-	 * guest. do_debug expects dr6 to be cleared after it runs, do the same.
-	 */
-	set_debugreg(0, 6);
  }
  
  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,


Paolo


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-10 10:07       ` Paolo Bonzini
@ 2021-08-10 10:30         ` Lai Jiangshan
  2021-08-10 10:35           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-10 10:30 UTC (permalink / raw)
  To: Paolo Bonzini, Lai Jiangshan, linux-kernel
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm



On 2021/8/10 18:07, Paolo Bonzini wrote:
> On 09/08/21 19:43, Lai Jiangshan wrote:
>> From: Lai Jiangshan <laijs@linux.alibaba.com>
>>
>> Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
>> registers") allows the guest accessing to DRs without exiting when
>> KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
>> on entry to the guest---including DR6 that was not synced before the commit.
>>
>> But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
>> but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
>> and just leads to a more case which leaks stale DR6 to the host which has
>> to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
>>
>> We'd better to set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT,
>> so that we can fine-grain control the cases when we need to reset it
>> which is done in later patch.
>>
>> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
>> ---
>>   arch/x86/kvm/x86.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index ad47a09ce307..d2aa49722064 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -9598,7 +9598,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>           set_debugreg(vcpu->arch.eff_db[1], 1);
>>           set_debugreg(vcpu->arch.eff_db[2], 2);
>>           set_debugreg(vcpu->arch.eff_db[3], 3);
>> -        set_debugreg(vcpu->arch.dr6, 6);
>> +        /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
>> +        if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>> +            set_debugreg(vcpu->arch.dr6, 6);
>>       } else if (unlikely(hw_breakpoint_active())) {
>>           set_debugreg(0, 7);
>>       }
>>
> 
> Even better, this should be moved to vmx.c's vcpu_enter_guest.  This
> matches the handling in svm.c:
> 
>          /*
>           * Run with all-zero DR6 unless needed, so that we can get the exact cause
>           * of a #DB.
>           */
>          if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
>                  svm_set_dr6(svm, vcpu->arch.dr6);
>          else
>                  svm_set_dr6(svm, DR6_ACTIVE_LOW);
> 
> That is,
> 
>      KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
>      Commit c77fb5fe6f03 ("KVM: x86: Allow the guest to run with dirty debug
>      registers") allows the guest accessing to DRs without exiting when
>      KVM_DEBUGREG_WONT_EXIT and we need to ensure that they are synchronized
>      on entry to the guest---including DR6 that was not synced before the commit.
>      But the commit sets the hardware DR6 not only when KVM_DEBUGREG_WONT_EXIT,
>      but also when KVM_DEBUGREG_BP_ENABLED.  The second case is unnecessary
>      and just leads to a more case which leaks stale DR6 to the host which has
>      to be resolved by unconditionally reseting DR6 in kvm_arch_vcpu_put().
>      Even if KVM_DEBUGREG_WONT_EXIT, however, setting the host DR6 only matters
>      on VMX because SVM always uses the DR6 value from the VMCB.  So move this
>      line to vmx.c and make it conditional on KVM_DEBUGREG_WONT_EXIT.
>      Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ae8e62df16dd..21a3ef3012cf 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>           vmx->loaded_vmcs->host_state.cr4 = cr4;
>       }
> 
> +    /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
> +    if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
> +        set_debugreg(vcpu->arch.dr6, 6);


I also noticed the related code in svm.c, but I refrained myself
to add a new branch in vmx_vcpu_run().  But after I see you put
the code of resetting dr6 in vmx_sync_dirty_debug_regs(), the whole
solution is much clean and better.

And if any chance you are also concern about the additional branch,
could you add a new callback to set dr6 and call the callback from
x86.c when KVM_DEBUGREG_WONT_EXIT.

The possible implementation of the callback:
for vmx: set_debugreg(vcpu->arch.dr6, 6);
for svm: svm_set_dr6(svm, vcpu->arch.dr6);
          and always do svm_set_dr6(svm, DR6_ACTIVE_LOW); at the end of the
          svm_handle_exit().

Thanks
Lai

> +
>       /* When single-stepping over STI and MOV SS, we must clear the
>        * corresponding interruptibility bits in the guest state. Otherwise
>        * vmentry fails as it then expects bit 14 (BS) in pending debug
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a111899ab2b4..fbc536b21585 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>           set_debugreg(vcpu->arch.eff_db[1], 1);
>           set_debugreg(vcpu->arch.eff_db[2], 2);
>           set_debugreg(vcpu->arch.eff_db[3], 3);
> -        set_debugreg(vcpu->arch.dr6, 6);
>       } else if (unlikely(hw_breakpoint_active())) {
>           set_debugreg(0, 7);
>       }
> 
> Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 3/3] KVM: X86: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-10 10:14       ` Paolo Bonzini
@ 2021-08-10 10:34         ` Lai Jiangshan
  2021-08-10 10:41           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-10 10:34 UTC (permalink / raw)
  To: Paolo Bonzini, Lai Jiangshan, linux-kernel
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm



On 2021/8/10 18:14, Paolo Bonzini wrote:
> On 09/08/21 19:43, Lai Jiangshan wrote:
>> From: Lai Jiangshan <laijs@linux.alibaba.com>
>>
>> The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
>> fixed a bug by reseting DR6 unconditionally when the vcpu being scheduled out.
>>
>> But writing to debug registers is slow, and it can be shown in perf results
>> sometimes even neither the host nor the guest activate breakpoints.
>>
>> It'd be better to reset it conditionally and this patch moves the code of
>> reseting DR6 to the path of VM-exit and only reset it when
>> KVM_DEBUGREG_WONT_EXIT which is the only case that DR6 is guest value.
>>
>> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
>> ---
>>   arch/x86/kvm/x86.c | 8 ++------
>>   1 file changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index d2aa49722064..f40cdd7687d8 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4309,12 +4309,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>       static_call(kvm_x86_vcpu_put)(vcpu);
>>       vcpu->arch.last_host_tsc = rdtsc();
>> -    /*
>> -     * If userspace has set any breakpoints or watchpoints, dr6 is restored
>> -     * on every vmexit, but if not, we might have a stale dr6 from the
>> -     * guest. do_debug expects dr6 to be cleared after it runs, do the same.
>> -     */
>> -    set_debugreg(0, 6);
>>   }
>>   static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
>> @@ -9630,6 +9624,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>           static_call(kvm_x86_sync_dirty_debug_regs)(vcpu);
>>           kvm_update_dr0123(vcpu);
>>           kvm_update_dr7(vcpu);
>> +        /* Reset Dr6 which is guest value. */
>> +        set_debugreg(DR6_RESERVED, 6);
>>       }
>>       /*
>>
> 
> ... and this should also be done exclusively for VMX, in vmx_sync_dirty_debug_regs:
> 
>      KVM: VMX: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
>      The commit efdab992813fb ("KVM: x86: fix escape of guest dr6 to the host")
>      fixed a bug by resetting DR6 unconditionally when the vcpu being scheduled out.
>      But writing to debug registers is slow, and it can be visible in perf results
>      sometimes, even if neither the host nor the guest activate breakpoints.
>      Since KVM_DEBUGREG_WONT_EXIT on Intel processors is the only case
>      where DR6 gets the guest value, and it never happens at all on SVM,
>      the register can be cleared in vmx.c right after reading it.
>      Reported-by: Lai Jiangshan <laijs@linux.alibaba.com>
>      Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 21a3ef3012cf..3a91302d05c0 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5110,6 +5110,12 @@ static void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
> 
>       vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_WONT_EXIT;
>       exec_controls_setbit(to_vmx(vcpu), CPU_BASED_MOV_DR_EXITING);
> +
> +    /*
> +     * do_debug expects dr6 to be cleared after it runs, avoid that it sees
> +     * a stale dr6 from the guest.
> +     */


do_debug() is renamed. Maybe you can use "The host kernel #DB handler".


> +    set_debugreg(DR6_RESERVED, 6);
>   }
> 
>   static void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fbc536b21585..04c393551fb0 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4313,12 +4313,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> 
>       static_call(kvm_x86_vcpu_put)(vcpu);
>       vcpu->arch.last_host_tsc = rdtsc();
> -    /*
> -     * If userspace has set any breakpoints or watchpoints, dr6 is restored
> -     * on every vmexit, but if not, we might have a stale dr6 from the
> -     * guest. do_debug expects dr6 to be cleared after it runs, do the same.
> -     */
> -    set_debugreg(0, 6);
>   }
> 
>   static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> 
> 
> Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-10 10:30         ` Lai Jiangshan
@ 2021-08-10 10:35           ` Paolo Bonzini
  2021-08-10 10:46             ` Lai Jiangshan
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10 10:35 UTC (permalink / raw)
  To: Lai Jiangshan, Lai Jiangshan, linux-kernel
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

On 10/08/21 12:30, Lai Jiangshan wrote:
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index ae8e62df16dd..21a3ef3012cf 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu 
>> *vcpu)
>>           vmx->loaded_vmcs->host_state.cr4 = cr4;
>>       }
>>
>> +    /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
>> +    if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>> +        set_debugreg(vcpu->arch.dr6, 6);
> 
> 
> I also noticed the related code in svm.c, but I refrained myself
> to add a new branch in vmx_vcpu_run().  But after I see you put
> the code of resetting dr6 in vmx_sync_dirty_debug_regs(), the whole
> solution is much clean and better.
> 
> And if any chance you are also concern about the additional branch,
> could you add a new callback to set dr6 and call the callback from
> x86.c when KVM_DEBUGREG_WONT_EXIT.

The extra branch should be well predicted, and the idea you sketched 
below would cause DR6 to be marked uselessly as dirty in SVM, so I think 
this is cleaner.  Let's add an "unlikely" around it too.

Paolo

> The possible implementation of the callback:
> for vmx: set_debugreg(vcpu->arch.dr6, 6);
> for svm: svm_set_dr6(svm, vcpu->arch.dr6);
>           and always do svm_set_dr6(svm, DR6_ACTIVE_LOW); at the end of the
>           svm_handle_exit().
> 
> Thanks
> Lai
> 
>> +
>>       /* When single-stepping over STI and MOV SS, we must clear the
>>        * corresponding interruptibility bits in the guest state. 
>> Otherwise
>>        * vmentry fails as it then expects bit 14 (BS) in pending debug
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a111899ab2b4..fbc536b21585 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>           set_debugreg(vcpu->arch.eff_db[1], 1);
>>           set_debugreg(vcpu->arch.eff_db[2], 2);
>>           set_debugreg(vcpu->arch.eff_db[3], 3);
>> -        set_debugreg(vcpu->arch.dr6, 6);
>>       } else if (unlikely(hw_breakpoint_active())) {
>>           set_debugreg(0, 7);
>>       }
>>
>> Paolo
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 3/3] KVM: X86: Reset DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-10 10:34         ` Lai Jiangshan
@ 2021-08-10 10:41           ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10 10:41 UTC (permalink / raw)
  To: Lai Jiangshan, Lai Jiangshan, linux-kernel
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

On 10/08/21 12:34, Lai Jiangshan wrote:
>>
>> +     * do_debug expects dr6 to be cleared after it runs, avoid that 
>> it sees
>> +     * a stale dr6 from the guest.
>> +     */
> 
> 
> do_debug() is renamed. Maybe you can use "The host kernel #DB handler".

Or exc_debug.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-10 10:35           ` Paolo Bonzini
@ 2021-08-10 10:46             ` Lai Jiangshan
  2021-08-10 12:49               ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Lai Jiangshan @ 2021-08-10 10:46 UTC (permalink / raw)
  To: Paolo Bonzini, Lai Jiangshan, linux-kernel
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm



On 2021/8/10 18:35, Paolo Bonzini wrote:
> On 10/08/21 12:30, Lai Jiangshan wrote:
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index ae8e62df16dd..21a3ef3012cf 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -6625,6 +6625,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>>           vmx->loaded_vmcs->host_state.cr4 = cr4;
>>>       }
>>>
>>> +    /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */
>>> +    if (vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>>> +        set_debugreg(vcpu->arch.dr6, 6);
>>
>>
>> I also noticed the related code in svm.c, but I refrained myself
>> to add a new branch in vmx_vcpu_run().  But after I see you put
>> the code of resetting dr6 in vmx_sync_dirty_debug_regs(), the whole
>> solution is much clean and better.
>>
>> And if any chance you are also concern about the additional branch,
>> could you add a new callback to set dr6 and call the callback from
>> x86.c when KVM_DEBUGREG_WONT_EXIT.
> 
> The extra branch should be well predicted, and the idea you sketched below would cause DR6 to be marked uselessly as 
> dirty in SVM, so I think this is cleaner.  Let's add an "unlikely" around it too.

I'm OK with it. But I don't think the sketched idea would cause DR6 to be marked uselessly as dirty in SVM. It doesn't 
mark it dirty if the value is unchanged, and the value is always DR6_ACTIVE_LOW except when it just clears 
KVM_DEBUGREG_WONT_EXIT.

> 
> Paolo
> 
>> The possible implementation of the callback:
>> for vmx: set_debugreg(vcpu->arch.dr6, 6);
>> for svm: svm_set_dr6(svm, vcpu->arch.dr6);
>>           and always do svm_set_dr6(svm, DR6_ACTIVE_LOW); at the end of the
>>           svm_handle_exit().
>>
>> Thanks
>> Lai
>>
>>> +
>>>       /* When single-stepping over STI and MOV SS, we must clear the
>>>        * corresponding interruptibility bits in the guest state. Otherwise
>>>        * vmentry fails as it then expects bit 14 (BS) in pending debug
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index a111899ab2b4..fbc536b21585 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -9597,7 +9597,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>           set_debugreg(vcpu->arch.eff_db[1], 1);
>>>           set_debugreg(vcpu->arch.eff_db[2], 2);
>>>           set_debugreg(vcpu->arch.eff_db[3], 3);
>>> -        set_debugreg(vcpu->arch.dr6, 6);
>>>       } else if (unlikely(hw_breakpoint_active())) {
>>>           set_debugreg(0, 7);
>>>       }
>>>
>>> Paolo
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT
  2021-08-10 10:46             ` Lai Jiangshan
@ 2021-08-10 12:49               ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2021-08-10 12:49 UTC (permalink / raw)
  To: Lai Jiangshan, Lai Jiangshan, linux-kernel
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, kvm

On 10/08/21 12:46, Lai Jiangshan wrote:
> I'm OK with it. But I don't think the sketched idea would cause DR6 to 
> be marked uselessly as dirty in SVM. It doesn't mark it dirty if the 
> value is unchanged, and the value is always DR6_ACTIVE_LOW except when 
> it just clears KVM_DEBUGREG_WONT_EXIT.

It would be marked dirty if it is not DR6_ACTIVE_LOW, because it would 
be set first to DR6_ACTIVE_LOW in svm_handle_exit and then set back to 
the guest value on the next entry.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-08-10 12:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-08 23:29 [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Lai Jiangshan
2021-08-09 16:54 ` Sean Christopherson
2021-08-09 17:43   ` [PATCH V2 1/3] KVM: X86: Remove unneeded KVM_DEBUGREG_RELOAD Lai Jiangshan
2021-08-09 17:43     ` [PATCH V2 2/3] KVM: X86: Set the hardware DR6 only when KVM_DEBUGREG_WONT_EXIT Lai Jiangshan
2021-08-10 10:07       ` Paolo Bonzini
2021-08-10 10:30         ` Lai Jiangshan
2021-08-10 10:35           ` Paolo Bonzini
2021-08-10 10:46             ` Lai Jiangshan
2021-08-10 12:49               ` Paolo Bonzini
2021-08-09 17:43     ` [PATCH V2 3/3] KVM: X86: Reset " Lai Jiangshan
2021-08-10  9:42       ` Paolo Bonzini
2021-08-10 10:14       ` Paolo Bonzini
2021-08-10 10:34         ` Lai Jiangshan
2021-08-10 10:41           ` Paolo Bonzini
2021-08-10  9:59   ` [PATCH] KVM: X86: Don't reset dr6 unconditionally when the vcpu being scheduled out Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).