All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2
@ 2023-09-15 20:36 Andrew Cooper
  2023-09-15 20:36 ` [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers Andrew Cooper
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

This time with a bit of sanity testing.  See patches for details.

Andrew Cooper (7):
  x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers
  x86/emul: Fix and extend #DB trap handling
  x86/pv: Fix the determiniation of whether to inject #DB
  x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
  x86: Introduce x86_merge_dr6()
  x86: Extend x86_event with a pending_dbg field
  x86/pv: Rewrite %dr6 handling

 xen/arch/x86/debug.c                   | 20 +++++++++++++++++
 xen/arch/x86/include/asm/debugreg.h    |  7 ++++++
 xen/arch/x86/include/asm/domain.h      | 18 ++++++++++++++--
 xen/arch/x86/include/asm/hvm/hvm.h     |  3 ++-
 xen/arch/x86/include/asm/x86-defns.h   |  7 ++++++
 xen/arch/x86/pv/emul-priv-op.c         | 30 +++++++++++++-------------
 xen/arch/x86/pv/emulate.c              |  9 ++++++--
 xen/arch/x86/pv/ro-page-fault.c        |  4 ++--
 xen/arch/x86/pv/traps.c                | 17 +++++++++++----
 xen/arch/x86/traps.c                   | 23 ++++++++++----------
 xen/arch/x86/x86_emulate/x86_emulate.c | 26 ++++++++++++++++------
 xen/arch/x86/x86_emulate/x86_emulate.h | 19 ++++++++++++----
 12 files changed, 134 insertions(+), 49 deletions(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-18 11:17   ` Jan Beulich
  2023-09-15 20:36 ` [PATCH 2/7] x86/emul: Fix and extend #DB trap handling Andrew Cooper
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

This property is far from clear.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * New
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index e88245eae9fb..94caec1d142c 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -8651,6 +8651,12 @@ int x86_emulate_wrapper(
 
     rc = x86_emulate(ctxt, ops);
 
+    /*
+     * X86EMUL_DONE is an internal signal in the emulator, and is not expected
+     * to ever escape out to callers.
+     */
+    ASSERT(rc != X86EMUL_DONE);
+
     /*
      * Most retire flags should only be set for successful instruction
      * emulation.
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/7] x86/emul: Fix and extend #DB trap handling
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
  2023-09-15 20:36 ` [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-18  9:24   ` Andrew Cooper
  2023-09-18 11:28   ` Jan Beulich
  2023-09-15 20:36 ` [PATCH 3/7] x86/pv: Fix the determiniation of whether to inject #DB Andrew Cooper
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

Lots of this is very very broken, but we need to start somewhere.

First, the bugfix.  Hooks which use X86EMUL_DONE to skip the general emulation
still need to evaluate singlestep as part of completing the instruction.
Defer the logic until X86EMUL_DONE has been converted to X86EMUL_OKAY.

Second, the improvement.  PENDING_DBG, INTERRUPTIBILITY and ACTIVITY are
internal pipeline state which Intel exposed to software in the VMCS, and AMD
exposed a subset of in the VMCB.  Importantly, bits set in PENDING_DBG can
survive across multiple instruction boundaries if e.g. delivery of #DB is
delayed by a MovSS.

For now, introduce a full pending_dbg field into the retire union.  This keeps
the sh_page_fault() and init_context() paths working but in due course the
field will want to lose the "retire" infix.

In addition, set singlestep into pending_dbg as appropriate.  Leave the old
singlestep bitfield in place until we can adjust the callers to the new
scheme.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * Only evaluate singlestep on X86EMUL_OKAY, but do so after X86EMUL_DONE has
   been adjusted to X86EMUL_OKAY.
 * Adjust comments in light of X86EMUL_DONE not getting back to callers.
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 20 +++++++++++++-------
 xen/arch/x86/x86_emulate/x86_emulate.h | 14 +++++++++++---
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 94caec1d142c..de7f99500e3f 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -8379,13 +8379,6 @@ x86_emulate(
     if ( !mode_64bit() )
         _regs.r(ip) = (uint32_t)_regs.r(ip);
 
-    /* Should a singlestep #DB be raised? */
-    if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
-    {
-        ctxt->retire.singlestep = true;
-        ctxt->retire.sti = false;
-    }
-
     if ( rc != X86EMUL_DONE )
         *ctxt->regs = _regs;
     else
@@ -8394,6 +8387,19 @@ x86_emulate(
         rc = X86EMUL_OKAY;
     }
 
+    /* Should a singlestep #DB be raised? */
+    if ( rc == X86EMUL_OKAY && singlestep )
+    {
+        ctxt->retire.pending_dbg |= X86_DR6_BS;
+
+        /* BROKEN - TODO, merge into pending_dbg. */
+        if ( !ctxt->retire.mov_ss )
+        {
+            ctxt->retire.singlestep = true;
+            ctxt->retire.sti = false;
+        }
+    }
+
     ctxt->regs->eflags &= ~X86_EFLAGS_RF;
 
  done:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 698750267a90..fbc023c37e34 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -588,15 +588,23 @@ struct x86_emulate_ctxt
     /* Canonical opcode (see below) (valid only on X86EMUL_OKAY). */
     unsigned int opcode;
 
-    /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
+    /*
+     * Retirement state, set by the emulator (valid only on X86EMUL_OKAY).
+     *
+     * TODO: all this state should be input/output from the VMCS PENDING_DBG,
+     * INTERRUPTIBILITY and ACTIVITIY fields.
+     */
     union {
-        uint8_t raw;
+        unsigned long raw;
         struct {
+            /* Accumulated %dr6 trap bits, positive polarity. */
+            unsigned int pending_dbg;
+
             bool hlt:1;          /* Instruction HLTed. */
             bool mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
             bool sti:1;          /* Instruction sets STI irq shadow. */
             bool unblock_nmi:1;  /* Instruction clears NMI blocking. */
-            bool singlestep:1;   /* Singlestepping was active. */
+            bool singlestep:1;   /* Singlestepping was active. (TODO, merge into pending_dbg) */
         };
     } retire;
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/7] x86/pv: Fix the determiniation of whether to inject #DB
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
  2023-09-15 20:36 ` [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers Andrew Cooper
  2023-09-15 20:36 ` [PATCH 2/7] x86/emul: Fix and extend #DB trap handling Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-15 20:36 ` [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead Andrew Cooper
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Jan Beulich, Roger Pau Monné,
	Wei Liu, Jinoh Kang

We long ago fixed the emulator to not inject exceptions behind our back.
Therefore, assert that that a PV event (including interrupts, because that
would be buggy too) isn't pending, rather than skipping the #DB injection if
one is.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * Drop X86EMUL_DONE adjustment.
---
 xen/arch/x86/pv/emul-priv-op.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 142bc4818cb5..0d9f84f458ba 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -1358,14 +1358,17 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs)
     switch ( rc )
     {
     case X86EMUL_OKAY:
+        ASSERT(!curr->arch.pv.trap_bounce.flags);
+
         if ( ctxt.ctxt.retire.singlestep )
             ctxt.bpmatch |= DR_STEP;
+
         if ( ctxt.bpmatch )
         {
             curr->arch.dr6 |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE;
-            if ( !(curr->arch.pv.trap_bounce.flags & TBF_EXCEPTION) )
-                pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
+            pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
         }
+
         /* fall through */
     case X86EMUL_RETRY:
         return EXCRET_fault_fixed;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
                   ` (2 preceding siblings ...)
  2023-09-15 20:36 ` [PATCH 3/7] x86/pv: Fix the determiniation of whether to inject #DB Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-16  7:36   ` Jinoh Kang
  2023-09-15 20:36 ` [PATCH 5/7] x86: Introduce x86_merge_dr6() Andrew Cooper
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Jan Beulich, Roger Pau Monné,
	Wei Liu, Jinoh Kang

With a full pending_dbg field in x86_emulate_ctxt, use it rather than using a
local bpmatch field.

This simplifies the X86EMUL_OKAY path as singlestep is already accumulated by
x86_emulate() when appropriate.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * Tweak commit message to avoid referencing X86EMUL_DONE.
---
 xen/arch/x86/pv/emul-priv-op.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 0d9f84f458ba..2f73ed0682e1 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -34,7 +34,6 @@ struct priv_op_ctxt {
         unsigned long base, limit;
     } cs;
     char *io_emul_stub;
-    unsigned int bpmatch;
 };
 
 /* I/O emulation helpers.  Use non-standard calling conventions. */
@@ -367,7 +366,8 @@ static int cf_check read_io(
     if ( !guest_io_okay(port, bytes, curr, ctxt->regs) )
         return X86EMUL_UNHANDLEABLE;
 
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes);
+    poc->ctxt.retire.pending_dbg |=
+        check_guest_io_breakpoint(curr, port, bytes);
 
     if ( admin_io_okay(port, bytes, currd) )
     {
@@ -472,7 +472,8 @@ static int cf_check write_io(
     if ( !guest_io_okay(port, bytes, curr, ctxt->regs) )
         return X86EMUL_UNHANDLEABLE;
 
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes);
+    poc->ctxt.retire.pending_dbg |=
+        check_guest_io_breakpoint(curr, port, bytes);
 
     if ( admin_io_okay(port, bytes, currd) )
     {
@@ -636,7 +637,8 @@ static int cf_check rep_ins(
         return X86EMUL_EXCEPTION;
     }
 
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep);
+    poc->ctxt.retire.pending_dbg |=
+        check_guest_io_breakpoint(curr, port, bytes_per_rep);
 
     while ( *reps < goal )
     {
@@ -658,7 +660,7 @@ static int cf_check rep_ins(
 
         ++*reps;
 
-        if ( poc->bpmatch || hypercall_preempt_check() )
+        if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() )
             break;
 
         /* x86_emulate() clips the repetition count to ensure we don't wrap. */
@@ -703,7 +705,8 @@ static int cf_check rep_outs(
         return X86EMUL_EXCEPTION;
     }
 
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep);
+    poc->ctxt.retire.pending_dbg |=
+        check_guest_io_breakpoint(curr, port, bytes_per_rep);
 
     while ( *reps < goal )
     {
@@ -726,7 +729,7 @@ static int cf_check rep_outs(
 
         ++*reps;
 
-        if ( poc->bpmatch || hypercall_preempt_check() )
+        if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() )
             break;
 
         /* x86_emulate() clips the repetition count to ensure we don't wrap. */
@@ -1360,12 +1363,9 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs)
     case X86EMUL_OKAY:
         ASSERT(!curr->arch.pv.trap_bounce.flags);
 
-        if ( ctxt.ctxt.retire.singlestep )
-            ctxt.bpmatch |= DR_STEP;
-
-        if ( ctxt.bpmatch )
+        if ( ctxt.ctxt.retire.pending_dbg )
         {
-            curr->arch.dr6 |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE;
+            curr->arch.dr6 |= ctxt.ctxt.retire.pending_dbg | DR_STATUS_RESERVED_ONE;
             pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
         }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/7] x86: Introduce x86_merge_dr6()
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
                   ` (3 preceding siblings ...)
  2023-09-15 20:36 ` [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-18 11:37   ` Jan Beulich
  2023-09-15 20:36 ` [PATCH 6/7] x86: Extend x86_event with a pending_dbg field Andrew Cooper
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

The current logic used to update %dr6 when injecting #DB is buggy.

The SDM/APM documention on %dr6 updates is far from ideal, but does at least
make clear that it's non-trivial.  The actual behaviour is to overwrite
B{0..3} and accumulate all other bits.

Introduce x86_merge_dr6() to perform the operaton properly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * Extend commit message
---
 xen/arch/x86/debug.c                 | 20 ++++++++++++++++++++
 xen/arch/x86/include/asm/debugreg.h  |  7 +++++++
 xen/arch/x86/include/asm/x86-defns.h |  7 +++++++
 3 files changed, 34 insertions(+)

diff --git a/xen/arch/x86/debug.c b/xen/arch/x86/debug.c
index 127fe83021cd..bfcd83ea4d0b 100644
--- a/xen/arch/x86/debug.c
+++ b/xen/arch/x86/debug.c
@@ -3,6 +3,7 @@
  * Copyright (C) 2023 XenServer.
  */
 #include <xen/kernel.h>
+#include <xen/lib.h>
 
 #include <xen/lib/x86/cpu-policy.h>
 
@@ -28,6 +29,25 @@ unsigned int x86_adj_dr6_rsvd(const struct cpu_policy *p, unsigned int dr6)
     return dr6;
 }
 
+unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6,
+                           unsigned int new)
+{
+    /* Flip dr6 to have positive polarity. */
+    dr6 ^= X86_DR6_DEFAULT;
+
+    /* Sanity check that only known values are passed in. */
+    ASSERT(!(dr6 & ~X86_DR6_KNOWN_MASK));
+    ASSERT(!(new & ~X86_DR6_KNOWN_MASK));
+
+    /* Breakpoint matches are overridden.  All other bits accumulate. */
+    dr6 = (dr6 & ~X86_DR6_BP_MASK) | new;
+
+    /* Flip dr6 back to having default polarity. */
+    dr6 ^= X86_DR6_DEFAULT;
+
+    return x86_adj_dr6_rsvd(p, dr6);
+}
+
 unsigned int x86_adj_dr7_rsvd(const struct cpu_policy *p, unsigned int dr7)
 {
     unsigned int zeros = X86_DR7_ZEROS;
diff --git a/xen/arch/x86/include/asm/debugreg.h b/xen/arch/x86/include/asm/debugreg.h
index 39ba312b84ee..e98a9ce977fa 100644
--- a/xen/arch/x86/include/asm/debugreg.h
+++ b/xen/arch/x86/include/asm/debugreg.h
@@ -89,4 +89,11 @@ struct cpu_policy;
 unsigned int x86_adj_dr6_rsvd(const struct cpu_policy *p, unsigned int dr6);
 unsigned int x86_adj_dr7_rsvd(const struct cpu_policy *p, unsigned int dr7);
 
+/*
+ * Merge new bits into dr6.  'new' is always given in positive polarity,
+ * matching the Intel VMCS PENDING_DBG semantics.
+ */
+unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6,
+                           unsigned int new);
+
 #endif /* _X86_DEBUGREG_H */
diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h
index 5838631ef634..edfecc89bd08 100644
--- a/xen/arch/x86/include/asm/x86-defns.h
+++ b/xen/arch/x86/include/asm/x86-defns.h
@@ -116,6 +116,13 @@
 #define X86_DR6_BT              (_AC(1, UL) << 15)   /* Task switch                 */
 #define X86_DR6_RTM             (_AC(1, UL) << 16)   /* #DB/#BP in RTM region (INV) */
 
+#define X86_DR6_BP_MASK                                 \
+    (X86_DR6_B0 | X86_DR6_B1 | X86_DR6_B2 | X86_DR6_B3)
+
+#define X86_DR6_KNOWN_MASK                                      \
+    (X86_DR6_BP_MASK | X86_DR6_BLD | X86_DR6_BD | X86_DR6_BS |  \
+     X86_DR6_BT | X86_DR6_RTM)
+
 #define X86_DR6_ZEROS           _AC(0x00001000, UL)  /* %dr6 bits forced to 0       */
 #define X86_DR6_DEFAULT         _AC(0xffff0ff0, UL)  /* Default %dr6 value          */
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/7] x86: Extend x86_event with a pending_dbg field
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
                   ` (4 preceding siblings ...)
  2023-09-15 20:36 ` [PATCH 5/7] x86: Introduce x86_merge_dr6() Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-16  7:44   ` Jinoh Kang
  2023-09-18 11:39   ` Jan Beulich
  2023-09-15 20:36 ` [PATCH 7/7] x86/pv: Rewrite %dr6 handling Andrew Cooper
  2023-09-19 20:53 ` [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
  7 siblings, 2 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

... using the Intel VMCS PENDING_DBG semantics, and sharing storage with cr2.

This requires working around anonymous union bugs in obsolete versions of GCC,
which in turn needs to drop unnecessary const qualifiers.

Also introduce a pv_inject_DB() wrapper use this field nicely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * Split out of prior patch.
---
 xen/arch/x86/include/asm/domain.h      | 18 ++++++++++++++++--
 xen/arch/x86/include/asm/hvm/hvm.h     |  3 ++-
 xen/arch/x86/x86_emulate/x86_emulate.h |  5 ++++-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h
index c2d9fc333be5..fd1f306222be 100644
--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -729,15 +729,29 @@ static inline void pv_inject_hw_exception(unsigned int vector, int errcode)
     pv_inject_event(&event);
 }
 
+static inline void pv_inject_DB(unsigned long pending_dbg)
+{
+    struct x86_event event = {
+        .vector      = X86_EXC_DB,
+        .type        = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code  = X86_EVENT_NO_EC,
+    };
+
+    event.pending_dbg = pending_dbg;
+
+    pv_inject_event(&event);
+}
+
 static inline void pv_inject_page_fault(int errcode, unsigned long cr2)
 {
-    const struct x86_event event = {
+    struct x86_event event = {
         .vector = X86_EXC_PF,
         .type = X86_EVENTTYPE_HW_EXCEPTION,
         .error_code = errcode,
-        .cr2 = cr2,
     };
 
+    event.cr2 = cr2;
+
     pv_inject_event(&event);
 }
 
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index 6d53713fc3a9..ea966f4429f9 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -532,9 +532,10 @@ static inline void hvm_inject_page_fault(int errcode, unsigned long cr2)
         .vector = X86_EXC_PF,
         .type = X86_EVENTTYPE_HW_EXCEPTION,
         .error_code = errcode,
-        .cr2 = cr2,
     };
 
+    event.cr2 = cr2;
+
     hvm_inject_event(&event);
 }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index fbc023c37e34..e567a9b635d9 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -78,7 +78,10 @@ struct x86_event {
     uint8_t       type;         /* X86_EVENTTYPE_* */
     uint8_t       insn_len;     /* Instruction length */
     int32_t       error_code;   /* X86_EVENT_NO_EC if n/a */
-    unsigned long cr2;          /* Only for X86_EXC_PF h/w exception */
+    union {
+        unsigned long cr2;         /* #PF */
+        unsigned long pending_dbg; /* #DB (new DR6 bits, positive polarity) */
+    };
 };
 
 /*
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/7] x86/pv: Rewrite %dr6 handling
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
                   ` (5 preceding siblings ...)
  2023-09-15 20:36 ` [PATCH 6/7] x86: Extend x86_event with a pending_dbg field Andrew Cooper
@ 2023-09-15 20:36 ` Andrew Cooper
  2023-09-16  7:50   ` Jinoh Kang
  2023-09-18 11:46   ` Jan Beulich
  2023-09-19 20:53 ` [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
  7 siblings, 2 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-15 20:36 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

All #DB exceptions result in an update of %dr6, but this isn't handled
properly by Xen for any guest type.

Remove all ad-hoc dr6 handling, leaving it to pv_inject_event() in most cases
and using the new x86_merge_dr6() helper.

In do_debug(), swap the dr6 to pending_dbg in order to operate entirely with
positive polarity.  Among other things, this helps spot RTM/BLD in the
diagnostic message.

Drop the unconditional v->arch.dr6 adjustment.  pv_inject_event() performs the
adjustment in the common case, but retain the prior behaviour if a debugger is
attached.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jinoh Kang <jinoh.kang.kr@gmail.com>

v2:
 * Split pieces out into an earlier patch.
 * Extend do_debug() to use pending_dbg entirely, rather than have the same
   variable used with different polarity at different times.
---
 xen/arch/x86/pv/emul-priv-op.c  |  5 +----
 xen/arch/x86/pv/emulate.c       |  9 +++++++--
 xen/arch/x86/pv/ro-page-fault.c |  4 ++--
 xen/arch/x86/pv/traps.c         | 17 +++++++++++++----
 xen/arch/x86/traps.c            | 23 +++++++++++------------
 5 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 2f73ed0682e1..fe2011f28e34 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -1364,10 +1364,7 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs)
         ASSERT(!curr->arch.pv.trap_bounce.flags);
 
         if ( ctxt.ctxt.retire.pending_dbg )
-        {
-            curr->arch.dr6 |= ctxt.ctxt.retire.pending_dbg | DR_STATUS_RESERVED_ONE;
-            pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
-        }
+            pv_inject_DB(ctxt.ctxt.retire.pending_dbg);
 
         /* fall through */
     case X86EMUL_RETRY:
diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c
index e7a1c0a2cc4f..29425a0a00ec 100644
--- a/xen/arch/x86/pv/emulate.c
+++ b/xen/arch/x86/pv/emulate.c
@@ -71,10 +71,15 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip)
 {
     regs->rip = rip;
     regs->eflags &= ~X86_EFLAGS_RF;
+
     if ( regs->eflags & X86_EFLAGS_TF )
     {
-        current->arch.dr6 |= DR_STEP | DR_STATUS_RESERVED_ONE;
-        pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
+        /*
+         * TODO, this should generally use TF from the start of the
+         * instruction.  It's only a latent bug for now, as this path isn't
+         * used for any instruction which modifies eflags.
+         */
+        pv_inject_DB(X86_DR6_BS);
     }
 }
 
diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c
index cad28ef928ad..f6bb33556e72 100644
--- a/xen/arch/x86/pv/ro-page-fault.c
+++ b/xen/arch/x86/pv/ro-page-fault.c
@@ -389,8 +389,8 @@ int pv_ro_page_fault(unsigned long addr, struct cpu_user_regs *regs)
 
         /* Fallthrough */
     case X86EMUL_OKAY:
-        if ( ctxt.retire.singlestep )
-            pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
+        if ( ctxt.retire.pending_dbg )
+            pv_inject_DB(ctxt.retire.pending_dbg);
 
         /* Fallthrough */
     case X86EMUL_RETRY:
diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index 74f333da7e1c..553b04bca956 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -13,6 +13,7 @@
 #include <xen/softirq.h>
 
 #include <asm/pv/trace.h>
+#include <asm/debugreg.h>
 #include <asm/shared.h>
 #include <asm/traps.h>
 #include <irq_vectors.h>
@@ -50,9 +51,9 @@ void pv_inject_event(const struct x86_event *event)
     tb->cs    = ti->cs;
     tb->eip   = ti->address;
 
-    if ( event->type == X86_EVENTTYPE_HW_EXCEPTION &&
-         vector == X86_EXC_PF )
+    switch ( vector | -(event->type == X86_EVENTTYPE_SW_INTERRUPT) )
     {
+    case X86_EXC_PF:
         curr->arch.pv.ctrlreg[2] = event->cr2;
         arch_set_cr2(curr, event->cr2);
 
@@ -62,9 +63,17 @@ void pv_inject_event(const struct x86_event *event)
             error_code |= PFEC_user_mode;
 
         trace_pv_page_fault(event->cr2, error_code);
-    }
-    else
+        break;
+
+    case X86_EXC_DB:
+        curr->arch.dr6 = x86_merge_dr6(curr->domain->arch.cpu_policy,
+                                       curr->arch.dr6, event->pending_dbg);
+        /* Fallthrough */
+
+    default:
         trace_pv_trap(vector, regs->rip, use_error_code, error_code);
+        break;
+    }
 
     if ( use_error_code )
     {
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index dead728ce329..447edc827b3a 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1887,11 +1887,11 @@ void do_device_not_available(struct cpu_user_regs *regs)
 /* SAF-1-safe */
 void do_debug(struct cpu_user_regs *regs)
 {
-    unsigned long dr6;
+    unsigned long pending_dbg;
     struct vcpu *v = current;
 
-    /* Stash dr6 as early as possible. */
-    dr6 = read_debugreg(6);
+    /* Stash dr6 as early as possible, operating with positive polarity. */
+    pending_dbg = read_debugreg(6) ^ X86_DR6_DEFAULT;
 
     /*
      * At the time of writing (March 2018), on the subject of %dr6:
@@ -1963,13 +1963,13 @@ void do_debug(struct cpu_user_regs *regs)
          * If however we do, safety measures need to be enacted.  Use a big
          * hammer and clear all debug settings.
          */
-        if ( dr6 & (DR_TRAP3 | DR_TRAP2 | DR_TRAP1 | DR_TRAP0) )
+        if ( pending_dbg & X86_DR6_BP_MASK )
         {
             unsigned int bp, dr7 = read_debugreg(7);
 
             for ( bp = 0; bp < 4; ++bp )
             {
-                if ( (dr6 & (1u << bp)) && /* Breakpoint triggered? */
+                if ( (pending_dbg & (1u << bp)) && /* Breakpoint triggered? */
                      (dr7 & (3u << (bp * DR_ENABLE_SIZE))) && /* Enabled? */
                      ((dr7 & (3u << ((bp * DR_CONTROL_SIZE) + /* Insn? */
                                      DR_CONTROL_SHIFT))) == DR_RW_EXECUTE) )
@@ -1990,24 +1990,23 @@ void do_debug(struct cpu_user_regs *regs)
          * so ensure the message is ratelimited.
          */
         gprintk(XENLOG_WARNING,
-                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, dr6 %lx\n",
+                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, pending_dbg %lx\n",
                 regs->cs, _p(regs->rip), _p(regs->rip),
-                regs->ss, _p(regs->rsp), dr6);
+                regs->ss, _p(regs->rsp), pending_dbg);
 
         return;
     }
 
-    /* Save debug status register where guest OS can peek at it */
-    v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
-    v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT);
-
     if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
     {
+        /* Save debug status register where gdbsx can peek at it */
+        v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy,
+                                    v->arch.dr6, pending_dbg);
         domain_pause_for_debugger();
         return;
     }
 
-    pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
+    pv_inject_DB(pending_dbg);
 }
 
 /* SAF-1-safe */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
  2023-09-15 20:36 ` [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead Andrew Cooper
@ 2023-09-16  7:36   ` Jinoh Kang
  2023-09-16 14:00     ` Andrew Cooper
  0 siblings, 1 reply; 31+ messages in thread
From: Jinoh Kang @ 2023-09-16  7:36 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 9/16/23 05:36, Andrew Cooper wrote:
> @@ -658,7 +660,7 @@ static int cf_check rep_ins(
>  
>          ++*reps;
>  
> -        if ( poc->bpmatch || hypercall_preempt_check() )
> +        if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() )
>              break;
>  
>          /* x86_emulate() clips the repetition count to ensure we don't wrap. */

(snip)

> @@ -726,7 +729,7 @@ static int cf_check rep_outs(
>  
>          ++*reps;
>  
> -        if ( poc->bpmatch || hypercall_preempt_check() )
> +        if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() )
>              break;
>  
>          /* x86_emulate() clips the repetition count to ensure we don't wrap. */

These two hunks look like a behavioral change in singlestep mode.

This is actually a fix, assuming the emulator previously did not handle
'rep {in,out}s' in singlestep mode correctly, since it now checks for
PENDING_DBG.BS in addition to PENDING_DBG.B[0-4].

If this is the case, (at least) this part of the patch looks like a stable
candidate.  You might want to edit the commit message to reflect that.

(Ideally all the HWBP handling should be part of the emulator logic, but
 I don't see an easy way to generalize the PV-specific logic.  It could
 be its own patch anyway.)

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] x86: Extend x86_event with a pending_dbg field
  2023-09-15 20:36 ` [PATCH 6/7] x86: Extend x86_event with a pending_dbg field Andrew Cooper
@ 2023-09-16  7:44   ` Jinoh Kang
  2023-09-18 11:39   ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Jinoh Kang @ 2023-09-16  7:44 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 9/16/23 05:36, Andrew Cooper wrote:
> ... using the Intel VMCS PENDING_DBG semantics, and sharing storage with cr2.
> 
> This requires working around anonymous union bugs in obsolete versions of GCC,
> which in turn needs to drop unnecessary const qualifiers.

I'd write this as a inline comment as long as it doesn't make the code
too verbose.

Please disregard this if it doesn't match our coding style.

> 
> Also introduce a pv_inject_DB() wrapper use this field nicely.

Minor nit: Also introduce a pv_inject_DB() wrapper to* use this field nicely.

> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Jinoh Kang <jinoh.kang.kr@gmail.com>
> 
> v2:
>  * Split out of prior patch.
> ---
>  xen/arch/x86/include/asm/domain.h      | 18 ++++++++++++++++--
>  xen/arch/x86/include/asm/hvm/hvm.h     |  3 ++-
>  xen/arch/x86/x86_emulate/x86_emulate.h |  5 ++++-
>  3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h
> index c2d9fc333be5..fd1f306222be 100644
> --- a/xen/arch/x86/include/asm/domain.h
> +++ b/xen/arch/x86/include/asm/domain.h
> @@ -729,15 +729,29 @@ static inline void pv_inject_hw_exception(unsigned int vector, int errcode)
>      pv_inject_event(&event);
>  }
>  
> +static inline void pv_inject_DB(unsigned long pending_dbg)
> +{
> +    struct x86_event event = {
> +        .vector      = X86_EXC_DB,
> +        .type        = X86_EVENTTYPE_HW_EXCEPTION,
> +        .error_code  = X86_EVENT_NO_EC,
> +    };
> +
> +    event.pending_dbg = pending_dbg;
> +
> +    pv_inject_event(&event);
> +}
> +
>  static inline void pv_inject_page_fault(int errcode, unsigned long cr2)
>  {
> -    const struct x86_event event = {
> +    struct x86_event event = {
>          .vector = X86_EXC_PF,
>          .type = X86_EVENTTYPE_HW_EXCEPTION,
>          .error_code = errcode,
> -        .cr2 = cr2,
>      };
>  
> +    event.cr2 = cr2;
> +
>      pv_inject_event(&event);
>  }
>  
> diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
> index 6d53713fc3a9..ea966f4429f9 100644
> --- a/xen/arch/x86/include/asm/hvm/hvm.h
> +++ b/xen/arch/x86/include/asm/hvm/hvm.h
> @@ -532,9 +532,10 @@ static inline void hvm_inject_page_fault(int errcode, unsigned long cr2)
>          .vector = X86_EXC_PF,
>          .type = X86_EVENTTYPE_HW_EXCEPTION,
>          .error_code = errcode,
> -        .cr2 = cr2,
>      };
>  
> +    event.cr2 = cr2;
> +
>      hvm_inject_event(&event);
>  }
>  
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
> index fbc023c37e34..e567a9b635d9 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -78,7 +78,10 @@ struct x86_event {
>      uint8_t       type;         /* X86_EVENTTYPE_* */
>      uint8_t       insn_len;     /* Instruction length */
>      int32_t       error_code;   /* X86_EVENT_NO_EC if n/a */
> -    unsigned long cr2;          /* Only for X86_EXC_PF h/w exception */
> +    union {
> +        unsigned long cr2;         /* #PF */
> +        unsigned long pending_dbg; /* #DB (new DR6 bits, positive polarity) */
> +    };
>  };
>  
>  /*

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] x86/pv: Rewrite %dr6 handling
  2023-09-15 20:36 ` [PATCH 7/7] x86/pv: Rewrite %dr6 handling Andrew Cooper
@ 2023-09-16  7:50   ` Jinoh Kang
  2023-09-16 12:56     ` Andrew Cooper
  2023-09-18 11:46   ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Jinoh Kang @ 2023-09-16  7:50 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 9/16/23 05:36, Andrew Cooper wrote:
> All #DB exceptions result in an update of %dr6, but this isn't handled
> properly by Xen for any guest type.
> 
> Remove all ad-hoc dr6 handling, leaving it to pv_inject_event() in most cases
> and using the new x86_merge_dr6() helper.
> 
> In do_debug(), swap the dr6 to pending_dbg in order to operate entirely with
> positive polarity.  Among other things, this helps spot RTM/BLD in the
> diagnostic message.
> 
> Drop the unconditional v->arch.dr6 adjustment.  pv_inject_event() performs the
> adjustment in the common case, but retain the prior behaviour if a debugger is
> attached.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Jinoh Kang <jinoh.kang.kr@gmail.com>
> 
> v2:
>  * Split pieces out into an earlier patch.
>  * Extend do_debug() to use pending_dbg entirely, rather than have the same
>    variable used with different polarity at different times.
> ---
>  xen/arch/x86/pv/emul-priv-op.c  |  5 +----
>  xen/arch/x86/pv/emulate.c       |  9 +++++++--
>  xen/arch/x86/pv/ro-page-fault.c |  4 ++--
>  xen/arch/x86/pv/traps.c         | 17 +++++++++++++----
>  xen/arch/x86/traps.c            | 23 +++++++++++------------
>  5 files changed, 34 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
> index 2f73ed0682e1..fe2011f28e34 100644
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -1364,10 +1364,7 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs)
>          ASSERT(!curr->arch.pv.trap_bounce.flags);
>  
>          if ( ctxt.ctxt.retire.pending_dbg )
> -        {
> -            curr->arch.dr6 |= ctxt.ctxt.retire.pending_dbg | DR_STATUS_RESERVED_ONE;
> -            pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
> -        }
> +            pv_inject_DB(ctxt.ctxt.retire.pending_dbg);
>  
>          /* fall through */
>      case X86EMUL_RETRY:
> diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c
> index e7a1c0a2cc4f..29425a0a00ec 100644
> --- a/xen/arch/x86/pv/emulate.c
> +++ b/xen/arch/x86/pv/emulate.c
> @@ -71,10 +71,15 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip)
>  {
>      regs->rip = rip;
>      regs->eflags &= ~X86_EFLAGS_RF;
> +
>      if ( regs->eflags & X86_EFLAGS_TF )
>      {
> -        current->arch.dr6 |= DR_STEP | DR_STATUS_RESERVED_ONE;
> -        pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
> +        /*
> +         * TODO, this should generally use TF from the start of the
> +         * instruction.  It's only a latent bug for now, as this path isn't
> +         * used for any instruction which modifies eflags.
> +         */
> +        pv_inject_DB(X86_DR6_BS);
>      }
>  }
>  
> diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c
> index cad28ef928ad..f6bb33556e72 100644
> --- a/xen/arch/x86/pv/ro-page-fault.c
> +++ b/xen/arch/x86/pv/ro-page-fault.c
> @@ -389,8 +389,8 @@ int pv_ro_page_fault(unsigned long addr, struct cpu_user_regs *regs)
>  
>          /* Fallthrough */
>      case X86EMUL_OKAY:
> -        if ( ctxt.retire.singlestep )
> -            pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
> +        if ( ctxt.retire.pending_dbg )
> +            pv_inject_DB(ctxt.retire.pending_dbg);
>  
>          /* Fallthrough */
>      case X86EMUL_RETRY:
> diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
> index 74f333da7e1c..553b04bca956 100644
> --- a/xen/arch/x86/pv/traps.c
> +++ b/xen/arch/x86/pv/traps.c
> @@ -13,6 +13,7 @@
>  #include <xen/softirq.h>
>  
>  #include <asm/pv/trace.h>
> +#include <asm/debugreg.h>
>  #include <asm/shared.h>
>  #include <asm/traps.h>
>  #include <irq_vectors.h>
> @@ -50,9 +51,9 @@ void pv_inject_event(const struct x86_event *event)
>      tb->cs    = ti->cs;
>      tb->eip   = ti->address;
>  
> -    if ( event->type == X86_EVENTTYPE_HW_EXCEPTION &&
> -         vector == X86_EXC_PF )
> +    switch ( vector | -(event->type == X86_EVENTTYPE_SW_INTERRUPT) )
>      {
> +    case X86_EXC_PF:
>          curr->arch.pv.ctrlreg[2] = event->cr2;
>          arch_set_cr2(curr, event->cr2);
>  
> @@ -62,9 +63,17 @@ void pv_inject_event(const struct x86_event *event)
>              error_code |= PFEC_user_mode;
>  
>          trace_pv_page_fault(event->cr2, error_code);
> -    }
> -    else
> +        break;
> +
> +    case X86_EXC_DB:
> +        curr->arch.dr6 = x86_merge_dr6(curr->domain->arch.cpu_policy,
> +                                       curr->arch.dr6, event->pending_dbg);
> +        /* Fallthrough */
> +
> +    default:
>          trace_pv_trap(vector, regs->rip, use_error_code, error_code);
> +        break;
> +    }
>  
>      if ( use_error_code )
>      {
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index dead728ce329..447edc827b3a 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1887,11 +1887,11 @@ void do_device_not_available(struct cpu_user_regs *regs)
>  /* SAF-1-safe */
>  void do_debug(struct cpu_user_regs *regs)
>  {
> -    unsigned long dr6;
> +    unsigned long pending_dbg;
>      struct vcpu *v = current;
>  
> -    /* Stash dr6 as early as possible. */
> -    dr6 = read_debugreg(6);
> +    /* Stash dr6 as early as possible, operating with positive polarity. */
> +    pending_dbg = read_debugreg(6) ^ X86_DR6_DEFAULT;

We don't reset DR6 after reading it, and there is no guarantee that the PV guest
will reset it either, so it doesn't match PENDING_DBG exactly IIRC.

On the other hand, nothing will probably care about its double-accumulating
quirk except perhaps monitor users.

Do we want to document that, shadow DR6 for PV (which seems a little overkill
if we don't trap DR6 access from PV already), or just live with that?

>  
>      /*
>       * At the time of writing (March 2018), on the subject of %dr6:
> @@ -1963,13 +1963,13 @@ void do_debug(struct cpu_user_regs *regs)
>           * If however we do, safety measures need to be enacted.  Use a big
>           * hammer and clear all debug settings.
>           */
> -        if ( dr6 & (DR_TRAP3 | DR_TRAP2 | DR_TRAP1 | DR_TRAP0) )
> +        if ( pending_dbg & X86_DR6_BP_MASK )
>          {
>              unsigned int bp, dr7 = read_debugreg(7);
>  
>              for ( bp = 0; bp < 4; ++bp )
>              {
> -                if ( (dr6 & (1u << bp)) && /* Breakpoint triggered? */
> +                if ( (pending_dbg & (1u << bp)) && /* Breakpoint triggered? */
>                       (dr7 & (3u << (bp * DR_ENABLE_SIZE))) && /* Enabled? */
>                       ((dr7 & (3u << ((bp * DR_CONTROL_SIZE) + /* Insn? */
>                                       DR_CONTROL_SHIFT))) == DR_RW_EXECUTE) )
> @@ -1990,24 +1990,23 @@ void do_debug(struct cpu_user_regs *regs)
>           * so ensure the message is ratelimited.
>           */
>          gprintk(XENLOG_WARNING,
> -                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, dr6 %lx\n",
> +                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, pending_dbg %lx\n",
>                  regs->cs, _p(regs->rip), _p(regs->rip),
> -                regs->ss, _p(regs->rsp), dr6);
> +                regs->ss, _p(regs->rsp), pending_dbg);
>  
>          return;
>      }
>  
> -    /* Save debug status register where guest OS can peek at it */
> -    v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
> -    v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT);
> -
>      if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
>      {
> +        /* Save debug status register where gdbsx can peek at it */
> +        v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy,
> +                                    v->arch.dr6, pending_dbg);
>          domain_pause_for_debugger();
>          return;
>      }
>  
> -    pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
> +    pv_inject_DB(pending_dbg);
>  }
>  
>  /* SAF-1-safe */

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] x86/pv: Rewrite %dr6 handling
  2023-09-16  7:50   ` Jinoh Kang
@ 2023-09-16 12:56     ` Andrew Cooper
  2023-09-16 13:10       ` Jinoh Kang
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-16 12:56 UTC (permalink / raw)
  To: Jinoh Kang, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 16/09/2023 8:50 am, Jinoh Kang wrote:
> On 9/16/23 05:36, Andrew Cooper wrote:
>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
>> index dead728ce329..447edc827b3a 100644
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -1887,11 +1887,11 @@ void do_device_not_available(struct cpu_user_regs *regs)
>>  /* SAF-1-safe */
>>  void do_debug(struct cpu_user_regs *regs)
>>  {
>> -    unsigned long dr6;
>> +    unsigned long pending_dbg;
>>      struct vcpu *v = current;
>>  
>> -    /* Stash dr6 as early as possible. */
>> -    dr6 = read_debugreg(6);
>> +    /* Stash dr6 as early as possible, operating with positive polarity. */
>> +    pending_dbg = read_debugreg(6) ^ X86_DR6_DEFAULT;
> We don't reset DR6 after reading it, and there is no guarantee that the PV guest
> will reset it either, so it doesn't match PENDING_DBG exactly IIRC.
>
> On the other hand, nothing will probably care about its double-accumulating
> quirk except perhaps monitor users.
>
> Do we want to document that, shadow DR6 for PV (which seems a little overkill
> if we don't trap DR6 access from PV already), or just live with that?

Different DR6's.

This is Xen responding to a real #DB (most likely from a PV guest, but
maybe from debugging activity in Xen itself), and in ...

>>  
>>      /*
>>       * At the time of writing (March 2018), on the subject of %dr6:

... this piece of context missing from the patch, Xen always writes
X86_DR6_DEFAULT back in order to clear the sticky bits.

This behaviour hasn't changed.  Xen always sees a "clean" DR6 on every
new #DB.

For the PV guest, what matters is ...

>> @@ -1963,13 +1963,13 @@ void do_debug(struct cpu_user_regs *regs)
>>           * If however we do, safety measures need to be enacted.  Use a big
>>           * hammer and clear all debug settings.
>>           */
>> -        if ( dr6 & (DR_TRAP3 | DR_TRAP2 | DR_TRAP1 | DR_TRAP0) )
>> +        if ( pending_dbg & X86_DR6_BP_MASK )
>>          {
>>              unsigned int bp, dr7 = read_debugreg(7);
>>  
>>              for ( bp = 0; bp < 4; ++bp )
>>              {
>> -                if ( (dr6 & (1u << bp)) && /* Breakpoint triggered? */
>> +                if ( (pending_dbg & (1u << bp)) && /* Breakpoint triggered? */
>>                       (dr7 & (3u << (bp * DR_ENABLE_SIZE))) && /* Enabled? */
>>                       ((dr7 & (3u << ((bp * DR_CONTROL_SIZE) + /* Insn? */
>>                                       DR_CONTROL_SHIFT))) == DR_RW_EXECUTE) )
>> @@ -1990,24 +1990,23 @@ void do_debug(struct cpu_user_regs *regs)
>>           * so ensure the message is ratelimited.
>>           */
>>          gprintk(XENLOG_WARNING,
>> -                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, dr6 %lx\n",
>> +                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, pending_dbg %lx\n",
>>                  regs->cs, _p(regs->rip), _p(regs->rip),
>> -                regs->ss, _p(regs->rsp), dr6);
>> +                regs->ss, _p(regs->rsp), pending_dbg);
>>  
>>          return;
>>      }
>>  
>> -    /* Save debug status register where guest OS can peek at it */
>> -    v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
>> -    v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT);
>> -
>>      if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
>>      {
>> +        /* Save debug status register where gdbsx can peek at it */
>> +        v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy,
>> +                                    v->arch.dr6, pending_dbg);
>>          domain_pause_for_debugger();
>>          return;
>>      }
>>  
>> -    pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
>> +    pv_inject_DB(pending_dbg);

... this, which will merge all new pending bits into the guest's DR6.

If the guest chooses not to clean out DR6 each time, then it will see
content accumulate.

To your earlier point of shadowing, we already have to do that.  DR6 is
just one of many registers we need to context switch for the vCPU.

PV guests, being ring-deprivieged, trap into Xen for every DR access,
and Xen handles every one of their #DB exceptions.  This is one reason
why I split the work into several parts.  PV guests are easier to get
sorted properly, and patch 1,2,5,6 are all common improvements relevant
to HVM too.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] x86/pv: Rewrite %dr6 handling
  2023-09-16 12:56     ` Andrew Cooper
@ 2023-09-16 13:10       ` Jinoh Kang
  2023-09-16 13:39         ` Andrew Cooper
  0 siblings, 1 reply; 31+ messages in thread
From: Jinoh Kang @ 2023-09-16 13:10 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 9/16/23 21:56, Andrew Cooper wrote:
>> We don't reset DR6 after reading it, and there is no guarantee that the PV guest
>> will reset it either, so it doesn't match PENDING_DBG exactly IIRC.
>>
>> On the other hand, nothing will probably care about its double-accumulating
>> quirk except perhaps monitor users.
>>
>> Do we want to document that, shadow DR6 for PV (which seems a little overkill
>> if we don't trap DR6 access from PV already), or just live with that?
> 
> Different DR6's.
> 
> This is Xen responding to a real #DB (most likely from a PV guest, but
> maybe from debugging activity in Xen itself), and in ...
> 
>>>  
>>>      /*
>>>       * At the time of writing (March 2018), on the subject of %dr6:
> 
> ... this piece of context missing from the patch, Xen always writes
> X86_DR6_DEFAULT back in order to clear the sticky bits.

Oh, that'll do.

How come have I not seen this?  Maybe I was in dire need of morning coffee.
I assumed absence just from a missing context...

> 
> This behaviour hasn't changed.  Xen always sees a "clean" DR6 on every
> new #DB.
> 
> For the PV guest, what matters is ...
> 

(snip)

>>> -    /* Save debug status register where guest OS can peek at it */
>>> -    v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
>>> -    v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT);
>>> -
>>>      if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
>>>      {
>>> +        /* Save debug status register where gdbsx can peek at it */
>>> +        v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy,
>>> +                                    v->arch.dr6, pending_dbg);
>>>          domain_pause_for_debugger();
>>>          return;
>>>      }
>>>  
>>> -    pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
>>> +    pv_inject_DB(pending_dbg);
> 
> ... this, which will merge all new pending bits into the guest's DR6.
> 
> If the guest chooses not to clean out DR6 each time, then it will see
> content accumulate.
> 
> To your earlier point of shadowing, we already have to do that.  DR6 is
> just one of many registers we need to context switch for the vCPU.
> 
> PV guests, being ring-deprivieged, trap into Xen for every DR access,
> and Xen handles every one of their #DB exceptions.  This is one reason
> why I split the work into several parts.  PV guests are easier to get
> sorted properly, and patch 1,2,5,6 are all common improvements relevant
> to HVM too.

That confirms my knowledge.  Honestly if I got such a foolish remark
I would question the reviewer's understanding of PV mode (not that you did
anything wrong).  Sorry for wasting your time.

> 
> ~Andrew

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] x86/pv: Rewrite %dr6 handling
  2023-09-16 13:10       ` Jinoh Kang
@ 2023-09-16 13:39         ` Andrew Cooper
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-16 13:39 UTC (permalink / raw)
  To: Jinoh Kang, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 16/09/2023 2:10 pm, Jinoh Kang wrote:
> On 9/16/23 21:56, Andrew Cooper wrote:
>>>> -    /* Save debug status register where guest OS can peek at it */
>>>> -    v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
>>>> -    v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT);
>>>> -
>>>>      if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
>>>>      {
>>>> +        /* Save debug status register where gdbsx can peek at it */
>>>> +        v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy,
>>>> +                                    v->arch.dr6, pending_dbg);
>>>>          domain_pause_for_debugger();
>>>>          return;
>>>>      }
>>>>  
>>>> -    pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC);
>>>> +    pv_inject_DB(pending_dbg);
>> ... this, which will merge all new pending bits into the guest's DR6.
>>
>> If the guest chooses not to clean out DR6 each time, then it will see
>> content accumulate.
>>
>> To your earlier point of shadowing, we already have to do that.  DR6 is
>> just one of many registers we need to context switch for the vCPU.
>>
>> PV guests, being ring-deprivieged, trap into Xen for every DR access,
>> and Xen handles every one of their #DB exceptions.  This is one reason
>> why I split the work into several parts.  PV guests are easier to get
>> sorted properly, and patch 1,2,5,6 are all common improvements relevant
>> to HVM too.
> That confirms my knowledge.  Honestly if I got such a foolish remark
> I would question the reviewer's understanding of PV mode (not that you did
> anything wrong).  Sorry for wasting your time.

Not at all.

This is exceptionally tricky, and as you can see from the v1 series,
don't assume that I know it all perfectly.

The specifics of PV guests are unique to Xen, and there are 0 people who
understand it fully.  For example, doing the sanity testing for this v2
series, I rediscovered the completely undocumented properly that a PV
guests IOPL is separate from an architectural ideal of IOPL, defaults to
0 which has the meaning "guest kernel takes a #GP for access to IO ports".

Please do continue to ask questions like this if you're not sure.  It is
not a waste of time cross-checking that the logic is correct.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
  2023-09-16  7:36   ` Jinoh Kang
@ 2023-09-16 14:00     ` Andrew Cooper
  2023-09-16 14:42       ` Jinoh Kang
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-16 14:00 UTC (permalink / raw)
  To: Jinoh Kang, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 16/09/2023 8:36 am, Jinoh Kang wrote:
> On 9/16/23 05:36, Andrew Cooper wrote:
>> @@ -658,7 +660,7 @@ static int cf_check rep_ins(
>>  
>>          ++*reps;
>>  
>> -        if ( poc->bpmatch || hypercall_preempt_check() )
>> +        if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() )
>>              break;
>>  
>>          /* x86_emulate() clips the repetition count to ensure we don't wrap. */
> (snip)
>
>> @@ -726,7 +729,7 @@ static int cf_check rep_outs(
>>  
>>          ++*reps;
>>  
>> -        if ( poc->bpmatch || hypercall_preempt_check() )
>> +        if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() )
>>              break;
>>  
>>          /* x86_emulate() clips the repetition count to ensure we don't wrap. */
> These two hunks look like a behavioral change in singlestep mode.
>
> This is actually a fix, assuming the emulator previously did not handle
> 'rep {in,out}s' in singlestep mode correctly, since it now checks for
> PENDING_DBG.BS in addition to PENDING_DBG.B[0-4].

The emulator should handle this correctly already.  I had been meaning
to test this, but hadn't so far - guess I should fix that.

x86_emulate.c line 511 in get_rep_prefix() sets max_reps to 1 if
SingleStepping is active.

This in turn causes the emulator to use the io_{read,write}() hook
rather than the rep hook.

This is important, because singlestepping becoming pending is normally
evaluated at the end of the instruction.  i.e. in this example it
wouldn't show up in pending_dbg (yet).

What definitely is broken here is the recognition of a data breakpoint
on the memory operand of the INS/OUTS instruction, but it's broken
everywhere else for PV guest emulation too, so needs to go on the TODO list.

> If this is the case, (at least) this part of the patch looks like a stable
> candidate.  You might want to edit the commit message to reflect that.

We're going to try and get all the %dr6 handling issues sorted, then
decide whether to backport the lot or not.  It will entirely depend on
how invasive the fixes end up being, but I hope they'll be ok to backport.

> (Ideally all the HWBP handling should be part of the emulator logic, but
>  I don't see an easy way to generalize the PV-specific logic.  It could
>  be its own patch anyway.)

The emulator does have HWBP handling for HVM guests, because that's
architectural behaviour to look in the TSS.

PV guests are the odd-ones-out with non-standard behaviour.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
  2023-09-16 14:00     ` Andrew Cooper
@ 2023-09-16 14:42       ` Jinoh Kang
  2023-09-16 16:18         ` Andrew Cooper
  0 siblings, 1 reply; 31+ messages in thread
From: Jinoh Kang @ 2023-09-16 14:42 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 9/16/23 23:00, Andrew Cooper wrote:
> On 16/09/2023 8:36 am, Jinoh Kang wrote:

(snip)

>> These two hunks look like a behavioral change in singlestep mode.
>>
>> This is actually a fix, assuming the emulator previously did not handle
>> 'rep {in,out}s' in singlestep mode correctly, since it now checks for
>> PENDING_DBG.BS in addition to PENDING_DBG.B[0-4].
> 
> The emulator should handle this correctly already.  I had been meaning
> to test this, but hadn't so far - guess I should fix that.
> 
> x86_emulate.c line 511 in get_rep_prefix() sets max_reps to 1 if
> SingleStepping is active.

Thanks for informing.  Although that EFLAGS.TF check in the macro now
makes me--almost reflexively--imagine all sort of creative pathological
cases, like "mov ss, ax; rep ins"...

> 
> This in turn causes the emulator to use the io_{read,write}() hook
> rather than the rep hook.

Right.  (Frankly that part of code has too many branches to be readable.
Also the "presumably most efficient" part of the comment hints at perf
optimization sans any profiling attempts...)

> 
> This is important, because singlestepping becoming pending is normally
> evaluated at the end of the instruction.  i.e. in this example it
> wouldn't show up in pending_dbg (yet).
> 
> What definitely is broken here is the recognition of a data breakpoint
> on the memory operand of the INS/OUTS instruction, but it's broken
> everywhere else for PV guest emulation too, so needs to go on the TODO list.

(Another thing definitely broken here is the recognition of I/O breakpoints
 post the first iteration.  Maybe it would be beneficial to do differential
 testing between the {read,write}_io slowpath and rep_{in,out}s fastpath.)

> 
>> If this is the case, (at least) this part of the patch looks like a stable
>> candidate.  You might want to edit the commit message to reflect that.
> 
> We're going to try and get all the %dr6 handling issues sorted, then
> decide whether to backport the lot or not.  It will entirely depend on
> how invasive the fixes end up being, but I hope they'll be ok to backport.
> 
>> (Ideally all the HWBP handling should be part of the emulator logic, but
>>  I don't see an easy way to generalize the PV-specific logic.  It could
>>  be its own patch anyway.)
> 
> The emulator does have HWBP handling for HVM guests, because that's
> architectural behaviour to look in the TSS.

I was under such impression since I didn't immediately notice I/O
breakpoint handling in ins/outs path; maybe I haven't looked into it deeper...

> 
> PV guests are the odd-ones-out with non-standard behaviour.
> 
> ~Andrew

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
  2023-09-16 14:42       ` Jinoh Kang
@ 2023-09-16 16:18         ` Andrew Cooper
  0 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-16 16:18 UTC (permalink / raw)
  To: Jinoh Kang, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 16/09/2023 3:42 pm, Jinoh Kang wrote:
> On 9/16/23 23:00, Andrew Cooper wrote:
>> On 16/09/2023 8:36 am, Jinoh Kang wrote:
> (snip)
>
>>> These two hunks look like a behavioral change in singlestep mode.
>>>
>>> This is actually a fix, assuming the emulator previously did not handle
>>> 'rep {in,out}s' in singlestep mode correctly, since it now checks for
>>> PENDING_DBG.BS in addition to PENDING_DBG.B[0-4].
>> The emulator should handle this correctly already.  I had been meaning
>> to test this, but hadn't so far - guess I should fix that.
>>
>> x86_emulate.c line 511 in get_rep_prefix() sets max_reps to 1 if
>> SingleStepping is active.
> Thanks for informing.  Although that EFLAGS.TF check in the macro now
> makes me--almost reflexively--imagine all sort of creative pathological
> cases, like "mov ss, ax; rep ins"...

Yeah isn't x86 fun...

That's why it's important to accumulate in pending_dbg, which does hold
the breakpoint recognition across point where the blocked-by-movss state
inhibits the generation of #DB, and causes it to generate #DB on the
subsequent instruction boundary.

>> This in turn causes the emulator to use the io_{read,write}() hook
>> rather than the rep hook.
> Right.  (Frankly that part of code has too many branches to be readable.
> Also the "presumably most efficient" part of the comment hints at perf
> optimization sans any profiling attempts...)

Yeah it's not the greatest code.  I cant remember the exact history
there, but IIRC we used to handle the rep looping entirely in
x86_emulate() and there were no rep hooks.

There are two cases where the perf hit did get noticed.  REP OUTSB to
the qemu/bochs/Xen debug port, and Windows which does a REP MOVSB across
the various bits of PCI MMIO.

>> This is important, because singlestepping becoming pending is normally
>> evaluated at the end of the instruction.  i.e. in this example it
>> wouldn't show up in pending_dbg (yet).
>>
>> What definitely is broken here is the recognition of a data breakpoint
>> on the memory operand of the INS/OUTS instruction, but it's broken
>> everywhere else for PV guest emulation too, so needs to go on the TODO list.
> (Another thing definitely broken here is the recognition of I/O breakpoints
>  post the first iteration.  Maybe it would be beneficial to do differential
>  testing between the {read,write}_io slowpath and rep_{in,out}s fastpath.)

PV, or HVM guest?

IO breakpoint recognition is missing generally in HVM guests.  I opened
https://gitlab.com/xen-project/xen/-/work_items/171 yesterday for it,
because it's not 30s work to fix.

But, IO breakpoints are invariant of REP.  The IO port doesn't change,
so the breakpoint(s) either match on every iteration, or not at all. 
(Of course, this doesn't mean that Xen is handling the recognition
correctly.)

In this patch, there is IO breakpoint recognition on all 4 of the PV IO
port hooks, so I think it ought to work properly, whether it's the first
iteration or not.

Obviously if not, I've got another bug to figure out...


It's worth saying that patch 2 does fix a bug (an X86EMUL_OKAY/DONE
confusion), but I'm pretty sure it was only a singlestep on admin-ok PV
IO ports which would be skipped, not the breakpoints.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling
  2023-09-15 20:36 ` [PATCH 2/7] x86/emul: Fix and extend #DB trap handling Andrew Cooper
@ 2023-09-18  9:24   ` Andrew Cooper
  2023-09-18 11:30     ` Jan Beulich
  2023-09-18 11:28   ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-18  9:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

On 15/09/2023 9:36 pm, Andrew Cooper wrote:
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 94caec1d142c..de7f99500e3f 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -8379,13 +8379,6 @@ x86_emulate(
>      if ( !mode_64bit() )
>          _regs.r(ip) = (uint32_t)_regs.r(ip);
>  
> -    /* Should a singlestep #DB be raised? */
> -    if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
> -    {
> -        ctxt->retire.singlestep = true;
> -        ctxt->retire.sti = false;
> -    }
> -
>      if ( rc != X86EMUL_DONE )
>          *ctxt->regs = _regs;
>      else
> @@ -8394,6 +8387,19 @@ x86_emulate(
>          rc = X86EMUL_OKAY;
>      }
>  
> +    /* Should a singlestep #DB be raised? */
> +    if ( rc == X86EMUL_OKAY && singlestep )
> +    {
> +        ctxt->retire.pending_dbg |= X86_DR6_BS;
> +
> +        /* BROKEN - TODO, merge into pending_dbg. */
> +        if ( !ctxt->retire.mov_ss )
> +        {
> +            ctxt->retire.singlestep = true;
> +            ctxt->retire.sti = false;
> +        }

I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
break HVM (when HVM swaps from singlestep to pending_dbg) until one of
the further TODOs is addressed.

This will need bringing back within the conditional to avoid regressions
in the short term.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers
  2023-09-15 20:36 ` [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers Andrew Cooper
@ 2023-09-18 11:17   ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 11:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 15.09.2023 22:36, Andrew Cooper wrote:
> This property is far from clear.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling
  2023-09-15 20:36 ` [PATCH 2/7] x86/emul: Fix and extend #DB trap handling Andrew Cooper
  2023-09-18  9:24   ` Andrew Cooper
@ 2023-09-18 11:28   ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 11:28 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 15.09.2023 22:36, Andrew Cooper wrote:
> Lots of this is very very broken, but we need to start somewhere.
> 
> First, the bugfix.  Hooks which use X86EMUL_DONE to skip the general emulation
> still need to evaluate singlestep as part of completing the instruction.
> Defer the logic until X86EMUL_DONE has been converted to X86EMUL_OKAY.

Doesn't this warrant a Fixes: tag against 4999bf3e8b4c?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling
  2023-09-18  9:24   ` Andrew Cooper
@ 2023-09-18 11:30     ` Jan Beulich
  2023-09-18 12:19       ` Andrew Cooper
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 11:30 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 18.09.2023 11:24, Andrew Cooper wrote:
> On 15/09/2023 9:36 pm, Andrew Cooper wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -8379,13 +8379,6 @@ x86_emulate(
>>      if ( !mode_64bit() )
>>          _regs.r(ip) = (uint32_t)_regs.r(ip);
>>  
>> -    /* Should a singlestep #DB be raised? */
>> -    if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
>> -    {
>> -        ctxt->retire.singlestep = true;
>> -        ctxt->retire.sti = false;
>> -    }
>> -
>>      if ( rc != X86EMUL_DONE )
>>          *ctxt->regs = _regs;
>>      else
>> @@ -8394,6 +8387,19 @@ x86_emulate(
>>          rc = X86EMUL_OKAY;
>>      }
>>  
>> +    /* Should a singlestep #DB be raised? */
>> +    if ( rc == X86EMUL_OKAY && singlestep )
>> +    {
>> +        ctxt->retire.pending_dbg |= X86_DR6_BS;
>> +
>> +        /* BROKEN - TODO, merge into pending_dbg. */
>> +        if ( !ctxt->retire.mov_ss )
>> +        {
>> +            ctxt->retire.singlestep = true;
>> +            ctxt->retire.sti = false;
>> +        }
> 
> I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
> break HVM (when HVM swaps from singlestep to pending_dbg) until one of
> the further TODOs is addressed.
> 
> This will need bringing back within the conditional to avoid regressions
> in the short term.

I'm afraid I don't understand this: Isn't the purpose to latch state no
matter whether it'll be consumed right away, or only on the next insn?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] x86: Introduce x86_merge_dr6()
  2023-09-15 20:36 ` [PATCH 5/7] x86: Introduce x86_merge_dr6() Andrew Cooper
@ 2023-09-18 11:37   ` Jan Beulich
  2023-09-22 16:11     ` Andrew Cooper
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 11:37 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 15.09.2023 22:36, Andrew Cooper wrote:
> The current logic used to update %dr6 when injecting #DB is buggy.
> 
> The SDM/APM documention on %dr6 updates is far from ideal, but does at least
> make clear that it's non-trivial.  The actual behaviour is to overwrite
> B{0..3} and accumulate all other bits.

As mentioned before, I'm okay to ack this patch provided it is explicitly said
where the information is coming from. Just saying that SDM/PM are incomplete
isn't enough, sorry. With that added (can't really be R-b, I'm afraid):
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] x86: Extend x86_event with a pending_dbg field
  2023-09-15 20:36 ` [PATCH 6/7] x86: Extend x86_event with a pending_dbg field Andrew Cooper
  2023-09-16  7:44   ` Jinoh Kang
@ 2023-09-18 11:39   ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 11:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 15.09.2023 22:36, Andrew Cooper wrote:
> ... using the Intel VMCS PENDING_DBG semantics, and sharing storage with cr2.
> 
> This requires working around anonymous union bugs in obsolete versions of GCC,
> which in turn needs to drop unnecessary const qualifiers.
> 
> Also introduce a pv_inject_DB() wrapper use this field nicely.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] x86/pv: Rewrite %dr6 handling
  2023-09-15 20:36 ` [PATCH 7/7] x86/pv: Rewrite %dr6 handling Andrew Cooper
  2023-09-16  7:50   ` Jinoh Kang
@ 2023-09-18 11:46   ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 11:46 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 15.09.2023 22:36, Andrew Cooper wrote:
> All #DB exceptions result in an update of %dr6, but this isn't handled
> properly by Xen for any guest type.
> 
> Remove all ad-hoc dr6 handling, leaving it to pv_inject_event() in most cases
> and using the new x86_merge_dr6() helper.
> 
> In do_debug(), swap the dr6 to pending_dbg in order to operate entirely with
> positive polarity.  Among other things, this helps spot RTM/BLD in the
> diagnostic message.
> 
> Drop the unconditional v->arch.dr6 adjustment.  pv_inject_event() performs the
> adjustment in the common case, but retain the prior behaviour if a debugger is
> attached.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

One minor aspect to consider:

> @@ -1990,24 +1990,23 @@ void do_debug(struct cpu_user_regs *regs)
>           * so ensure the message is ratelimited.
>           */
>          gprintk(XENLOG_WARNING,
> -                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, dr6 %lx\n",
> +                "Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, pending_dbg %lx\n",

Would you mind shorting to just "pending"?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling
  2023-09-18 11:30     ` Jan Beulich
@ 2023-09-18 12:19       ` Andrew Cooper
  2023-09-18 12:47         ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-18 12:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 18/09/2023 12:30 pm, Jan Beulich wrote:
> On 18.09.2023 11:24, Andrew Cooper wrote:
>> On 15/09/2023 9:36 pm, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -8379,13 +8379,6 @@ x86_emulate(
>>>      if ( !mode_64bit() )
>>>          _regs.r(ip) = (uint32_t)_regs.r(ip);
>>>  
>>> -    /* Should a singlestep #DB be raised? */
>>> -    if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
>>> -    {
>>> -        ctxt->retire.singlestep = true;
>>> -        ctxt->retire.sti = false;
>>> -    }
>>> -
>>>      if ( rc != X86EMUL_DONE )
>>>          *ctxt->regs = _regs;
>>>      else
>>> @@ -8394,6 +8387,19 @@ x86_emulate(
>>>          rc = X86EMUL_OKAY;
>>>      }
>>>  
>>> +    /* Should a singlestep #DB be raised? */
>>> +    if ( rc == X86EMUL_OKAY && singlestep )
>>> +    {
>>> +        ctxt->retire.pending_dbg |= X86_DR6_BS;
>>> +
>>> +        /* BROKEN - TODO, merge into pending_dbg. */
>>> +        if ( !ctxt->retire.mov_ss )
>>> +        {
>>> +            ctxt->retire.singlestep = true;
>>> +            ctxt->retire.sti = false;
>>> +        }
>> I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
>> break HVM (when HVM swaps from singlestep to pending_dbg) until one of
>> the further TODOs is addressed.
>>
>> This will need bringing back within the conditional to avoid regressions
>> in the short term.
> I'm afraid I don't understand this: Isn't the purpose to latch state no
> matter whether it'll be consumed right away, or only on the next insn?

Yes, that is the intention in the longterm.

But in the short term, where I'm doing just enough to fix the %dr6 bits,
putting this unconditionally in PENDING_DBG will break the emulation of
mov-to-ss until the bigger todo of "wire INTERRUPTIBILITY/ACTIVITY state
into the emulation context" is complete.

The latter is definitely too big to fit into 4.18, and I can't
intentionally regress mov-to-ss in a series we intend to backport.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling
  2023-09-18 12:19       ` Andrew Cooper
@ 2023-09-18 12:47         ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2023-09-18 12:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 18.09.2023 14:19, Andrew Cooper wrote:
> On 18/09/2023 12:30 pm, Jan Beulich wrote:
>> On 18.09.2023 11:24, Andrew Cooper wrote:
>>> On 15/09/2023 9:36 pm, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> @@ -8379,13 +8379,6 @@ x86_emulate(
>>>>      if ( !mode_64bit() )
>>>>          _regs.r(ip) = (uint32_t)_regs.r(ip);
>>>>  
>>>> -    /* Should a singlestep #DB be raised? */
>>>> -    if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
>>>> -    {
>>>> -        ctxt->retire.singlestep = true;
>>>> -        ctxt->retire.sti = false;
>>>> -    }
>>>> -
>>>>      if ( rc != X86EMUL_DONE )
>>>>          *ctxt->regs = _regs;
>>>>      else
>>>> @@ -8394,6 +8387,19 @@ x86_emulate(
>>>>          rc = X86EMUL_OKAY;
>>>>      }
>>>>  
>>>> +    /* Should a singlestep #DB be raised? */
>>>> +    if ( rc == X86EMUL_OKAY && singlestep )
>>>> +    {
>>>> +        ctxt->retire.pending_dbg |= X86_DR6_BS;
>>>> +
>>>> +        /* BROKEN - TODO, merge into pending_dbg. */
>>>> +        if ( !ctxt->retire.mov_ss )
>>>> +        {
>>>> +            ctxt->retire.singlestep = true;
>>>> +            ctxt->retire.sti = false;
>>>> +        }
>>> I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
>>> break HVM (when HVM swaps from singlestep to pending_dbg) until one of
>>> the further TODOs is addressed.
>>>
>>> This will need bringing back within the conditional to avoid regressions
>>> in the short term.
>> I'm afraid I don't understand this: Isn't the purpose to latch state no
>> matter whether it'll be consumed right away, or only on the next insn?
> 
> Yes, that is the intention in the longterm.
> 
> But in the short term, where I'm doing just enough to fix the %dr6 bits,
> putting this unconditionally in PENDING_DBG will break the emulation of
> mov-to-ss until the bigger todo of "wire INTERRUPTIBILITY/ACTIVITY state
> into the emulation context" is complete.

Since I assume we're talking about the tail of _hvm_emulate_one(), my
problem is that I cannot see how setting X86_DR6_BS would lead to a
problem there. Plus you don't touch x86/hvm/ at all in the series, and
the pending_dbg field gets newly introduced in the patch here. Hence it
looks to me as if for HVM the field could take any value, without
breaking the code. But then, from you explicitly pointing out a problem,
I can only infer that I'm overlooking something.

> The latter is definitely too big to fit into 4.18, and I can't
> intentionally regress mov-to-ss in a series we intend to backport.

Of course.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2
  2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
                   ` (6 preceding siblings ...)
  2023-09-15 20:36 ` [PATCH 7/7] x86/pv: Rewrite %dr6 handling Andrew Cooper
@ 2023-09-19 20:53 ` Andrew Cooper
  7 siblings, 0 replies; 31+ messages in thread
From: Andrew Cooper @ 2023-09-19 20:53 UTC (permalink / raw)
  To: Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu, Jinoh Kang

On 15/09/2023 9:36 pm, Andrew Cooper wrote:
> This time with a bit of sanity testing.  See patches for details.
>
> Andrew Cooper (7):
>   x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers
>   x86/emul: Fix and extend #DB trap handling
>   x86/pv: Fix the determiniation of whether to inject #DB
>   x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead
>   x86: Introduce x86_merge_dr6()
>   x86: Extend x86_event with a pending_dbg field
>   x86/pv: Rewrite %dr6 handling

I've found even more PV debug bugs.

1) The MSR leak fix stopped MSR_DEBUGCTL leaking into PV guests, and
even "broke" my XTF test (still not blocking in CI because of bisector
interaction issues).  Really, this was one buggy case swapped for
another, but it needs resolving nevertheless.

2) activate_debugregs() has a misleading dr7 check in it (the caller is
gated on the same condition, making it tautological).  Worse however, it
loads %dr6 which interferes with with the #DB handler trying to get a
clean view of "new bits".  PV guests can't access %dr6 at all, so I'm
reasonably sure it's state we simply don't need to load at all.

3) The comment in paravirt_ctxt_switch_from() about debug regs is
buggy.  The logic is correct, but the reasoning is false.

4) set_debugreg() doesn't account for the dr7 state before loading
dr{0..3} in current context.  This manifests as spuriously loading
breakpoint registers despite not having debugging "active".  I think
it's benign.

5) The order of operations in activate_debugregs() is wrong.  There's a
period of time where we've got the new vCPU's breakpoints active using
the old vCPU's mask registers.  Again, I think it's benign.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] x86: Introduce x86_merge_dr6()
  2023-09-18 11:37   ` Jan Beulich
@ 2023-09-22 16:11     ` Andrew Cooper
  2023-09-25  7:30       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Cooper @ 2023-09-22 16:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 18/09/2023 12:37 pm, Jan Beulich wrote:
> On 15.09.2023 22:36, Andrew Cooper wrote:
>> The current logic used to update %dr6 when injecting #DB is buggy.
>>
>> The SDM/APM documention on %dr6 updates is far from ideal, but does at least
>> make clear that it's non-trivial.  The actual behaviour is to overwrite
>> B{0..3} and accumulate all other bits.
> As mentioned before, I'm okay to ack this patch provided it is explicitly said
> where the information is coming from.

The information *is* coming from the relevant paragraph of the relevant
chapters of the relevant manuals.

I don't need to teach programmers how to suck eggs.  Nor am I going to
quote buggy manuals (corrections are pending for both) as a
justification for restating several paragraphs of information as a oneliner.

~Andrew


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] x86: Introduce x86_merge_dr6()
  2023-09-22 16:11     ` Andrew Cooper
@ 2023-09-25  7:30       ` Jan Beulich
  2023-09-25 10:20         ` Jinoh Kang
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2023-09-25  7:30 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Jinoh Kang, Xen-devel

On 22.09.2023 18:11, Andrew Cooper wrote:
> On 18/09/2023 12:37 pm, Jan Beulich wrote:
>> On 15.09.2023 22:36, Andrew Cooper wrote:
>>> The current logic used to update %dr6 when injecting #DB is buggy.
>>>
>>> The SDM/APM documention on %dr6 updates is far from ideal, but does at least
>>> make clear that it's non-trivial.  The actual behaviour is to overwrite
>>> B{0..3} and accumulate all other bits.
>> As mentioned before, I'm okay to ack this patch provided it is explicitly said
>> where the information is coming from.
> 
> The information *is* coming from the relevant paragraph of the relevant
> chapters of the relevant manuals.
> 
> I don't need to teach programmers how to suck eggs.  Nor am I going to
> quote buggy manuals (corrections are pending for both) as a
> justification for restating several paragraphs of information as a oneliner.

Earlier on you said this to my original request:

'SDM Vol3 18.2.3 Debug Status Register (DR6) says

 "Certain debug exceptions may clear bits 0-3. The remaining contents of
 the DR6 register are never cleared by the processor."'

"Certain" and "may" do not describe the behavior that your change implements.
Hence imo there's still a need to clarify where the extra information is
coming from. Pending corrections are of course appreciated; in case you have
been told the new wording already, perhaps you could quote that?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] x86: Introduce x86_merge_dr6()
  2023-09-25  7:30       ` Jan Beulich
@ 2023-09-25 10:20         ` Jinoh Kang
  2023-10-28 13:55           ` Jinoh Kang
  0 siblings, 1 reply; 31+ messages in thread
From: Jinoh Kang @ 2023-09-25 10:20 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 9/25/23 16:30, Jan Beulich wrote:
> On 22.09.2023 18:11, Andrew Cooper wrote:
>> On 18/09/2023 12:37 pm, Jan Beulich wrote:
>>> On 15.09.2023 22:36, Andrew Cooper wrote:
>>>> The current logic used to update %dr6 when injecting #DB is buggy.
>>>>
>>>> The SDM/APM documention on %dr6 updates is far from ideal, but does at least
>>>> make clear that it's non-trivial.  The actual behaviour is to overwrite
>>>> B{0..3} and accumulate all other bits.
>>> As mentioned before, I'm okay to ack this patch provided it is explicitly said
>>> where the information is coming from.
>>
>> The information *is* coming from the relevant paragraph of the relevant
>> chapters of the relevant manuals.
>>
>> I don't need to teach programmers how to suck eggs.  Nor am I going to
>> quote buggy manuals (corrections are pending for both) as a
>> justification for restating several paragraphs of information as a oneliner.
> 
> Earlier on you said this to my original request:
> 
> 'SDM Vol3 18.2.3 Debug Status Register (DR6) says
> 
>  "Certain debug exceptions may clear bits 0-3. The remaining contents of
>  the DR6 register are never cleared by the processor."'
> 
> "Certain" and "may" do not describe the behavior that your change implements.
> Hence imo there's still a need to clarify where the extra information is
> coming from. Pending corrections are of course appreciated; in case you have
> been told the new wording already, perhaps you could quote that?

As an outsider's perspective, I think this kind of thing is where selftests
really shine.  I got the impression that Xen will need to rely on numerous other
platform oddities, the documentation of which are often unavailable.

Of course, adding a whole new test infrastructure in code freeze is not viable.
Maybe I have missed something, but I only see three paths forward here:

1. Reference the most relevant paragraph in SDM/APM, but don't quote it.
   Keep the current explanation, and state that the manual is vague anyway.

2. Acknowledge that SDM/APM is incomplete, and completely abandon the manual
   as the *authoritative* source of information.  Perhaps embed a sample test
   program that demonstrates the behavior, if it isn't too long.

3. Actually assert in runtime that DR6 behaves as expected.

> 
> Jan

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] x86: Introduce x86_merge_dr6()
  2023-09-25 10:20         ` Jinoh Kang
@ 2023-10-28 13:55           ` Jinoh Kang
  0 siblings, 0 replies; 31+ messages in thread
From: Jinoh Kang @ 2023-10-28 13:55 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 9/25/23 19:20, Jinoh Kang wrote:
> As an outsider's perspective, I think this kind of thing is where selftests
> really shine.  I got the impression that Xen will need to rely on numerous other
> platform oddities, the documentation of which are often unavailable.
> 
> Of course, adding a whole new test infrastructure in code freeze is not viable.
> Maybe I have missed something, but I only see three paths forward here:
> 
> 1. Reference the most relevant paragraph in SDM/APM, but don't quote it.
>    Keep the current explanation, and state that the manual is vague anyway.
> 
> 2. Acknowledge that SDM/APM is incomplete, and completely abandon the manual
>    as the *authoritative* source of information.  Perhaps embed a sample test
>    program that demonstrates the behavior, if it isn't too long.
> 
> 3. Actually assert in runtime that DR6 behaves as expected.
> 

Just a heads-up; has there been any progress on this part?

Please let me know if you need anything.  I'm happy to help!

-- 
Sincerely,
Jinoh Kang



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-10-28 13:55 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-15 20:36 [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper
2023-09-15 20:36 ` [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers Andrew Cooper
2023-09-18 11:17   ` Jan Beulich
2023-09-15 20:36 ` [PATCH 2/7] x86/emul: Fix and extend #DB trap handling Andrew Cooper
2023-09-18  9:24   ` Andrew Cooper
2023-09-18 11:30     ` Jan Beulich
2023-09-18 12:19       ` Andrew Cooper
2023-09-18 12:47         ` Jan Beulich
2023-09-18 11:28   ` Jan Beulich
2023-09-15 20:36 ` [PATCH 3/7] x86/pv: Fix the determiniation of whether to inject #DB Andrew Cooper
2023-09-15 20:36 ` [PATCH 4/7] x86/pv: Drop priv_op_ctxt.bpmatch and use pending_dbg instead Andrew Cooper
2023-09-16  7:36   ` Jinoh Kang
2023-09-16 14:00     ` Andrew Cooper
2023-09-16 14:42       ` Jinoh Kang
2023-09-16 16:18         ` Andrew Cooper
2023-09-15 20:36 ` [PATCH 5/7] x86: Introduce x86_merge_dr6() Andrew Cooper
2023-09-18 11:37   ` Jan Beulich
2023-09-22 16:11     ` Andrew Cooper
2023-09-25  7:30       ` Jan Beulich
2023-09-25 10:20         ` Jinoh Kang
2023-10-28 13:55           ` Jinoh Kang
2023-09-15 20:36 ` [PATCH 6/7] x86: Extend x86_event with a pending_dbg field Andrew Cooper
2023-09-16  7:44   ` Jinoh Kang
2023-09-18 11:39   ` Jan Beulich
2023-09-15 20:36 ` [PATCH 7/7] x86/pv: Rewrite %dr6 handling Andrew Cooper
2023-09-16  7:50   ` Jinoh Kang
2023-09-16 12:56     ` Andrew Cooper
2023-09-16 13:10       ` Jinoh Kang
2023-09-16 13:39         ` Andrew Cooper
2023-09-18 11:46   ` Jan Beulich
2023-09-19 20:53 ` [PATCH v2 0/7] x86/pv: #DB vs %dr6 fixes, part 2 Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.