All of lore.kernel.org
 help / color / mirror / Atom feed
* Causing VMEXITs when kprobes are hit in the guest VM
@ 2022-05-01 15:30 Arnabjyoti Kalita
  2022-05-03 20:45 ` Sean Christopherson
  0 siblings, 1 reply; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-01 15:30 UTC (permalink / raw)
  To: kvm

Hello all,

I intend to run a kernel module inside my guest VM. The kernel module
sets kprobes on a couple of functions in the linux kernel. After
registering the kprobes successfully, I can see that the kprobes are
being hit repeatedly.

I would like to cause a VMEXIT when these kprobes are hit. I know that
kprobes use a breakpoint instruction (INT 3) to successfully execute
the pre and post handlers. This would mean that the execution of the
instruction INT 3 should technically cause a VMEXIT. However, I do not
get any software exception type VMEXITs when these kprobes are hit.

I have used the commands "perf kvm stat record" and "perf kvm stat
report --event=vmexit" to try and observe the VMEXIT reasons and I do
not see any VMEXIT of type "EXCEPTION_NMI" being returned in the
period that the kprobe was being hit.

My host uses a modified Linux kernel 5.8.0 while my guest runs a 4.4.0
Linux kernel. Both the guest and the host use the x86_64 architecture.
I am using QEMU version 5.0.1. What changes are needed in the Linux
kernel to make sure that I get an exception in the form of a VMEXIT
whenever the kprobes are hit?

Thank you very much.

Best Regards,
Arnabjyoti Kalita

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-01 15:30 Causing VMEXITs when kprobes are hit in the guest VM Arnabjyoti Kalita
@ 2022-05-03 20:45 ` Sean Christopherson
  2022-05-06  5:14   ` Arnabjyoti Kalita
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2022-05-03 20:45 UTC (permalink / raw)
  To: Arnabjyoti Kalita; +Cc: kvm

On Sun, May 01, 2022, Arnabjyoti Kalita wrote:
> Hello all,
> 
> I intend to run a kernel module inside my guest VM. The kernel module
> sets kprobes on a couple of functions in the linux kernel. After
> registering the kprobes successfully, I can see that the kprobes are
> being hit repeatedly.
> 
> I would like to cause a VMEXIT when these kprobes are hit. I know that
> kprobes use a breakpoint instruction (INT 3) to successfully execute
> the pre and post handlers. This would mean that the execution of the
> instruction INT 3 should technically cause a VMEXIT.

No, it should cause #BP.  KVM doesn't intercept #BP by default because there's no
reason to do so.

> However, I do not get any software exception type VMEXITs when these kprobes
> are hit.
> 
> I have used the commands "perf kvm stat record" and "perf kvm stat
> report --event=vmexit" to try and observe the VMEXIT reasons and I do
> not see any VMEXIT of type "EXCEPTION_NMI" being returned in the
> period that the kprobe was being hit.
> 
> My host uses a modified Linux kernel 5.8.0 while my guest runs a 4.4.0
> Linux kernel. Both the guest and the host use the x86_64 architecture.
> I am using QEMU version 5.0.1. What changes are needed in the Linux
> kernel to make sure that I get an exception in the form of a VMEXIT
> whenever the kprobes are hit?

This can be done entirely from userspace by enabling KVM_GUESTDBG_USE_SW_BP, e.g.

	struct kvm_guest_debug debug;

	memset(&debug, 0, sizeof(debug));
	debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
	ioctl(vcpu->fd, KVM_SET_GUEST_DEBUG, &debug);

That will intercept #BP and exit to userspace with KVM_EXIT_DEBUG.  Note, it's
userspace's responsibility to re-inject the #BP if userspace wants to forward the
#BP to the guest.

There's a bit more info in Documentation/virt/kvm/api.rst under KVM_SET_GUEST_DEBUG.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-03 20:45 ` Sean Christopherson
@ 2022-05-06  5:14   ` Arnabjyoti Kalita
  2022-05-07  6:30     ` Arnabjyoti Kalita
  0 siblings, 1 reply; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-06  5:14 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

Dear Sean,

Thank you very much for your answer.

I managed to set a Hardware Breakpoint using the API you provided, in
userspace, and managed to get a VMEXIT of type "KVM_EXIT_DEBUG". I
will not re-inject the #BP since I do not necessarily want the guest
to do anything with it. All I will do is just record the CPU ID of the
cpu (in userspace) that caused this breakpoint and let the guest
continue execution.

Best Regards,
Arnabjyoti Kalita

On Wed, May 4, 2022 at 2:15 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Sun, May 01, 2022, Arnabjyoti Kalita wrote:
> > Hello all,
> >
> > I intend to run a kernel module inside my guest VM. The kernel module
> > sets kprobes on a couple of functions in the linux kernel. After
> > registering the kprobes successfully, I can see that the kprobes are
> > being hit repeatedly.
> >
> > I would like to cause a VMEXIT when these kprobes are hit. I know that
> > kprobes use a breakpoint instruction (INT 3) to successfully execute
> > the pre and post handlers. This would mean that the execution of the
> > instruction INT 3 should technically cause a VMEXIT.
>
> No, it should cause #BP.  KVM doesn't intercept #BP by default because there's no
> reason to do so.
>
> > However, I do not get any software exception type VMEXITs when these kprobes
> > are hit.
> >
> > I have used the commands "perf kvm stat record" and "perf kvm stat
> > report --event=vmexit" to try and observe the VMEXIT reasons and I do
> > not see any VMEXIT of type "EXCEPTION_NMI" being returned in the
> > period that the kprobe was being hit.
> >
> > My host uses a modified Linux kernel 5.8.0 while my guest runs a 4.4.0
> > Linux kernel. Both the guest and the host use the x86_64 architecture.
> > I am using QEMU version 5.0.1. What changes are needed in the Linux
> > kernel to make sure that I get an exception in the form of a VMEXIT
> > whenever the kprobes are hit?
>
> This can be done entirely from userspace by enabling KVM_GUESTDBG_USE_SW_BP, e.g.
>
>         struct kvm_guest_debug debug;
>
>         memset(&debug, 0, sizeof(debug));
>         debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
>         ioctl(vcpu->fd, KVM_SET_GUEST_DEBUG, &debug);
>
> That will intercept #BP and exit to userspace with KVM_EXIT_DEBUG.  Note, it's
> userspace's responsibility to re-inject the #BP if userspace wants to forward the
> #BP to the guest.
>
> There's a bit more info in Documentation/virt/kvm/api.rst under KVM_SET_GUEST_DEBUG.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-06  5:14   ` Arnabjyoti Kalita
@ 2022-05-07  6:30     ` Arnabjyoti Kalita
  2022-05-11  0:49       ` Jim Mattson
  0 siblings, 1 reply; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-07  6:30 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

Dear Sean and all,

When a VMEXIT happens of type "KVM_EXIT_DEBUG" because a hardware
breakpoint was triggered when an instruction was about to be executed,
does the instruction where the breakpoint was placed actually execute
before the VMEXIT happens?

I am attempting to record the occurrence of the debug exception in
userspace. I do not want to do anything extra with the debug
exception. I have modified the kernel code (handle_exception_nmi) to
do something like this -

case BP_VECTOR:
    /*
     * Update instruction length as we may reinject #BP from
     * user space while in guest debugging mode. Reading it for
     * #DB as well causes no harm, it is not used in that case.
     */
      vmx->vcpu.arch.event_exit_inst_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
      kvm_run->exit_reason = KVM_EXIT_DEBUG;
      ......
      kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
      kvm_run->debug.arch.exception = ex_no;
      kvm_rip_write(vcpu, rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN));
   <---Change : update RIP here
      break;

This allows the guest to proceed after the hardware breakpoint
exception was triggered. However, the guest kernel keeps running into
page fault at arbitrary points in time. So, I'm not sure if I need to
handle something else too.

I have modified the userspace code to not trigger any exception, it
just records the occurence of this VMEXIT and lets the guest continue.

Is this the right approach?

Thank you very much.

Best Regards,
Arnabjyoti Kalita


On Fri, May 6, 2022 at 10:44 AM Arnabjyoti Kalita
<akalita@cs.stonybrook.edu> wrote:
>
> Dear Sean,
>
> Thank you very much for your answer.
>
> I managed to set a Hardware Breakpoint using the API you provided, in
> userspace, and managed to get a VMEXIT of type "KVM_EXIT_DEBUG". I
> will not re-inject the #BP since I do not necessarily want the guest
> to do anything with it. All I will do is just record the CPU ID of the
> cpu (in userspace) that caused this breakpoint and let the guest
> continue execution.
>
> Best Regards,
> Arnabjyoti Kalita
>
> On Wed, May 4, 2022 at 2:15 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Sun, May 01, 2022, Arnabjyoti Kalita wrote:
> > > Hello all,
> > >
> > > I intend to run a kernel module inside my guest VM. The kernel module
> > > sets kprobes on a couple of functions in the linux kernel. After
> > > registering the kprobes successfully, I can see that the kprobes are
> > > being hit repeatedly.
> > >
> > > I would like to cause a VMEXIT when these kprobes are hit. I know that
> > > kprobes use a breakpoint instruction (INT 3) to successfully execute
> > > the pre and post handlers. This would mean that the execution of the
> > > instruction INT 3 should technically cause a VMEXIT.
> >
> > No, it should cause #BP.  KVM doesn't intercept #BP by default because there's no
> > reason to do so.
> >
> > > However, I do not get any software exception type VMEXITs when these kprobes
> > > are hit.
> > >
> > > I have used the commands "perf kvm stat record" and "perf kvm stat
> > > report --event=vmexit" to try and observe the VMEXIT reasons and I do
> > > not see any VMEXIT of type "EXCEPTION_NMI" being returned in the
> > > period that the kprobe was being hit.
> > >
> > > My host uses a modified Linux kernel 5.8.0 while my guest runs a 4.4.0
> > > Linux kernel. Both the guest and the host use the x86_64 architecture.
> > > I am using QEMU version 5.0.1. What changes are needed in the Linux
> > > kernel to make sure that I get an exception in the form of a VMEXIT
> > > whenever the kprobes are hit?
> >
> > This can be done entirely from userspace by enabling KVM_GUESTDBG_USE_SW_BP, e.g.
> >
> >         struct kvm_guest_debug debug;
> >
> >         memset(&debug, 0, sizeof(debug));
> >         debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
> >         ioctl(vcpu->fd, KVM_SET_GUEST_DEBUG, &debug);
> >
> > That will intercept #BP and exit to userspace with KVM_EXIT_DEBUG.  Note, it's
> > userspace's responsibility to re-inject the #BP if userspace wants to forward the
> > #BP to the guest.
> >
> > There's a bit more info in Documentation/virt/kvm/api.rst under KVM_SET_GUEST_DEBUG.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-07  6:30     ` Arnabjyoti Kalita
@ 2022-05-11  0:49       ` Jim Mattson
  2022-05-11 13:59         ` Sean Christopherson
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Mattson @ 2022-05-11  0:49 UTC (permalink / raw)
  To: Arnabjyoti Kalita; +Cc: Sean Christopherson, kvm

On Fri, May 6, 2022 at 11:31 PM Arnabjyoti Kalita
<akalita@cs.stonybrook.edu> wrote:
>
> Dear Sean and all,
>
> When a VMEXIT happens of type "KVM_EXIT_DEBUG" because a hardware
> breakpoint was triggered when an instruction was about to be executed,
> does the instruction where the breakpoint was placed actually execute
> before the VMEXIT happens?
>
> I am attempting to record the occurrence of the debug exception in
> userspace. I do not want to do anything extra with the debug
> exception. I have modified the kernel code (handle_exception_nmi) to
> do something like this -
>
> case BP_VECTOR:
>     /*
>      * Update instruction length as we may reinject #BP from
>      * user space while in guest debugging mode. Reading it for
>      * #DB as well causes no harm, it is not used in that case.
>      */
>       vmx->vcpu.arch.event_exit_inst_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
>       kvm_run->exit_reason = KVM_EXIT_DEBUG;
>       ......
>       kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
>       kvm_run->debug.arch.exception = ex_no;
>       kvm_rip_write(vcpu, rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN));
>    <---Change : update RIP here
>       break;
>
> This allows the guest to proceed after the hardware breakpoint
> exception was triggered. However, the guest kernel keeps running into
> page fault at arbitrary points in time. So, I'm not sure if I need to
> handle something else too.
>
> I have modified the userspace code to not trigger any exception, it
> just records the occurence of this VMEXIT and lets the guest continue.
>
> Is this the right approach?

Probably not. I'm not sure how kprobes work, but the tracepoint hooks
at function entry are multi-byte nopl instructions. The int3
instruction that raises a #BP fault is only one byte. If you advance
past that byte, you will try to execute the remaining bytes of the
original nopl. You want to skip past the entire nopl.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11  0:49       ` Jim Mattson
@ 2022-05-11 13:59         ` Sean Christopherson
  2022-05-11 14:08           ` Arnabjyoti Kalita
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2022-05-11 13:59 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Arnabjyoti Kalita, kvm

On Wed, May 11, 2022, Jim Mattson wrote:
> On Fri, May 6, 2022 at 11:31 PM Arnabjyoti Kalita
> <akalita@cs.stonybrook.edu> wrote:
> >
> > Dear Sean and all,
> >
> > When a VMEXIT happens of type "KVM_EXIT_DEBUG" because a hardware
> > breakpoint was triggered when an instruction was about to be executed,
> > does the instruction where the breakpoint was placed actually execute
> > before the VMEXIT happens?
> >
> > I am attempting to record the occurrence of the debug exception in
> > userspace. I do not want to do anything extra with the debug
> > exception. I have modified the kernel code (handle_exception_nmi) to
> > do something like this -
> >
> > case BP_VECTOR:
> >     /*
> >      * Update instruction length as we may reinject #BP from
> >      * user space while in guest debugging mode. Reading it for
> >      * #DB as well causes no harm, it is not used in that case.
> >      */
> >       vmx->vcpu.arch.event_exit_inst_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
> >       kvm_run->exit_reason = KVM_EXIT_DEBUG;
> >       ......
> >       kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
> >       kvm_run->debug.arch.exception = ex_no;
> >       kvm_rip_write(vcpu, rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN));
> >    <---Change : update RIP here
> >       break;
> >
> > This allows the guest to proceed after the hardware breakpoint
> > exception was triggered. However, the guest kernel keeps running into
> > page fault at arbitrary points in time. So, I'm not sure if I need to
> > handle something else too.
> >
> > I have modified the userspace code to not trigger any exception, it
> > just records the occurence of this VMEXIT and lets the guest continue.
> >
> > Is this the right approach?
> 
> Probably not. I'm not sure how kprobes work, but the tracepoint hooks
> at function entry are multi-byte nopl instructions. The int3
> instruction that raises a #BP fault is only one byte. If you advance
> past that byte, you will try to execute the remaining bytes of the
> original nopl. You want to skip past the entire nopl.

And kprobes aren't the only thing that will generate #BP, e.g. the kernel uses
INT3 for patching, userspace debuggers in the guest can insert INT3, etc...  The
correct thing to do is to re-inject the #BP back into the guest without touching
RIP.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11 13:59         ` Sean Christopherson
@ 2022-05-11 14:08           ` Arnabjyoti Kalita
  2022-05-11 14:16             ` Sean Christopherson
  0 siblings, 1 reply; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-11 14:08 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, kvm

Hello Jim and Sean,

Thank you for your answers.

If I re-inject the #BP back into the guest, does it automatically take
care of updating the RIP and continuing execution?

I have seen that advancing RIP is unpredictable, works for some
instructions, not for others, so ideally I wouldn't want to go that
route.

Regards,
Arnabjyoti Kalita

On Wed, May 11, 2022 at 7:29 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, May 11, 2022, Jim Mattson wrote:
> > On Fri, May 6, 2022 at 11:31 PM Arnabjyoti Kalita
> > <akalita@cs.stonybrook.edu> wrote:
> > >
> > > Dear Sean and all,
> > >
> > > When a VMEXIT happens of type "KVM_EXIT_DEBUG" because a hardware
> > > breakpoint was triggered when an instruction was about to be executed,
> > > does the instruction where the breakpoint was placed actually execute
> > > before the VMEXIT happens?
> > >
> > > I am attempting to record the occurrence of the debug exception in
> > > userspace. I do not want to do anything extra with the debug
> > > exception. I have modified the kernel code (handle_exception_nmi) to
> > > do something like this -
> > >
> > > case BP_VECTOR:
> > >     /*
> > >      * Update instruction length as we may reinject #BP from
> > >      * user space while in guest debugging mode. Reading it for
> > >      * #DB as well causes no harm, it is not used in that case.
> > >      */
> > >       vmx->vcpu.arch.event_exit_inst_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
> > >       kvm_run->exit_reason = KVM_EXIT_DEBUG;
> > >       ......
> > >       kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
> > >       kvm_run->debug.arch.exception = ex_no;
> > >       kvm_rip_write(vcpu, rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN));
> > >    <---Change : update RIP here
> > >       break;
> > >
> > > This allows the guest to proceed after the hardware breakpoint
> > > exception was triggered. However, the guest kernel keeps running into
> > > page fault at arbitrary points in time. So, I'm not sure if I need to
> > > handle something else too.
> > >
> > > I have modified the userspace code to not trigger any exception, it
> > > just records the occurence of this VMEXIT and lets the guest continue.
> > >
> > > Is this the right approach?
> >
> > Probably not. I'm not sure how kprobes work, but the tracepoint hooks
> > at function entry are multi-byte nopl instructions. The int3
> > instruction that raises a #BP fault is only one byte. If you advance
> > past that byte, you will try to execute the remaining bytes of the
> > original nopl. You want to skip past the entire nopl.
>
> And kprobes aren't the only thing that will generate #BP, e.g. the kernel uses
> INT3 for patching, userspace debuggers in the guest can insert INT3, etc...  The
> correct thing to do is to re-inject the #BP back into the guest without touching
> RIP.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11 14:08           ` Arnabjyoti Kalita
@ 2022-05-11 14:16             ` Sean Christopherson
  2022-05-11 14:38               ` Arnabjyoti Kalita
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2022-05-11 14:16 UTC (permalink / raw)
  To: Arnabjyoti Kalita; +Cc: Jim Mattson, kvm

On Wed, May 11, 2022, Arnabjyoti Kalita wrote:
> Hello Jim and Sean,
> 
> Thank you for your answers.
> 
> If I re-inject the #BP back into the guest, does it automatically take
> care of updating the RIP and continuing execution?

Yes, the guest "automatically" handles the #BP.  What the appropriate handling may
be is up to the guest, i.e. skipping an instruction may or may not be the correct
thing to do.  Injecting the #BP after VM-Exit is simply emulating what would happen
from the guest's perspective if KVM had never intercepted the #BP in the first place.

Note, KVM doesn't have to initiate the injection, you can handle that from userspace
via KVM_SET_VCPU_EVENTS.  But if it's just as easy to hack KVM, that's totally fine
too, so long as userspace doesn't double inject.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11 14:16             ` Sean Christopherson
@ 2022-05-11 14:38               ` Arnabjyoti Kalita
  2022-05-11 15:04                 ` Sean Christopherson
  0 siblings, 1 reply; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-11 14:38 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, kvm

Hello Sean,

Thank you for your answer. It might be easier and my code might be
more "portable" if I allow the guest to handle the #BP themselves.

When a VMEXIT happens, if I allow KVM to inject the #BP, cause a
VMEXIT to userspace, do what I want when that happens and then, allow
KVM to restart the guest via KVM_VCPU_RUN, I understand that the guest
will handle the #BP and continue execution.

What could be the various ways a guest could handle #BP? Can we "make"
the guest skip the instruction that caused the #BP ?

Best Regards,
Arnabjyoti Kalita


On Wed, May 11, 2022 at 7:46 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, May 11, 2022, Arnabjyoti Kalita wrote:
> > Hello Jim and Sean,
> >
> > Thank you for your answers.
> >
> > If I re-inject the #BP back into the guest, does it automatically take
> > care of updating the RIP and continuing execution?
>
> Yes, the guest "automatically" handles the #BP.  What the appropriate handling may
> be is up to the guest, i.e. skipping an instruction may or may not be the correct
> thing to do.  Injecting the #BP after VM-Exit is simply emulating what would happen
> from the guest's perspective if KVM had never intercepted the #BP in the first place.
>
> Note, KVM doesn't have to initiate the injection, you can handle that from userspace
> via KVM_SET_VCPU_EVENTS.  But if it's just as easy to hack KVM, that's totally fine
> too, so long as userspace doesn't double inject.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11 14:38               ` Arnabjyoti Kalita
@ 2022-05-11 15:04                 ` Sean Christopherson
  2022-05-11 17:02                   ` Arnabjyoti Kalita
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2022-05-11 15:04 UTC (permalink / raw)
  To: Arnabjyoti Kalita; +Cc: Jim Mattson, kvm

On Wed, May 11, 2022, Arnabjyoti Kalita wrote:
> What could be the various ways a guest could handle #BP?

The kernel uses INT3 to patch instructions/flows, e.g. for alternatives.  For those,
the INT3 handler will unwind to the original RIP and retry.  The #BP will keep
occurring until the patching completes.  See text_poke_bp_batch(), poke_int3_handler(),
etc...

Userspace debuggers will do something similar; after catching the #BP, the original
instruction is restored and restarted.

The reason INT3 is a single byte is so that software can "atomically" trap/patch an
instruction without having to worry about cache line splits.  CPUs are guaranteed
to either see the INT3 or the original instruction in its entirety, i.e. other CPUs
will never decode a half-baked instruction.

The kernel has even fancier uses for things like static_call(), e.g. emulating
CALL, RET, and JMP from the #BP handler.

> Can we "make" the guest skip the instruction that caused the #BP ?

Well, technically yes, that's effectively what would happen if the host skips the
INT3 and doesn't inject the #BP.  Can you do that and expect the guest not to
crash?  Nope.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11 15:04                 ` Sean Christopherson
@ 2022-05-11 17:02                   ` Arnabjyoti Kalita
  2022-05-23 19:44                     ` Arnabjyoti Kalita
  0 siblings, 1 reply; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-11 17:02 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, kvm

Thank you for your answer, Sean.

I think I now have a fair idea on how to proceed. I will re-inject the
#BP into the guest from KVM and see what happens. I'm hoping the guest
will handle the #BP and continue execution without me needing to make
any more changes.

Best Regards,
Arnabjyoti Kalita

On Wed, May 11, 2022 at 8:34 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, May 11, 2022, Arnabjyoti Kalita wrote:
> > What could be the various ways a guest could handle #BP?
>
> The kernel uses INT3 to patch instructions/flows, e.g. for alternatives.  For those,
> the INT3 handler will unwind to the original RIP and retry.  The #BP will keep
> occurring until the patching completes.  See text_poke_bp_batch(), poke_int3_handler(),
> etc...
>
> Userspace debuggers will do something similar; after catching the #BP, the original
> instruction is restored and restarted.
>
> The reason INT3 is a single byte is so that software can "atomically" trap/patch an
> instruction without having to worry about cache line splits.  CPUs are guaranteed
> to either see the INT3 or the original instruction in its entirety, i.e. other CPUs
> will never decode a half-baked instruction.
>
> The kernel has even fancier uses for things like static_call(), e.g. emulating
> CALL, RET, and JMP from the #BP handler.
>
> > Can we "make" the guest skip the instruction that caused the #BP ?
>
> Well, technically yes, that's effectively what would happen if the host skips the
> INT3 and doesn't inject the #BP.  Can you do that and expect the guest not to
> crash?  Nope.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Causing VMEXITs when kprobes are hit in the guest VM
  2022-05-11 17:02                   ` Arnabjyoti Kalita
@ 2022-05-23 19:44                     ` Arnabjyoti Kalita
  0 siblings, 0 replies; 12+ messages in thread
From: Arnabjyoti Kalita @ 2022-05-23 19:44 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Jim Mattson, kvm

Dear Sean and all,

I tried to re-inject the #BP into the guest from userspace using the
KVM_SET_VCPU_EVENTS ioctl. However, I see that after some time, the
VMEXITs stop happening (due to KVM_EXIT_DEBUG). I am suspicious that
the guest RIP hasn't been updated when I re-inject the #BP exception.
The kprobe logs tell me that the kprobe functions are being accessed
quite frequently (I  am attaching a kprobe at "free_one_page()" and I
do not expect it to be called almost every microsecond).

This is the code I wrote in userspace to re-inject the #BP interrupt
into the guest.

The below code is from QEMU version 5.0.1

switch (run->exit_reason) {
    /* other exit reason handling */
    case KVM_EXIT_DEBUG:
         struct kvm_vcpu_events events = {};
         events.exception.nr = run->debug.arch.exception;
         events.exception.has_error_code = 0;
         events.exception.pending = 1;
         events.exception.injected = 1;
         events.exception.error_code = 0;
         if (kvm_vcpu_ioctl(cpu, KVM_SET_VCPU_EVENTS, &events) < 0)
             printf("Error while doing ioctl KVM_SET_VCPU_EVENTS");
         ret = 0;
         break:

Do you see a need to initialize any other structure member in
kvm_vcpu_events{} ? Do I need to change any of the structure member
values that I am passing to the ioctl command? Why is the RIP still
not updating ?

Thank you very much for your answer again.

Best Regards,
Arnabjyoti Kalita



On Wed, May 11, 2022 at 10:32 PM Arnabjyoti Kalita
<akalita@cs.stonybrook.edu> wrote:
>
> Thank you for your answer, Sean.
>
> I think I now have a fair idea on how to proceed. I will re-inject the
> #BP into the guest from KVM and see what happens. I'm hoping the guest
> will handle the #BP and continue execution without me needing to make
> any more changes.
>
> Best Regards,
> Arnabjyoti Kalita
>
> On Wed, May 11, 2022 at 8:34 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, May 11, 2022, Arnabjyoti Kalita wrote:
> > > What could be the various ways a guest could handle #BP?
> >
> > The kernel uses INT3 to patch instructions/flows, e.g. for alternatives.  For those,
> > the INT3 handler will unwind to the original RIP and retry.  The #BP will keep
> > occurring until the patching completes.  See text_poke_bp_batch(), poke_int3_handler(),
> > etc...
> >
> > Userspace debuggers will do something similar; after catching the #BP, the original
> > instruction is restored and restarted.
> >
> > The reason INT3 is a single byte is so that software can "atomically" trap/patch an
> > instruction without having to worry about cache line splits.  CPUs are guaranteed
> > to either see the INT3 or the original instruction in its entirety, i.e. other CPUs
> > will never decode a half-baked instruction.
> >
> > The kernel has even fancier uses for things like static_call(), e.g. emulating
> > CALL, RET, and JMP from the #BP handler.
> >
> > > Can we "make" the guest skip the instruction that caused the #BP ?
> >
> > Well, technically yes, that's effectively what would happen if the host skips the
> > INT3 and doesn't inject the #BP.  Can you do that and expect the guest not to
> > crash?  Nope.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-05-23 19:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-01 15:30 Causing VMEXITs when kprobes are hit in the guest VM Arnabjyoti Kalita
2022-05-03 20:45 ` Sean Christopherson
2022-05-06  5:14   ` Arnabjyoti Kalita
2022-05-07  6:30     ` Arnabjyoti Kalita
2022-05-11  0:49       ` Jim Mattson
2022-05-11 13:59         ` Sean Christopherson
2022-05-11 14:08           ` Arnabjyoti Kalita
2022-05-11 14:16             ` Sean Christopherson
2022-05-11 14:38               ` Arnabjyoti Kalita
2022-05-11 15:04                 ` Sean Christopherson
2022-05-11 17:02                   ` Arnabjyoti Kalita
2022-05-23 19:44                     ` Arnabjyoti Kalita

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.