* [PATCH 1/3] scripts/gdb: rework lx-symbols gdb script
2021-03-15 22:10 [PATCH 0/3] KVM: my debug patch queue Maxim Levitsky
@ 2021-03-15 22:10 ` Maxim Levitsky
2021-03-16 13:38 ` Jan Kiszka
2021-03-15 22:10 ` [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping Maxim Levitsky
2021-03-15 22:10 ` [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug Maxim Levitsky
2 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-15 22:10 UTC (permalink / raw)
To: kvm
Cc: Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, Maxim Levitsky, H. Peter Anvin,
Paolo Bonzini, Ingo Molnar
Fix several issues that are present in lx-symbols script:
* Track module unloads by placing another software breakpoint at 'free_module'
(force uninline this symbol just in case), and use remove-symbol-file
gdb command to unload the symobls of the module that is unloading.
That gives the gdb a chance to mark all software breakpoints from
this module as pending again.
Also remove the module from the 'known' module list once it is unloaded.
* Since we now track module unload, we don't need to reload all
symbols anymore when 'known' module loaded again (that can't happen anymore).
This allows reloading a module in the debugged kernel to finish much faster,
while lx-symbols tracks module loads and unloads.
* Disable/enable all gdb breakpoints on both module load and unload breakpoint
hits, and not only in 'load_all_symbols' as was done before.
(load_all_symbols is no longer called on breakpoint hit)
That allows gdb to avoid getting confused about the state of the (now two)
internal breakpoints we place.
Otherwise it will leave them in the kernel code segment, when continuing
which triggers a guest kernel panic as soon as it skips over the 'int3'
instruction and executes the garbage tail of the optcode on which
the breakpoint was placed.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
kernel/module.c | 8 ++-
scripts/gdb/linux/symbols.py | 106 +++++++++++++++++++++++++----------
2 files changed, 83 insertions(+), 31 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c
index 30479355ab850..ea81fc06ea1f5 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -901,8 +901,12 @@ int module_refcount(struct module *mod)
}
EXPORT_SYMBOL(module_refcount);
-/* This exists whether we can unload or not */
-static void free_module(struct module *mod);
+/* This exists whether we can unload or not
+ * Keep it uninlined to provide a reliable breakpoint target,
+ * e.g. for the gdb helper command 'lx-symbols'.
+ */
+
+static noinline void free_module(struct module *mod);
SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
unsigned int, flags)
diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
index 1be9763cf8bb2..4ce879548a1ae 100644
--- a/scripts/gdb/linux/symbols.py
+++ b/scripts/gdb/linux/symbols.py
@@ -17,6 +17,24 @@ import re
from linux import modules, utils
+def save_state():
+ breakpoints = []
+ if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
+ for bp in gdb.breakpoints():
+ breakpoints.append({'breakpoint': bp, 'enabled': bp.enabled})
+ bp.enabled = False
+
+ show_pagination = gdb.execute("show pagination", to_string=True)
+ pagination = show_pagination.endswith("on.\n")
+ gdb.execute("set pagination off")
+
+ return {"breakpoints":breakpoints, "show_pagination": show_pagination}
+
+def load_state(state):
+ for breakpoint in state["breakpoints"]:
+ breakpoint['breakpoint'].enabled = breakpoint['enabled']
+ gdb.execute("set pagination %s" % ("on" if state["show_pagination"] else "off"))
+
if hasattr(gdb, 'Breakpoint'):
class LoadModuleBreakpoint(gdb.Breakpoint):
@@ -30,26 +48,38 @@ if hasattr(gdb, 'Breakpoint'):
module_name = module['name'].string()
cmd = self.gdb_command
+ # module already loaded, false alarm
+ if module_name in cmd.loaded_modules:
+ return False
+
# enforce update if object file is not found
cmd.module_files_updated = False
# Disable pagination while reporting symbol (re-)loading.
# The console input is blocked in this context so that we would
# get stuck waiting for the user to acknowledge paged output.
- show_pagination = gdb.execute("show pagination", to_string=True)
- pagination = show_pagination.endswith("on.\n")
- gdb.execute("set pagination off")
+ state = save_state()
+ cmd.load_module_symbols(module)
+ load_state(state)
+ return False
- if module_name in cmd.loaded_modules:
- gdb.write("refreshing all symbols to reload module "
- "'{0}'\n".format(module_name))
- cmd.load_all_symbols()
- else:
- cmd.load_module_symbols(module)
+ class UnLoadModuleBreakpoint(gdb.Breakpoint):
+ def __init__(self, spec, gdb_command):
+ super(UnLoadModuleBreakpoint, self).__init__(spec, internal=True)
+ self.silent = True
+ self.gdb_command = gdb_command
+
+ def stop(self):
+ module = gdb.parse_and_eval("mod")
+ module_name = module['name'].string()
+ cmd = self.gdb_command
- # restore pagination state
- gdb.execute("set pagination %s" % ("on" if pagination else "off"))
+ if not module_name in cmd.loaded_modules:
+ return False
+ state = save_state()
+ cmd.unload_module_symbols(module)
+ load_state(state)
return False
@@ -64,8 +94,9 @@ lx-symbols command."""
module_paths = []
module_files = []
module_files_updated = False
- loaded_modules = []
- breakpoint = None
+ loaded_modules = {}
+ module_load_breakpoint = None
+ module_unload_breakpoint = None
def __init__(self):
super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
@@ -129,21 +160,32 @@ lx-symbols command."""
filename=module_file,
addr=module_addr,
sections=self._section_arguments(module))
+
gdb.execute(cmdline, to_string=True)
- if module_name not in self.loaded_modules:
- self.loaded_modules.append(module_name)
+ self.loaded_modules[module_name] = {"module_file": module_file,
+ "module_addr": module_addr}
else:
gdb.write("no module object found for '{0}'\n".format(module_name))
+ def unload_module_symbols(self, module):
+ module_name = module['name'].string()
+
+ module_file = self.loaded_modules[module_name]["module_file"]
+ module_addr = self.loaded_modules[module_name]["module_addr"]
+
+ gdb.write("unloading @{addr}: {filename}\n".format(
+ addr=module_addr, filename=module_file))
+ cmdline = "remove-symbol-file {filename}".format(
+ filename=module_file)
+
+ gdb.execute(cmdline, to_string=True)
+ del self.loaded_modules[module_name]
+
+
def load_all_symbols(self):
gdb.write("loading vmlinux\n")
- # Dropping symbols will disable all breakpoints. So save their states
- # and restore them afterward.
- saved_states = []
- if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
- for bp in gdb.breakpoints():
- saved_states.append({'breakpoint': bp, 'enabled': bp.enabled})
+ state = save_state()
# drop all current symbols and reload vmlinux
orig_vmlinux = 'vmlinux'
@@ -153,15 +195,14 @@ lx-symbols command."""
gdb.execute("symbol-file", to_string=True)
gdb.execute("symbol-file {0}".format(orig_vmlinux))
- self.loaded_modules = []
+ self.loaded_modules = {}
module_list = modules.module_list()
if not module_list:
gdb.write("no modules found\n")
else:
[self.load_module_symbols(module) for module in module_list]
- for saved_state in saved_states:
- saved_state['breakpoint'].enabled = saved_state['enabled']
+ load_state(state)
def invoke(self, arg, from_tty):
self.module_paths = [os.path.expanduser(p) for p in arg.split()]
@@ -174,11 +215,18 @@ lx-symbols command."""
self.load_all_symbols()
if hasattr(gdb, 'Breakpoint'):
- if self.breakpoint is not None:
- self.breakpoint.delete()
- self.breakpoint = None
- self.breakpoint = LoadModuleBreakpoint(
- "kernel/module.c:do_init_module", self)
+ if self.module_load_breakpoint is not None:
+ self.module_load_breakpoint.delete()
+ self.module_load_breakpoint = None
+ self.module_load_breakpoint = \
+ LoadModuleBreakpoint("kernel/module.c:do_init_module", self)
+
+ if self.module_unload_breakpoint is not None:
+ self.module_unload_breakpoint.delete()
+ self.module_unload_breakpoint = None
+ self.module_unload_breakpoint = \
+ UnLoadModuleBreakpoint("kernel/module.c:free_module", self)
+
else:
gdb.write("Note: symbol update on module loading not supported "
"with this gdb version\n")
--
2.26.2
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] scripts/gdb: rework lx-symbols gdb script
2021-03-15 22:10 ` [PATCH 1/3] scripts/gdb: rework lx-symbols gdb script Maxim Levitsky
@ 2021-03-16 13:38 ` Jan Kiszka
2021-03-16 14:12 ` Maxim Levitsky
0 siblings, 1 reply; 32+ messages in thread
From: Jan Kiszka @ 2021-03-16 13:38 UTC (permalink / raw)
To: Maxim Levitsky, kvm
Cc: Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On 15.03.21 23:10, Maxim Levitsky wrote:
> Fix several issues that are present in lx-symbols script:
>
> * Track module unloads by placing another software breakpoint at 'free_module'
> (force uninline this symbol just in case), and use remove-symbol-file
> gdb command to unload the symobls of the module that is unloading.
>
> That gives the gdb a chance to mark all software breakpoints from
> this module as pending again.
> Also remove the module from the 'known' module list once it is unloaded.
>
> * Since we now track module unload, we don't need to reload all
> symbols anymore when 'known' module loaded again (that can't happen anymore).
> This allows reloading a module in the debugged kernel to finish much faster,
> while lx-symbols tracks module loads and unloads.
>
> * Disable/enable all gdb breakpoints on both module load and unload breakpoint
> hits, and not only in 'load_all_symbols' as was done before.
> (load_all_symbols is no longer called on breakpoint hit)
> That allows gdb to avoid getting confused about the state of the (now two)
> internal breakpoints we place.
>
> Otherwise it will leave them in the kernel code segment, when continuing
> which triggers a guest kernel panic as soon as it skips over the 'int3'
> instruction and executes the garbage tail of the optcode on which
> the breakpoint was placed.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
> kernel/module.c | 8 ++-
> scripts/gdb/linux/symbols.py | 106 +++++++++++++++++++++++++----------
> 2 files changed, 83 insertions(+), 31 deletions(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 30479355ab850..ea81fc06ea1f5 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -901,8 +901,12 @@ int module_refcount(struct module *mod)
> }
> EXPORT_SYMBOL(module_refcount);
>
> -/* This exists whether we can unload or not */
> -static void free_module(struct module *mod);
> +/* This exists whether we can unload or not
> + * Keep it uninlined to provide a reliable breakpoint target,
> + * e.g. for the gdb helper command 'lx-symbols'.
> + */
> +
> +static noinline void free_module(struct module *mod);
>
> SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> unsigned int, flags)
> diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
> index 1be9763cf8bb2..4ce879548a1ae 100644
> --- a/scripts/gdb/linux/symbols.py
> +++ b/scripts/gdb/linux/symbols.py
> @@ -17,6 +17,24 @@ import re
>
> from linux import modules, utils
>
> +def save_state():
Naming is a bit too generic. And it's not only saving the state, it's
also disabling things.
> + breakpoints = []
> + if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> + for bp in gdb.breakpoints():
> + breakpoints.append({'breakpoint': bp, 'enabled': bp.enabled})
> + bp.enabled = False
> +
> + show_pagination = gdb.execute("show pagination", to_string=True)
> + pagination = show_pagination.endswith("on.\n")
> + gdb.execute("set pagination off")
> +
> + return {"breakpoints":breakpoints, "show_pagination": show_pagination}
> +
> +def load_state(state):
Maybe rather something with "restore", to make naming balanced. Or is
there a use case where "state" is not coming from the function above?
> + for breakpoint in state["breakpoints"]:
> + breakpoint['breakpoint'].enabled = breakpoint['enabled']
> + gdb.execute("set pagination %s" % ("on" if state["show_pagination"] else "off"))
> +
>
> if hasattr(gdb, 'Breakpoint'):
> class LoadModuleBreakpoint(gdb.Breakpoint):
> @@ -30,26 +48,38 @@ if hasattr(gdb, 'Breakpoint'):
> module_name = module['name'].string()
> cmd = self.gdb_command
>
> + # module already loaded, false alarm
> + if module_name in cmd.loaded_modules:
> + return False
Possibly at all, now that we track unloading? Can our state tracking
become out-of-sync?
> +
> # enforce update if object file is not found
> cmd.module_files_updated = False
>
> # Disable pagination while reporting symbol (re-)loading.
> # The console input is blocked in this context so that we would
> # get stuck waiting for the user to acknowledge paged output.
> - show_pagination = gdb.execute("show pagination", to_string=True)
> - pagination = show_pagination.endswith("on.\n")
> - gdb.execute("set pagination off")
> + state = save_state()
> + cmd.load_module_symbols(module)
> + load_state(state)
> + return False
>
> - if module_name in cmd.loaded_modules:
> - gdb.write("refreshing all symbols to reload module "
> - "'{0}'\n".format(module_name))
> - cmd.load_all_symbols()
> - else:
> - cmd.load_module_symbols(module)
> + class UnLoadModuleBreakpoint(gdb.Breakpoint):
> + def __init__(self, spec, gdb_command):
> + super(UnLoadModuleBreakpoint, self).__init__(spec, internal=True)
> + self.silent = True
> + self.gdb_command = gdb_command
> +
> + def stop(self):
> + module = gdb.parse_and_eval("mod")
> + module_name = module['name'].string()
> + cmd = self.gdb_command
>
> - # restore pagination state
> - gdb.execute("set pagination %s" % ("on" if pagination else "off"))
> + if not module_name in cmd.loaded_modules:
> + return False
>
Same question as above. For robustness, checking is not bad. But maybe
it's worth reporting as well.
> + state = save_state()
> + cmd.unload_module_symbols(module)
> + load_state(state)
> return False
>
>
> @@ -64,8 +94,9 @@ lx-symbols command."""
> module_paths = []
> module_files = []
> module_files_updated = False
> - loaded_modules = []
> - breakpoint = None
> + loaded_modules = {}
> + module_load_breakpoint = None
> + module_unload_breakpoint = None
>
> def __init__(self):
> super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
> @@ -129,21 +160,32 @@ lx-symbols command."""
> filename=module_file,
> addr=module_addr,
> sections=self._section_arguments(module))
> +
> gdb.execute(cmdline, to_string=True)
> - if module_name not in self.loaded_modules:
> - self.loaded_modules.append(module_name)
> + self.loaded_modules[module_name] = {"module_file": module_file,
> + "module_addr": module_addr}
> else:
> gdb.write("no module object found for '{0}'\n".format(module_name))
>
> + def unload_module_symbols(self, module):
> + module_name = module['name'].string()
> +
> + module_file = self.loaded_modules[module_name]["module_file"]
> + module_addr = self.loaded_modules[module_name]["module_addr"]
> +
> + gdb.write("unloading @{addr}: {filename}\n".format(
> + addr=module_addr, filename=module_file))
> + cmdline = "remove-symbol-file {filename}".format(
> + filename=module_file)
> +
> + gdb.execute(cmdline, to_string=True)
> + del self.loaded_modules[module_name]
> +
> +
> def load_all_symbols(self):
> gdb.write("loading vmlinux\n")
>
> - # Dropping symbols will disable all breakpoints. So save their states
> - # and restore them afterward.
> - saved_states = []
> - if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> - for bp in gdb.breakpoints():
> - saved_states.append({'breakpoint': bp, 'enabled': bp.enabled})
> + state = save_state()
>
> # drop all current symbols and reload vmlinux
> orig_vmlinux = 'vmlinux'
> @@ -153,15 +195,14 @@ lx-symbols command."""
> gdb.execute("symbol-file", to_string=True)
> gdb.execute("symbol-file {0}".format(orig_vmlinux))
>
> - self.loaded_modules = []
> + self.loaded_modules = {}
> module_list = modules.module_list()
> if not module_list:
> gdb.write("no modules found\n")
> else:
> [self.load_module_symbols(module) for module in module_list]
>
> - for saved_state in saved_states:
> - saved_state['breakpoint'].enabled = saved_state['enabled']
> + load_state(state)
>
> def invoke(self, arg, from_tty):
> self.module_paths = [os.path.expanduser(p) for p in arg.split()]
> @@ -174,11 +215,18 @@ lx-symbols command."""
> self.load_all_symbols()
>
> if hasattr(gdb, 'Breakpoint'):
> - if self.breakpoint is not None:
> - self.breakpoint.delete()
> - self.breakpoint = None
> - self.breakpoint = LoadModuleBreakpoint(
> - "kernel/module.c:do_init_module", self)
> + if self.module_load_breakpoint is not None:
> + self.module_load_breakpoint.delete()
> + self.module_load_breakpoint = None
> + self.module_load_breakpoint = \
> + LoadModuleBreakpoint("kernel/module.c:do_init_module", self)
> +
> + if self.module_unload_breakpoint is not None:
> + self.module_unload_breakpoint.delete()
> + self.module_unload_breakpoint = None
> + self.module_unload_breakpoint = \
> + UnLoadModuleBreakpoint("kernel/module.c:free_module", self)
> +
> else:
> gdb.write("Note: symbol update on module loading not supported "
> "with this gdb version\n")
>
Good improvement!
Jan
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] scripts/gdb: rework lx-symbols gdb script
2021-03-16 13:38 ` Jan Kiszka
@ 2021-03-16 14:12 ` Maxim Levitsky
0 siblings, 0 replies; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-16 14:12 UTC (permalink / raw)
To: Jan Kiszka, kvm
Cc: Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Tue, 2021-03-16 at 14:38 +0100, Jan Kiszka wrote:
> On 15.03.21 23:10, Maxim Levitsky wrote:
> > Fix several issues that are present in lx-symbols script:
> >
> > * Track module unloads by placing another software breakpoint at 'free_module'
> > (force uninline this symbol just in case), and use remove-symbol-file
> > gdb command to unload the symobls of the module that is unloading.
> >
> > That gives the gdb a chance to mark all software breakpoints from
> > this module as pending again.
> > Also remove the module from the 'known' module list once it is unloaded.
> >
> > * Since we now track module unload, we don't need to reload all
> > symbols anymore when 'known' module loaded again (that can't happen anymore).
> > This allows reloading a module in the debugged kernel to finish much faster,
> > while lx-symbols tracks module loads and unloads.
> >
> > * Disable/enable all gdb breakpoints on both module load and unload breakpoint
> > hits, and not only in 'load_all_symbols' as was done before.
> > (load_all_symbols is no longer called on breakpoint hit)
> > That allows gdb to avoid getting confused about the state of the (now two)
> > internal breakpoints we place.
> >
> > Otherwise it will leave them in the kernel code segment, when continuing
> > which triggers a guest kernel panic as soon as it skips over the 'int3'
> > instruction and executes the garbage tail of the optcode on which
> > the breakpoint was placed.
> >
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> > kernel/module.c | 8 ++-
> > scripts/gdb/linux/symbols.py | 106 +++++++++++++++++++++++++----------
> > 2 files changed, 83 insertions(+), 31 deletions(-)
> >
> > diff --git a/kernel/module.c b/kernel/module.c
> > index 30479355ab850..ea81fc06ea1f5 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -901,8 +901,12 @@ int module_refcount(struct module *mod)
> > }
> > EXPORT_SYMBOL(module_refcount);
> >
> > -/* This exists whether we can unload or not */
> > -static void free_module(struct module *mod);
> > +/* This exists whether we can unload or not
> > + * Keep it uninlined to provide a reliable breakpoint target,
> > + * e.g. for the gdb helper command 'lx-symbols'.
> > + */
> > +
> > +static noinline void free_module(struct module *mod);
> >
> > SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> > unsigned int, flags)
> > diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
> > index 1be9763cf8bb2..4ce879548a1ae 100644
> > --- a/scripts/gdb/linux/symbols.py
> > +++ b/scripts/gdb/linux/symbols.py
> > @@ -17,6 +17,24 @@ import re
> >
> > from linux import modules, utils
> >
> > +def save_state():
>
> Naming is a bit too generic. And it's not only saving the state, it's
> also disabling things.
>
> > + breakpoints = []
> > + if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> > + for bp in gdb.breakpoints():
> > + breakpoints.append({'breakpoint': bp, 'enabled': bp.enabled})
> > + bp.enabled = False
> > +
> > + show_pagination = gdb.execute("show pagination", to_string=True)
> > + pagination = show_pagination.endswith("on.\n")
> > + gdb.execute("set pagination off")
> > +
> > + return {"breakpoints":breakpoints, "show_pagination": show_pagination}
> > +
> > +def load_state(state):
>
> Maybe rather something with "restore", to make naming balanced. Or is
> there a use case where "state" is not coming from the function above?
I didn't put much thought into naming these functions.
I'll think of something better.
>
> > + for breakpoint in state["breakpoints"]:
> > + breakpoint['breakpoint'].enabled = breakpoint['enabled']
> > + gdb.execute("set pagination %s" % ("on" if state["show_pagination"] else "off"))
> > +
> >
> > if hasattr(gdb, 'Breakpoint'):
> > class LoadModuleBreakpoint(gdb.Breakpoint):
> > @@ -30,26 +48,38 @@ if hasattr(gdb, 'Breakpoint'):
> > module_name = module['name'].string()
> > cmd = self.gdb_command
> >
> > + # module already loaded, false alarm
> > + if module_name in cmd.loaded_modules:
> > + return False
>
> Possibly at all, now that we track unloading? Can our state tracking
> become out-of-sync?
Sadly yes and that happens a lot unless kvm is patched to
avoid injecting interrupts on a single step.
What is happening is that this breakpoint is hit, then symbols
are loaded which is a relatively slow process, and by the time
the gdb resumes the guest, a timer interrupt is already pending
(kernel local apic timer is running while vcpus are stopped),
which makes the guest kernel take the interrupt (interrupts are
not disabled in these two intercepted functions) and eventually
return to the breakpoint, and trigger its python handler again.
This happens so often that especially with multiple vcpus,
it is possible to enter live-lock like state where guest+gdb
are stuck in this loop forever.
When KVM is patched, that indeed shouldn't happen but the check
won't hurt here.
Plus due to the feedback I received on the patch 2, I will end up
implementing the 'don't inject interrupts on single step' as
a new KVM debug feature flag, thus you will need a newer qemu
to make use of it.
>
> > +
> > # enforce update if object file is not found
> > cmd.module_files_updated = False
> >
> > # Disable pagination while reporting symbol (re-)loading.
> > # The console input is blocked in this context so that we would
> > # get stuck waiting for the user to acknowledge paged output.
> > - show_pagination = gdb.execute("show pagination", to_string=True)
> > - pagination = show_pagination.endswith("on.\n")
> > - gdb.execute("set pagination off")
> > + state = save_state()
> > + cmd.load_module_symbols(module)
> > + load_state(state)
> > + return False
> >
> > - if module_name in cmd.loaded_modules:
> > - gdb.write("refreshing all symbols to reload module "
> > - "'{0}'\n".format(module_name))
> > - cmd.load_all_symbols()
> > - else:
> > - cmd.load_module_symbols(module)
> > + class UnLoadModuleBreakpoint(gdb.Breakpoint):
> > + def __init__(self, spec, gdb_command):
> > + super(UnLoadModuleBreakpoint, self).__init__(spec, internal=True)
> > + self.silent = True
> > + self.gdb_command = gdb_command
> > +
> > + def stop(self):
> > + module = gdb.parse_and_eval("mod")
> > + module_name = module['name'].string()
> > + cmd = self.gdb_command
> >
> > - # restore pagination state
> > - gdb.execute("set pagination %s" % ("on" if pagination else "off"))
> > + if not module_name in cmd.loaded_modules:
> > + return False
> >
>
> Same question as above. For robustness, checking is not bad. But maybe
> it's worth reporting as well.
>
> > + state = save_state()
> > + cmd.unload_module_symbols(module)
> > + load_state(state)
> > return False
> >
> >
> > @@ -64,8 +94,9 @@ lx-symbols command."""
> > module_paths = []
> > module_files = []
> > module_files_updated = False
> > - loaded_modules = []
> > - breakpoint = None
> > + loaded_modules = {}
> > + module_load_breakpoint = None
> > + module_unload_breakpoint = None
> >
> > def __init__(self):
> > super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
> > @@ -129,21 +160,32 @@ lx-symbols command."""
> > filename=module_file,
> > addr=module_addr,
> > sections=self._section_arguments(module))
> > +
> > gdb.execute(cmdline, to_string=True)
> > - if module_name not in self.loaded_modules:
> > - self.loaded_modules.append(module_name)
> > + self.loaded_modules[module_name] = {"module_file": module_file,
> > + "module_addr": module_addr}
> > else:
> > gdb.write("no module object found for '{0}'\n".format(module_name))
> >
> > + def unload_module_symbols(self, module):
> > + module_name = module['name'].string()
> > +
> > + module_file = self.loaded_modules[module_name]["module_file"]
> > + module_addr = self.loaded_modules[module_name]["module_addr"]
> > +
> > + gdb.write("unloading @{addr}: {filename}\n".format(
> > + addr=module_addr, filename=module_file))
> > + cmdline = "remove-symbol-file {filename}".format(
> > + filename=module_file)
> > +
> > + gdb.execute(cmdline, to_string=True)
> > + del self.loaded_modules[module_name]
> > +
> > +
> > def load_all_symbols(self):
> > gdb.write("loading vmlinux\n")
> >
> > - # Dropping symbols will disable all breakpoints. So save their states
> > - # and restore them afterward.
> > - saved_states = []
> > - if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> > - for bp in gdb.breakpoints():
> > - saved_states.append({'breakpoint': bp, 'enabled': bp.enabled})
> > + state = save_state()
> >
> > # drop all current symbols and reload vmlinux
> > orig_vmlinux = 'vmlinux'
> > @@ -153,15 +195,14 @@ lx-symbols command."""
> > gdb.execute("symbol-file", to_string=True)
> > gdb.execute("symbol-file {0}".format(orig_vmlinux))
> >
> > - self.loaded_modules = []
> > + self.loaded_modules = {}
> > module_list = modules.module_list()
> > if not module_list:
> > gdb.write("no modules found\n")
> > else:
> > [self.load_module_symbols(module) for module in module_list]
> >
> > - for saved_state in saved_states:
> > - saved_state['breakpoint'].enabled = saved_state['enabled']
> > + load_state(state)
> >
> > def invoke(self, arg, from_tty):
> > self.module_paths = [os.path.expanduser(p) for p in arg.split()]
> > @@ -174,11 +215,18 @@ lx-symbols command."""
> > self.load_all_symbols()
> >
> > if hasattr(gdb, 'Breakpoint'):
> > - if self.breakpoint is not None:
> > - self.breakpoint.delete()
> > - self.breakpoint = None
> > - self.breakpoint = LoadModuleBreakpoint(
> > - "kernel/module.c:do_init_module", self)
> > + if self.module_load_breakpoint is not None:
> > + self.module_load_breakpoint.delete()
> > + self.module_load_breakpoint = None
> > + self.module_load_breakpoint = \
> > + LoadModuleBreakpoint("kernel/module.c:do_init_module", self)
> > +
> > + if self.module_unload_breakpoint is not None:
> > + self.module_unload_breakpoint.delete()
> > + self.module_unload_breakpoint = None
> > + self.module_unload_breakpoint = \
> > + UnLoadModuleBreakpoint("kernel/module.c:free_module", self)
> > +
> > else:
> > gdb.write("Note: symbol update on module loading not supported "
> > "with this gdb version\n")
> >
>
> Good improvement!
Thanks a lot!
Best regards,
Maxim Levitsky
>
> Jan
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-15 22:10 [PATCH 0/3] KVM: my debug patch queue Maxim Levitsky
2021-03-15 22:10 ` [PATCH 1/3] scripts/gdb: rework lx-symbols gdb script Maxim Levitsky
@ 2021-03-15 22:10 ` Maxim Levitsky
2021-03-15 23:37 ` Sean Christopherson
2021-03-15 22:10 ` [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug Maxim Levitsky
2 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-15 22:10 UTC (permalink / raw)
To: kvm
Cc: Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, Maxim Levitsky, H. Peter Anvin,
Paolo Bonzini, Ingo Molnar
This change greatly helps with two issues:
* Resuming from a breakpoint is much more reliable.
When resuming execution from a breakpoint, with interrupts enabled, more often
than not, KVM would inject an interrupt and make the CPU jump immediately to
the interrupt handler and eventually return to the breakpoint, to trigger it
again.
From the user point of view it looks like the CPU never executed a
single instruction and in some cases that can even prevent forward progress,
for example, when the breakpoint is placed by an automated script
(e.g lx-symbols), which does something in response to the breakpoint and then
continues the guest automatically.
If the script execution takes enough time for another interrupt to arrive,
the guest will be stuck on the same breakpoint RIP forever.
* Normal single stepping is much more predictable, since it won't land the
debugger into an interrupt handler, so it is much more usable.
(If entry to an interrupt handler is desired, the user can still place a
breakpoint at it and resume the guest, which won't activate this workaround
and let the gdb still stop at the interrupt handler)
Since this change is only active when guest is debugged, it won't affect
KVM running normal 'production' VMs.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
---
arch/x86/kvm/x86.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a9d95f90a0487..b75d990fcf12b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
can_inject = false;
}
+ /*
+ * Don't inject interrupts while single stepping to make guest debug easier
+ */
+ if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
+ return;
+
/*
* Finally, inject interrupt events. If an event cannot be injected
* due to architectural conditions (e.g. IF=0) a window-open exit
--
2.26.2
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-15 22:10 ` [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping Maxim Levitsky
@ 2021-03-15 23:37 ` Sean Christopherson
2021-03-16 9:16 ` Jan Kiszka
2021-03-16 10:55 ` Maxim Levitsky
0 siblings, 2 replies; 32+ messages in thread
From: Sean Christopherson @ 2021-03-15 23:37 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Tue, Mar 16, 2021, Maxim Levitsky wrote:
> This change greatly helps with two issues:
>
> * Resuming from a breakpoint is much more reliable.
>
> When resuming execution from a breakpoint, with interrupts enabled, more often
> than not, KVM would inject an interrupt and make the CPU jump immediately to
> the interrupt handler and eventually return to the breakpoint, to trigger it
> again.
>
> From the user point of view it looks like the CPU never executed a
> single instruction and in some cases that can even prevent forward progress,
> for example, when the breakpoint is placed by an automated script
> (e.g lx-symbols), which does something in response to the breakpoint and then
> continues the guest automatically.
> If the script execution takes enough time for another interrupt to arrive,
> the guest will be stuck on the same breakpoint RIP forever.
>
> * Normal single stepping is much more predictable, since it won't land the
> debugger into an interrupt handler, so it is much more usable.
>
> (If entry to an interrupt handler is desired, the user can still place a
> breakpoint at it and resume the guest, which won't activate this workaround
> and let the gdb still stop at the interrupt handler)
>
> Since this change is only active when guest is debugged, it won't affect
> KVM running normal 'production' VMs.
>
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> Tested-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> arch/x86/kvm/x86.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a9d95f90a0487..b75d990fcf12b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
> can_inject = false;
> }
>
> + /*
> + * Don't inject interrupts while single stepping to make guest debug easier
> + */
> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> + return;
Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
blocking at the start of single-stepping, unwind at the end? Deviating this far
from architectural behavior will end in tears at some point.
> +
> /*
> * Finally, inject interrupt events. If an event cannot be injected
> * due to architectural conditions (e.g. IF=0) a window-open exit
> --
> 2.26.2
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-15 23:37 ` Sean Christopherson
@ 2021-03-16 9:16 ` Jan Kiszka
2021-03-16 10:59 ` Maxim Levitsky
2021-03-16 10:55 ` Maxim Levitsky
1 sibling, 1 reply; 32+ messages in thread
From: Jan Kiszka @ 2021-03-16 9:16 UTC (permalink / raw)
To: Sean Christopherson, Maxim Levitsky
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On 16.03.21 00:37, Sean Christopherson wrote:
> On Tue, Mar 16, 2021, Maxim Levitsky wrote:
>> This change greatly helps with two issues:
>>
>> * Resuming from a breakpoint is much more reliable.
>>
>> When resuming execution from a breakpoint, with interrupts enabled, more often
>> than not, KVM would inject an interrupt and make the CPU jump immediately to
>> the interrupt handler and eventually return to the breakpoint, to trigger it
>> again.
>>
>> From the user point of view it looks like the CPU never executed a
>> single instruction and in some cases that can even prevent forward progress,
>> for example, when the breakpoint is placed by an automated script
>> (e.g lx-symbols), which does something in response to the breakpoint and then
>> continues the guest automatically.
>> If the script execution takes enough time for another interrupt to arrive,
>> the guest will be stuck on the same breakpoint RIP forever.
>>
>> * Normal single stepping is much more predictable, since it won't land the
>> debugger into an interrupt handler, so it is much more usable.
>>
>> (If entry to an interrupt handler is desired, the user can still place a
>> breakpoint at it and resume the guest, which won't activate this workaround
>> and let the gdb still stop at the interrupt handler)
>>
>> Since this change is only active when guest is debugged, it won't affect
>> KVM running normal 'production' VMs.
>>
>>
>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>> Tested-by: Stefano Garzarella <sgarzare@redhat.com>
>> ---
>> arch/x86/kvm/x86.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a9d95f90a0487..b75d990fcf12b 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
>> can_inject = false;
>> }
>>
>> + /*
>> + * Don't inject interrupts while single stepping to make guest debug easier
>> + */
>> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
>> + return;
>
> Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
> blocking at the start of single-stepping, unwind at the end? Deviating this far
> from architectural behavior will end in tears at some point.
>
Does this happen to address this suspicious workaround in the kernel?
/*
* The kernel doesn't use TF single-step outside of:
*
* - Kprobes, consumed through kprobe_debug_handler()
* - KGDB, consumed through notify_debug()
*
* So if we get here with DR_STEP set, something is wonky.
*
* A known way to trigger this is through QEMU's GDB stub,
* which leaks #DB into the guest and causes IST recursion.
*/
if (WARN_ON_ONCE(dr6 & DR_STEP))
regs->flags &= ~X86_EFLAGS_TF;
(arch/x86/kernel/traps.c, exc_debug_kernel)
I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
yeah, question to myself as well, dancing around broken guest debugging
for a long time while trying to fix other issues...
Jan
>> +
>> /*
>> * Finally, inject interrupt events. If an event cannot be injected
>> * due to architectural conditions (e.g. IF=0) a window-open exit
>> --
>> 2.26.2
>>
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 9:16 ` Jan Kiszka
@ 2021-03-16 10:59 ` Maxim Levitsky
2021-03-16 11:27 ` Jan Kiszka
0 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-16 10:59 UTC (permalink / raw)
To: Jan Kiszka, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
> On 16.03.21 00:37, Sean Christopherson wrote:
> > On Tue, Mar 16, 2021, Maxim Levitsky wrote:
> > > This change greatly helps with two issues:
> > >
> > > * Resuming from a breakpoint is much more reliable.
> > >
> > > When resuming execution from a breakpoint, with interrupts enabled, more often
> > > than not, KVM would inject an interrupt and make the CPU jump immediately to
> > > the interrupt handler and eventually return to the breakpoint, to trigger it
> > > again.
> > >
> > > From the user point of view it looks like the CPU never executed a
> > > single instruction and in some cases that can even prevent forward progress,
> > > for example, when the breakpoint is placed by an automated script
> > > (e.g lx-symbols), which does something in response to the breakpoint and then
> > > continues the guest automatically.
> > > If the script execution takes enough time for another interrupt to arrive,
> > > the guest will be stuck on the same breakpoint RIP forever.
> > >
> > > * Normal single stepping is much more predictable, since it won't land the
> > > debugger into an interrupt handler, so it is much more usable.
> > >
> > > (If entry to an interrupt handler is desired, the user can still place a
> > > breakpoint at it and resume the guest, which won't activate this workaround
> > > and let the gdb still stop at the interrupt handler)
> > >
> > > Since this change is only active when guest is debugged, it won't affect
> > > KVM running normal 'production' VMs.
> > >
> > >
> > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > Tested-by: Stefano Garzarella <sgarzare@redhat.com>
> > > ---
> > > arch/x86/kvm/x86.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index a9d95f90a0487..b75d990fcf12b 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
> > > can_inject = false;
> > > }
> > >
> > > + /*
> > > + * Don't inject interrupts while single stepping to make guest debug easier
> > > + */
> > > + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > > + return;
> >
> > Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
> > blocking at the start of single-stepping, unwind at the end? Deviating this far
> > from architectural behavior will end in tears at some point.
> >
>
> Does this happen to address this suspicious workaround in the kernel?
>
> /*
> * The kernel doesn't use TF single-step outside of:
> *
> * - Kprobes, consumed through kprobe_debug_handler()
> * - KGDB, consumed through notify_debug()
> *
> * So if we get here with DR_STEP set, something is wonky.
> *
> * A known way to trigger this is through QEMU's GDB stub,
> * which leaks #DB into the guest and causes IST recursion.
> */
> if (WARN_ON_ONCE(dr6 & DR_STEP))
> regs->flags &= ~X86_EFLAGS_TF;
>
> (arch/x86/kernel/traps.c, exc_debug_kernel)
>
> I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
> yeah, question to myself as well, dancing around broken guest debugging
> for a long time while trying to fix other issues...
To be honest I didn't see that warning even once, but I can imagine KVM
leaking #DB due to bugs in that code. That area historically didn't receive
much attention since it can only be triggered by
KVM_GET/SET_GUEST_DEBUG which isn't used in production.
The only issue that I on the other hand did
see which is mostly gdb fault is that it fails to remove a software breakpoint
when resuming over it, if that breakpoint's python handler messes up
with gdb's symbols, which is what lx-symbols does.
And that despite the fact that lx-symbol doesn't mess with the object
(that is the kernel) where the breakpoint is defined.
Just adding/removing one symbol file is enough to trigger this issue.
Since lx-symbols already works this around when it reloads all symbols,
I extended that workaround to happen also when loading/unloading
only a single symbol file.
Best regards,
Maxim Levitsky
>
> Jan
>
> > > +
> > > /*
> > > * Finally, inject interrupt events. If an event cannot be injected
> > > * due to architectural conditions (e.g. IF=0) a window-open exit
> > > --
> > > 2.26.2
> > >
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 10:59 ` Maxim Levitsky
@ 2021-03-16 11:27 ` Jan Kiszka
2021-03-16 12:34 ` Maxim Levitsky
0 siblings, 1 reply; 32+ messages in thread
From: Jan Kiszka @ 2021-03-16 11:27 UTC (permalink / raw)
To: Maxim Levitsky, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On 16.03.21 11:59, Maxim Levitsky wrote:
> On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
>> On 16.03.21 00:37, Sean Christopherson wrote:
>>> On Tue, Mar 16, 2021, Maxim Levitsky wrote:
>>>> This change greatly helps with two issues:
>>>>
>>>> * Resuming from a breakpoint is much more reliable.
>>>>
>>>> When resuming execution from a breakpoint, with interrupts enabled, more often
>>>> than not, KVM would inject an interrupt and make the CPU jump immediately to
>>>> the interrupt handler and eventually return to the breakpoint, to trigger it
>>>> again.
>>>>
>>>> From the user point of view it looks like the CPU never executed a
>>>> single instruction and in some cases that can even prevent forward progress,
>>>> for example, when the breakpoint is placed by an automated script
>>>> (e.g lx-symbols), which does something in response to the breakpoint and then
>>>> continues the guest automatically.
>>>> If the script execution takes enough time for another interrupt to arrive,
>>>> the guest will be stuck on the same breakpoint RIP forever.
>>>>
>>>> * Normal single stepping is much more predictable, since it won't land the
>>>> debugger into an interrupt handler, so it is much more usable.
>>>>
>>>> (If entry to an interrupt handler is desired, the user can still place a
>>>> breakpoint at it and resume the guest, which won't activate this workaround
>>>> and let the gdb still stop at the interrupt handler)
>>>>
>>>> Since this change is only active when guest is debugged, it won't affect
>>>> KVM running normal 'production' VMs.
>>>>
>>>>
>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>> Tested-by: Stefano Garzarella <sgarzare@redhat.com>
>>>> ---
>>>> arch/x86/kvm/x86.c | 6 ++++++
>>>> 1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index a9d95f90a0487..b75d990fcf12b 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
>>>> can_inject = false;
>>>> }
>>>>
>>>> + /*
>>>> + * Don't inject interrupts while single stepping to make guest debug easier
>>>> + */
>>>> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
>>>> + return;
>>>
>>> Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
>>> blocking at the start of single-stepping, unwind at the end? Deviating this far
>>> from architectural behavior will end in tears at some point.
>>>
>>
>> Does this happen to address this suspicious workaround in the kernel?
>>
>> /*
>> * The kernel doesn't use TF single-step outside of:
>> *
>> * - Kprobes, consumed through kprobe_debug_handler()
>> * - KGDB, consumed through notify_debug()
>> *
>> * So if we get here with DR_STEP set, something is wonky.
>> *
>> * A known way to trigger this is through QEMU's GDB stub,
>> * which leaks #DB into the guest and causes IST recursion.
>> */
>> if (WARN_ON_ONCE(dr6 & DR_STEP))
>> regs->flags &= ~X86_EFLAGS_TF;
>>
>> (arch/x86/kernel/traps.c, exc_debug_kernel)
>>
>> I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
>> yeah, question to myself as well, dancing around broken guest debugging
>> for a long time while trying to fix other issues...
>
> To be honest I didn't see that warning even once, but I can imagine KVM
> leaking #DB due to bugs in that code. That area historically didn't receive
> much attention since it can only be triggered by
> KVM_GET/SET_GUEST_DEBUG which isn't used in production.
I've triggered it recently while debugging a guest, that's why I got
aware of the code path. Long ago, all this used to work (soft BPs,
single-stepping etc.)
>
> The only issue that I on the other hand did
> see which is mostly gdb fault is that it fails to remove a software breakpoint
> when resuming over it, if that breakpoint's python handler messes up
> with gdb's symbols, which is what lx-symbols does.
>
> And that despite the fact that lx-symbol doesn't mess with the object
> (that is the kernel) where the breakpoint is defined.
>
> Just adding/removing one symbol file is enough to trigger this issue.
>
> Since lx-symbols already works this around when it reloads all symbols,
> I extended that workaround to happen also when loading/unloading
> only a single symbol file.
You have no issue with interactive debugging when NOT using gdb scripts
/ lx-symbol?
Jan
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 11:27 ` Jan Kiszka
@ 2021-03-16 12:34 ` Maxim Levitsky
2021-03-16 13:46 ` Jan Kiszka
2021-03-18 12:21 ` Jan Kiszka
0 siblings, 2 replies; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-16 12:34 UTC (permalink / raw)
To: Jan Kiszka, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Tue, 2021-03-16 at 12:27 +0100, Jan Kiszka wrote:
> On 16.03.21 11:59, Maxim Levitsky wrote:
> > On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
> > > On 16.03.21 00:37, Sean Christopherson wrote:
> > > > On Tue, Mar 16, 2021, Maxim Levitsky wrote:
> > > > > This change greatly helps with two issues:
> > > > >
> > > > > * Resuming from a breakpoint is much more reliable.
> > > > >
> > > > > When resuming execution from a breakpoint, with interrupts enabled, more often
> > > > > than not, KVM would inject an interrupt and make the CPU jump immediately to
> > > > > the interrupt handler and eventually return to the breakpoint, to trigger it
> > > > > again.
> > > > >
> > > > > From the user point of view it looks like the CPU never executed a
> > > > > single instruction and in some cases that can even prevent forward progress,
> > > > > for example, when the breakpoint is placed by an automated script
> > > > > (e.g lx-symbols), which does something in response to the breakpoint and then
> > > > > continues the guest automatically.
> > > > > If the script execution takes enough time for another interrupt to arrive,
> > > > > the guest will be stuck on the same breakpoint RIP forever.
> > > > >
> > > > > * Normal single stepping is much more predictable, since it won't land the
> > > > > debugger into an interrupt handler, so it is much more usable.
> > > > >
> > > > > (If entry to an interrupt handler is desired, the user can still place a
> > > > > breakpoint at it and resume the guest, which won't activate this workaround
> > > > > and let the gdb still stop at the interrupt handler)
> > > > >
> > > > > Since this change is only active when guest is debugged, it won't affect
> > > > > KVM running normal 'production' VMs.
> > > > >
> > > > >
> > > > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > > > Tested-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > > ---
> > > > > arch/x86/kvm/x86.c | 6 ++++++
> > > > > 1 file changed, 6 insertions(+)
> > > > >
> > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > index a9d95f90a0487..b75d990fcf12b 100644
> > > > > --- a/arch/x86/kvm/x86.c
> > > > > +++ b/arch/x86/kvm/x86.c
> > > > > @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
> > > > > can_inject = false;
> > > > > }
> > > > >
> > > > > + /*
> > > > > + * Don't inject interrupts while single stepping to make guest debug easier
> > > > > + */
> > > > > + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > > > > + return;
> > > >
> > > > Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
> > > > blocking at the start of single-stepping, unwind at the end? Deviating this far
> > > > from architectural behavior will end in tears at some point.
> > > >
> > >
> > > Does this happen to address this suspicious workaround in the kernel?
> > >
> > > /*
> > > * The kernel doesn't use TF single-step outside of:
> > > *
> > > * - Kprobes, consumed through kprobe_debug_handler()
> > > * - KGDB, consumed through notify_debug()
> > > *
> > > * So if we get here with DR_STEP set, something is wonky.
> > > *
> > > * A known way to trigger this is through QEMU's GDB stub,
> > > * which leaks #DB into the guest and causes IST recursion.
> > > */
> > > if (WARN_ON_ONCE(dr6 & DR_STEP))
> > > regs->flags &= ~X86_EFLAGS_TF;
> > >
> > > (arch/x86/kernel/traps.c, exc_debug_kernel)
> > >
> > > I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
> > > yeah, question to myself as well, dancing around broken guest debugging
> > > for a long time while trying to fix other issues...
> >
> > To be honest I didn't see that warning even once, but I can imagine KVM
> > leaking #DB due to bugs in that code. That area historically didn't receive
> > much attention since it can only be triggered by
> > KVM_GET/SET_GUEST_DEBUG which isn't used in production.
>
> I've triggered it recently while debugging a guest, that's why I got
> aware of the code path. Long ago, all this used to work (soft BPs,
> single-stepping etc.)
>
> > The only issue that I on the other hand did
> > see which is mostly gdb fault is that it fails to remove a software breakpoint
> > when resuming over it, if that breakpoint's python handler messes up
> > with gdb's symbols, which is what lx-symbols does.
> >
> > And that despite the fact that lx-symbol doesn't mess with the object
> > (that is the kernel) where the breakpoint is defined.
> >
> > Just adding/removing one symbol file is enough to trigger this issue.
> >
> > Since lx-symbols already works this around when it reloads all symbols,
> > I extended that workaround to happen also when loading/unloading
> > only a single symbol file.
>
> You have no issue with interactive debugging when NOT using gdb scripts
> / lx-symbol?
To be honest I don't use guest debugging that much,
so I probably missed some issues.
Now that I fixed lx-symbols though I'll probably use
guest debugging much more.
I will keep an eye on any issues that I find.
The main push to fix lx-symbols actually came
from me wanting to understand if there is something
broken with KVM's guest debugging knowing that
lx-symbols crashes the guest when module is loaded
after lx-symbols was executed.
That lx-symbols related guest crash I traced to issue
with gdb as I explained, and the lack of blocking of the interrupts
on single step is not a bug but more a missing feature
that should be implemented to make single step easier to use.
Another issue which isn't a bug is that you can't place a software
breakpoint if kernel is not loaded (since there is no code in memory)
or if the kernel haven't done basic paging initialization
(since there is no paging yet to know where to place the breakpoint).
Hardware breakpoints work for this fine though.
So in summary I haven't found any major issues with KVM's guest debug
yet.
If I do notice issues with guest debug, I will try to isolate
and debug them.
For the issue that you mentioned, do you have a way to reproduce it?
Best regards,
Maxim Levitsky
>
> Jan
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 12:34 ` Maxim Levitsky
@ 2021-03-16 13:46 ` Jan Kiszka
2021-03-16 14:34 ` Maxim Levitsky
2021-03-18 12:21 ` Jan Kiszka
1 sibling, 1 reply; 32+ messages in thread
From: Jan Kiszka @ 2021-03-16 13:46 UTC (permalink / raw)
To: Maxim Levitsky, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On 16.03.21 13:34, Maxim Levitsky wrote:
> On Tue, 2021-03-16 at 12:27 +0100, Jan Kiszka wrote:
>> On 16.03.21 11:59, Maxim Levitsky wrote:
>>> On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
>>>> On 16.03.21 00:37, Sean Christopherson wrote:
>>>>> On Tue, Mar 16, 2021, Maxim Levitsky wrote:
>>>>>> This change greatly helps with two issues:
>>>>>>
>>>>>> * Resuming from a breakpoint is much more reliable.
>>>>>>
>>>>>> When resuming execution from a breakpoint, with interrupts enabled, more often
>>>>>> than not, KVM would inject an interrupt and make the CPU jump immediately to
>>>>>> the interrupt handler and eventually return to the breakpoint, to trigger it
>>>>>> again.
>>>>>>
>>>>>> From the user point of view it looks like the CPU never executed a
>>>>>> single instruction and in some cases that can even prevent forward progress,
>>>>>> for example, when the breakpoint is placed by an automated script
>>>>>> (e.g lx-symbols), which does something in response to the breakpoint and then
>>>>>> continues the guest automatically.
>>>>>> If the script execution takes enough time for another interrupt to arrive,
>>>>>> the guest will be stuck on the same breakpoint RIP forever.
>>>>>>
>>>>>> * Normal single stepping is much more predictable, since it won't land the
>>>>>> debugger into an interrupt handler, so it is much more usable.
>>>>>>
>>>>>> (If entry to an interrupt handler is desired, the user can still place a
>>>>>> breakpoint at it and resume the guest, which won't activate this workaround
>>>>>> and let the gdb still stop at the interrupt handler)
>>>>>>
>>>>>> Since this change is only active when guest is debugged, it won't affect
>>>>>> KVM running normal 'production' VMs.
>>>>>>
>>>>>>
>>>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>>>> Tested-by: Stefano Garzarella <sgarzare@redhat.com>
>>>>>> ---
>>>>>> arch/x86/kvm/x86.c | 6 ++++++
>>>>>> 1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>>> index a9d95f90a0487..b75d990fcf12b 100644
>>>>>> --- a/arch/x86/kvm/x86.c
>>>>>> +++ b/arch/x86/kvm/x86.c
>>>>>> @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
>>>>>> can_inject = false;
>>>>>> }
>>>>>>
>>>>>> + /*
>>>>>> + * Don't inject interrupts while single stepping to make guest debug easier
>>>>>> + */
>>>>>> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
>>>>>> + return;
>>>>>
>>>>> Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
>>>>> blocking at the start of single-stepping, unwind at the end? Deviating this far
>>>>> from architectural behavior will end in tears at some point.
>>>>>
>>>>
>>>> Does this happen to address this suspicious workaround in the kernel?
>>>>
>>>> /*
>>>> * The kernel doesn't use TF single-step outside of:
>>>> *
>>>> * - Kprobes, consumed through kprobe_debug_handler()
>>>> * - KGDB, consumed through notify_debug()
>>>> *
>>>> * So if we get here with DR_STEP set, something is wonky.
>>>> *
>>>> * A known way to trigger this is through QEMU's GDB stub,
>>>> * which leaks #DB into the guest and causes IST recursion.
>>>> */
>>>> if (WARN_ON_ONCE(dr6 & DR_STEP))
>>>> regs->flags &= ~X86_EFLAGS_TF;
>>>>
>>>> (arch/x86/kernel/traps.c, exc_debug_kernel)
>>>>
>>>> I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
>>>> yeah, question to myself as well, dancing around broken guest debugging
>>>> for a long time while trying to fix other issues...
>>>
>>> To be honest I didn't see that warning even once, but I can imagine KVM
>>> leaking #DB due to bugs in that code. That area historically didn't receive
>>> much attention since it can only be triggered by
>>> KVM_GET/SET_GUEST_DEBUG which isn't used in production.
>>
>> I've triggered it recently while debugging a guest, that's why I got
>> aware of the code path. Long ago, all this used to work (soft BPs,
>> single-stepping etc.)
>>
>>> The only issue that I on the other hand did
>>> see which is mostly gdb fault is that it fails to remove a software breakpoint
>>> when resuming over it, if that breakpoint's python handler messes up
>>> with gdb's symbols, which is what lx-symbols does.
>>>
>>> And that despite the fact that lx-symbol doesn't mess with the object
>>> (that is the kernel) where the breakpoint is defined.
>>>
>>> Just adding/removing one symbol file is enough to trigger this issue.
>>>
>>> Since lx-symbols already works this around when it reloads all symbols,
>>> I extended that workaround to happen also when loading/unloading
>>> only a single symbol file.
>>
>> You have no issue with interactive debugging when NOT using gdb scripts
>> / lx-symbol?
>
> To be honest I don't use guest debugging that much,
> so I probably missed some issues.
>
> Now that I fixed lx-symbols though I'll probably use
> guest debugging much more.
> I will keep an eye on any issues that I find.
>
> The main push to fix lx-symbols actually came
> from me wanting to understand if there is something
> broken with KVM's guest debugging knowing that
> lx-symbols crashes the guest when module is loaded
> after lx-symbols was executed.
>
> That lx-symbols related guest crash I traced to issue
> with gdb as I explained, and the lack of blocking of the interrupts
> on single step is not a bug but more a missing feature
> that should be implemented to make single step easier to use.
Again, this used to work fine. But maybe this patch can change the
picture by avoid that the unavoidable short TF leakage into the guest
escalates beyond the single instruction to step over.
>
> Another issue which isn't a bug is that you can't place a software
> breakpoint if kernel is not loaded (since there is no code in memory)
> or if the kernel haven't done basic paging initialization
> (since there is no paging yet to know where to place the breakpoint).
> Hardware breakpoints work for this fine though.
>
> So in summary I haven't found any major issues with KVM's guest debug
> yet.
>
> If I do notice issues with guest debug, I will try to isolate
> and debug them.
> For the issue that you mentioned, do you have a way to reproduce it?
I need to spend some time in factoring out a clean test setup, will come
back to you. I'm always pushing this to the back - and then grumble when
hitting it while debugging something urgent. Your patch is a nice reason
to do this systematically now.
Jan
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 13:46 ` Jan Kiszka
@ 2021-03-16 14:34 ` Maxim Levitsky
2021-03-16 15:31 ` Jan Kiszka
0 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-16 14:34 UTC (permalink / raw)
To: Jan Kiszka, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Tue, 2021-03-16 at 14:46 +0100, Jan Kiszka wrote:
> On 16.03.21 13:34, Maxim Levitsky wrote:
> > On Tue, 2021-03-16 at 12:27 +0100, Jan Kiszka wrote:
> > > On 16.03.21 11:59, Maxim Levitsky wrote:
> > > > On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
> > > > > On 16.03.21 00:37, Sean Christopherson wrote:
> > > > > > On Tue, Mar 16, 2021, Maxim Levitsky wrote:
> > > > > > > This change greatly helps with two issues:
> > > > > > >
> > > > > > > * Resuming from a breakpoint is much more reliable.
> > > > > > >
> > > > > > > When resuming execution from a breakpoint, with interrupts enabled, more often
> > > > > > > than not, KVM would inject an interrupt and make the CPU jump immediately to
> > > > > > > the interrupt handler and eventually return to the breakpoint, to trigger it
> > > > > > > again.
> > > > > > >
> > > > > > > From the user point of view it looks like the CPU never executed a
> > > > > > > single instruction and in some cases that can even prevent forward progress,
> > > > > > > for example, when the breakpoint is placed by an automated script
> > > > > > > (e.g lx-symbols), which does something in response to the breakpoint and then
> > > > > > > continues the guest automatically.
> > > > > > > If the script execution takes enough time for another interrupt to arrive,
> > > > > > > the guest will be stuck on the same breakpoint RIP forever.
> > > > > > >
> > > > > > > * Normal single stepping is much more predictable, since it won't land the
> > > > > > > debugger into an interrupt handler, so it is much more usable.
> > > > > > >
> > > > > > > (If entry to an interrupt handler is desired, the user can still place a
> > > > > > > breakpoint at it and resume the guest, which won't activate this workaround
> > > > > > > and let the gdb still stop at the interrupt handler)
> > > > > > >
> > > > > > > Since this change is only active when guest is debugged, it won't affect
> > > > > > > KVM running normal 'production' VMs.
> > > > > > >
> > > > > > >
> > > > > > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > > > > > Tested-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > > > > ---
> > > > > > > arch/x86/kvm/x86.c | 6 ++++++
> > > > > > > 1 file changed, 6 insertions(+)
> > > > > > >
> > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > > index a9d95f90a0487..b75d990fcf12b 100644
> > > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > > @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
> > > > > > > can_inject = false;
> > > > > > > }
> > > > > > >
> > > > > > > + /*
> > > > > > > + * Don't inject interrupts while single stepping to make guest debug easier
> > > > > > > + */
> > > > > > > + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > > > > > > + return;
> > > > > >
> > > > > > Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
> > > > > > blocking at the start of single-stepping, unwind at the end? Deviating this far
> > > > > > from architectural behavior will end in tears at some point.
> > > > > >
> > > > >
> > > > > Does this happen to address this suspicious workaround in the kernel?
> > > > >
> > > > > /*
> > > > > * The kernel doesn't use TF single-step outside of:
> > > > > *
> > > > > * - Kprobes, consumed through kprobe_debug_handler()
> > > > > * - KGDB, consumed through notify_debug()
> > > > > *
> > > > > * So if we get here with DR_STEP set, something is wonky.
> > > > > *
> > > > > * A known way to trigger this is through QEMU's GDB stub,
> > > > > * which leaks #DB into the guest and causes IST recursion.
> > > > > */
> > > > > if (WARN_ON_ONCE(dr6 & DR_STEP))
> > > > > regs->flags &= ~X86_EFLAGS_TF;
> > > > >
> > > > > (arch/x86/kernel/traps.c, exc_debug_kernel)
> > > > >
> > > > > I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
> > > > > yeah, question to myself as well, dancing around broken guest debugging
> > > > > for a long time while trying to fix other issues...
> > > >
> > > > To be honest I didn't see that warning even once, but I can imagine KVM
> > > > leaking #DB due to bugs in that code. That area historically didn't receive
> > > > much attention since it can only be triggered by
> > > > KVM_GET/SET_GUEST_DEBUG which isn't used in production.
> > >
> > > I've triggered it recently while debugging a guest, that's why I got
> > > aware of the code path. Long ago, all this used to work (soft BPs,
> > > single-stepping etc.)
> > >
> > > > The only issue that I on the other hand did
> > > > see which is mostly gdb fault is that it fails to remove a software breakpoint
> > > > when resuming over it, if that breakpoint's python handler messes up
> > > > with gdb's symbols, which is what lx-symbols does.
> > > >
> > > > And that despite the fact that lx-symbol doesn't mess with the object
> > > > (that is the kernel) where the breakpoint is defined.
> > > >
> > > > Just adding/removing one symbol file is enough to trigger this issue.
> > > >
> > > > Since lx-symbols already works this around when it reloads all symbols,
> > > > I extended that workaround to happen also when loading/unloading
> > > > only a single symbol file.
> > >
> > > You have no issue with interactive debugging when NOT using gdb scripts
> > > / lx-symbol?
> >
> > To be honest I don't use guest debugging that much,
> > so I probably missed some issues.
> >
> > Now that I fixed lx-symbols though I'll probably use
> > guest debugging much more.
> > I will keep an eye on any issues that I find.
> >
> > The main push to fix lx-symbols actually came
> > from me wanting to understand if there is something
> > broken with KVM's guest debugging knowing that
> > lx-symbols crashes the guest when module is loaded
> > after lx-symbols was executed.
> >
> > That lx-symbols related guest crash I traced to issue
> > with gdb as I explained, and the lack of blocking of the interrupts
> > on single step is not a bug but more a missing feature
> > that should be implemented to make single step easier to use.
>
> Again, this used to work fine. But maybe this patch can change the
> picture by avoid that the unavoidable short TF leakage into the guest
> escalates beyond the single instruction to step over.
Actually now I think I understand what is going on.
The TF flag isn't auto cleared as RF flag is, and if the instruction
which is single stepped gets an interrupt it is pushed onto the interrupt stack.
(then it is cleared for the duration of the interrupt handler)
Since we use the TF flag for single stepping the guest, this indeed can
cause it to be leaked.
So this patch actually should mitigate this almost completely.
Also now I understand why Intel has the 'monitor trap' feature, I think it
is exactly for the cases when hypervisor wants to single step the guest
without the fear of changing of the guest visible cpu state.
KVM on VMX should probably switch to using monitor trap for single stepping.
Best regards,
Maxim Levitsky
>
> > Another issue which isn't a bug is that you can't place a software
> > breakpoint if kernel is not loaded (since there is no code in memory)
> > or if the kernel haven't done basic paging initialization
> > (since there is no paging yet to know where to place the breakpoint).
> > Hardware breakpoints work for this fine though.
> >
> > So in summary I haven't found any major issues with KVM's guest debug
> > yet.
> >
> > If I do notice issues with guest debug, I will try to isolate
> > and debug them.
> > For the issue that you mentioned, do you have a way to reproduce it?
>
> I need to spend some time in factoring out a clean test setup, will come
> back to you. I'm always pushing this to the back - and then grumble when
> hitting it while debugging something urgent. Your patch is a nice reason
> to do this systematically now.
Thanks for the review,
Best regards,
Maxim Levitsky
>
> Jan
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 14:34 ` Maxim Levitsky
@ 2021-03-16 15:31 ` Jan Kiszka
[not found] ` <e2cd978e357155dbab21a523bb8981973bd10da7.camel@redhat.com>
0 siblings, 1 reply; 32+ messages in thread
From: Jan Kiszka @ 2021-03-16 15:31 UTC (permalink / raw)
To: Maxim Levitsky, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On 16.03.21 15:34, Maxim Levitsky wrote:
> On Tue, 2021-03-16 at 14:46 +0100, Jan Kiszka wrote:
>> On 16.03.21 13:34, Maxim Levitsky wrote:
>>> On Tue, 2021-03-16 at 12:27 +0100, Jan Kiszka wrote:
>>>> On 16.03.21 11:59, Maxim Levitsky wrote:
>>>>> On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
>>>>>> On 16.03.21 00:37, Sean Christopherson wrote:
>>>>>>> On Tue, Mar 16, 2021, Maxim Levitsky wrote:
>>>>>>>> This change greatly helps with two issues:
>>>>>>>>
>>>>>>>> * Resuming from a breakpoint is much more reliable.
>>>>>>>>
>>>>>>>> When resuming execution from a breakpoint, with interrupts enabled, more often
>>>>>>>> than not, KVM would inject an interrupt and make the CPU jump immediately to
>>>>>>>> the interrupt handler and eventually return to the breakpoint, to trigger it
>>>>>>>> again.
>>>>>>>>
>>>>>>>> From the user point of view it looks like the CPU never executed a
>>>>>>>> single instruction and in some cases that can even prevent forward progress,
>>>>>>>> for example, when the breakpoint is placed by an automated script
>>>>>>>> (e.g lx-symbols), which does something in response to the breakpoint and then
>>>>>>>> continues the guest automatically.
>>>>>>>> If the script execution takes enough time for another interrupt to arrive,
>>>>>>>> the guest will be stuck on the same breakpoint RIP forever.
>>>>>>>>
>>>>>>>> * Normal single stepping is much more predictable, since it won't land the
>>>>>>>> debugger into an interrupt handler, so it is much more usable.
>>>>>>>>
>>>>>>>> (If entry to an interrupt handler is desired, the user can still place a
>>>>>>>> breakpoint at it and resume the guest, which won't activate this workaround
>>>>>>>> and let the gdb still stop at the interrupt handler)
>>>>>>>>
>>>>>>>> Since this change is only active when guest is debugged, it won't affect
>>>>>>>> KVM running normal 'production' VMs.
>>>>>>>>
>>>>>>>>
>>>>>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>>>>>> Tested-by: Stefano Garzarella <sgarzare@redhat.com>
>>>>>>>> ---
>>>>>>>> arch/x86/kvm/x86.c | 6 ++++++
>>>>>>>> 1 file changed, 6 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>>>>> index a9d95f90a0487..b75d990fcf12b 100644
>>>>>>>> --- a/arch/x86/kvm/x86.c
>>>>>>>> +++ b/arch/x86/kvm/x86.c
>>>>>>>> @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
>>>>>>>> can_inject = false;
>>>>>>>> }
>>>>>>>>
>>>>>>>> + /*
>>>>>>>> + * Don't inject interrupts while single stepping to make guest debug easier
>>>>>>>> + */
>>>>>>>> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
>>>>>>>> + return;
>>>>>>>
>>>>>>> Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
>>>>>>> blocking at the start of single-stepping, unwind at the end? Deviating this far
>>>>>>> from architectural behavior will end in tears at some point.
>>>>>>>
>>>>>>
>>>>>> Does this happen to address this suspicious workaround in the kernel?
>>>>>>
>>>>>> /*
>>>>>> * The kernel doesn't use TF single-step outside of:
>>>>>> *
>>>>>> * - Kprobes, consumed through kprobe_debug_handler()
>>>>>> * - KGDB, consumed through notify_debug()
>>>>>> *
>>>>>> * So if we get here with DR_STEP set, something is wonky.
>>>>>> *
>>>>>> * A known way to trigger this is through QEMU's GDB stub,
>>>>>> * which leaks #DB into the guest and causes IST recursion.
>>>>>> */
>>>>>> if (WARN_ON_ONCE(dr6 & DR_STEP))
>>>>>> regs->flags &= ~X86_EFLAGS_TF;
>>>>>>
>>>>>> (arch/x86/kernel/traps.c, exc_debug_kernel)
>>>>>>
>>>>>> I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
>>>>>> yeah, question to myself as well, dancing around broken guest debugging
>>>>>> for a long time while trying to fix other issues...
>>>>>
>>>>> To be honest I didn't see that warning even once, but I can imagine KVM
>>>>> leaking #DB due to bugs in that code. That area historically didn't receive
>>>>> much attention since it can only be triggered by
>>>>> KVM_GET/SET_GUEST_DEBUG which isn't used in production.
>>>>
>>>> I've triggered it recently while debugging a guest, that's why I got
>>>> aware of the code path. Long ago, all this used to work (soft BPs,
>>>> single-stepping etc.)
>>>>
>>>>> The only issue that I on the other hand did
>>>>> see which is mostly gdb fault is that it fails to remove a software breakpoint
>>>>> when resuming over it, if that breakpoint's python handler messes up
>>>>> with gdb's symbols, which is what lx-symbols does.
>>>>>
>>>>> And that despite the fact that lx-symbol doesn't mess with the object
>>>>> (that is the kernel) where the breakpoint is defined.
>>>>>
>>>>> Just adding/removing one symbol file is enough to trigger this issue.
>>>>>
>>>>> Since lx-symbols already works this around when it reloads all symbols,
>>>>> I extended that workaround to happen also when loading/unloading
>>>>> only a single symbol file.
>>>>
>>>> You have no issue with interactive debugging when NOT using gdb scripts
>>>> / lx-symbol?
>>>
>>> To be honest I don't use guest debugging that much,
>>> so I probably missed some issues.
>>>
>>> Now that I fixed lx-symbols though I'll probably use
>>> guest debugging much more.
>>> I will keep an eye on any issues that I find.
>>>
>>> The main push to fix lx-symbols actually came
>>> from me wanting to understand if there is something
>>> broken with KVM's guest debugging knowing that
>>> lx-symbols crashes the guest when module is loaded
>>> after lx-symbols was executed.
>>>
>>> That lx-symbols related guest crash I traced to issue
>>> with gdb as I explained, and the lack of blocking of the interrupts
>>> on single step is not a bug but more a missing feature
>>> that should be implemented to make single step easier to use.
>>
>> Again, this used to work fine. But maybe this patch can change the
>> picture by avoid that the unavoidable short TF leakage into the guest
>> escalates beyond the single instruction to step over.
>
>
> Actually now I think I understand what is going on.
>
> The TF flag isn't auto cleared as RF flag is, and if the instruction
> which is single stepped gets an interrupt it is pushed onto the interrupt stack.
> (then it is cleared for the duration of the interrupt handler)
> Since we use the TF flag for single stepping the guest, this indeed can
> cause it to be leaked.
>
> So this patch actually should mitigate this almost completely.
>
> Also now I understand why Intel has the 'monitor trap' feature, I think it
> is exactly for the cases when hypervisor wants to single step the guest
> without the fear of changing of the guest visible cpu state.
Exactly.
>
> KVM on VMX should probably switch to using monitor trap for single stepping.
Back then, when I was hacking on the gdb-stub and KVM support, the
monitor trap flag was not yet broadly available, but the idea to once
use it was already there. Now it can be considered broadly available,
but it would still require some changes to get it in.
Unfortunately, we don't have such thing with SVM, even recent versions,
right? So, a proper way of avoiding diverting event injections while we
are having the guest in an "incorrect" state should definitely be the goal.
Given that KVM knows whether TF originates solely from guest debugging
or was (also) injected by the guest, we should be able to identify the
cases where your approach is best to apply. And that without any extra
control knob that everyone will only forget to set.
Jan
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-16 12:34 ` Maxim Levitsky
2021-03-16 13:46 ` Jan Kiszka
@ 2021-03-18 12:21 ` Jan Kiszka
1 sibling, 0 replies; 32+ messages in thread
From: Jan Kiszka @ 2021-03-18 12:21 UTC (permalink / raw)
To: Maxim Levitsky, Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On 16.03.21 13:34, Maxim Levitsky wrote:
> On Tue, 2021-03-16 at 12:27 +0100, Jan Kiszka wrote:
>> On 16.03.21 11:59, Maxim Levitsky wrote:
>>> On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote:
>>>> On 16.03.21 00:37, Sean Christopherson wrote:
>>>>> On Tue, Mar 16, 2021, Maxim Levitsky wrote:
>>>>>> This change greatly helps with two issues:
>>>>>>
>>>>>> * Resuming from a breakpoint is much more reliable.
>>>>>>
>>>>>> When resuming execution from a breakpoint, with interrupts enabled, more often
>>>>>> than not, KVM would inject an interrupt and make the CPU jump immediately to
>>>>>> the interrupt handler and eventually return to the breakpoint, to trigger it
>>>>>> again.
>>>>>>
>>>>>> From the user point of view it looks like the CPU never executed a
>>>>>> single instruction and in some cases that can even prevent forward progress,
>>>>>> for example, when the breakpoint is placed by an automated script
>>>>>> (e.g lx-symbols), which does something in response to the breakpoint and then
>>>>>> continues the guest automatically.
>>>>>> If the script execution takes enough time for another interrupt to arrive,
>>>>>> the guest will be stuck on the same breakpoint RIP forever.
>>>>>>
>>>>>> * Normal single stepping is much more predictable, since it won't land the
>>>>>> debugger into an interrupt handler, so it is much more usable.
>>>>>>
>>>>>> (If entry to an interrupt handler is desired, the user can still place a
>>>>>> breakpoint at it and resume the guest, which won't activate this workaround
>>>>>> and let the gdb still stop at the interrupt handler)
>>>>>>
>>>>>> Since this change is only active when guest is debugged, it won't affect
>>>>>> KVM running normal 'production' VMs.
>>>>>>
>>>>>>
>>>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>>>> Tested-by: Stefano Garzarella <sgarzare@redhat.com>
>>>>>> ---
>>>>>> arch/x86/kvm/x86.c | 6 ++++++
>>>>>> 1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>>> index a9d95f90a0487..b75d990fcf12b 100644
>>>>>> --- a/arch/x86/kvm/x86.c
>>>>>> +++ b/arch/x86/kvm/x86.c
>>>>>> @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
>>>>>> can_inject = false;
>>>>>> }
>>>>>>
>>>>>> + /*
>>>>>> + * Don't inject interrupts while single stepping to make guest debug easier
>>>>>> + */
>>>>>> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
>>>>>> + return;
>>>>>
>>>>> Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
>>>>> blocking at the start of single-stepping, unwind at the end? Deviating this far
>>>>> from architectural behavior will end in tears at some point.
>>>>>
>>>>
>>>> Does this happen to address this suspicious workaround in the kernel?
>>>>
>>>> /*
>>>> * The kernel doesn't use TF single-step outside of:
>>>> *
>>>> * - Kprobes, consumed through kprobe_debug_handler()
>>>> * - KGDB, consumed through notify_debug()
>>>> *
>>>> * So if we get here with DR_STEP set, something is wonky.
>>>> *
>>>> * A known way to trigger this is through QEMU's GDB stub,
>>>> * which leaks #DB into the guest and causes IST recursion.
>>>> */
>>>> if (WARN_ON_ONCE(dr6 & DR_STEP))
>>>> regs->flags &= ~X86_EFLAGS_TF;
>>>>
>>>> (arch/x86/kernel/traps.c, exc_debug_kernel)
>>>>
>>>> I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh,
>>>> yeah, question to myself as well, dancing around broken guest debugging
>>>> for a long time while trying to fix other issues...
>>>
>>> To be honest I didn't see that warning even once, but I can imagine KVM
>>> leaking #DB due to bugs in that code. That area historically didn't receive
>>> much attention since it can only be triggered by
>>> KVM_GET/SET_GUEST_DEBUG which isn't used in production.
>>
>> I've triggered it recently while debugging a guest, that's why I got
>> aware of the code path. Long ago, all this used to work (soft BPs,
>> single-stepping etc.)
>>
>>> The only issue that I on the other hand did
>>> see which is mostly gdb fault is that it fails to remove a software breakpoint
>>> when resuming over it, if that breakpoint's python handler messes up
>>> with gdb's symbols, which is what lx-symbols does.
>>>
>>> And that despite the fact that lx-symbol doesn't mess with the object
>>> (that is the kernel) where the breakpoint is defined.
>>>
>>> Just adding/removing one symbol file is enough to trigger this issue.
>>>
>>> Since lx-symbols already works this around when it reloads all symbols,
>>> I extended that workaround to happen also when loading/unloading
>>> only a single symbol file.
>>
>> You have no issue with interactive debugging when NOT using gdb scripts
>> / lx-symbol?
>
> To be honest I don't use guest debugging that much,
> so I probably missed some issues.
>
> Now that I fixed lx-symbols though I'll probably use
> guest debugging much more.
> I will keep an eye on any issues that I find.
>
> The main push to fix lx-symbols actually came
> from me wanting to understand if there is something
> broken with KVM's guest debugging knowing that
> lx-symbols crashes the guest when module is loaded
> after lx-symbols was executed.
>
> That lx-symbols related guest crash I traced to issue
> with gdb as I explained, and the lack of blocking of the interrupts
> on single step is not a bug but more a missing feature
> that should be implemented to make single step easier to use.
>
> Another issue which isn't a bug is that you can't place a software
> breakpoint if kernel is not loaded (since there is no code in memory)
> or if the kernel haven't done basic paging initialization
> (since there is no paging yet to know where to place the breakpoint).
> Hardware breakpoints work for this fine though.
>
> So in summary I haven't found any major issues with KVM's guest debug
> yet.
>
> If I do notice issues with guest debug, I will try to isolate
> and debug them.
> For the issue that you mentioned, do you have a way to reproduce it?
To pick this up again, I did some experiments and was easily able to
reproduce the main problem:
- checkout linux master (1df27313f50 yesterday)
- build a fitting host kernel with KVM
- build a target kernel with defconfig + CONFIG_KVM_GUEST +
CONFIG_DEBUG_INFO, no gdb scripts for now
- boot the guest
qemu-system-x86_64 linux.img -enable-kvm -smp 4 -s
-kernel bzImage -append "console=ttyS0 root=... nokaslr"
- gdb vmlinux
- tar rem :1234
- b __x64_sys_execve
- continue whenever you hit the breakpoint, and the guest will
eventually hang
- or stepi, and you may end up in the interrupt handler
- specifically doing that on the serial console or setting the bp in
early boot exposes the issue
I've also seen
WARNING: CPU: 3 PID: 751 at ../arch/x86/kernel/traps.c:915
exc_debug+0x16b/0x1c0
from time to time.
When I apply this patch to the host, the problems are gone.
Interestingly, my OpenSUSE 5.3.18-lp152.66-default does not show this
behavior and /seems/ stable from quick testing. Not sure if there were
changes in upstream between that baseline and head or if Suse is
carrying a local fix.
In any case, I think we need the following:
- default disabling of event injections when guest debugging injected
TF
- possibly some control interface to allow events BUT then avoids any
TF injection to prevent guest state corruptions
- exposure of that interface to the gdb frontend, maybe by making it
available via the QEMU monitor (monitor ...)
- a for kvm-unit-tests to trigger the above corner cases
Jan
--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping
2021-03-15 23:37 ` Sean Christopherson
2021-03-16 9:16 ` Jan Kiszka
@ 2021-03-16 10:55 ` Maxim Levitsky
1 sibling, 0 replies; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-16 10:55 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Jim Mattson, Borislav Petkov, Stefano Garzarella,
H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Mon, 2021-03-15 at 16:37 -0700, Sean Christopherson wrote:
> On Tue, Mar 16, 2021, Maxim Levitsky wrote:
> > This change greatly helps with two issues:
> >
> > * Resuming from a breakpoint is much more reliable.
> >
> > When resuming execution from a breakpoint, with interrupts enabled, more often
> > than not, KVM would inject an interrupt and make the CPU jump immediately to
> > the interrupt handler and eventually return to the breakpoint, to trigger it
> > again.
> >
> > From the user point of view it looks like the CPU never executed a
> > single instruction and in some cases that can even prevent forward progress,
> > for example, when the breakpoint is placed by an automated script
> > (e.g lx-symbols), which does something in response to the breakpoint and then
> > continues the guest automatically.
> > If the script execution takes enough time for another interrupt to arrive,
> > the guest will be stuck on the same breakpoint RIP forever.
> >
> > * Normal single stepping is much more predictable, since it won't land the
> > debugger into an interrupt handler, so it is much more usable.
> >
> > (If entry to an interrupt handler is desired, the user can still place a
> > breakpoint at it and resume the guest, which won't activate this workaround
> > and let the gdb still stop at the interrupt handler)
> >
> > Since this change is only active when guest is debugged, it won't affect
> > KVM running normal 'production' VMs.
> >
> >
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > Tested-by: Stefano Garzarella <sgarzare@redhat.com>
> > ---
> > arch/x86/kvm/x86.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index a9d95f90a0487..b75d990fcf12b 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
> > can_inject = false;
> > }
> >
> > + /*
> > + * Don't inject interrupts while single stepping to make guest debug easier
> > + */
> > + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
> > + return;
>
> Is this something userspace can deal with? E.g. disable IRQs and/or set NMI
> blocking at the start of single-stepping, unwind at the end? Deviating this far
> from architectural behavior will end in tears at some point.
I don't worry about NMI, but for IRQs, userspace can clear EFLAGS.IF, but that
can be messy to unwind, if an instruction that clears the interrupt flag was
single stepped over.
There is also notion of interrupt shadow but it also is reserved for things
like delaying interrupts for one cycle after sti, and such.
IMHO KVM_GUESTDBG_SINGLESTEP is already non architectural feature (userspace
basically tell the KVM to single step the guest but it doesn't set TF flag
or something like that), so changing its definition shouldn't be a problem.
If you worry about some automated script breaking due to the change,
(I expect that KVM_GUESTDBG_SINGLESTEP is mostly used manually, especially
since single stepping is never 100% reliable due to various issues like that),
I can add another flag to it which will block all the interrupts.
(like say KVM_GUESTDBG_BLOCKEVENTS).
In fact qemu already has single step flags, enabled over special qemu gdb extension
'maintenance packet qqemu.sstepbits'
Those single step flags allow to disable interrupts and qemu timers during the single stepping,
(and both modes are enabled by default)
However kvm code in qemu ignores these bits.
What do you think?
Best regards,
Maxim Levitsky
>
> > +
> > /*
> > * Finally, inject interrupt events. If an event cannot be injected
> > * due to architectural conditions (e.g. IF=0) a window-open exit
> > --
> > 2.26.2
> >
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-15 22:10 [PATCH 0/3] KVM: my debug patch queue Maxim Levitsky
2021-03-15 22:10 ` [PATCH 1/3] scripts/gdb: rework lx-symbols gdb script Maxim Levitsky
2021-03-15 22:10 ` [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping Maxim Levitsky
@ 2021-03-15 22:10 ` Maxim Levitsky
2021-03-16 8:32 ` Joerg Roedel
2021-03-16 8:34 ` Borislav Petkov
2 siblings, 2 replies; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-15 22:10 UTC (permalink / raw)
To: kvm
Cc: Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, Maxim Levitsky, H. Peter Anvin,
Paolo Bonzini, Ingo Molnar, Borislav Petkov
Add a new debug module param 'debug_intercept_exceptions' which will allow the
KVM to intercept any guest exception, and forward it to the guest.
This can be very useful for guest debugging and/or KVM debugging with kvm trace.
This is not intended to be used on production systems.
This is based on an idea first shown here:
https://patchwork.kernel.org/project/kvm/patch/20160301192822.GD22677@pd.tnic/
CC: Borislav Petkov <bp@suse.de>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/svm/svm.c | 77 ++++++++++++++++++++++++++++++++-
arch/x86/kvm/svm/svm.h | 5 ++-
arch/x86/kvm/x86.c | 5 ++-
4 files changed, 85 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a52f973bdff6d..c8f44a88b3153 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1564,6 +1564,8 @@ int kvm_emulate_rdpmc(struct kvm_vcpu *vcpu);
void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload);
+void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
+ u32 error_code, unsigned long payload);
void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 271196400495f..94156a367a663 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -197,6 +197,9 @@ module_param(sev_es, int, 0444);
bool __read_mostly dump_invalid_vmcb;
module_param(dump_invalid_vmcb, bool, 0644);
+uint debug_intercept_exceptions;
+module_param(debug_intercept_exceptions, uint, 0444);
+
static bool svm_gp_erratum_intercept = true;
static u8 rsm_ins_bytes[] = "\x0f\xaa";
@@ -220,6 +223,8 @@ static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
#define MSRS_RANGE_SIZE 2048
#define MSRS_IN_RANGE (MSRS_RANGE_SIZE * 8 / 2)
+static void init_debug_exceptions_intercept(struct vcpu_svm *svm);
+
u32 svm_msrpm_offset(u32 msr)
{
u32 offset;
@@ -1137,6 +1142,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
set_exception_intercept(svm, MC_VECTOR);
set_exception_intercept(svm, AC_VECTOR);
set_exception_intercept(svm, DB_VECTOR);
+
+ init_debug_exceptions_intercept(svm);
/*
* Guest access to VMware backdoor ports could legitimately
* trigger #GP because of TSS I/O permission bitmap.
@@ -1913,6 +1920,17 @@ static int pf_interception(struct kvm_vcpu *vcpu)
u64 fault_address = svm->vmcb->control.exit_info_2;
u64 error_code = svm->vmcb->control.exit_info_1;
+ if ((debug_intercept_exceptions & (1 << PF_VECTOR)))
+ if (npt_enabled && !vcpu->arch.apf.host_apf_flags) {
+ /* If #PF was only intercepted for debug, inject
+ * it directly to the guest, since the mmu code
+ * is not ready to deal with such page faults
+ */
+ kvm_queue_exception_e_p(vcpu, PF_VECTOR,
+ error_code, fault_address);
+ return 1;
+ }
+
return kvm_handle_page_fault(vcpu, error_code, fault_address,
static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
svm->vmcb->control.insn_bytes : NULL,
@@ -3025,7 +3043,7 @@ static int invpcid_interception(struct kvm_vcpu *vcpu)
return kvm_handle_invpcid(vcpu, type, gva);
}
-static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
+static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[SVM_EXIT_READ_CR0] = cr_interception,
[SVM_EXIT_READ_CR3] = cr_interception,
[SVM_EXIT_READ_CR4] = cr_interception,
@@ -3099,6 +3117,63 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[SVM_EXIT_VMGEXIT] = sev_handle_vmgexit,
};
+static int generic_exception_interception(struct kvm_vcpu *vcpu)
+{
+ /*
+ * Generic exception handler which forwards a guest exception
+ * as-is to the guest.
+ * For exceptions that don't have a special intercept handler.
+ *
+ * Used for 'debug_intercept_exceptions' KVM debug feature only.
+ */
+ struct vcpu_svm *svm = to_svm(vcpu);
+ int exc = svm->vmcb->control.exit_code - SVM_EXIT_EXCP_BASE;
+
+ WARN_ON(exc < 0 || exc > 31);
+
+ if (exc == TS_VECTOR) {
+ /*
+ * SVM doesn't provide us with an error code to be able to
+ * re-inject the #TS exception, so just disable its
+ * interception, and let the guest re-execute the instruction.
+ */
+ vmcb_clr_intercept(&svm->vmcb01.ptr->control,
+ INTERCEPT_EXCEPTION_OFFSET + TS_VECTOR);
+ recalc_intercepts(svm);
+ return 1;
+ } else if (exc == DF_VECTOR) {
+ /* SVM doesn't provide us with an error code for the #DF */
+ kvm_queue_exception_e(vcpu, exc, 0);
+ return 1;
+ }
+
+ if (x86_exception_has_error_code(exc))
+ kvm_queue_exception_e(vcpu, exc, svm->vmcb->control.exit_info_1);
+ else
+ kvm_queue_exception(vcpu, exc);
+ return 1;
+}
+
+static void init_debug_exceptions_intercept(struct vcpu_svm *svm)
+{
+ int exc;
+
+ for (exc = 0 ; exc < 32 ; exc++) {
+ if (!(debug_intercept_exceptions & (1 << exc)))
+ continue;
+
+ /* Those are defined to have undefined behavior in the SVM spec */
+ if (exc == 2 || exc == 9)
+ continue;
+
+ set_exception_intercept(svm, exc);
+
+ if (!svm_exit_handlers[SVM_EXIT_EXCP_BASE + exc])
+ svm_exit_handlers[SVM_EXIT_EXCP_BASE + exc] =
+ generic_exception_interception;
+ }
+}
+
static void dump_vmcb(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8e276c4fb33df..e0ff9ca996df8 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -32,6 +32,7 @@ static const u32 host_save_user_msrs[] = {
#define MSRPM_OFFSETS 16
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
+extern uint debug_intercept_exceptions;
enum {
VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -333,7 +334,9 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
struct vmcb *vmcb = svm->vmcb01.ptr;
WARN_ON_ONCE(bit >= 32);
- vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+ if (!((1 << bit) & debug_intercept_exceptions))
+ vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
recalc_intercepts(svm);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b75d990fcf12b..be509944622bd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -627,12 +627,13 @@ void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
}
EXPORT_SYMBOL_GPL(kvm_queue_exception_p);
-static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
- u32 error_code, unsigned long payload)
+void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
+ u32 error_code, unsigned long payload)
{
kvm_multiple_exception(vcpu, nr, true, error_code,
true, payload, false);
}
+EXPORT_SYMBOL_GPL(kvm_queue_exception_e_p);
int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err)
{
--
2.26.2
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-15 22:10 ` [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug Maxim Levitsky
@ 2021-03-16 8:32 ` Joerg Roedel
2021-03-16 10:51 ` Maxim Levitsky
2021-03-16 8:34 ` Borislav Petkov
1 sibling, 1 reply; 32+ messages in thread
From: Joerg Roedel @ 2021-03-16 8:32 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar,
Borislav Petkov
Hi Maxim,
On Tue, Mar 16, 2021 at 12:10:20AM +0200, Maxim Levitsky wrote:
> -static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
> +static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
Can you keep this const and always set the necessary handlers? If
exceptions are not intercepted they will not be used.
> @@ -333,7 +334,9 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
> struct vmcb *vmcb = svm->vmcb01.ptr;
>
> WARN_ON_ONCE(bit >= 32);
> - vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
> +
> + if (!((1 << bit) & debug_intercept_exceptions))
> + vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
This will break SEV-ES guests, as those will not cause an intercept but
now start to get #VC exceptions on every other exception that is raised.
SEV-ES guests are not prepared for that and will not even boot, so
please don't enable this feature for them.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-16 8:32 ` Joerg Roedel
@ 2021-03-16 10:51 ` Maxim Levitsky
2021-03-18 9:19 ` Joerg Roedel
0 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-16 10:51 UTC (permalink / raw)
To: Joerg Roedel
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar,
Borislav Petkov
On Tue, 2021-03-16 at 09:32 +0100, Joerg Roedel wrote:
> Hi Maxim,
>
> On Tue, Mar 16, 2021 at 12:10:20AM +0200, Maxim Levitsky wrote:
> > -static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
> > +static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>
> Can you keep this const and always set the necessary handlers? If
> exceptions are not intercepted they will not be used.
>
> > @@ -333,7 +334,9 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
> > struct vmcb *vmcb = svm->vmcb01.ptr;
> >
> > WARN_ON_ONCE(bit >= 32);
> > - vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
> > +
> > + if (!((1 << bit) & debug_intercept_exceptions))
> > + vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
>
> This will break SEV-ES guests, as those will not cause an intercept but
> now start to get #VC exceptions on every other exception that is raised.
> SEV-ES guests are not prepared for that and will not even boot, so
> please don't enable this feature for them.
I agree but what is wrong with that?
This is a debug feature, and it only can be enabled by the root,
and so someone might actually want this case to happen
(e.g to see if a SEV guest can cope with extra #VC exceptions).
I have nothing against not allowing this for SEV-ES guests though.
What do you think?
Best regards,
Maxim Levitsky
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-16 10:51 ` Maxim Levitsky
@ 2021-03-18 9:19 ` Joerg Roedel
2021-03-18 9:24 ` Maxim Levitsky
0 siblings, 1 reply; 32+ messages in thread
From: Joerg Roedel @ 2021-03-18 9:19 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar,
Borislav Petkov
On Tue, Mar 16, 2021 at 12:51:20PM +0200, Maxim Levitsky wrote:
> I agree but what is wrong with that?
> This is a debug feature, and it only can be enabled by the root,
> and so someone might actually want this case to happen
> (e.g to see if a SEV guest can cope with extra #VC exceptions).
That doesn't make sense, we know that and SEV-ES guest can't cope with
extra #VC exceptions, so there is no point in testing this. It is more a
way to shot oneself into the foot for the user and a potential source of
bug reports for SEV-ES guests.
> I have nothing against not allowing this for SEV-ES guests though.
> What do you think?
I think SEV-ES guests should only have the intercept bits set which
guests acutally support.
Regards,
Joerg
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-18 9:19 ` Joerg Roedel
@ 2021-03-18 9:24 ` Maxim Levitsky
2021-03-18 15:47 ` Joerg Roedel
0 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-18 9:24 UTC (permalink / raw)
To: Joerg Roedel
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar,
Borislav Petkov
On Thu, 2021-03-18 at 10:19 +0100, Joerg Roedel wrote:
> On Tue, Mar 16, 2021 at 12:51:20PM +0200, Maxim Levitsky wrote:
> > I agree but what is wrong with that?
> > This is a debug feature, and it only can be enabled by the root,
> > and so someone might actually want this case to happen
> > (e.g to see if a SEV guest can cope with extra #VC exceptions).
>
> That doesn't make sense, we know that and SEV-ES guest can't cope with
> extra #VC exceptions, so there is no point in testing this. It is more a
> way to shot oneself into the foot for the user and a potential source of
> bug reports for SEV-ES guests.
But again this is a debug feature, and it is intended to allow the user
to shoot himself in the foot. Bug reports for a debug feature
are autoclosed. It is no different from say poking kernel memory with
its built-in gdbstub, for example.
Best regards,
Maxim Levitsky
>
>
> > I have nothing against not allowing this for SEV-ES guests though.
> > What do you think?
>
> I think SEV-ES guests should only have the intercept bits set which
> guests acutally support
>
> Regards,
>
> Joerg
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-18 9:24 ` Maxim Levitsky
@ 2021-03-18 15:47 ` Joerg Roedel
2021-03-18 16:35 ` Sean Christopherson
0 siblings, 1 reply; 32+ messages in thread
From: Joerg Roedel @ 2021-03-18 15:47 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar,
Borislav Petkov
On Thu, Mar 18, 2021 at 11:24:25AM +0200, Maxim Levitsky wrote:
> But again this is a debug feature, and it is intended to allow the user
> to shoot himself in the foot.
And one can't debug SEV-ES guests with it, so what is the point of
enabling it for them too?
Regards,
Joerg
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-18 15:47 ` Joerg Roedel
@ 2021-03-18 16:35 ` Sean Christopherson
2021-03-18 16:41 ` Maxim Levitsky
0 siblings, 1 reply; 32+ messages in thread
From: Sean Christopherson @ 2021-03-18 16:35 UTC (permalink / raw)
To: Joerg Roedel
Cc: Maxim Levitsky, kvm, Vitaly Kuznetsov, linux-kernel,
Thomas Gleixner, Wanpeng Li, Kieran Bingham, Jessica Yu,
Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Jim Mattson, Borislav Petkov, Stefano Garzarella, H. Peter Anvin,
Paolo Bonzini, Ingo Molnar, Borislav Petkov
On Thu, Mar 18, 2021, Joerg Roedel wrote:
> On Thu, Mar 18, 2021 at 11:24:25AM +0200, Maxim Levitsky wrote:
> > But again this is a debug feature, and it is intended to allow the user
> > to shoot himself in the foot.
>
> And one can't debug SEV-ES guests with it, so what is the point of
> enabling it for them too?
Agreed. I can see myself enabling debug features by default, it would be nice
to not having to go out of my way to disable them for SEV-ES/SNP guests.
Skipping SEV-ES guests should not be difficult; KVM could probably even
print a message stating that the debug hook is being ignored. One thought would
be to snapshot debug_intercept_exceptions at VM creation, and simply zero it out
for incompatible guests. That would also allow changing debug_intercept_exceptions
without reloading KVM, which IMO would be very convenient.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-18 16:35 ` Sean Christopherson
@ 2021-03-18 16:41 ` Maxim Levitsky
2021-03-18 17:22 ` Sean Christopherson
0 siblings, 1 reply; 32+ messages in thread
From: Maxim Levitsky @ 2021-03-18 16:41 UTC (permalink / raw)
To: Sean Christopherson, Joerg Roedel
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Jim Mattson, Borislav Petkov, Stefano Garzarella, H. Peter Anvin,
Paolo Bonzini, Ingo Molnar, Borislav Petkov
On Thu, 2021-03-18 at 16:35 +0000, Sean Christopherson wrote:
> On Thu, Mar 18, 2021, Joerg Roedel wrote:
> > On Thu, Mar 18, 2021 at 11:24:25AM +0200, Maxim Levitsky wrote:
> > > But again this is a debug feature, and it is intended to allow the user
> > > to shoot himself in the foot.
> >
> > And one can't debug SEV-ES guests with it, so what is the point of
> > enabling it for them too?
You can create a special SEV-ES guest which does handle all exceptions via
#VC, or just observe it fail which can be useful for some whatever reason.
>
> Agreed. I can see myself enabling debug features by default, it would be nice
> to not having to go out of my way to disable them for SEV-ES/SNP guests.
This does sound like a valid reason to disable this for SEV-ES.
>
> Skipping SEV-ES guests should not be difficult; KVM could probably even
> print a message stating that the debug hook is being ignored. One thought would
> be to snapshot debug_intercept_exceptions at VM creation, and simply zero it out
> for incompatible guests. That would also allow changing debug_intercept_exceptions
> without reloading KVM, which IMO would be very convenient.
>
So all right I'll disable this for SEV-ES.
The idea to change the debug_intercept_exceptions on the fly is also a good idea,
I will implement it in next version of the patches.
Thanks for the review,
Best regards,
Maxim Levitsky
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-18 16:41 ` Maxim Levitsky
@ 2021-03-18 17:22 ` Sean Christopherson
0 siblings, 0 replies; 32+ messages in thread
From: Sean Christopherson @ 2021-03-18 17:22 UTC (permalink / raw)
To: Maxim Levitsky
Cc: Joerg Roedel, kvm, Vitaly Kuznetsov, linux-kernel,
Thomas Gleixner, Wanpeng Li, Kieran Bingham, Jessica Yu,
Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Jim Mattson, Borislav Petkov, Stefano Garzarella, H. Peter Anvin,
Paolo Bonzini, Ingo Molnar, Borislav Petkov
On Thu, Mar 18, 2021, Maxim Levitsky wrote:
> On Thu, 2021-03-18 at 16:35 +0000, Sean Christopherson wrote:
> > Skipping SEV-ES guests should not be difficult; KVM could probably even
> > print a message stating that the debug hook is being ignored. One thought would
> > be to snapshot debug_intercept_exceptions at VM creation, and simply zero it out
> > for incompatible guests. That would also allow changing debug_intercept_exceptions
> > without reloading KVM, which IMO would be very convenient.
> >
> So all right I'll disable this for SEV-ES.
Belated thought. KVM doesn't know a guest will be an SEV-ES guest until
sev_es_guest_init(), and KVM currently doesn't prevent creating vCPUs before
KVM_SEV_ES_INIT. But, I'm 99% confident that's a KVM bug. For your purposes,
I think you can assume kvm->arch.debug_intercept_exceptions will _not_ change
after vCPU creation.
> The idea to change the debug_intercept_exceptions on the fly is also a good idea,
> I will implement it in next version of the patches.
Can you also move the module param to x86? It doesn't need to be wired up for
VMX right away, but it makes sense to do it at some point, and ideally folks
won't have to update their scripts when that happens.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug
2021-03-15 22:10 ` [PATCH 3/3] KVM: SVM: allow to intercept all exceptions for debug Maxim Levitsky
2021-03-16 8:32 ` Joerg Roedel
@ 2021-03-16 8:34 ` Borislav Petkov
1 sibling, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2021-03-16 8:34 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, Vitaly Kuznetsov, linux-kernel, Thomas Gleixner, Wanpeng Li,
Kieran Bingham, Jessica Yu, Jan Kiszka, Andrew Morton,
maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
Joerg Roedel, Sean Christopherson, Jim Mattson, Borislav Petkov,
Stefano Garzarella, H. Peter Anvin, Paolo Bonzini, Ingo Molnar
On Tue, Mar 16, 2021 at 12:10:20AM +0200, Maxim Levitsky wrote:
> Add a new debug module param 'debug_intercept_exceptions' which will allow the
> KVM to intercept any guest exception, and forward it to the guest.
>
> This can be very useful for guest debugging and/or KVM debugging with kvm trace.
> This is not intended to be used on production systems.
>
> This is based on an idea first shown here:
> https://patchwork.kernel.org/project/kvm/patch/20160301192822.GD22677@pd.tnic/
>
> CC: Borislav Petkov <bp@suse.de>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
> arch/x86/include/asm/kvm_host.h | 2 +
> arch/x86/kvm/svm/svm.c | 77 ++++++++++++++++++++++++++++++++-
> arch/x86/kvm/svm/svm.h | 5 ++-
> arch/x86/kvm/x86.c | 5 ++-
> 4 files changed, 85 insertions(+), 4 deletions(-)
Looks interesting, I'll give it a try when I get a chance in the coming days.
Thx!
--
Regards/Gruss,
Boris.
SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg
^ permalink raw reply [flat|nested] 32+ messages in thread