All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scripts/gdb: rework lx-symbols gdb script
@ 2021-08-11 13:31 Maxim Levitsky
  2021-08-11 19:01 ` Jan Kiszka
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Levitsky @ 2021-08-11 13:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Johannes Berg, Jessica Yu, Jan Kiszka, Maxim Levitsky,
	Andrew Morton, Kieran Bingham

Fix several issues that are present in lx-symbols script:

* Track module unloads by placing another software breakpoint at
  'free_module'
  (force uninline this symbol just in case), and use remove-symbol-file
  gdb command to unload the symobls of the module that is unloading.

  That gives the gdb a chance to mark all software breakpoints from
  this module as pending again.
  Also remove the module from the 'known' module list once it is unloaded.

* Since we now track module unload, we don't need to reload all
  symbols anymore when 'known' module loaded again
  (that can't happen anymore).
  This allows reloading a module in the debugged kernel to finish
  much faster, while lx-symbols tracks module loads and unloads.

* Disable/enable all gdb breakpoints on both module load and unload
  breakpoint hits, and not only in 'load_all_symbols' as was done before.
  (load_all_symbols is no longer called on breakpoint hit)
  That allows gdb to avoid getting confused about the state of the
  (now two) internal breakpoints we place.
  Otherwise it will leave them in the kernel code segment, when
  continuing which triggers a guest kernel panic as soon as it skips
  over the 'int3' instruction and executes the garbage tail of the optcode
  on which the breakpoint was placed.

* Block SIGINT while the script is running as this seems to crash gdb

* Add a basic check that kernel is already loaded into the guest memory
  to avoid confusing errors.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 kernel/module.c              |   8 +-
 scripts/gdb/linux/symbols.py | 203 +++++++++++++++++++++++------------
 2 files changed, 143 insertions(+), 68 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index ed13917ea5f3..242bd4bb0b55 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -906,8 +906,12 @@ int module_refcount(struct module *mod)
 }
 EXPORT_SYMBOL(module_refcount);
 
-/* This exists whether we can unload or not */
-static void free_module(struct module *mod);
+/* This exists whether we can unload or not
+ * Keep it uninlined to provide a reliable breakpoint target,
+ * e.g. for the gdb helper command 'lx-symbols'.
+ */
+
+static noinline void free_module(struct module *mod);
 
 SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
 		unsigned int, flags)
diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
index 08d264ac328b..78e278fb4bad 100644
--- a/scripts/gdb/linux/symbols.py
+++ b/scripts/gdb/linux/symbols.py
@@ -14,45 +14,23 @@
 import gdb
 import os
 import re
+import signal
 
 from linux import modules, utils
 
 
 if hasattr(gdb, 'Breakpoint'):
-    class LoadModuleBreakpoint(gdb.Breakpoint):
-        def __init__(self, spec, gdb_command):
-            super(LoadModuleBreakpoint, self).__init__(spec, internal=True)
+
+    class BreakpointWrapper(gdb.Breakpoint):
+        def __init__(self, callback, **kwargs):
+            super(BreakpointWrapper, self).__init__(internal=True, **kwargs)
             self.silent = True
-            self.gdb_command = gdb_command
+            self.callback = callback
 
         def stop(self):
-            module = gdb.parse_and_eval("mod")
-            module_name = module['name'].string()
-            cmd = self.gdb_command
-
-            # enforce update if object file is not found
-            cmd.module_files_updated = False
-
-            # Disable pagination while reporting symbol (re-)loading.
-            # The console input is blocked in this context so that we would
-            # get stuck waiting for the user to acknowledge paged output.
-            show_pagination = gdb.execute("show pagination", to_string=True)
-            pagination = show_pagination.endswith("on.\n")
-            gdb.execute("set pagination off")
-
-            if module_name in cmd.loaded_modules:
-                gdb.write("refreshing all symbols to reload module "
-                          "'{0}'\n".format(module_name))
-                cmd.load_all_symbols()
-            else:
-                cmd.load_module_symbols(module)
-
-            # restore pagination state
-            gdb.execute("set pagination %s" % ("on" if pagination else "off"))
-
+            self.callback()
             return False
 
-
 class LxSymbols(gdb.Command):
     """(Re-)load symbols of Linux kernel and currently loaded modules.
 
@@ -61,15 +39,52 @@ are scanned recursively, starting in the same directory. Optionally, the module
 search path can be extended by a space separated list of paths passed to the
 lx-symbols command."""
 
-    module_paths = []
-    module_files = []
-    module_files_updated = False
-    loaded_modules = []
-    breakpoint = None
-
     def __init__(self):
         super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
                                         gdb.COMPLETE_FILENAME)
+        self.module_paths = []
+        self.module_files = []
+        self.module_files_updated = False
+        self.loaded_modules = {}
+        self.internal_breakpoints = []
+
+    # prepare GDB for loading/unloading a module
+    def _prepare_for_module_load_unload(self):
+
+        self.blocked_sigint = False
+
+        # block SIGINT during execution to avoid gdb crash
+        sigmask = signal.pthread_sigmask(signal.SIG_BLOCK, [])
+        if not signal.SIGINT in sigmask:
+            self.blocked_sigint = True
+            signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT})
+
+        # disable all breakpoints to workaround a GDB bug where it would
+        # not correctly resume from an internal breakpoint we placed
+        # in do_module_init/free_module (it leaves the int3
+        self.saved_breakpoints = []
+        if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
+            for bp in gdb.breakpoints():
+                self.saved_breakpoints.append({'breakpoint': bp, 'enabled': bp.enabled})
+                bp.enabled = False
+
+        # disable pagination to avoid asking user for continue
+        show_pagination = gdb.execute("show pagination", to_string=True)
+        self.saved_pagination = show_pagination.endswith("on.\n")
+        gdb.execute("set pagination off")
+
+    def _unprepare_for_module_load_unload(self):
+        # restore breakpoint state
+        for breakpoint in self.saved_breakpoints:
+            breakpoint['breakpoint'].enabled = breakpoint['enabled']
+
+        # restore pagination state
+        gdb.execute("set pagination %s" % ("on" if self.saved_pagination else "off"))
+
+        # unblock SIGINT
+        if self.blocked_sigint:
+            sigmask = signal.pthread_sigmask(signal.SIG_UNBLOCK, {signal.SIGINT})
+            self.blocked_sigint = False
 
     def _update_module_files(self):
         self.module_files = []
@@ -107,7 +122,7 @@ lx-symbols command."""
                     name=section_name, addr=str(address)))
         return "".join(args)
 
-    def load_module_symbols(self, module):
+    def _do_load_module_symbols(self, module):
         module_name = module['name'].string()
         module_addr = str(module['core_layout']['base']).split()[0]
 
@@ -130,56 +145,112 @@ lx-symbols command."""
                 addr=module_addr,
                 sections=self._section_arguments(module))
             gdb.execute(cmdline, to_string=True)
-            if module_name not in self.loaded_modules:
-                self.loaded_modules.append(module_name)
+
+            self.loaded_modules[module_name] = {"module_file": module_file,
+                                                "module_addr": module_addr}
         else:
             gdb.write("no module object found for '{0}'\n".format(module_name))
 
+
+    def load_module_symbols(self):
+        module = gdb.parse_and_eval("mod")
+
+                # module already loaded, false alarm
+        # can happen if 'do_init_module' breakpoint is hit multiple times
+        # due to interrupts
+        module_name = module['name'].string()
+        if module_name in self.loaded_modules:
+            gdb.write("spurious module load breakpoint\n")
+            return
+
+        # enforce update if object file is not found
+        self.module_files_updated = False
+        self._prepare_for_module_load_unload()
+        try:
+            self._do_load_module_symbols(module)
+        finally:
+            self._unprepare_for_module_load_unload()
+
+
+    def unload_module_symbols(self):
+        module = gdb.parse_and_eval("mod")
+        module_name = module['name'].string()
+
+        # module already unloaded, false alarm
+        # can happen if 'free_module' breakpoint is hit multiple times
+        # due to interrupts
+        if not module_name in self.loaded_modules:
+            gdb.write("spurious module unload breakpoint\n")
+            return
+
+        module_file = self.loaded_modules[module_name]["module_file"]
+        module_addr = self.loaded_modules[module_name]["module_addr"]
+
+        self._prepare_for_module_load_unload()
+        try:
+            gdb.write("unloading @{addr}: {filename}\n".format(
+                addr=module_addr, filename=module_file))
+            cmdline = "remove-symbol-file {filename}".format(
+                filename=module_file)
+            gdb.execute(cmdline, to_string=True)
+            del self.loaded_modules[module_name]
+
+        finally:
+            self._unprepare_for_module_load_unload()
+
     def load_all_symbols(self):
         gdb.write("loading vmlinux\n")
 
-        # Dropping symbols will disable all breakpoints. So save their states
-        # and restore them afterward.
-        saved_states = []
-        if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
-            for bp in gdb.breakpoints():
-                saved_states.append({'breakpoint': bp, 'enabled': bp.enabled})
-
-        # drop all current symbols and reload vmlinux
-        orig_vmlinux = 'vmlinux'
-        for obj in gdb.objfiles():
-            if obj.filename.endswith('vmlinux'):
-                orig_vmlinux = obj.filename
-        gdb.execute("symbol-file", to_string=True)
-        gdb.execute("symbol-file {0}".format(orig_vmlinux))
-
-        self.loaded_modules = []
-        module_list = modules.module_list()
-        if not module_list:
-            gdb.write("no modules found\n")
-        else:
-            [self.load_module_symbols(module) for module in module_list]
+        self._prepare_for_module_load_unload()
+        try:
+            # drop all current symbols and reload vmlinux
+            orig_vmlinux = 'vmlinux'
+            for obj in gdb.objfiles():
+                if obj.filename.endswith('vmlinux'):
+                    orig_vmlinux = obj.filename
+            gdb.execute("symbol-file", to_string=True)
+            gdb.execute("symbol-file {0}".format(orig_vmlinux))
+            self.loaded_modules = {}
+            module_list = modules.module_list()
+            if not module_list:
+                gdb.write("no modules found\n")
+            else:
+                [self._do_load_module_symbols(module) for module in module_list]
+        finally:
+            self._unprepare_for_module_load_unload()
 
-        for saved_state in saved_states:
-            saved_state['breakpoint'].enabled = saved_state['enabled']
+        self._unprepare_for_module_load_unload()
 
     def invoke(self, arg, from_tty):
         self.module_paths = [os.path.abspath(os.path.expanduser(p))
                              for p in arg.split()]
         self.module_paths.append(os.getcwd())
 
+        try:
+            gdb.parse_and_eval("*start_kernel").fetch_lazy()
+        except gdb.MemoryError:
+            gdb.write("Error: Kernel is not yet loaded\n")
+            return
+
         # enforce update
         self.module_files = []
         self.module_files_updated = False
 
+        for bp in self.internal_breakpoints:
+            bp.delete()
+        self.internal_breakpoints = []
+
         self.load_all_symbols()
 
         if hasattr(gdb, 'Breakpoint'):
-            if self.breakpoint is not None:
-                self.breakpoint.delete()
-                self.breakpoint = None
-            self.breakpoint = LoadModuleBreakpoint(
-                "kernel/module.c:do_init_module", self)
+            self.internal_breakpoints.append(
+                BreakpointWrapper(self.load_module_symbols,
+                                  spec="kernel/module.c:do_init_module",
+                                  ))
+            self.internal_breakpoints.append(
+                BreakpointWrapper(self.unload_module_symbols,
+                                  spec="kernel/module.c:free_module",
+                                  ))
         else:
             gdb.write("Note: symbol update on module loading not supported "
                       "with this gdb version\n")
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] scripts/gdb: rework lx-symbols gdb script
  2021-08-11 13:31 [PATCH] scripts/gdb: rework lx-symbols gdb script Maxim Levitsky
@ 2021-08-11 19:01 ` Jan Kiszka
  2021-08-11 20:10   ` Maxim Levitsky
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2021-08-11 19:01 UTC (permalink / raw)
  To: Maxim Levitsky, linux-kernel
  Cc: Johannes Berg, Jessica Yu, Andrew Morton, Kieran Bingham

On 11.08.21 15:31, Maxim Levitsky wrote:
> Fix several issues that are present in lx-symbols script:
> 
> * Track module unloads by placing another software breakpoint at
>   'free_module'
>   (force uninline this symbol just in case), and use remove-symbol-file
>   gdb command to unload the symobls of the module that is unloading.
> 
>   That gives the gdb a chance to mark all software breakpoints from
>   this module as pending again.
>   Also remove the module from the 'known' module list once it is unloaded.
> 
> * Since we now track module unload, we don't need to reload all
>   symbols anymore when 'known' module loaded again
>   (that can't happen anymore).
>   This allows reloading a module in the debugged kernel to finish
>   much faster, while lx-symbols tracks module loads and unloads.
> 
> * Disable/enable all gdb breakpoints on both module load and unload
>   breakpoint hits, and not only in 'load_all_symbols' as was done before.
>   (load_all_symbols is no longer called on breakpoint hit)
>   That allows gdb to avoid getting confused about the state of the
>   (now two) internal breakpoints we place.
>   Otherwise it will leave them in the kernel code segment, when
>   continuing which triggers a guest kernel panic as soon as it skips
>   over the 'int3' instruction and executes the garbage tail of the optcode
>   on which the breakpoint was placed.
> 
> * Block SIGINT while the script is running as this seems to crash gdb
> 
> * Add a basic check that kernel is already loaded into the guest memory
>   to avoid confusing errors.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  kernel/module.c              |   8 +-
>  scripts/gdb/linux/symbols.py | 203 +++++++++++++++++++++++------------
>  2 files changed, 143 insertions(+), 68 deletions(-)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index ed13917ea5f3..242bd4bb0b55 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -906,8 +906,12 @@ int module_refcount(struct module *mod)
>  }
>  EXPORT_SYMBOL(module_refcount);
>  
> -/* This exists whether we can unload or not */
> -static void free_module(struct module *mod);
> +/* This exists whether we can unload or not
> + * Keep it uninlined to provide a reliable breakpoint target,
> + * e.g. for the gdb helper command 'lx-symbols'.
> + */
> +
> +static noinline void free_module(struct module *mod);
>  
>  SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
>  		unsigned int, flags)

You likely want and need to push that as separate patch, analogously to
be02a1862304.

And as you are factoring the patch, maybe think about whether you can
split the changes to symbols.py into logical steps as well. The commit
messages suggests that, thought that might be misleading.

> diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
> index 08d264ac328b..78e278fb4bad 100644
> --- a/scripts/gdb/linux/symbols.py
> +++ b/scripts/gdb/linux/symbols.py
> @@ -14,45 +14,23 @@
>  import gdb
>  import os
>  import re
> +import signal
>  
>  from linux import modules, utils
>  
>  
>  if hasattr(gdb, 'Breakpoint'):
> -    class LoadModuleBreakpoint(gdb.Breakpoint):
> -        def __init__(self, spec, gdb_command):
> -            super(LoadModuleBreakpoint, self).__init__(spec, internal=True)
> +
> +    class BreakpointWrapper(gdb.Breakpoint):
> +        def __init__(self, callback, **kwargs):
> +            super(BreakpointWrapper, self).__init__(internal=True, **kwargs)
>              self.silent = True
> -            self.gdb_command = gdb_command
> +            self.callback = callback
>  
>          def stop(self):
> -            module = gdb.parse_and_eval("mod")
> -            module_name = module['name'].string()
> -            cmd = self.gdb_command
> -
> -            # enforce update if object file is not found
> -            cmd.module_files_updated = False
> -
> -            # Disable pagination while reporting symbol (re-)loading.
> -            # The console input is blocked in this context so that we would
> -            # get stuck waiting for the user to acknowledge paged output.
> -            show_pagination = gdb.execute("show pagination", to_string=True)
> -            pagination = show_pagination.endswith("on.\n")
> -            gdb.execute("set pagination off")
> -
> -            if module_name in cmd.loaded_modules:
> -                gdb.write("refreshing all symbols to reload module "
> -                          "'{0}'\n".format(module_name))
> -                cmd.load_all_symbols()
> -            else:
> -                cmd.load_module_symbols(module)
> -
> -            # restore pagination state
> -            gdb.execute("set pagination %s" % ("on" if pagination else "off"))
> -
> +            self.callback()
>              return False
>  
> -
>  class LxSymbols(gdb.Command):
>      """(Re-)load symbols of Linux kernel and currently loaded modules.
>  
> @@ -61,15 +39,52 @@ are scanned recursively, starting in the same directory. Optionally, the module
>  search path can be extended by a space separated list of paths passed to the
>  lx-symbols command."""
>  
> -    module_paths = []
> -    module_files = []
> -    module_files_updated = False
> -    loaded_modules = []
> -    breakpoint = None
> -
>      def __init__(self):
>          super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
>                                          gdb.COMPLETE_FILENAME)
> +        self.module_paths = []
> +        self.module_files = []
> +        self.module_files_updated = False
> +        self.loaded_modules = {}
> +        self.internal_breakpoints = []
> +
> +    # prepare GDB for loading/unloading a module
> +    def _prepare_for_module_load_unload(self):
> +
> +        self.blocked_sigint = False
> +
> +        # block SIGINT during execution to avoid gdb crash
> +        sigmask = signal.pthread_sigmask(signal.SIG_BLOCK, [])
> +        if not signal.SIGINT in sigmask:
> +            self.blocked_sigint = True
> +            signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT})
> +
> +        # disable all breakpoints to workaround a GDB bug where it would
> +        # not correctly resume from an internal breakpoint we placed
> +        # in do_module_init/free_module (it leaves the int3

Seems the comment ends prematurely.

Any reference to a gdb bug tracker entry? Or affected versions? The
description is a bit too fuzzy.

> +        self.saved_breakpoints = []
> +        if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> +            for bp in gdb.breakpoints():
> +                self.saved_breakpoints.append({'breakpoint': bp, 'enabled': bp.enabled})
> +                bp.enabled = False
> +
> +        # disable pagination to avoid asking user for continue
> +        show_pagination = gdb.execute("show pagination", to_string=True)
> +        self.saved_pagination = show_pagination.endswith("on.\n")
> +        gdb.execute("set pagination off")
> +
> +    def _unprepare_for_module_load_unload(self):
> +        # restore breakpoint state
> +        for breakpoint in self.saved_breakpoints:
> +            breakpoint['breakpoint'].enabled = breakpoint['enabled']
> +
> +        # restore pagination state
> +        gdb.execute("set pagination %s" % ("on" if self.saved_pagination else "off"))
> +
> +        # unblock SIGINT
> +        if self.blocked_sigint:
> +            sigmask = signal.pthread_sigmask(signal.SIG_UNBLOCK, {signal.SIGINT})
> +            self.blocked_sigint = False
>  
>      def _update_module_files(self):
>          self.module_files = []
> @@ -107,7 +122,7 @@ lx-symbols command."""
>                      name=section_name, addr=str(address)))
>          return "".join(args)
>  
> -    def load_module_symbols(self, module):
> +    def _do_load_module_symbols(self, module):
>          module_name = module['name'].string()
>          module_addr = str(module['core_layout']['base']).split()[0]
>  
> @@ -130,56 +145,112 @@ lx-symbols command."""
>                  addr=module_addr,
>                  sections=self._section_arguments(module))
>              gdb.execute(cmdline, to_string=True)
> -            if module_name not in self.loaded_modules:
> -                self.loaded_modules.append(module_name)
> +
> +            self.loaded_modules[module_name] = {"module_file": module_file,
> +                                                "module_addr": module_addr}
>          else:
>              gdb.write("no module object found for '{0}'\n".format(module_name))
>  
> +
> +    def load_module_symbols(self):
> +        module = gdb.parse_and_eval("mod")
> +
> +        # module already loaded, false alarm
> +        # can happen if 'do_init_module' breakpoint is hit multiple times
> +        # due to interrupts

Maybe better move this comment into the if-clause to make the context
clearer.

> +        module_name = module['name'].string()
> +        if module_name in self.loaded_modules:
> +            gdb.write("spurious module load breakpoint\n")
> +            return
> +
> +        # enforce update if object file is not found
> +        self.module_files_updated = False
> +        self._prepare_for_module_load_unload()
> +        try:
> +            self._do_load_module_symbols(module)
> +        finally:
> +            self._unprepare_for_module_load_unload()
> +
> +
> +    def unload_module_symbols(self):
> +        module = gdb.parse_and_eval("mod")
> +        module_name = module['name'].string()
> +
> +        # module already unloaded, false alarm
> +        # can happen if 'free_module' breakpoint is hit multiple times
> +        # due to interrupts

Same as above.

> +        if not module_name in self.loaded_modules:
> +            gdb.write("spurious module unload breakpoint\n")
> +            return
> +
> +        module_file = self.loaded_modules[module_name]["module_file"]
> +        module_addr = self.loaded_modules[module_name]["module_addr"]
> +
> +        self._prepare_for_module_load_unload()
> +        try:
> +            gdb.write("unloading @{addr}: {filename}\n".format(
> +                addr=module_addr, filename=module_file))
> +            cmdline = "remove-symbol-file {filename}".format(
> +                filename=module_file)
> +            gdb.execute(cmdline, to_string=True)
> +            del self.loaded_modules[module_name]
> +
> +        finally:
> +            self._unprepare_for_module_load_unload()
> +
>      def load_all_symbols(self):
>          gdb.write("loading vmlinux\n")
>  
> -        # Dropping symbols will disable all breakpoints. So save their states
> -        # and restore them afterward.
> -        saved_states = []
> -        if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> -            for bp in gdb.breakpoints():
> -                saved_states.append({'breakpoint': bp, 'enabled': bp.enabled})
> -
> -        # drop all current symbols and reload vmlinux
> -        orig_vmlinux = 'vmlinux'
> -        for obj in gdb.objfiles():
> -            if obj.filename.endswith('vmlinux'):
> -                orig_vmlinux = obj.filename
> -        gdb.execute("symbol-file", to_string=True)
> -        gdb.execute("symbol-file {0}".format(orig_vmlinux))
> -
> -        self.loaded_modules = []
> -        module_list = modules.module_list()
> -        if not module_list:
> -            gdb.write("no modules found\n")
> -        else:
> -            [self.load_module_symbols(module) for module in module_list]
> +        self._prepare_for_module_load_unload()
> +        try:
> +            # drop all current symbols and reload vmlinux
> +            orig_vmlinux = 'vmlinux'
> +            for obj in gdb.objfiles():
> +                if obj.filename.endswith('vmlinux'):
> +                    orig_vmlinux = obj.filename
> +            gdb.execute("symbol-file", to_string=True)
> +            gdb.execute("symbol-file {0}".format(orig_vmlinux))
> +            self.loaded_modules = {}
> +            module_list = modules.module_list()
> +            if not module_list:
> +                gdb.write("no modules found\n")
> +            else:
> +                [self._do_load_module_symbols(module) for module in module_list]

Is that common python style? Elsewhere, you do

for var in list:
    function(var)

> +        finally:
> +            self._unprepare_for_module_load_unload()
>  
> -        for saved_state in saved_states:
> -            saved_state['breakpoint'].enabled = saved_state['enabled']
> +        self._unprepare_for_module_load_unload()
>  
>      def invoke(self, arg, from_tty):
>          self.module_paths = [os.path.abspath(os.path.expanduser(p))
>                               for p in arg.split()]
>          self.module_paths.append(os.getcwd())
>  
> +        try:
> +            gdb.parse_and_eval("*start_kernel").fetch_lazy()
> +        except gdb.MemoryError:
> +            gdb.write("Error: Kernel is not yet loaded\n")
> +            return
> +
>          # enforce update
>          self.module_files = []
>          self.module_files_updated = False
>  
> +        for bp in self.internal_breakpoints:
> +            bp.delete()
> +        self.internal_breakpoints = []
> +
>          self.load_all_symbols()
>  
>          if hasattr(gdb, 'Breakpoint'):
> -            if self.breakpoint is not None:
> -                self.breakpoint.delete()
> -                self.breakpoint = None
> -            self.breakpoint = LoadModuleBreakpoint(
> -                "kernel/module.c:do_init_module", self)
> +            self.internal_breakpoints.append(
> +                BreakpointWrapper(self.load_module_symbols,
> +                                  spec="kernel/module.c:do_init_module",
> +                                  ))
> +            self.internal_breakpoints.append(
> +                BreakpointWrapper(self.unload_module_symbols,
> +                                  spec="kernel/module.c:free_module",
> +                                  ))
>          else:
>              gdb.write("Note: symbol update on module loading not supported "
>                        "with this gdb version\n")
> 

The logic of the changes look good to me.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scripts/gdb: rework lx-symbols gdb script
  2021-08-11 19:01 ` Jan Kiszka
@ 2021-08-11 20:10   ` Maxim Levitsky
  2021-08-11 20:15     ` Jan Kiszka
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Levitsky @ 2021-08-11 20:10 UTC (permalink / raw)
  To: Jan Kiszka, linux-kernel
  Cc: Johannes Berg, Jessica Yu, Andrew Morton, Kieran Bingham

On Wed, 2021-08-11 at 21:01 +0200, Jan Kiszka wrote:
> On 11.08.21 15:31, Maxim Levitsky wrote:
> > Fix several issues that are present in lx-symbols script:
> > 
> > * Track module unloads by placing another software breakpoint at
> >   'free_module'
> >   (force uninline this symbol just in case), and use remove-symbol-file
> >   gdb command to unload the symobls of the module that is unloading.
> > 
> >   That gives the gdb a chance to mark all software breakpoints from
> >   this module as pending again.
> >   Also remove the module from the 'known' module list once it is unloaded.
> > 
> > * Since we now track module unload, we don't need to reload all
> >   symbols anymore when 'known' module loaded again
> >   (that can't happen anymore).
> >   This allows reloading a module in the debugged kernel to finish
> >   much faster, while lx-symbols tracks module loads and unloads.
> > 
> > * Disable/enable all gdb breakpoints on both module load and unload
> >   breakpoint hits, and not only in 'load_all_symbols' as was done before.
> >   (load_all_symbols is no longer called on breakpoint hit)
> >   That allows gdb to avoid getting confused about the state of the
> >   (now two) internal breakpoints we place.
> >   Otherwise it will leave them in the kernel code segment, when
> >   continuing which triggers a guest kernel panic as soon as it skips
> >   over the 'int3' instruction and executes the garbage tail of the optcode
> >   on which the breakpoint was placed.
> > 
> > * Block SIGINT while the script is running as this seems to crash gdb
> > 
> > * Add a basic check that kernel is already loaded into the guest memory
> >   to avoid confusing errors.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  kernel/module.c              |   8 +-
> >  scripts/gdb/linux/symbols.py | 203 +++++++++++++++++++++++------------
> >  2 files changed, 143 insertions(+), 68 deletions(-)
> > 
> > diff --git a/kernel/module.c b/kernel/module.c
> > index ed13917ea5f3..242bd4bb0b55 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -906,8 +906,12 @@ int module_refcount(struct module *mod)
> >  }
> >  EXPORT_SYMBOL(module_refcount);
> >  
> > -/* This exists whether we can unload or not */
> > -static void free_module(struct module *mod);
> > +/* This exists whether we can unload or not
> > + * Keep it uninlined to provide a reliable breakpoint target,
> > + * e.g. for the gdb helper command 'lx-symbols'.
> > + */
> > +
> > +static noinline void free_module(struct module *mod);
> >  
> >  SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> >  		unsigned int, flags)
> 
> You likely want and need to push that as separate patch, analogously to
> be02a1862304.

I will do.

> 
> And as you are factoring the patch, maybe think about whether you can
> split the changes to symbols.py into logical steps as well. The commit
> messages suggests that, thought that might be misleading.

I can try doing that.

> 
> > diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
> > index 08d264ac328b..78e278fb4bad 100644
> > --- a/scripts/gdb/linux/symbols.py
> > +++ b/scripts/gdb/linux/symbols.py
> > @@ -14,45 +14,23 @@
> >  import gdb
> >  import os
> >  import re
> > +import signal
> >  
> >  from linux import modules, utils
> >  
> >  
> >  if hasattr(gdb, 'Breakpoint'):
> > -    class LoadModuleBreakpoint(gdb.Breakpoint):
> > -        def __init__(self, spec, gdb_command):
> > -            super(LoadModuleBreakpoint, self).__init__(spec, internal=True)
> > +
> > +    class BreakpointWrapper(gdb.Breakpoint):
> > +        def __init__(self, callback, **kwargs):
> > +            super(BreakpointWrapper, self).__init__(internal=True, **kwargs)
> >              self.silent = True
> > -            self.gdb_command = gdb_command
> > +            self.callback = callback
> >  
> >          def stop(self):
> > -            module = gdb.parse_and_eval("mod")
> > -            module_name = module['name'].string()
> > -            cmd = self.gdb_command
> > -
> > -            # enforce update if object file is not found
> > -            cmd.module_files_updated = False
> > -
> > -            # Disable pagination while reporting symbol (re-)loading.
> > -            # The console input is blocked in this context so that we would
> > -            # get stuck waiting for the user to acknowledge paged output.
> > -            show_pagination = gdb.execute("show pagination", to_string=True)
> > -            pagination = show_pagination.endswith("on.\n")
> > -            gdb.execute("set pagination off")
> > -
> > -            if module_name in cmd.loaded_modules:
> > -                gdb.write("refreshing all symbols to reload module "
> > -                          "'{0}'\n".format(module_name))
> > -                cmd.load_all_symbols()
> > -            else:
> > -                cmd.load_module_symbols(module)
> > -
> > -            # restore pagination state
> > -            gdb.execute("set pagination %s" % ("on" if pagination else "off"))
> > -
> > +            self.callback()
> >              return False
> >  
> > -
> >  class LxSymbols(gdb.Command):
> >      """(Re-)load symbols of Linux kernel and currently loaded modules.
> >  
> > @@ -61,15 +39,52 @@ are scanned recursively, starting in the same directory. Optionally, the module
> >  search path can be extended by a space separated list of paths passed to the
> >  lx-symbols command."""
> >  
> > -    module_paths = []
> > -    module_files = []
> > -    module_files_updated = False
> > -    loaded_modules = []
> > -    breakpoint = None
> > -
> >      def __init__(self):
> >          super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
> >                                          gdb.COMPLETE_FILENAME)
> > +        self.module_paths = []
> > +        self.module_files = []
> > +        self.module_files_updated = False
> > +        self.loaded_modules = {}
> > +        self.internal_breakpoints = []
> > +
> > +    # prepare GDB for loading/unloading a module
> > +    def _prepare_for_module_load_unload(self):
> > +
> > +        self.blocked_sigint = False
> > +
> > +        # block SIGINT during execution to avoid gdb crash
> > +        sigmask = signal.pthread_sigmask(signal.SIG_BLOCK, [])
> > +        if not signal.SIGINT in sigmask:
> > +            self.blocked_sigint = True
> > +            signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT})
> > +
> > +        # disable all breakpoints to workaround a GDB bug where it would
> > +        # not correctly resume from an internal breakpoint we placed
> > +        # in do_module_init/free_module (it leaves the int3
> 
> Seems the comment ends prematurely.

Yep, GDB leaves the int3 instruction in the guest memory, and the guest dies after
it encounters the truncated instruction that follows it.

> 
> Any reference to a gdb bug tracker entry? Or affected versions? The
> description is a bit too fuzzy.

Well stricly speaking this isn't a GDB bug.
GDB documentation explictly prohibits what we are doing in this script.

https://sourceware.org/gdb/current/onlinedocs/gdb/Breakpoints-In-Python.html

"You should not alter the execution state of the inferior (i.e., step, next, etc.), alter the current frame context 
(i.e., change the current active frame), or alter, add or delete any breakpoint. 
As a general rule, you should not alter any data within GDB or the inferior at this time."

However we are reloading the whole symbol table as a response to a breakpoint.

However there is pretty much no other way to do what this script does so the next best thing
is to workaround this as was already partially done, and I just made it more robust.

Same for blocking SIGINT which I added, which otherwise crashes GDB
while the symbols are reloading.
It probably will also be blamed on this.

Do you think I might have some luck taking with GDB maintainers and asking them to support
this use case of updating symbol table when a breakpoint hits?

> 
> > +        self.saved_breakpoints = []
> > +        if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> > +            for bp in gdb.breakpoints():
> > +                self.saved_breakpoints.append({'breakpoint': bp, 'enabled': bp.enabled})
> > +                bp.enabled = False
> > +
> > +        # disable pagination to avoid asking user for continue
> > +        show_pagination = gdb.execute("show pagination", to_string=True)
> > +        self.saved_pagination = show_pagination.endswith("on.\n")
> > +        gdb.execute("set pagination off")
> > +
> > +    def _unprepare_for_module_load_unload(self):
> > +        # restore breakpoint state
> > +        for breakpoint in self.saved_breakpoints:
> > +            breakpoint['breakpoint'].enabled = breakpoint['enabled']
> > +
> > +        # restore pagination state
> > +        gdb.execute("set pagination %s" % ("on" if self.saved_pagination else "off"))
> > +
> > +        # unblock SIGINT
> > +        if self.blocked_sigint:
> > +            sigmask = signal.pthread_sigmask(signal.SIG_UNBLOCK, {signal.SIGINT})
> > +            self.blocked_sigint = False
> >  
> >      def _update_module_files(self):
> >          self.module_files = []
> > @@ -107,7 +122,7 @@ lx-symbols command."""
> >                      name=section_name, addr=str(address)))
> >          return "".join(args)
> >  
> > -    def load_module_symbols(self, module):
> > +    def _do_load_module_symbols(self, module):
> >          module_name = module['name'].string()
> >          module_addr = str(module['core_layout']['base']).split()[0]
> >  
> > @@ -130,56 +145,112 @@ lx-symbols command."""
> >                  addr=module_addr,
> >                  sections=self._section_arguments(module))
> >              gdb.execute(cmdline, to_string=True)
> > -            if module_name not in self.loaded_modules:
> > -                self.loaded_modules.append(module_name)
> > +
> > +            self.loaded_modules[module_name] = {"module_file": module_file,
> > +                                                "module_addr": module_addr}
> >          else:
> >              gdb.write("no module object found for '{0}'\n".format(module_name))
> >  
> > +
> > +    def load_module_symbols(self):
> > +        module = gdb.parse_and_eval("mod")
> > +
> > +        # module already loaded, false alarm
> > +        # can happen if 'do_init_module' breakpoint is hit multiple times
> > +        # due to interrupts
> 
> Maybe better move this comment into the if-clause to make the context
> clearer.

Good idea, will do.

> 
> > +        module_name = module['name'].string()
> > +        if module_name in self.loaded_modules:
> > +            gdb.write("spurious module load breakpoint\n")
> > +            return
> > +
> > +        # enforce update if object file is not found
> > +        self.module_files_updated = False
> > +        self._prepare_for_module_load_unload()
> > +        try:
> > +            self._do_load_module_symbols(module)
> > +        finally:
> > +            self._unprepare_for_module_load_unload()
> > +
> > +
> > +    def unload_module_symbols(self):
> > +        module = gdb.parse_and_eval("mod")
> > +        module_name = module['name'].string()
> > +
> > +        # module already unloaded, false alarm
> > +        # can happen if 'free_module' breakpoint is hit multiple times
> > +        # due to interrupts
> 
> Same as above.
> 
> > +        if not module_name in self.loaded_modules:
> > +            gdb.write("spurious module unload breakpoint\n")
> > +            return
> > +
> > +        module_file = self.loaded_modules[module_name]["module_file"]
> > +        module_addr = self.loaded_modules[module_name]["module_addr"]
> > +
> > +        self._prepare_for_module_load_unload()
> > +        try:
> > +            gdb.write("unloading @{addr}: {filename}\n".format(
> > +                addr=module_addr, filename=module_file))
> > +            cmdline = "remove-symbol-file {filename}".format(
> > +                filename=module_file)
> > +            gdb.execute(cmdline, to_string=True)
> > +            del self.loaded_modules[module_name]
> > +
> > +        finally:
> > +            self._unprepare_for_module_load_unload()
> > +
> >      def load_all_symbols(self):
> >          gdb.write("loading vmlinux\n")
> >  
> > -        # Dropping symbols will disable all breakpoints. So save their states
> > -        # and restore them afterward.
> > -        saved_states = []
> > -        if hasattr(gdb, 'breakpoints') and not gdb.breakpoints() is None:
> > -            for bp in gdb.breakpoints():
> > -                saved_states.append({'breakpoint': bp, 'enabled': bp.enabled})
> > -
> > -        # drop all current symbols and reload vmlinux
> > -        orig_vmlinux = 'vmlinux'
> > -        for obj in gdb.objfiles():
> > -            if obj.filename.endswith('vmlinux'):
> > -                orig_vmlinux = obj.filename
> > -        gdb.execute("symbol-file", to_string=True)
> > -        gdb.execute("symbol-file {0}".format(orig_vmlinux))
> > -
> > -        self.loaded_modules = []
> > -        module_list = modules.module_list()
> > -        if not module_list:
> > -            gdb.write("no modules found\n")
> > -        else:
> > -            [self.load_module_symbols(module) for module in module_list]
> > +        self._prepare_for_module_load_unload()
> > +        try:
> > +            # drop all current symbols and reload vmlinux
> > +            orig_vmlinux = 'vmlinux'
> > +            for obj in gdb.objfiles():
> > +                if obj.filename.endswith('vmlinux'):
> > +                    orig_vmlinux = obj.filename
> > +            gdb.execute("symbol-file", to_string=True)
> > +            gdb.execute("symbol-file {0}".format(orig_vmlinux))
> > +            self.loaded_modules = {}
> > +            module_list = modules.module_list()
> > +            if not module_list:
> > +                gdb.write("no modules found\n")
> > +            else:
> > +                [self._do_load_module_symbols(module) for module in module_list]
> 
> Is that common python style? Elsewhere, you do
> 
> for var in list:
>     function(var)

It is a code I moved verbatim from the above. 
I can change it to use the more common syntax.

> 
> > +        finally:
> > +            self._unprepare_for_module_load_unload()
> >  
> > -        for saved_state in saved_states:
> > -            saved_state['breakpoint'].enabled = saved_state['enabled']
> > +        self._unprepare_for_module_load_unload()
> >  
> >      def invoke(self, arg, from_tty):
> >          self.module_paths = [os.path.abspath(os.path.expanduser(p))
> >                               for p in arg.split()]
> >          self.module_paths.append(os.getcwd())
> >  
> > +        try:
> > +            gdb.parse_and_eval("*start_kernel").fetch_lazy()
> > +        except gdb.MemoryError:
> > +            gdb.write("Error: Kernel is not yet loaded\n")
> > +            return
> > +
> >          # enforce update
> >          self.module_files = []
> >          self.module_files_updated = False
> >  
> > +        for bp in self.internal_breakpoints:
> > +            bp.delete()
> > +        self.internal_breakpoints = []
> > +
> >          self.load_all_symbols()
> >  
> >          if hasattr(gdb, 'Breakpoint'):
> > -            if self.breakpoint is not None:
> > -                self.breakpoint.delete()
> > -                self.breakpoint = None
> > -            self.breakpoint = LoadModuleBreakpoint(
> > -                "kernel/module.c:do_init_module", self)
> > +            self.internal_breakpoints.append(
> > +                BreakpointWrapper(self.load_module_symbols,
> > +                                  spec="kernel/module.c:do_init_module",
> > +                                  ))
> > +            self.internal_breakpoints.append(
> > +                BreakpointWrapper(self.unload_module_symbols,
> > +                                  spec="kernel/module.c:free_module",
> > +                                  ))
> >          else:
> >              gdb.write("Note: symbol update on module loading not supported "
> >                        "with this gdb version\n")
> > 
> 
> The logic of the changes look good to me.

Thanks!

Best regards,
	Maxim Levitsky

> 
> Jan
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scripts/gdb: rework lx-symbols gdb script
  2021-08-11 20:10   ` Maxim Levitsky
@ 2021-08-11 20:15     ` Jan Kiszka
  2021-08-11 20:41       ` Maxim Levitsky
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2021-08-11 20:15 UTC (permalink / raw)
  To: Maxim Levitsky, linux-kernel
  Cc: Johannes Berg, Jessica Yu, Andrew Morton, Kieran Bingham

On 11.08.21 22:10, Maxim Levitsky wrote:
> On Wed, 2021-08-11 at 21:01 +0200, Jan Kiszka wrote:
>> On 11.08.21 15:31, Maxim Levitsky wrote:
>>> Fix several issues that are present in lx-symbols script:
>>>
>>> * Track module unloads by placing another software breakpoint at
>>>   'free_module'
>>>   (force uninline this symbol just in case), and use remove-symbol-file
>>>   gdb command to unload the symobls of the module that is unloading.
>>>
>>>   That gives the gdb a chance to mark all software breakpoints from
>>>   this module as pending again.
>>>   Also remove the module from the 'known' module list once it is unloaded.
>>>
>>> * Since we now track module unload, we don't need to reload all
>>>   symbols anymore when 'known' module loaded again
>>>   (that can't happen anymore).
>>>   This allows reloading a module in the debugged kernel to finish
>>>   much faster, while lx-symbols tracks module loads and unloads.
>>>
>>> * Disable/enable all gdb breakpoints on both module load and unload
>>>   breakpoint hits, and not only in 'load_all_symbols' as was done before.
>>>   (load_all_symbols is no longer called on breakpoint hit)
>>>   That allows gdb to avoid getting confused about the state of the
>>>   (now two) internal breakpoints we place.
>>>   Otherwise it will leave them in the kernel code segment, when
>>>   continuing which triggers a guest kernel panic as soon as it skips
>>>   over the 'int3' instruction and executes the garbage tail of the optcode
>>>   on which the breakpoint was placed.
>>>
>>> * Block SIGINT while the script is running as this seems to crash gdb
>>>
>>> * Add a basic check that kernel is already loaded into the guest memory
>>>   to avoid confusing errors.
>>>
>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>> ---
>>>  kernel/module.c              |   8 +-
>>>  scripts/gdb/linux/symbols.py | 203 +++++++++++++++++++++++------------
>>>  2 files changed, 143 insertions(+), 68 deletions(-)
>>>
>>> diff --git a/kernel/module.c b/kernel/module.c
>>> index ed13917ea5f3..242bd4bb0b55 100644
>>> --- a/kernel/module.c
>>> +++ b/kernel/module.c
>>> @@ -906,8 +906,12 @@ int module_refcount(struct module *mod)
>>>  }
>>>  EXPORT_SYMBOL(module_refcount);
>>>  
>>> -/* This exists whether we can unload or not */
>>> -static void free_module(struct module *mod);
>>> +/* This exists whether we can unload or not
>>> + * Keep it uninlined to provide a reliable breakpoint target,
>>> + * e.g. for the gdb helper command 'lx-symbols'.
>>> + */
>>> +
>>> +static noinline void free_module(struct module *mod);
>>>  
>>>  SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
>>>  		unsigned int, flags)
>>
>> You likely want and need to push that as separate patch, analogously to
>> be02a1862304.
> 
> I will do.
> 
>>
>> And as you are factoring the patch, maybe think about whether you can
>> split the changes to symbols.py into logical steps as well. The commit
>> messages suggests that, thought that might be misleading.
> 
> I can try doing that.
> 
>>
>>> diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
>>> index 08d264ac328b..78e278fb4bad 100644
>>> --- a/scripts/gdb/linux/symbols.py
>>> +++ b/scripts/gdb/linux/symbols.py
>>> @@ -14,45 +14,23 @@
>>>  import gdb
>>>  import os
>>>  import re
>>> +import signal
>>>  
>>>  from linux import modules, utils
>>>  
>>>  
>>>  if hasattr(gdb, 'Breakpoint'):
>>> -    class LoadModuleBreakpoint(gdb.Breakpoint):
>>> -        def __init__(self, spec, gdb_command):
>>> -            super(LoadModuleBreakpoint, self).__init__(spec, internal=True)
>>> +
>>> +    class BreakpointWrapper(gdb.Breakpoint):
>>> +        def __init__(self, callback, **kwargs):
>>> +            super(BreakpointWrapper, self).__init__(internal=True, **kwargs)
>>>              self.silent = True
>>> -            self.gdb_command = gdb_command
>>> +            self.callback = callback
>>>  
>>>          def stop(self):
>>> -            module = gdb.parse_and_eval("mod")
>>> -            module_name = module['name'].string()
>>> -            cmd = self.gdb_command
>>> -
>>> -            # enforce update if object file is not found
>>> -            cmd.module_files_updated = False
>>> -
>>> -            # Disable pagination while reporting symbol (re-)loading.
>>> -            # The console input is blocked in this context so that we would
>>> -            # get stuck waiting for the user to acknowledge paged output.
>>> -            show_pagination = gdb.execute("show pagination", to_string=True)
>>> -            pagination = show_pagination.endswith("on.\n")
>>> -            gdb.execute("set pagination off")
>>> -
>>> -            if module_name in cmd.loaded_modules:
>>> -                gdb.write("refreshing all symbols to reload module "
>>> -                          "'{0}'\n".format(module_name))
>>> -                cmd.load_all_symbols()
>>> -            else:
>>> -                cmd.load_module_symbols(module)
>>> -
>>> -            # restore pagination state
>>> -            gdb.execute("set pagination %s" % ("on" if pagination else "off"))
>>> -
>>> +            self.callback()
>>>              return False
>>>  
>>> -
>>>  class LxSymbols(gdb.Command):
>>>      """(Re-)load symbols of Linux kernel and currently loaded modules.
>>>  
>>> @@ -61,15 +39,52 @@ are scanned recursively, starting in the same directory. Optionally, the module
>>>  search path can be extended by a space separated list of paths passed to the
>>>  lx-symbols command."""
>>>  
>>> -    module_paths = []
>>> -    module_files = []
>>> -    module_files_updated = False
>>> -    loaded_modules = []
>>> -    breakpoint = None
>>> -
>>>      def __init__(self):
>>>          super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
>>>                                          gdb.COMPLETE_FILENAME)
>>> +        self.module_paths = []
>>> +        self.module_files = []
>>> +        self.module_files_updated = False
>>> +        self.loaded_modules = {}
>>> +        self.internal_breakpoints = []
>>> +
>>> +    # prepare GDB for loading/unloading a module
>>> +    def _prepare_for_module_load_unload(self):
>>> +
>>> +        self.blocked_sigint = False
>>> +
>>> +        # block SIGINT during execution to avoid gdb crash
>>> +        sigmask = signal.pthread_sigmask(signal.SIG_BLOCK, [])
>>> +        if not signal.SIGINT in sigmask:
>>> +            self.blocked_sigint = True
>>> +            signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT})
>>> +
>>> +        # disable all breakpoints to workaround a GDB bug where it would
>>> +        # not correctly resume from an internal breakpoint we placed
>>> +        # in do_module_init/free_module (it leaves the int3
>>
>> Seems the comment ends prematurely.
> 
> Yep, GDB leaves the int3 instruction in the guest memory, and the guest dies after
> it encounters the truncated instruction that follows it.
> 
>>
>> Any reference to a gdb bug tracker entry? Or affected versions? The
>> description is a bit too fuzzy.
> 
> Well stricly speaking this isn't a GDB bug.
> GDB documentation explictly prohibits what we are doing in this script.
> 
> https://sourceware.org/gdb/current/onlinedocs/gdb/Breakpoints-In-Python.html
> 
> "You should not alter the execution state of the inferior (i.e., step, next, etc.), alter the current frame context 
> (i.e., change the current active frame), or alter, add or delete any breakpoint. 
> As a general rule, you should not alter any data within GDB or the inferior at this time."
> 
> However we are reloading the whole symbol table as a response to a breakpoint.
> 
> However there is pretty much no other way to do what this script does so the next best thing
> is to workaround this as was already partially done, and I just made it more robust.
> 
> Same for blocking SIGINT which I added, which otherwise crashes GDB
> while the symbols are reloading.
> It probably will also be blamed on this.
> 
> Do you think I might have some luck taking with GDB maintainers and asking them to support
> this use case of updating symbol table when a breakpoint hits?
> 

We should at least clarify if it's a GDB bug or our use case is outside
of the envisioned ones, thus need to account for that. Then we should
not call it a bug.

[...]

>>> +            if not module_list:
>>> +                gdb.write("no modules found\n")
>>> +            else:
>>> +                [self._do_load_module_symbols(module) for module in module_list]
>>
>> Is that common python style? Elsewhere, you do
>>
>> for var in list:
>>     function(var)
> 
> It is a code I moved verbatim from the above. 
> I can change it to use the more common syntax.

Oh, missed that. And it seems I once wrote it this way - no idea anymore
why...

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scripts/gdb: rework lx-symbols gdb script
  2021-08-11 20:15     ` Jan Kiszka
@ 2021-08-11 20:41       ` Maxim Levitsky
  2021-08-12  5:40         ` Jan Kiszka
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Levitsky @ 2021-08-11 20:41 UTC (permalink / raw)
  To: Jan Kiszka, linux-kernel
  Cc: Johannes Berg, Jessica Yu, Andrew Morton, Kieran Bingham

On Wed, 2021-08-11 at 22:15 +0200, Jan Kiszka wrote:
> On 11.08.21 22:10, Maxim Levitsky wrote:
> > On Wed, 2021-08-11 at 21:01 +0200, Jan Kiszka wrote:
> > > On 11.08.21 15:31, Maxim Levitsky wrote:
> > > > Fix several issues that are present in lx-symbols script:
> > > > 
> > > > * Track module unloads by placing another software breakpoint at
> > > >   'free_module'
> > > >   (force uninline this symbol just in case), and use remove-symbol-file
> > > >   gdb command to unload the symobls of the module that is unloading.
> > > > 
> > > >   That gives the gdb a chance to mark all software breakpoints from
> > > >   this module as pending again.
> > > >   Also remove the module from the 'known' module list once it is unloaded.
> > > > 
> > > > * Since we now track module unload, we don't need to reload all
> > > >   symbols anymore when 'known' module loaded again
> > > >   (that can't happen anymore).
> > > >   This allows reloading a module in the debugged kernel to finish
> > > >   much faster, while lx-symbols tracks module loads and unloads.
> > > > 
> > > > * Disable/enable all gdb breakpoints on both module load and unload
> > > >   breakpoint hits, and not only in 'load_all_symbols' as was done before.
> > > >   (load_all_symbols is no longer called on breakpoint hit)
> > > >   That allows gdb to avoid getting confused about the state of the
> > > >   (now two) internal breakpoints we place.
> > > >   Otherwise it will leave them in the kernel code segment, when
> > > >   continuing which triggers a guest kernel panic as soon as it skips
> > > >   over the 'int3' instruction and executes the garbage tail of the optcode
> > > >   on which the breakpoint was placed.
> > > > 
> > > > * Block SIGINT while the script is running as this seems to crash gdb
> > > > 
> > > > * Add a basic check that kernel is already loaded into the guest memory
> > > >   to avoid confusing errors.
> > > > 
> > > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > > ---
> > > >  kernel/module.c              |   8 +-
> > > >  scripts/gdb/linux/symbols.py | 203 +++++++++++++++++++++++------------
> > > >  2 files changed, 143 insertions(+), 68 deletions(-)
> > > > 
> > > > diff --git a/kernel/module.c b/kernel/module.c
> > > > index ed13917ea5f3..242bd4bb0b55 100644
> > > > --- a/kernel/module.c
> > > > +++ b/kernel/module.c
> > > > @@ -906,8 +906,12 @@ int module_refcount(struct module *mod)
> > > >  }
> > > >  EXPORT_SYMBOL(module_refcount);
> > > >  
> > > > -/* This exists whether we can unload or not */
> > > > -static void free_module(struct module *mod);
> > > > +/* This exists whether we can unload or not
> > > > + * Keep it uninlined to provide a reliable breakpoint target,
> > > > + * e.g. for the gdb helper command 'lx-symbols'.
> > > > + */
> > > > +
> > > > +static noinline void free_module(struct module *mod);
> > > >  
> > > >  SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> > > >  		unsigned int, flags)
> > > 
> > > You likely want and need to push that as separate patch, analogously to
> > > be02a1862304.
> > 
> > I will do.
> > 
> > > And as you are factoring the patch, maybe think about whether you can
> > > split the changes to symbols.py into logical steps as well. The commit
> > > messages suggests that, thought that might be misleading.
> > 
> > I can try doing that.
> > 
> > > > diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
> > > > index 08d264ac328b..78e278fb4bad 100644
> > > > --- a/scripts/gdb/linux/symbols.py
> > > > +++ b/scripts/gdb/linux/symbols.py
> > > > @@ -14,45 +14,23 @@
> > > >  import gdb
> > > >  import os
> > > >  import re
> > > > +import signal
> > > >  
> > > >  from linux import modules, utils
> > > >  
> > > >  
> > > >  if hasattr(gdb, 'Breakpoint'):
> > > > -    class LoadModuleBreakpoint(gdb.Breakpoint):
> > > > -        def __init__(self, spec, gdb_command):
> > > > -            super(LoadModuleBreakpoint, self).__init__(spec, internal=True)
> > > > +
> > > > +    class BreakpointWrapper(gdb.Breakpoint):
> > > > +        def __init__(self, callback, **kwargs):
> > > > +            super(BreakpointWrapper, self).__init__(internal=True, **kwargs)
> > > >              self.silent = True
> > > > -            self.gdb_command = gdb_command
> > > > +            self.callback = callback
> > > >  
> > > >          def stop(self):
> > > > -            module = gdb.parse_and_eval("mod")
> > > > -            module_name = module['name'].string()
> > > > -            cmd = self.gdb_command
> > > > -
> > > > -            # enforce update if object file is not found
> > > > -            cmd.module_files_updated = False
> > > > -
> > > > -            # Disable pagination while reporting symbol (re-)loading.
> > > > -            # The console input is blocked in this context so that we would
> > > > -            # get stuck waiting for the user to acknowledge paged output.
> > > > -            show_pagination = gdb.execute("show pagination", to_string=True)
> > > > -            pagination = show_pagination.endswith("on.\n")
> > > > -            gdb.execute("set pagination off")
> > > > -
> > > > -            if module_name in cmd.loaded_modules:
> > > > -                gdb.write("refreshing all symbols to reload module "
> > > > -                          "'{0}'\n".format(module_name))
> > > > -                cmd.load_all_symbols()
> > > > -            else:
> > > > -                cmd.load_module_symbols(module)
> > > > -
> > > > -            # restore pagination state
> > > > -            gdb.execute("set pagination %s" % ("on" if pagination else "off"))
> > > > -
> > > > +            self.callback()
> > > >              return False
> > > >  
> > > > -
> > > >  class LxSymbols(gdb.Command):
> > > >      """(Re-)load symbols of Linux kernel and currently loaded modules.
> > > >  
> > > > @@ -61,15 +39,52 @@ are scanned recursively, starting in the same directory. Optionally, the module
> > > >  search path can be extended by a space separated list of paths passed to the
> > > >  lx-symbols command."""
> > > >  
> > > > -    module_paths = []
> > > > -    module_files = []
> > > > -    module_files_updated = False
> > > > -    loaded_modules = []
> > > > -    breakpoint = None
> > > > -
> > > >      def __init__(self):
> > > >          super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
> > > >                                          gdb.COMPLETE_FILENAME)
> > > > +        self.module_paths = []
> > > > +        self.module_files = []
> > > > +        self.module_files_updated = False
> > > > +        self.loaded_modules = {}
> > > > +        self.internal_breakpoints = []
> > > > +
> > > > +    # prepare GDB for loading/unloading a module
> > > > +    def _prepare_for_module_load_unload(self):
> > > > +
> > > > +        self.blocked_sigint = False
> > > > +
> > > > +        # block SIGINT during execution to avoid gdb crash
> > > > +        sigmask = signal.pthread_sigmask(signal.SIG_BLOCK, [])
> > > > +        if not signal.SIGINT in sigmask:
> > > > +            self.blocked_sigint = True
> > > > +            signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT})
> > > > +
> > > > +        # disable all breakpoints to workaround a GDB bug where it would
> > > > +        # not correctly resume from an internal breakpoint we placed
> > > > +        # in do_module_init/free_module (it leaves the int3
> > > 
> > > Seems the comment ends prematurely.
> > 
> > Yep, GDB leaves the int3 instruction in the guest memory, and the guest dies after
> > it encounters the truncated instruction that follows it.
> > 
> > > Any reference to a gdb bug tracker entry? Or affected versions? The
> > > description is a bit too fuzzy.
> > 
> > Well stricly speaking this isn't a GDB bug.
> > GDB documentation explictly prohibits what we are doing in this script.
> > 
> > https://sourceware.org/gdb/current/onlinedocs/gdb/Breakpoints-In-Python.html
> > 
> > "You should not alter the execution state of the inferior (i.e., step, next, etc.), alter the current frame context 
> > (i.e., change the current active frame), or alter, add or delete any breakpoint. 
> > As a general rule, you should not alter any data within GDB or the inferior at this time."
> > 
> > However we are reloading the whole symbol table as a response to a breakpoint.
> > 
> > However there is pretty much no other way to do what this script does so the next best thing
> > is to workaround this as was already partially done, and I just made it more robust.
> > 
> > Same for blocking SIGINT which I added, which otherwise crashes GDB
> > while the symbols are reloading.
> > It probably will also be blamed on this.
> > 
> > Do you think I might have some luck taking with GDB maintainers and asking them to support
> > this use case of updating symbol table when a breakpoint hits?
> > 
> 
> We should at least clarify if it's a GDB bug or our use case is outside
> of the envisioned ones, thus need to account for that. Then we should
> not call it a bug.

100% agree.

Do you think it makes sense to CC gdb@sourceware.org to this discussion or
should I make a new post there. I do think I have some energy to try and discuss
this issue with them.

> 
> [...]
> 
> > > > +            if not module_list:
> > > > +                gdb.write("no modules found\n")
> > > > +            else:
> > > > +                [self._do_load_module_symbols(module) for module in module_list]
> > > 
> > > Is that common python style? Elsewhere, you do
> > > 
> > > for var in list:
> > >     function(var)
> > 
> > It is a code I moved verbatim from the above. 
> > I can change it to use the more common syntax.
> 
> Oh, missed that. And it seems I once wrote it this way - no idea anymore
> why...

Python has various wierd syntaxes which we all at some point tried to adopt,
but eventually reverted back to something simple.

Thanks for the review!

Best regards,
	Maxim Levitsky

> 
> Jan
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scripts/gdb: rework lx-symbols gdb script
  2021-08-11 20:41       ` Maxim Levitsky
@ 2021-08-12  5:40         ` Jan Kiszka
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kiszka @ 2021-08-12  5:40 UTC (permalink / raw)
  To: Maxim Levitsky, linux-kernel
  Cc: Johannes Berg, Jessica Yu, Andrew Morton, Kieran Bingham

On 11.08.21 22:41, Maxim Levitsky wrote:
> On Wed, 2021-08-11 at 22:15 +0200, Jan Kiszka wrote:
>> On 11.08.21 22:10, Maxim Levitsky wrote:
>>> On Wed, 2021-08-11 at 21:01 +0200, Jan Kiszka wrote:
>>>> On 11.08.21 15:31, Maxim Levitsky wrote:
>>>>> Fix several issues that are present in lx-symbols script:
>>>>>
>>>>> * Track module unloads by placing another software breakpoint at
>>>>>   'free_module'
>>>>>   (force uninline this symbol just in case), and use remove-symbol-file
>>>>>   gdb command to unload the symobls of the module that is unloading.
>>>>>
>>>>>   That gives the gdb a chance to mark all software breakpoints from
>>>>>   this module as pending again.
>>>>>   Also remove the module from the 'known' module list once it is unloaded.
>>>>>
>>>>> * Since we now track module unload, we don't need to reload all
>>>>>   symbols anymore when 'known' module loaded again
>>>>>   (that can't happen anymore).
>>>>>   This allows reloading a module in the debugged kernel to finish
>>>>>   much faster, while lx-symbols tracks module loads and unloads.
>>>>>
>>>>> * Disable/enable all gdb breakpoints on both module load and unload
>>>>>   breakpoint hits, and not only in 'load_all_symbols' as was done before.
>>>>>   (load_all_symbols is no longer called on breakpoint hit)
>>>>>   That allows gdb to avoid getting confused about the state of the
>>>>>   (now two) internal breakpoints we place.
>>>>>   Otherwise it will leave them in the kernel code segment, when
>>>>>   continuing which triggers a guest kernel panic as soon as it skips
>>>>>   over the 'int3' instruction and executes the garbage tail of the optcode
>>>>>   on which the breakpoint was placed.
>>>>>
>>>>> * Block SIGINT while the script is running as this seems to crash gdb
>>>>>
>>>>> * Add a basic check that kernel is already loaded into the guest memory
>>>>>   to avoid confusing errors.
>>>>>
>>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>>> ---
>>>>>  kernel/module.c              |   8 +-
>>>>>  scripts/gdb/linux/symbols.py | 203 +++++++++++++++++++++++------------
>>>>>  2 files changed, 143 insertions(+), 68 deletions(-)
>>>>>
>>>>> diff --git a/kernel/module.c b/kernel/module.c
>>>>> index ed13917ea5f3..242bd4bb0b55 100644
>>>>> --- a/kernel/module.c
>>>>> +++ b/kernel/module.c
>>>>> @@ -906,8 +906,12 @@ int module_refcount(struct module *mod)
>>>>>  }
>>>>>  EXPORT_SYMBOL(module_refcount);
>>>>>  
>>>>> -/* This exists whether we can unload or not */
>>>>> -static void free_module(struct module *mod);
>>>>> +/* This exists whether we can unload or not
>>>>> + * Keep it uninlined to provide a reliable breakpoint target,
>>>>> + * e.g. for the gdb helper command 'lx-symbols'.
>>>>> + */
>>>>> +
>>>>> +static noinline void free_module(struct module *mod);
>>>>>  
>>>>>  SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
>>>>>  		unsigned int, flags)
>>>>
>>>> You likely want and need to push that as separate patch, analogously to
>>>> be02a1862304.
>>>
>>> I will do.
>>>
>>>> And as you are factoring the patch, maybe think about whether you can
>>>> split the changes to symbols.py into logical steps as well. The commit
>>>> messages suggests that, thought that might be misleading.
>>>
>>> I can try doing that.
>>>
>>>>> diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
>>>>> index 08d264ac328b..78e278fb4bad 100644
>>>>> --- a/scripts/gdb/linux/symbols.py
>>>>> +++ b/scripts/gdb/linux/symbols.py
>>>>> @@ -14,45 +14,23 @@
>>>>>  import gdb
>>>>>  import os
>>>>>  import re
>>>>> +import signal
>>>>>  
>>>>>  from linux import modules, utils
>>>>>  
>>>>>  
>>>>>  if hasattr(gdb, 'Breakpoint'):
>>>>> -    class LoadModuleBreakpoint(gdb.Breakpoint):
>>>>> -        def __init__(self, spec, gdb_command):
>>>>> -            super(LoadModuleBreakpoint, self).__init__(spec, internal=True)
>>>>> +
>>>>> +    class BreakpointWrapper(gdb.Breakpoint):
>>>>> +        def __init__(self, callback, **kwargs):
>>>>> +            super(BreakpointWrapper, self).__init__(internal=True, **kwargs)
>>>>>              self.silent = True
>>>>> -            self.gdb_command = gdb_command
>>>>> +            self.callback = callback
>>>>>  
>>>>>          def stop(self):
>>>>> -            module = gdb.parse_and_eval("mod")
>>>>> -            module_name = module['name'].string()
>>>>> -            cmd = self.gdb_command
>>>>> -
>>>>> -            # enforce update if object file is not found
>>>>> -            cmd.module_files_updated = False
>>>>> -
>>>>> -            # Disable pagination while reporting symbol (re-)loading.
>>>>> -            # The console input is blocked in this context so that we would
>>>>> -            # get stuck waiting for the user to acknowledge paged output.
>>>>> -            show_pagination = gdb.execute("show pagination", to_string=True)
>>>>> -            pagination = show_pagination.endswith("on.\n")
>>>>> -            gdb.execute("set pagination off")
>>>>> -
>>>>> -            if module_name in cmd.loaded_modules:
>>>>> -                gdb.write("refreshing all symbols to reload module "
>>>>> -                          "'{0}'\n".format(module_name))
>>>>> -                cmd.load_all_symbols()
>>>>> -            else:
>>>>> -                cmd.load_module_symbols(module)
>>>>> -
>>>>> -            # restore pagination state
>>>>> -            gdb.execute("set pagination %s" % ("on" if pagination else "off"))
>>>>> -
>>>>> +            self.callback()
>>>>>              return False
>>>>>  
>>>>> -
>>>>>  class LxSymbols(gdb.Command):
>>>>>      """(Re-)load symbols of Linux kernel and currently loaded modules.
>>>>>  
>>>>> @@ -61,15 +39,52 @@ are scanned recursively, starting in the same directory. Optionally, the module
>>>>>  search path can be extended by a space separated list of paths passed to the
>>>>>  lx-symbols command."""
>>>>>  
>>>>> -    module_paths = []
>>>>> -    module_files = []
>>>>> -    module_files_updated = False
>>>>> -    loaded_modules = []
>>>>> -    breakpoint = None
>>>>> -
>>>>>      def __init__(self):
>>>>>          super(LxSymbols, self).__init__("lx-symbols", gdb.COMMAND_FILES,
>>>>>                                          gdb.COMPLETE_FILENAME)
>>>>> +        self.module_paths = []
>>>>> +        self.module_files = []
>>>>> +        self.module_files_updated = False
>>>>> +        self.loaded_modules = {}
>>>>> +        self.internal_breakpoints = []
>>>>> +
>>>>> +    # prepare GDB for loading/unloading a module
>>>>> +    def _prepare_for_module_load_unload(self):
>>>>> +
>>>>> +        self.blocked_sigint = False
>>>>> +
>>>>> +        # block SIGINT during execution to avoid gdb crash
>>>>> +        sigmask = signal.pthread_sigmask(signal.SIG_BLOCK, [])
>>>>> +        if not signal.SIGINT in sigmask:
>>>>> +            self.blocked_sigint = True
>>>>> +            signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT})
>>>>> +
>>>>> +        # disable all breakpoints to workaround a GDB bug where it would
>>>>> +        # not correctly resume from an internal breakpoint we placed
>>>>> +        # in do_module_init/free_module (it leaves the int3
>>>>
>>>> Seems the comment ends prematurely.
>>>
>>> Yep, GDB leaves the int3 instruction in the guest memory, and the guest dies after
>>> it encounters the truncated instruction that follows it.
>>>
>>>> Any reference to a gdb bug tracker entry? Or affected versions? The
>>>> description is a bit too fuzzy.
>>>
>>> Well stricly speaking this isn't a GDB bug.
>>> GDB documentation explictly prohibits what we are doing in this script.
>>>
>>> https://sourceware.org/gdb/current/onlinedocs/gdb/Breakpoints-In-Python.html
>>>
>>> "You should not alter the execution state of the inferior (i.e., step, next, etc.), alter the current frame context 
>>> (i.e., change the current active frame), or alter, add or delete any breakpoint. 
>>> As a general rule, you should not alter any data within GDB or the inferior at this time."
>>>
>>> However we are reloading the whole symbol table as a response to a breakpoint.
>>>
>>> However there is pretty much no other way to do what this script does so the next best thing
>>> is to workaround this as was already partially done, and I just made it more robust.
>>>
>>> Same for blocking SIGINT which I added, which otherwise crashes GDB
>>> while the symbols are reloading.
>>> It probably will also be blamed on this.
>>>
>>> Do you think I might have some luck taking with GDB maintainers and asking them to support
>>> this use case of updating symbol table when a breakpoint hits?
>>>
>>
>> We should at least clarify if it's a GDB bug or our use case is outside
>> of the envisioned ones, thus need to account for that. Then we should
>> not call it a bug.
> 
> 100% agree.
> 
> Do you think it makes sense to CC gdb@sourceware.org to this discussion or
> should I make a new post there. I do think I have some energy to try and discuss
> this issue with them.

Better start a new thread there, setting the full context and
referencing this one.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-12  5:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 13:31 [PATCH] scripts/gdb: rework lx-symbols gdb script Maxim Levitsky
2021-08-11 19:01 ` Jan Kiszka
2021-08-11 20:10   ` Maxim Levitsky
2021-08-11 20:15     ` Jan Kiszka
2021-08-11 20:41       ` Maxim Levitsky
2021-08-12  5:40         ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.