linux-debuggers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux Kernel Debugging Tools Monthly Meeting on Wednesday, January 25th
@ 2023-01-23 20:10 Omar Sandoval
  2023-01-25 21:36 ` Omar Sandoval
  0 siblings, 1 reply; 3+ messages in thread
From: Omar Sandoval @ 2023-01-23 20:10 UTC (permalink / raw)
  To: linux-debuggers

Hello! The first Linux Kernel Debugging Tools meeting of 2023 is this
Wednesday, January 25th at 11:30 AM Pacific time. We've been having
these for a few years as a forum to discuss development of Linux kernel
debugging tools like drgn, crash, and more. Now that we have this
mailing list, I figured it'd be nice to publicize it more widely. If you
would like to attend, please email me offlist.

The agenda so far is roughly:

- Is there a good way to iterate over all online `struct page`s? See
  https://github.com/osandov/drgn/pull/228
- Pending drgn work
    - Pluggable symbol finder from Stephen
    - Slab helpers from Imran
- Dealing with `DW_OP_entry_value` and `DW_TAG_call_site_parameter`. See
  https://github.com/osandov/drgn/issues/233
- Call for drgn/contrib scripts:
  https://github.com/osandov/drgn/tree/main/contrib

Please reply with anything else you'd like to add to the agenda.

Thanks!
Omar

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Linux Kernel Debugging Tools Monthly Meeting on Wednesday, January 25th
  2023-01-23 20:10 Linux Kernel Debugging Tools Monthly Meeting on Wednesday, January 25th Omar Sandoval
@ 2023-01-25 21:36 ` Omar Sandoval
  2023-01-26  0:47   ` Stephen Brennan
  0 siblings, 1 reply; 3+ messages in thread
From: Omar Sandoval @ 2023-01-25 21:36 UTC (permalink / raw)
  To: linux-debuggers

On Mon, Jan 23, 2023 at 12:10:50PM -0800, Omar Sandoval wrote:
> Hello! The first Linux Kernel Debugging Tools meeting of 2023 is this
> Wednesday, January 25th at 11:30 AM Pacific time. We've been having
> these for a few years as a forum to discuss development of Linux kernel
> debugging tools like drgn, crash, and more. Now that we have this
> mailing list, I figured it'd be nice to publicize it more widely. If you
> would like to attend, please email me offlist.
> 
> The agenda so far is roughly:
> 
> - Is there a good way to iterate over all online `struct page`s? See
>   https://github.com/osandov/drgn/pull/228
> - Pending drgn work
>     - Pluggable symbol finder from Stephen
>     - Slab helpers from Imran
> - Dealing with `DW_OP_entry_value` and `DW_TAG_call_site_parameter`. See
>   https://github.com/osandov/drgn/issues/233
> - Call for drgn/contrib scripts:
>   https://github.com/osandov/drgn/tree/main/contrib
> 
> Please reply with anything else you'd like to add to the agenda.
> 
> Thanks!
> Omar

Here are notes from the meeting:

- There doesn't seem to be a better way currently to get the constants
  needed for iterating sparsemem. We should bite the bullet and make
  them enums upstream. This won't be too hard, but it'll be a bit
  tedious since they're defined separately for each architecture. And it
  will only solve it for new kernels.
- Stephen Brennan is working on making symbol finding pluggable (as
  opposed to only from ELF symbols). This will make it possible to find
  debug info for kernel modules from `kallsyms` without loading the
  module debug info, for BPF functions, and more.
- Imran Khan has a bunch of new slab helpers, and on top of those,
  helpers for validating that the slab metadata makes sense, to help
  with debugging memory corruptions. Some of these require `slub_debug`
  to be enabled.
- Ross Zwisler is working on setting aside the vmcore from the kdump
  kernel for the full kernel to access. This works on x86 by marking it
  in the e820 table, but Arm will need something different (maybe in
  device tree).
- I'll merge the workaround for `DW_OP_entry_value` from
  https://github.com/osandov/drgn/issues/233, but there's potentially
  more we can do to actually recover the value when
  `DW_TAG_call_site_parameter` is available.
- Jeremy Carin, a teaching assistant at Columbia University's Operating
  Systems course, is looking for good resources to teach beginners
  kernel debugging.
    - We weren't aware of anything great that already exists, but we had
      some ideas for the future. Specifically, I would love to have
      step-by-step walkthroughs through kernel bugs with the core dumps
      and debug info available for download so that you can follow along
      with drgn.
- I may or may not be around for the meeting next month since I have a
  baby coming around then.

Thanks everyone!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Linux Kernel Debugging Tools Monthly Meeting on Wednesday, January 25th
  2023-01-25 21:36 ` Omar Sandoval
@ 2023-01-26  0:47   ` Stephen Brennan
  0 siblings, 0 replies; 3+ messages in thread
From: Stephen Brennan @ 2023-01-26  0:47 UTC (permalink / raw)
  To: Omar Sandoval, linux-debuggers

Thanks for the notes!

Sorry for running out the door at the end of that. A couple comments inline.

Omar Sandoval <osandov@osandov.com> writes:
> On Mon, Jan 23, 2023 at 12:10:50PM -0800, Omar Sandoval wrote:
>> Hello! The first Linux Kernel Debugging Tools meeting of 2023 is this
>> Wednesday, January 25th at 11:30 AM Pacific time. We've been having
>> these for a few years as a forum to discuss development of Linux kernel
>> debugging tools like drgn, crash, and more. Now that we have this
>> mailing list, I figured it'd be nice to publicize it more widely. If you
>> would like to attend, please email me offlist.
>> 
>> The agenda so far is roughly:
>> 
>> - Is there a good way to iterate over all online `struct page`s? See
>>   https://github.com/osandov/drgn/pull/228
>> - Pending drgn work
>>     - Pluggable symbol finder from Stephen
>>     - Slab helpers from Imran
>> - Dealing with `DW_OP_entry_value` and `DW_TAG_call_site_parameter`. See
>>   https://github.com/osandov/drgn/issues/233
>> - Call for drgn/contrib scripts:
>>   https://github.com/osandov/drgn/tree/main/contrib
>> 
>> Please reply with anything else you'd like to add to the agenda.
>> 
>> Thanks!
>> Omar
>
> Here are notes from the meeting:
>
> - There doesn't seem to be a better way currently to get the constants
>   needed for iterating sparsemem. We should bite the bullet and make
>   them enums upstream. This won't be too hard, but it'll be a bit
>   tedious since they're defined separately for each architecture. And it
>   will only solve it for new kernels.

Initially I read this and thought: "future BTF and CTF type backends
wouldn't be able to take advantage of this". But the more I think about
it, I'm wrong. BTF and CTF include enums and enumerators. I think
there's an open question about when a type gets included into BTF. For
instance, if no function or global variable has an enum type, will its
definition still be included in the BTF? I need to do more research
there.

I only remark this because I want to keep track of what the future
"alternative debuginfo backends" will be able to support: which drgn
features will be available so that users aren't surprised. EG: source
code mapping for stack frames is (maybe obviously) not available for CTF
and BTF.

> - Stephen Brennan is working on making symbol finding pluggable (as
>   opposed to only from ELF symbols). This will make it possible to find
>   debug info for kernel modules from `kallsyms` without loading the
>   module debug info, for BPF functions, and more.
> - Imran Khan has a bunch of new slab helpers, and on top of those,
>   helpers for validating that the slab metadata makes sense, to help
>   with debugging memory corruptions. Some of these require `slub_debug`
>   to be enabled.
> - Ross Zwisler is working on setting aside the vmcore from the kdump
>   kernel for the full kernel to access. This works on x86 by marking it
>   in the e820 table, but Arm will need something different (maybe in
>   device tree).
> - I'll merge the workaround for `DW_OP_entry_value` from
>   https://github.com/osandov/drgn/issues/233, but there's potentially
>   more we can do to actually recover the value when
>   `DW_TAG_call_site_parameter` is available.

Honestly, it's a bit tough for me to visualize or understand what the
DW_TAG_call_site_parameter values would be, or even how common they are.

Do you think there's a possibility of some very basic, almost throwaway
implementation that could be put into an unstable API somewhere?
Anything from a patch that could be applied so I could distribute a
pre-compiled beta, all the way to a release containing an "unstable" API
that may be changed or removed at any time.

Sometimes having _anything_ to play with will help come up with a
design, even if it means we throw away the previous implementation. I'd
even try to do it myself if you had an hour or two to share some
pointers.

> - Jeremy Carin, a teaching assistant at Columbia University's Operating
>   Systems course, is looking for good resources to teach beginners
>   kernel debugging.

Honestly bravo to Jeremy and this course. My Operating Systems course
was entirely pseudocode, and we were allowed to substitute Java for C on
the homework assignments, if we preferred.  Don't ask me how that
worked: it didn't.

>     - We weren't aware of anything great that already exists, but we had
>       some ideas for the future. Specifically, I would love to have
>       step-by-step walkthroughs through kernel bugs with the core dumps
>       and debug info available for download so that you can follow along
>       with drgn.

I've thought about this for the purposes of onboarding and training with
tools. The main difficulty I see is finding the actual bug to debug.

We can of course manually corrupt data, insert a use after free, or add
some logic error, but that's... unsatisfying. And many bugs tend to
require so much background knowledge that it would overshadow the tool
training. There have been precious few bugs where I looked at a vmcore
and found a solution readily apparent (and most of those times were when
I was the one who wrote the bug).

I don't know if I have a solution there, just some thoughts.

> - I may or may not be around for the meeting next month since I have a
>   baby coming around then.
>
> Thanks everyone!

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-01-26  0:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-23 20:10 Linux Kernel Debugging Tools Monthly Meeting on Wednesday, January 25th Omar Sandoval
2023-01-25 21:36 ` Omar Sandoval
2023-01-26  0:47   ` Stephen Brennan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).