linux-debuggers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: Petr Tesarik <petr@tesarici.cz>
Cc: Omar Sandoval <osandov@osandov.com>, <linux-debuggers@vger.kernel.org>
Subject: Segmentation fault with drgn + libkdumpfile
Date: Fri, 05 Jan 2024 10:38:16 -0800	[thread overview]
Message-ID: <8734vb1v8n.fsf@oracle.com> (raw)

Hi Petr,

I recently encountered a segmentation fault with libkdumpfile & drgn
which appears to be related to the cache implementation. I've included
the stack trace at the end of this message, since it's a bit of a longer
one. The exact issue occurred with a test vmcore that I could probably
share with you privately if you'd like. In any case, the reproducer is
fairly straightforward in drgn code:

for t in for_each_task(prog):
    prog.stack_trace(t)
for t in for_each_task(prog):
    prog.stack_trace(t)

The repetition is required, the segfault only occurs on the second
iteration of the loop. Which, in hindsight, is a textbook sign that the
issue has to do with caching. I'd expect that the issue is specific to
this vmcore, it doesn't reproduce on others.

I stuck that into a git bisect script and bisected the libkdumpfile
commit that introduced it:

commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
Author: Petr Tesarik <petr@tesarici.cz>
Date:   Wed Jan 11 22:53:01 2023 +0100

    Cache: Calculate eprobe in reinit_entry()

    If this function is called to reuse a ghost entry, the probe list
    has not been walked yet, so eprobe is left uninitialized.

    This passed the test case, because the correct old value was left
    on stack. Modify the test case to poison the stack.

    Signed-off-by: Petr Tesarik <petr@tesarici.cz>

 src/kdumpfile/cache.c      |  6 +++++-
 src/kdumpfile/test-cache.c | 13 +++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

I haven't yet tried to debug the logic of the cache implementation and
create a patch. I'm totally willing to try that, but I figured I would
send this report to you first, to see if there's something obvious that
sticks out to your eyes.

Thanks!
Stephen


#0  0x00007fb9a1ac1ef3 in lzo1x_decompress_safe () from /lib64/liblzo2.so.2
#1  0x00007fb9af164151 in diskdump_read_page (pio=pio@entry=0x7fff1f5fa760)
    at diskdump.c:584
#2  0x00007fb9af16bcd8 in _kdumpfile_priv_cache_get_page (pio=0x7fff1f5fa760,
    fn=0x7fb9af163cd0 <diskdump_read_page>) at read.c:69
#3  0x00007fb9af16bc15 in get_page (pio=0x7fff1f5fa760)
    at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
#4  get_page_xlat (pio=pio@entry=0x7fff1f5fa760) at read.c:126
#5  0x00007fb9af16be57 in get_page_maybe_xlat (pio=0x7fff1f5fa760) at read.c:137
#6  _kdumpfile_priv_read_locked (ctx=ctx@entry=0x564a0a9475e0,
    as=as@entry=KDUMP_KVADDR, addr=addr@entry=18446612133360081960,
    buffer=buffer@entry=0x7fff1f5fa917, plength=plength@entry=0x7fff1f5fa848)
    at read.c:169
#7  0x00007fb9af16beee in kdump_read (ctx=ctx@entry=0x564a0a9475e0, as=KDUMP_KVADDR,
    addr=addr@entry=18446612133360081960, buffer=0x7fff1f5fa917,
    plength=plength@entry=0x7fff1f5fa848) at read.c:196
#8  0x00007fb9a1bc9a8f in drgn_read_kdump (buf=<optimized out>,
    address=18446612133360081960, count=<optimized out>, offset=<optimized out>,
    arg=0x564a0a9475e0, physical=<optimized out>) at ../../libdrgn/kdump.c:73
#9  0x00007fb9a1bb67bd in drgn_memory_reader_read (reader=reader@entry=0x564a0aa3c440,
    buf=buf@entry=0x7fff1f5fa917, address=<optimized out>, count=count@entry=4,
    physical=physical@entry=false) at ../../libdrgn/memory_reader.c:260
#10 0x00007fb9a1bc0160 in drgn_program_read_memory (prog=0x564a0aa3c440,
    buf=buf@entry=0x7fff1f5fa917, address=<optimized out>, count=count@entry=4,
    physical=physical@entry=false) at ../../libdrgn/program.c:1648
#11 0x00007fb9a1bb6b61 in drgn_object_read_reference (obj=0x7fff1f5fab10,
    value=value@entry=0x7fff1f5fa990) at ../../libdrgn/object.c:739
#12 0x00007fb9a1bb82c8 in drgn_object_read_value (obj=<optimized out>,
    value=0x7fff1f5fa990, ret=0x7fff1f5fa998) at ../../libdrgn/object.c:782
#13 0x00007fb9a1bb8311 in drgn_object_value_signed (obj=0x7fff1f5fab10,
    ret=0x7fff1f5fa9f0) at ../../libdrgn/object.c:905
#14 0x00007fb9a1bb8632 in drgn_object_is_zero_impl (obj=obj@entry=0x7fff1f5fab10,
    ret=ret@entry=0x7fff1f5faab6) at ../../libdrgn/object.c:1250
#15 0x00007fb9a1bb9338 in drgn_object_is_zero (obj=obj@entry=0x7fff1f5fab10,
    ret=ret@entry=0x7fff1f5faab6) at ../../libdrgn/object.c:1322
#16 0x00007fb9a1babb13 in c_op_bool (obj=0x7fff1f5fab10, ret=0x7fff1f5faab6)
    at ../../libdrgn/language_c.c:3307
#17 0x00007fb9a1bc3471 in drgn_get_initial_registers (ret=0x7fff1f5faad0,
    thread_obj=<optimized out>, tid=0, prog=0x564a0aa3c440)
    at ../../libdrgn/stack_trace.c:631
#18 drgn_get_stack_trace (prog=0x564a0aa3c440, tid=tid@entry=0, obj=<optimized out>,
    prstatus=prstatus@entry=0x0, ret=ret@entry=0x7fff1f5fab88)
    at ../../libdrgn/stack_trace.c:1091
#19 0x00007fb9a1bc4726 in drgn_get_stack_trace (ret=0x7fff1f5fab88, prstatus=0x0,
    obj=<optimized out>, tid=0, prog=<optimized out>)
    at ../../libdrgn/stack_trace.c:1151
#20 0x00007fb9a1b7fc20 in Program_stack_trace (self=0x564a0aa3c430,
    args=<optimized out>, kwds=<optimized out>) at ../../libdrgn/python/program.c:849
...
many more python stack frames :)

             reply	other threads:[~2024-01-05 18:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-05 18:38 Stephen Brennan [this message]
2024-01-05 19:23 ` Segmentation fault with drgn + libkdumpfile Petr Tesařík
2024-01-05 21:53   ` Stephen Brennan
2024-01-08 20:40     ` Petr Tesařík
2024-01-09  9:06       ` Petr Tesařík
2024-01-10  1:40         ` Stephen Brennan
2024-01-10  8:36           ` Petr Tesařík
2024-01-10 13:49             ` Petr Tesařík
2024-01-10 18:03               ` Petr Tesařík
2024-01-10 19:48                 ` Stephen Brennan
2024-01-10 19:58                   ` Petr Tesařík

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734vb1v8n.fsf@oracle.com \
    --to=stephen.s.brennan@oracle.com \
    --cc=linux-debuggers@vger.kernel.org \
    --cc=osandov@osandov.com \
    --cc=petr@tesarici.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).