From: Stephen Brennan <stephen.s.brennan@oracle.com>
To: "Petr Tesařík" <petr@tesarici.cz>
Cc: Omar Sandoval <osandov@osandov.com>, linux-debuggers@vger.kernel.org
Subject: Re: Segmentation fault with drgn + libkdumpfile
Date: Tue, 09 Jan 2024 17:40:15 -0800 [thread overview]
Message-ID: <87sf36yni8.fsf@oracle.com> (raw)
In-Reply-To: <20240109100609.4e956beb@meshulam.tesarici.cz>
Petr Tesařík <petr@tesarici.cz> writes:
> On Mon, 8 Jan 2024 21:40:08 +0100
> Petr Tesařík <petr@tesarici.cz> wrote:
>
>> On Fri, 05 Jan 2024 13:53:15 -0800
>> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>>
>> > Petr Tesařík <petr@tesarici.cz> writes:
>> > > On Fri, 05 Jan 2024 10:38:16 -0800
>> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>> > >
>> > >> Hi Petr,
>> > >>
>> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
>> > >> which appears to be related to the cache implementation. I've included
>> > >> the stack trace at the end of this message, since it's a bit of a longer
>> > >> one. The exact issue occurred with a test vmcore that I could probably
>> > >> share with you privately if you'd like. In any case, the reproducer is
>> > >> fairly straightforward in drgn code:
>> > >>
>> > >> for t in for_each_task(prog):
>> > >> prog.stack_trace(t)
>> > >> for t in for_each_task(prog):
>> > >> prog.stack_trace(t)
>> > >>
>> > >> The repetition is required, the segfault only occurs on the second
>> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
>> > >> issue has to do with caching. I'd expect that the issue is specific to
>> > >> this vmcore, it doesn't reproduce on others.
>> > >>
>> > >> I stuck that into a git bisect script and bisected the libkdumpfile
>> > >> commit that introduced it:
>> > >>
>> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
>> > >> Author: Petr Tesarik <petr@tesarici.cz>
>> > >> Date: Wed Jan 11 22:53:01 2023 +0100
>> > >>
>> > >> Cache: Calculate eprobe in reinit_entry()
>> > >>
>> > >> If this function is called to reuse a ghost entry, the probe list
>> > >> has not been walked yet, so eprobe is left uninitialized.
>> > >>
>> > >> This passed the test case, because the correct old value was left
>> > >> on stack. Modify the test case to poison the stack.
>> > >>
>> > >> Signed-off-by: Petr Tesarik <petr@tesarici.cz>
>> > >>
>> > >> src/kdumpfile/cache.c | 6 +++++-
>> > >> src/kdumpfile/test-cache.c | 13 +++++++++++++
>> > >> 2 files changed, 18 insertions(+), 1 deletion(-)
>> > >
>> > > This looks like a red herring to me. The cache most likely continues in
>> > > a corrupted state without this commit, which may mask the issue (until
>> > > it resurfaces later).
>> >
>> > I see, that makes a lot of sense.
>> >
>> > >> I haven't yet tried to debug the logic of the cache implementation and
>> > >> create a patch. I'm totally willing to try that, but I figured I would
>> > >> send this report to you first, to see if there's something obvious that
>> > >> sticks out to your eyes.
>> > >
>> > > No, but I should be able to recreate the issue if I get a log of the
>> > > cache API calls:
>> > >
>> > > - cache_alloc() - to know the number of elements
>> > > - cache_get_entry()
>> > > - cache_put_entry()
>> > > - cache_insert()
>> > > - cache_discard()
>> > > - cache_flush() - not likely after initialization, but...
>> >
>> > I went ahead and logged each of these calls as you suggested, I tried to
>> > log them at the beginning of the function call and always include the
>> > cache pointer, cache_entry, and the key. I took the resulting log and
>> > filtered it to just contain the most recently logged cache prior to the
>> > crash, compressed it, and attached it. For completeness, the patch
>> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
>> > from fweimer-rh/c99")).
>> >
>> > I'll also see if I can reproduce it based on the log.
>>
>> Thank you for the log. I haven't had much time to look at it, but the
>> first line is a good hint already:
>>
>> 0x56098b68c4c0: cache_alloc(1024, 0)
>>
>> Zero size means the data pointers are managed by the caller, so this
>> must be the cache of mmap()'ed segments. That's the only cache which
>> installs a cleanup callback with set_cache_entry_cleanup(). There is
>> only one call to the cleanup callback for evicted entries in cache.c:
>>
>> /* Get an unused cached entry. */
>> if (cs->nuprobe != 0 &&
>> (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
>> evict = evict_probe(cache, cs);
>> else
>> evict = evict_prec(cache, cs);
>> if (cache->entry_cleanup)
>> cache->entry_cleanup(cache->cleanup_data, evict);
>>
>> The entries can be evicted from the probe partition or from the precious
>> partition. This might be relevant. Please, can you re-run and log where
>> the evict entry comes from?
>
> I found some time this morning, and it wouldn't help. Because of a bug
> in fcache_new(), the number of elements in the cache is big enough that
> cache entries are never evicted in your case. It's quite weird to hit a
> cache metadata bug after elements have been inserted. FWIW I am not
> able to reproduce the bug by replaying the logged file read pattern.
>
> Since you have a reliable reproducer, it cannot be a Heisenbug. But it
> could be caused by the other cache - the cache of decompressed pages.
> Do you know for sure that lzo1x_decompress_safe() crashes while trying
> to _read_ from the input buffer, and not while trying to _write_ to the
> output buffer?
Hi Petr,
Sorry for the delay here, I got pulled into other issues and am trying
to attend to all my work in a round-robin fashion :)
The fault is definitely in lzo1x_decompress_safe() *writing* to address
0. I fetched debuginfo for all the necessary libraries and we see the
following stack trace:
%<-----------------------
#0 0x00007fcd9adddef3 in lzo1x_decompress_safe (in=<optimized out>,
in_len=<optimized out>, out=0x0, out_len=0x7ffdee2c1388, wrkmem=<optimized out>)
at src/lzo1x_d.ch:120
#1 0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
#2 0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
#3 0x00007fcd9ae25e44 in diskdump_get_page (pio=0x7ffdee2c1590) at diskdump.c:647
#4 0x00007fcd9ae32be0 in get_page (pio=0x7ffdee2c1590)
at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
#5 0x00007fcd9ae32ed4 in get_page_xlat (pio=0x7ffdee2c1590) at read.c:126
#6 0x00007fcd9ae32f22 in get_page_maybe_xlat (pio=0x7ffdee2c1590) at read.c:137
#7 0x00007fcd9ae32fb1 in _kdumpfile_priv_read_locked (ctx=0x55745bfca8f0,
as=KDUMP_KVADDR, addr=18446612133360081960, buffer=0x7ffdee2c17df,
plength=0x7ffdee2c1698) at read.c:169
#8 0x00007fcd9ae330dd in kdump_read (ctx=0x55745bfca8f0, as=KDUMP_KVADDR,
addr=18446612133360081960, buffer=0x7ffdee2c17df, plength=0x7ffdee2c1698)
at read.c:196
#9 0x00007fcd9afb0cc4 in drgn_read_kdump (buf=0x7ffdee2c17df,
address=18446612133360081960, count=4, offset=18446612133360081960,
arg=0x55745bfca8f0, physical=false) at ../../libdrgn/kdump.c:73
%<-----------------------
In frame 1 where we are calling the decompressor:
%<-----------------------
(gdb) frame 1
#1 0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
584 int ret = lzo1x_decompress_safe(fch.data, pd.size,
(gdb) list
579 if (ret != KDUMP_OK)
580 return ret;
581 } else if (pd.flags & DUMP_DH_COMPRESSED_LZO) {
582 #if USE_LZO
583 lzo_uint retlen = get_page_size(ctx);
584 int ret = lzo1x_decompress_safe(fch.data, pd.size,
585 pio->chunk.data,
586 &retlen,
587 LZO1X_MEM_DECOMPRESS);
588 fcache_put_chunk(&fch);
(gdb) p retlen
$7 = 0
(gdb) p pio->chunk.data
$8 = (void *) 0x0
(gdb) p fch.data
$9 = (void *) 0x7fcd7cc33da4
(gdb) p pd.size
$10 = 816
%<-----------------------
As far as I can tell, pio->chunk.data comes directly from the
cache_get_page() function in frame 2:
%<-----------------------
(gdb) up
#2 0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
69 ret = fn(pio);
(gdb) list
64 pio->chunk.data = entry->data;
65 pio->chunk.embed_fces->ce = entry;
66 if (cache_entry_valid(entry))
67 return KDUMP_OK;
68
69 ret = fn(pio);
70 mutex_lock(&ctx->shared->cache_lock);
71 if (ret == KDUMP_OK)
72 cache_insert(pio->chunk.embed_fces->cache, entry);
73 else
(gdb) p *entry
$11 = {key = 1045860353, state = cs_precious, next = 626, prev = 626, refcnt = 1,
data = 0x0}
(gdb) p *pio
$12 = {ctx = 0x55745bfca8f0, addr = {addr = 1045860352, as = ADDRXLAT_MACHPHYSADDR},
chunk = {data = 0x0, nent = 1, {embed_fces = {{data = 0xffff880ff1470788,
len = 140728599320032, ce = 0x55745c1003d8, cache = 0x55745c0fb540}, {
data = 0x55745bfd42f0, len = 140728599320112,
ce = 0x7fcd9ae330ef <kdump_read+102>, cache = 0xffff88003e569c28}},
fces = 0xffff880ff1470788}}}
%<-----------------------
And here is the cache structure:
%<-----------------------
(gdb) p *pio->chunk.embed_fces->cache
$16 = {split = 487, nprec = 1020, ngprec = 248, nprobe = 3, ngprobe = 239, dprobe = 2,
cap = 1024, inflight = 626, ninflight = 1, hits = {number = 168473, address = 168473,
string = 0x29219 <error: Cannot access memory at address 0x29219>,
bitmap = 0x29219, blob = 0x29219}, misses = {number = 1913, address = 1913,
string = 0x779 <error: Cannot access memory at address 0x779>, bitmap = 0x779,
blob = 0x779}, elemsize = 4096, data = 0x7fcd997fe010, entry_cleanup = 0x0,
cleanup_data = 0x0, ce = 0x55745c0fb598}
%<-----------------------
Thanks for looking into this! I'll continue investigating more as well.
Stephen
next prev parent reply other threads:[~2024-01-10 1:40 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-05 18:38 Segmentation fault with drgn + libkdumpfile Stephen Brennan
2024-01-05 19:23 ` Petr Tesařík
2024-01-05 21:53 ` Stephen Brennan
2024-01-08 20:40 ` Petr Tesařík
2024-01-09 9:06 ` Petr Tesařík
2024-01-10 1:40 ` Stephen Brennan [this message]
2024-01-10 8:36 ` Petr Tesařík
2024-01-10 13:49 ` Petr Tesařík
2024-01-10 18:03 ` Petr Tesařík
2024-01-10 19:48 ` Stephen Brennan
2024-01-10 19:58 ` Petr Tesařík
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sf36yni8.fsf@oracle.com \
--to=stephen.s.brennan@oracle.com \
--cc=linux-debuggers@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=petr@tesarici.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).